Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Year over Year Variance - How best to perform - Excel Pivot or PowerPivot or SQL 1

Status
Not open for further replies.

BxWill

MIS
Mar 30, 2009
367
US
My objective is to perform a monthly import of a text file containing two years of data into MS Excel 2013 and create a Year over Year, by month analysis using vba or some other process and determine the following;

Step 1: Top 5 months with the greatest variances - Year over Year
Step 2: Then, the Top 5 providers for the 5 months in Step 1
Step 3: Then, the Top 5 customers (by PaidAmt) for the 5 providers in Step 2

The text file contains 2013 and 2014 data and consists of the following columns;
Date, Provider, CustomerName, PaidAmt

Currently, I manually create the three pivot tables and perform the analysis which is somewhat time consuming.

Pivot table #1 - I have the PaidAmt by Year and Month. For example,
Row: Month
Column: Year
Values: Sum of PaidAmt, Difference, %-age Increase(Decrease)

Pivot table #2 - I have the PaidAmt by Month and providers. And, the third pivot table is the PaidAmt by Provider and by CustomerName.

Note, I cannot rank on the calculated column, "%-age Increase(Decrease)", within a pivot table - whether creating the pivot table manually or with VBA.

Any suggestions on how best to "automate" this process?

Also contemplated the use of a PowerPivot-based data model with time intelligence functionality or importing the text file into Sql Server using SSIS and then within SSMS, create a query? However, I am not sure that this can be performed using SQL.


Thanks in advance for any insight.
 
What is the purpose of this query? It escapes me, if indeed this is your composite master table of 2013 thru the latest in 2015???

Regarding the error, surely you didn't use my sConn code, did you? Your Excel version might be different than mine.

Back to the reason for the query. If this is just to get the entire table into the MSTR sheet, I would not do it this way! Reason? Because it's a ONE TIME EVENT! Rather develop the process for IMPORTING into your yet undefined Import sheet and transferring that data to you tMSTR table on your MSTR sheet. Otherwise an inadvertent REFRESH in you current IMPORT query, will wipe out any monthly updates to your tMSTR table.

With further review of your code, I see that you're importing a table from a workbook. That makes it even more reason that this is a ONE TIME EVENT, that I'd just COPY 'n' PASTE. Your on-going process for importing, I believe, is from a TEXT file for monthly updates, so I would NOT muck it up with this non-text query!

Skip

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Skip,

Still at it...

Just to clarify;

At this point, I have an MS Excel workbook with the three worksheets
1. MSTR - contains a manually created query table - Table name "tMSTR" - based on text file
2. PVT1 - Structured table - Table name "tPVT1" - contains PaidAmt by month by year - created via array formula. (Can be created by pivot of text file directly?)
3. PVT2 - Should display the providers for each of the Top 5 months.
4. PVT3 - Should display the customer names for the providers in Step 3

Currently, the all inclusive text file is imported every month into the worksheet "MSTR." Eventually, I may consider separating the current month's data from the other data (data for 2013, 2014 and YTD 2015).

Yes, the sub procedure "QueryTableExample()" appears to not be necessary because of the contents of worksheet "MSTR."

Yes, did copy the connection properties that you provided. Any insight as to how to determine the properties for a external connection to a external text file? To a range on another worksheet? If I recall, there was a process to connect to a external file using notepad and then save with extension ".odl" or something similar. Also, I recorded a macro to import the text file and then reviewed.

Sorted the worksheet "PVT1" in descending order based on the variance in the paid amounts - 2014 vs. 2013, highlighted the months (column containing 1 through 12) and range named the top 5

Then, upon running the sub procedure "PVT2()", the error message is
"Subscript out of range" and the line "With Sheets("Sheet2").ListObjects("tPVT1").QueryTable " is highlighted

Code:
Sub PVT2()
    Dim sSQL As String, sPath As String, sDB As String, sConn As String
    sPath = "C:\Users\Tom\Documents"
    sDB = "Test.txt"
    
    sConn = "ODBC;DSN=Excel Files;"
    'sConn = "Text;"
    sConn = sConn & "DBQ=" & sPath & "\" & sDB & ";"
    sConn = sConn & "DefaultDir=" & sPath & ";"
    sConn = sConn & "DriverId=1046;MaxBufferSize=2048;PageTimeout=5;"

    '--build SQL statement
    sSQL = sSQL & "transform Sum(a.PaidAmt)"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "SELECT"
    sSQL = sSQL & "  a.Provider"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "FROM `MSTR$` a"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "WHERE month(a.Date) in (" & MakeList([rTop5Months]) & ")"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "group by"
    sSQL = sSQL & "  a.Provider"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "pivot year(a.Date)"
    
    'Error on line below:  "Subscript out of range"
    'Due to not setting up a query table on worksheet "PVT2" ?? 
    'Question - Query table of what?  Is Query table needed?
    'Question - Sub procedure should output pivot or records about top 25 providers on worksheet "PVT2" - 5 providers for each of the selected 5 months.  
    
With Sheets("Sheet2").ListObjects("tPVT1").QueryTable 

        .Connection = sConn
        .CommandText = sSQL
        .BackgroundQuery = False
        .Refresh
    End With
End Sub

Any additional comments regarding a resolution of the errors?
 
Currently, the all inclusive text file is imported every month into the worksheet "MSTR." Eventually, I may consider separating the current month's data from the other data (data for 2013, 2014 and YTD 2015).

I mistakenly thought that you imported monthly addenda, but it seems that your import into MSTR is the complete file each time. My apologies. However, please do not consider separating this data, as that will complicate your process.

Regarding the connection string, see my post of 12 Oct 15 14:03, the second, table setup, where after having added the QT manually, you can get the connection string and SQL string in the Debug.Prints from the Immediate Window.

After adding the PVT1 QT, did you name the table tPVT1? And with PVT2?

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Resuming where I left off...

I am thinking that I can use just one query table (tMSTR), one structured table on worksheet "PVT2" where I rank providers and one structured table on worksheet "PVT3" where I rank customers.


Still have questions regarding the connection string.

Should the connection string within Sub PVT2() be for the external text file or tMSTR, the Excel table within the workbook that is sourced from the text file?


I did read about "Memory Leakage" when querying Excel data within MS Excel... Is this something I should be concerned with?

Below, I have revised the code. Note, I did add two additional columns to the text file - "Month","Year."

Code:
Sub PVT2_Revised()
    Dim sSQL As String, sPath As String, sDB As String, sConn As String

'Connecting to tMSTR table  
'    sPath = "C:\Users\Bill\Documents"
'    sDB = "Test.xlsm"
'    sConn = "ODBC;DSN=Excel Table;"
'    sConn = sConn & "DBQ=" & sPath & "\" & sDB & ";"
'    sConn = sConn & "DefaultDir=" & sPath & ";"
'    sConn = sConn & "DriverId=790;FIL=excel 8.0;MaxBufferSize=2048;PageTimeout=5;"

'Connecting to Text File
    sPath = "C:\Users\Bill\Documents"
    sDB = "SampleTextFile4a.txt"
    sConn = "TEXT;"
    sConn = sConn & "DBQ=" & sPath & "\" & sDB & ";"
    sConn = sConn & "DefaultDir=" & sPath & ";"
    
    '--build SQL statement
    'sSQL = sSQL & "transform Sum(a.PaidAmt)"
    'sSQL = sSQL & vbLf
    sSQL = sSQL & "SELECT"
    sSQL = sSQL & "  a.Provider"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "FROM `MSTR$` a"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "WHERE a.Month in (" & MakeList([rTop5Months]) & ")"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "group by"
    sSQL = sSQL & "  a.Provider"
    'sSQL = sSQL & vbLf
    'sSQL = sSQL & "pivot year(a.Date)"
    
    'Debug.Print sSQL

End Sub

How would Sub PVT2() change if I want to perform the following;

1. Output the list of providers from the sql script above to column B (starting at row 4) of the structured table (tProv) on worksheet "PVT2?"

Prior to pasting the data to worksheet "PVT2", I need to clear column B that contains the provider names.

Did perform extensive research as to how this can be accomplished but have not been able to obtain a definitive answer.

Some examples used recordsets with ADO. Not quite sure as to the advantages/disadvantages with the current approach vs. recordsets with ADO...

Thanks in advance for any additional insight.
 
Continuing my research, it appears that thread707-1535834 may have some applicability...

Will review...

 
Regarding memory leaks, I've used the technique of querying sheets in my workbook extensively, often with several 100k+ row result sets from other data bases and a dozen sheets with QTs from the large internal sheets, over a period of more than 10 years and any memory leaks have not hampered my applications.

Regarding you 2 new columns, Month & Year, why not just ONE DATE column; first of the month? Any month/year would need to be converted to a date to be useful. I see no advantage to store a month/year! Rather a disadvantage!

Regarding PVT2, I need to review the work I've done earlier and your workbook sample.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
And some thoughts in additional to the previous post.

If you are importing the entire table every month from your Test.xlsm workbook to MSTR, then why do that? I'd query your Test.xlsm table directly from PVT1 & PVT2, unless you want the data in the MSTR sheet. This has nothing to do with memory leaks.

Your PVT2 sheet query, the TRANSFORM you posted above, will repopulate the resultset each time your execute the code. Don't need to delete anything, if I understand you correctly.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top