Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Year over Year Variance - How best to perform - Excel Pivot or PowerPivot or SQL 1

Status
Not open for further replies.

BxWill

MIS
Mar 30, 2009
367
US
My objective is to perform a monthly import of a text file containing two years of data into MS Excel 2013 and create a Year over Year, by month analysis using vba or some other process and determine the following;

Step 1: Top 5 months with the greatest variances - Year over Year
Step 2: Then, the Top 5 providers for the 5 months in Step 1
Step 3: Then, the Top 5 customers (by PaidAmt) for the 5 providers in Step 2

The text file contains 2013 and 2014 data and consists of the following columns;
Date, Provider, CustomerName, PaidAmt

Currently, I manually create the three pivot tables and perform the analysis which is somewhat time consuming.

Pivot table #1 - I have the PaidAmt by Year and Month. For example,
Row: Month
Column: Year
Values: Sum of PaidAmt, Difference, %-age Increase(Decrease)

Pivot table #2 - I have the PaidAmt by Month and providers. And, the third pivot table is the PaidAmt by Provider and by CustomerName.

Note, I cannot rank on the calculated column, "%-age Increase(Decrease)", within a pivot table - whether creating the pivot table manually or with VBA.

Any suggestions on how best to "automate" this process?

Also contemplated the use of a PowerPivot-based data model with time intelligence functionality or importing the text file into Sql Server using SSIS and then within SSMS, create a query? However, I am not sure that this can be performed using SQL.


Thanks in advance for any insight.
 
Hi,

Once you have created your three PTs, why do you need to create any other PTs?

Am I missing something?

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Manually creating the three pivot tables and ranking the items within the first pivot table is a time consuming process.

I would like to perform this analysis relatively quickly instead of manually creating the first pivot table and then manually ranking the months based on the %-age increase(decrease) in the paid amounts, then filtering the second pivot table on the month with the largest increase by having the "month" in the page field - noting the top 5 providers that is driving the variance, then going to the third pivot table and filtering initially on the top 1 provider and noting the top 5 customers driving the variance for this provider, then filtering on the top 2 provider and noting the top 5 customers driving the variance for this provider, and so on.

 
The manual effort for creating the three PTs is a ONE TIME EVENT.

I fail to understand what manual ranking means. Is this not a sort of some sort?

The filtering of the second PT based on a result of the first (or maybe the result of a summary formula from your source table) could be accomplished without intervention, it appears.

You might ought to post your workbook so we can talk specifics.

Also, how are you calculating your variances?

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Okay, I've developed a process.

You first need to IMPORT the next month's data into a MSTR sheet, containing your Date, Provider, CustomerName, PaidAmt. This sheet/table will be an accumulation of data.

Then you will need several QueryTables to perform your data analysis.

QT1 to rank the top 5 months for variance in PaidAmt, using this query:
Code:
Transform sum(a.PaidAmt) 

SELECT month(a.Date) as [MO]

FROM [b]`MSTR$`[/b] a

GROUP BY month(a.Date)

pivot year(a.date)

The QuertTable resultset will have a calculated column VAR that will be sorted high to low, so that the first 5 rows of your table will be the 5 months with the highest variance year to year. That range of months, will be converted to a string to use in an IN () statement in the next query to return the Providers for those months.

Am I getting the picture? Can you see what the next step might be?

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Thanks for the insight.

In my initial post, I stated that the data was for 2013 and 2014.

Within the next few weeks, data for 2015 should be compared with 2014.

Therefore, every month, I receive the text file that contains all of 2014 data and the year to date data for 2015. Note, due to customers sometimes paying late, the PaidAmt figure for the 2015 data will usually fluctuate for the months prior to the current month.

Yes, by "ranking", I mean sorting the pivot data on the calculated field, in descending order.

It is not possible to sort on "%-age Increase(Decrease)", a calculated field that is based on a comparison of 2014 data with 2013 data.

The error message received is something like:
"AutoSort and AutoShow cannot be used with custom calculations that use positional references. Do you want to turn off AutoSort/Show?"

This is because there will be blank columns within the pivot table...

Therefore, upon creating the pivot table, I copy the entire pivot table to another section of the worksheet and use paste/special values and then sort on the column titled "%-age Increase(Decrease)." Alternatively, I could use a getPivot formula to extract data from the pivot table and then sort.

Will explore your method further.

The method you are using appears to be similar to the use of temporary tables within T-SQL?


 
Therefore, every month, I receive the text file..."

Just keep appending data each month.

"Yes, by "ranking", I mean sorting the pivot data on the calculated field, in descending order."

I'd suggest NOT using a PivotTable. Rather successive queries.

"It is not possible to sort on "%-age Increase(Decrease)", a calculated field that is based on a comparison of 2014 data with 2013 data."

That's why! It IS possible to sort a query resultset with additional calculated columns!

All the sequential queries process can be designed and coded and run start to finish. Have done this many times with more complex process than I believe you have stipulated.

Again, it would help if you posted your workbook or a reasonable facsimile.

BTW, DFW here y'all. ;-)


Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
So here's a thought regarding 3 years of data:

Your query returns
[pre]
MO 2013 2014 2015
[/pre]

Then you can add Varience columns for VAR13-14 and VAR14-15 all in one table.

Depending on the Varience you're workin on is the column you sort on.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Actually, you don't need a query for the first "pivot"

Here's the table structure...
[pre]
MO 2013 2014 2015 VAR13-14 VAR14-15

1
2
3
4
5
6
7
8
9
10
11
12
[/pre]

so then the suming formula for the PaidAmt in B2 is...
[tt]
B2: =SUMPRODUCT((MONTH(tMSTR[Date])=$A2)*(YEAR(tMSTR[Date])=VALUE(B$1))*(tMSTR[PaidAmt]))
[/tt]
...where the table in the MSTR sheet is a Structured Table named tMSTR. Copy/Paste this formula under the YEAR headings. Then enter your variance formula under the VARyy-yy headings.

Oh yes, and this new table is also a Structured Table, named tPVT1.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]


 
Sample of the data is below

Date------------CustomerName----Provider-----------PaidAmt
1/1/2013--------Jill Anderson----Sela Liquidators---2,028.60
1/1/2013--------Hill Vance-------Taylor Mfg---------265.00
1/9/2013--------Yial Hillo-------Taylor Mfg---------725.04
1/10/2013-------Cynthia Hall-----Multan-----------75.00
1/27/2013-------Rayna Smith------Lucas Mfg--------145.00
2/1/2013--------James Smith------Nevada Tool------52.65
2/6/2013--------Samantha Lovea---ICM-------------450.00
2/25/2013-------Jane Anderson----Childress-------1,159.19
2/28/2013-------Margaret Smith---Lakeside--------237.71
3/1/2013--------Hill Vance-------Taylor Mfg------480.00
3/1/2013--------Samantha Lovea---ICM-------------456.00
3/1/2013--------Mary Horton------Region Mfg------138.98

Currently reviewing the use of query tables via VBA.

Initially, I created the code below to import from the text file.

Code:
Set shMSTR = Workbooks(1).Worksheets(1) 
Set qtResults = shMSTR.QueryTables.Add( _ 
    Connection := "TEXT;C:\My Documents\Data_October2015.txt", 
    Destination := shMSTR.Cells(1,1)) 
With qtResults 
    .TextFileParsingType = xlFixedWidth 
    .TextFileFixedColumnWidths := Array(12,25,35,15) 
    .TextFileColumnDataTypes := _ 
        Array(xlGeneralFormat,xlGeneralFormat,xlGeneralFormat,xlGeneralFormat) 
    .Refresh 
End With

Then, I created the following, that currently errors out on the line ".CommandText = sSQL"
Error is "Object variable or With Block variable not set."

Any insight as to a resolution to the error?

Any additional examples of how to create the multiple query tables and rank based on the percentage increase in the PaidAmt in 2014 relative to 2013?

Code:
Sub QueryTableExample()
Dim sSQL As String
Dim qt As QueryTable
Dim rDest As Range

'--build SQL statement
sSQL = "Transform Sum(a.PaidAmt)" & _
"SELECT month(a.Date) as [MO]" & _
"FROM `MSTR$` a" & _
"GROUP BY month(a.Date)" & _
"pivot Year(a.Date);"

Set rDest = Sheets("Results").Range("A1")
rDest.CurrentRegion.Clear  'optional- delete existing table

'--populate QueryTable
With qt
    .CommandText = sSQL  '<---- Error: "Object Variable not set
    .CommandType = xlCmdSql
    .AdjustColumnWidth = True  'add any other table properties here
    .BackgroundQuery = False
    .Refresh
End With

Set qt = Nothing
Set rDest = Nothing
End Sub



 
First of all, you do NOT want to use code to ADD querytables. Adding a QT is a ONE TIME EVENT, generally speaking. I can count on one finger the number of times in 20 years that I have used code to add a QT. In fact, you ought to delete the sheet to destroy any MULTIPLE QTs you might have added inadvertently, of run a...
Code:
dim lo as listobject
for each lo in activesheet.listobjects
  lo.delete
next
...as QTs are a property of ListObjects since Excel 2007.

Set up your QT MANUALLY. Get the SQL working just the way you need it to run. Then, use this sub code to 1) get the connecton and 2) get the SQL, each of which can be COPIED from the IMMEDIATE Window.
Code:
Sub QueryTableSetup()
    Dim sSQL As String, sPath As String, sDB As String, sConn As String
    
    sPath = ""
    
    sDB = ""
    
    sConn = ""

    '--build SQL statement
    sSQL = ""
    
    With Sheets("Results").ListObjects(1).QueryTable
        Debug.Print .Connection
        Debug.Print .CommandText
    End With
End Sub
Copy your connection string into the sConn variable.
Copy your SQL string into the sSQL variable.
Naturally, each of these should be massaged for ease of readability & maintainability.

Here;s how it looks in my code sheet.
Code:
Sub QueryTableExample()
    Dim sSQL As String, sPath As String, sDB As String, sConn As String
    
    sPath = "C:\Users\Skip\Documents"
    
    sDB = "tt_bx_summary.xlsm"
    
    sConn = "ODBC;DSN=Excel Files;"
    sConn = sConn & "DBQ=" & sPath & "\" & sDB & ";"
    sConn = sConn & "DefaultDir=" & sPath & ";"
    sConn = sConn & "DriverId=1046;MaxBufferSize=2048;PageTimeout=5;"

    '--build SQL statement
    sSQL = "TRANSFORM Sum(a.PaidAmt)"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "SELECT month(a.Date) as [MO]"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "FROM `MSTR$` a"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "GROUP BY month(a.Date)"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "PIVOT Year(a.Date)"
    
    With Sheets("Results").ListObjects(1).QueryTable
        .Connection = sConn
        .CommandText = sSQL  '<---- Error: "Object Variable not set
        .BackgroundQuery = False
        .Refresh
    End With
End Sub
[highlight #FCE94F]SORRY, I HAD SOME PUNCTUATION ERRORS THAT I CORRECTED IN THE ABOVE CODE[/highlight]

Notice how I could change the path or the DB (workbook) quite easily.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Okay, the next step is getting the Top 5 from the first table and using that in the query for the second table.

Here's a function to return an IN() list from a range in Excel. So the first thing is to select the top 5 months in the resultset for the first query and NAME that range rTop5Months.

This is the function that you'll need to return that list...
Code:
Function MakeList(rng As Range)
    Dim r As Range
    For Each r In rng
        MakeList = MakeList & "'" & r.Value & "'" & ","
    Next
    MakeList = Left(MakeList, Len(MakeList) - 1)
End Function

Then your second query might look something like this (remember to ADD the QT first before running this code)...
Code:
Sub PVT2()
    Dim sSQL As String, sPath As String, sDB As String, sConn As String
    
    sPath = "C:\Users\Skip\Documents"
    
    sDB = "tt_bx_summary.xlsm"
    
    sConn = "ODBC;DSN=Excel Files;"
    sConn = sConn & "DBQ=" & sPath & "\" & sDB & ";"
    sConn = sConn & "DefaultDir=" & sPath & ";"
    sConn = sConn & "DriverId=1046;MaxBufferSize=2048;PageTimeout=5;"

    '--build SQL statement
    sSQL = sSQL & "transform Sum(a.PaidAmt)"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "SELECT"
    sSQL = sSQL & "  a.Provider"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "FROM `MSTR$` a"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "WHERE month(a.Date) in ([highlight #FCE94F]" & MakeList([[b]rTop5Months[/b]]) & "[/highlight])"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "group by"
    sSQL = sSQL & "  a.Provider"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "pivot year(a.Date)"
    
    With Sheets("Results").ListObjects("tPVT2").QueryTable
        .Connection = sConn
        .CommandText = sSQL
        .BackgroundQuery = False
        .Refresh
    End With
End Sub

BTW, this same technique can be used for your rTop5Providers or rTop5Customers or...

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Interesting concept - the use of query tables...

Currently reviewing...

Are there any disadvantages and/or limitations with the use of this approach?




 
It might depend on the particular application, but I have been using queries in Excel, from all kinds of external data sources like Oracle, DB2, SQL Server, Access, Excel, text files and then using sucessive queries within Excel, often based on user interaction (controls). You might call them complex interactive reporting or decision support systems in Excel.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Let me give you a perspective. Quite often, when you get a text file from someone in your company, you discover that this data came from some sort of IT maintained system. BTW, over the past 15 years I was not in an IT department, but I discovered, got access to and learned to use the databases that housed the data my management needed to run their organization: that is data in a form they were not getting.

So very likely the monthly text file you get is derived from one or more tables in your company's IT system. I'd make it a priority to find out and get access and learn how that database works. They probably rehost data in tables that will not adversely affect production systems when reporting users access them.

You crawl, then walk, then run.

Just my 2 cents.

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Receive error message - "Subscript out of range" when using the following code:

Code:
Sub QueryTableExample()
    Dim sSQL As String, sPath As String, sDB As String, sConn As String
    
    sPath = "C:\Users\Tom\Documents"
    
    sDB = "Test.xlsm"
    
    sConn = "ODBC;DSN=Excel Files;"
    sConn = sConn & "DBQ=" & sPath & "\" & sDB & ";"
    sConn = sConn & "DefaultDir=" & sPath & ";"
    sConn = sConn & "DriverId=1046;MaxBufferSize=2048;PageTimeout=5;"

    '--build SQL statement
    sSQL = "TRANSFORM Sum(a.PaidAmt)"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "SELECT month(a.Date) as [MO]"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "FROM `MSTR$` a"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "GROUP BY month(a.Date)"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "PIVOT Year(a.Date)"
    
    With Sheets("Results").ListObjects(1).QueryTable  '<--- Error: Subscript out of range
        .Connection = sConn
        .CommandText = sSQL
        .BackgroundQuery = False
        .Refresh
    End With
End Sub


Due to the error, I decided to count the number of query tables on worksheet "MSTR using the code below. It appears that the number = 0, which is incorrect. It should be "1."

Therefore, I am trouble shooting. Any idea as to the cause/resolution of error?

Code:
Sub Count_QT_New()
  Dim lQT As Long
  Dim LO As ListObject

  For Each LO In ActiveSheet.ListObjects
    If LO.SourceType = 3 Then 'xlSrcQuery
        lQT = lQT + 1
    End If
  Next LO
  Debug.Print lQT
End Sub
 
Well you subscript error is on Results, not MSTR. So did you add a query to Results? Apparently not. Or you deleted the LO that was there? There needs to be a LO.QT there via a manually added query.

As for MSTR, how did a LO get there? Only by 1) adding a query (LO.QT) or 2) inserting a Structured Table (LO) via Insert>Tables>Table. However, on this sheet, you're looking for a LO that has a QT property. WHY? If your table is a result is 2), then your code returns a false answer, as there could very well be a LO there and there should!

But your error was with reference to Results!

???

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Ok.

Did revise the code;

From: "With Sheets("Results").ListObjects(1).QueryTable"

To: "With Sheets("MSTR").ListObjects(1).QueryTable"

Also, reviewed the worksheet "MSTR", deleted and reimported the text file.

The error received now is "Run time Error 1004: Application-defined or object-defined error"
and the following line within the code is highlighted.

Code:
With Sheets("MSTR").ListObjects(1).QueryTable

Note, upon reimporting the text file, I converted to a table. Received a dialog box - "Your selection overlaps one or more external data ranges. Do you want to convert the selection to a table and remove all external connections?"

Therefore, it appears that upon converting to a table, I have lost all connections to the text file...

I believe that I need to maintain the connection considering that I need to refresh the connection to "pull in" the additional records for the current month as well as all of the records for 2015 that may contain revised PaidAmt figures). (The Accounting Dept. places the updated text file on the network drive that contains 2014 data and YTD 2015 data.)

Any additional insight as to a resolution to the error received?
 
I think you have a problem in your process. I'd see you IMPORTING in some other sheet, the SINGLE MONTH that is in the text file and then APPENDING that DATA to the table in MSTR (the MASTER table), from which you will do your year over year variance analysis & reporting.

BTW, the table that you use to IMPORT the text file data IS A ListObject.QueryTable already, so you don't need to convert it to a table (i.e. A Structured Table). All you need do is give it an appropriate table name using your preferred convention. My convention is to name my Structured Tables like tMSTR or tPVT1. That's my convention. If it were a pre-2007 QueryTqble (no ListObjects before vers 2007), then I would name it qMSRT, for instance. My named ranges are like rTop5Months.

So I'd expect that MSTR would have data for all 12 months for 2013 & 2014 and part of 2015, BUT all you have given me is a 3 month 2013 table. ???

Where did I go astray?

Skip,

[glasses]Just traded in my OLD subtlety...
for a NUance![tongue]
 
Yes, MSTR worksheet has 12 months of data for 2013 and 2014. There is 9 months of data for 2015 - Jan through Sept.

Did review the process and it is as initially planned - using the MSTR table (all inclusive table containing 2013 through 2015 data) to perform the year over year variance analysis and reporting. Also, using another worksheet for the current month's data.

Recreated the entire process and have encountered a error - Object Variable or With Block variable not set and the line ".Connection = sConn" is highlighted.

What am I missing??

Code:
Sub QueryTableExample()
    Dim sSQL As String, sPath As String, sDB As String, sConn As String
    
    sPath = "C:\Users\Tom\Documents"
    
    sDB = "Test.xlsm"
           
    sConn = "ODBC;DSN=Excel Files;"  '1st iteration
    'sConn = "TEXT;"  '2nd iteration;  'Error persists
       
    sConn = sConn & "DBQ=" & sPath & "\" & sDB & ";"
    sConn = sConn & "DefaultDir=" & sPath & ";"
    sConn = sConn & "DriverId=1046;MaxBufferSize=2048;PageTimeout=5;"

    '--build SQL statement
    sSQL = "TRANSFORM Sum(a.PaidAmt)"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "SELECT month(a.Date) as [MO]"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "FROM `MSTR$` a"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "GROUP BY month(a.Date)"
    sSQL = sSQL & vbLf
    sSQL = sSQL & "PIVOT Year(a.Date)"
    
    Debug.Print sSQL
    
    With Sheets("MSTR").ListObjects(1).QueryTable
                
        .Connection = sConn   <----- ERROR: Object Variable or With Block Variable not set
        .CommandText = sSQL
        .BackgroundQuery = False
        .Refresh
    End With
End Sub

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top