shadedecho
Programmer
ok... I have developed a reputation on this site for long complicated posts (I like to think that's because I don't ask questions that are simple because I somehow figure those out on my own...) anyway, this one is no different --
NOTE: not for the faint of heart!
This question relates to a web-based timesheet system I have written for my company, and all the tables that store the information each employee enters daily. i have a hugely complex set of cross-linked tables, and I am trying to do one master query across all of those tables (using joins) to come up with a result set that has a field for each of the variables and calculated values that I need, and each record in the result set corresponds to one employee.
Here's the foundation for my question:
I have an employee table, which is inner-linked on a many-to-one basis with the SupID (supervisor's employee ID) field pointing to another record's EmpID (employee ID).
I should now mention that all other records are related to an employee record ONLY by a calculated value... by this, I mean that a "UID" is created on the fly with MD5(CONCAT(`FirstName`,`LastName`,`EmpID`)) from the employee table and that UID (32 character string) is what is stored in each timesheet or child or previous adjustment record.
Next, I have a timesheet table, which has the main record of a timesheet for a certain pay period, the unique identifier is TSID. Then, linked to the main timesheet record are records from two "child" tables, project_worked and leavetime_used, both on a many-to-one relationship to the main timesheet record (linking to the TSID).
The project_worked records are flagged with a "default_holiday" if they are holiday pay, and one of several other (irrevlevant) codes if not. So I have to get a total of non-holiday hours (flag <> 'default_holiday') and a total of holiday hours (flag = 'default_holiday').
The project_worked (and leavetime_used) records are stored with a "date" field, meaning they correspond to a particular day that an employee logged time for. overtime is calculated based on calendar weeks (sun-sat), but our pay period starts and ends in the middle of the calendar week, so overtime calculation has to "look back" to previous days, all the way back to the sunday preceeding the payperiod startdate, to get an accurate calendar week total. The way our pay period dates fall, it is POSSIBLE for there to be a total of 3 full calendar weeks in question that the OT calculation would need to occur on (ie, the pay period starts on a saturday, the last day of a period, and ends 16 days later, on a monday).
Also, we have this notion of "previous adjustments" which are logged in the current time period, but are adjusting the last day of the previous pay period. these are all stored in one table, but are flagged with a field to be "project_worked" or "leavetime_used". the previous adjustments are line items, that each have their own primary key, but related previous adjustments (ie, adjustments for an employee in a particular pay period) all have a common "documentID". also, previous adjustments are linked to a "current pay period" with the "period_start_date".
So, the challenge is, I need to have the query use the SUM() function to sum up based on the "total_time" field in my child tables like project_worked, leavetime_used, and previousTS_adjustments. Also, I need a calendar week total using SUM from each of the 3 possible calendar weeks (don't worry, I programatically input the appropriate start and end dates when creating this query). This way, I can examine each week's total to see if > 40 to be displayed elsewhere as OT.
Sound complicated enough?
Here's the query I've written so far to attempt this feat...
Note, $$$___$$$ denotes a variable that I programatically build into the query string in PHP, so just assume it to be some constant.
so, the good news is that I don't get any syntax errors from all that stuff. bad news, it is taking INCREDIBLY too much time to execute... right now, the query has been running for nearly 10 minutes and the page just timed out. I only have about 200 employee records, and each employee has one timesheet (so far) and maybe 30 child records. This is NOT a huge data set at this point, I must be doing something incredibly inefficient for this to be occuring.
As you can see in the query, I select out some fields, and then select out some "SUM"s of the "total_time" field in the various other tables... because i need to get the total in several different fashions (holiday, non-holiday, week1, etc etc) I am joining the same table to the mix several times over, each time applying a different ON clause for that particular condition, like the date being between the start and end dates of the respective calendar week.
Can anyone help me out in where I might be going wrong with this query?
NOTE: not for the faint of heart!
This question relates to a web-based timesheet system I have written for my company, and all the tables that store the information each employee enters daily. i have a hugely complex set of cross-linked tables, and I am trying to do one master query across all of those tables (using joins) to come up with a result set that has a field for each of the variables and calculated values that I need, and each record in the result set corresponds to one employee.
Here's the foundation for my question:
I have an employee table, which is inner-linked on a many-to-one basis with the SupID (supervisor's employee ID) field pointing to another record's EmpID (employee ID).
I should now mention that all other records are related to an employee record ONLY by a calculated value... by this, I mean that a "UID" is created on the fly with MD5(CONCAT(`FirstName`,`LastName`,`EmpID`)) from the employee table and that UID (32 character string) is what is stored in each timesheet or child or previous adjustment record.
Next, I have a timesheet table, which has the main record of a timesheet for a certain pay period, the unique identifier is TSID. Then, linked to the main timesheet record are records from two "child" tables, project_worked and leavetime_used, both on a many-to-one relationship to the main timesheet record (linking to the TSID).
The project_worked records are flagged with a "default_holiday" if they are holiday pay, and one of several other (irrevlevant) codes if not. So I have to get a total of non-holiday hours (flag <> 'default_holiday') and a total of holiday hours (flag = 'default_holiday').
The project_worked (and leavetime_used) records are stored with a "date" field, meaning they correspond to a particular day that an employee logged time for. overtime is calculated based on calendar weeks (sun-sat), but our pay period starts and ends in the middle of the calendar week, so overtime calculation has to "look back" to previous days, all the way back to the sunday preceeding the payperiod startdate, to get an accurate calendar week total. The way our pay period dates fall, it is POSSIBLE for there to be a total of 3 full calendar weeks in question that the OT calculation would need to occur on (ie, the pay period starts on a saturday, the last day of a period, and ends 16 days later, on a monday).
Also, we have this notion of "previous adjustments" which are logged in the current time period, but are adjusting the last day of the previous pay period. these are all stored in one table, but are flagged with a field to be "project_worked" or "leavetime_used". the previous adjustments are line items, that each have their own primary key, but related previous adjustments (ie, adjustments for an employee in a particular pay period) all have a common "documentID". also, previous adjustments are linked to a "current pay period" with the "period_start_date".
So, the challenge is, I need to have the query use the SUM() function to sum up based on the "total_time" field in my child tables like project_worked, leavetime_used, and previousTS_adjustments. Also, I need a calendar week total using SUM from each of the 3 possible calendar weeks (don't worry, I programatically input the appropriate start and end dates when creating this query). This way, I can examine each week's total to see if > 40 to be displayed elsewhere as OT.
Sound complicated enough?
Here's the query I've written so far to attempt this feat...
Note, $$$___$$$ denotes a variable that I programatically build into the query string in PHP, so just assume it to be some constant.
Code:
SELECT emp.`actual_name`,
emp.`EmpID`,
MD5(CONCAT(emp.`FirstName`,emp.`LastName`,emp.`EmpID`)) AS 'empUID',
emp.`SupID`,
sup.`actual_name`,
MD5(CONCAT(sup.`FirstName`,sup.`LastName`,sup.`EmpID`)) AS 'supUID',
ts.`TSID`,
SUM(ts_pw.`total_time`) AS 'ts_pw_total',
SUM(ts_lt.`total_time`) AS 'ts_lt_total',
SUM(ts_pw_holiday.`total_time`) AS 'ts_holiday_total',
SUM(ts_pw_week1.`total_time`) AS 'ts_pw_week1_total',
SUM(ts_pw_week2.`total_time`) AS 'ts_pw_week2_total',
SUM(ts_pw_week3.`total_time`) AS 'ts_pw_week3_total',
pa.`documentID`,
SUM(pa_pw.`total_time`) AS 'pa_pw_total',
SUM(pa_lt.`total_time`) AS 'pa_lt_total'
FROM `employee` emp INNER JOIN `employee` sup ON sup.`EmpID` = emp.`SupID`
LEFT OUTER JOIN `timesheet` ts ON ts.UID = MD5(CONCAT(emp.`FirstName`,emp.`LastName`,emp.`EmpID`))
LEFT OUTER JOIN `project_worked` ts_pw ON ts.`TSID` = ts_pw.`TSID` AND ts_pw.`flag` <> 'default_holiday'
LEFT OUTER JOIN `project_worked` ts_pw_holiday ON ts.`TSID` = ts_pw_holiday.`TSID` AND ts_pw_holiday.`flag` = 'default_holiday'
LEFT OUTER JOIN `project_worked` ts_pw_week1 ON ts.`TSID` = ts_pw_week1.`TSID` AND ts_pw_week1.`flag` <> 'default_holiday' AND (TO_DAYS(ts_pw_week1.`date`) BETWEEN TO_DAYS('$$$calendar1start$$$') AND TO_DAYS('$$$calendar1end$$$'))
LEFT OUTER JOIN `project_worked` ts_pw_week2 ON ts.`TSID` = ts_pw_week2.`TSID` AND ts_pw_week1.`flag` <> 'default_holiday' AND (TO_DAYS(ts_pw_week2.`date`) BETWEEN TO_DAYS('$$$calendar2start$$$') AND TO_DAYS('$$$calendar2end$$$'))
LEFT OUTER JOIN `project_worked` ts_pw_week3 ON ts.`TSID` = ts_pw_week3.`TSID` AND ts_pw_week1.`flag` <> 'default_holiday' AND (TO_DAYS(ts_pw_week3.`date`) BETWEEN TO_DAYS('$$$calendar3start$$$') AND TO_DAYS('$$$calendar3end$$$'))
LEFT OUTER JOIN `leavetime_used` ts_lt ON ts.`TSID` = ts_lt.`TSID`
LEFT OUTER JOIN `previousTS_adjustments` pa ON pa.`UID` = MD5(CONCAT(emp.`FirstName`,emp.`LastName`,emp.`EmpID`))
LEFT OUTER JOIN `previousTS_adjustments` pa_pw ON pa_pw.`documentID` = pa.`documentID` AND pa_pw.`type` = 'work'
LEFT OUTER JOIN `previousTS_adjustments` pa_lt ON pa_lt.`documentID` = pa.`documentID` AND pa_lt.`type` = 'leave'
WHERE sup.`EmpID` = emp.`SupID` AND
ts.`start_date` = '$$$periodstart$$$' AND
ts.`end_date` = '$$$periodend$$$' AND
ts.`status` = 'processed' AND
pa.`status` = 'processed' AND
pa.`period_start_date` = '2003-03-12'
GROUP BY emp.`actual_name`
so, the good news is that I don't get any syntax errors from all that stuff. bad news, it is taking INCREDIBLY too much time to execute... right now, the query has been running for nearly 10 minutes and the page just timed out. I only have about 200 employee records, and each employee has one timesheet (so far) and maybe 30 child records. This is NOT a huge data set at this point, I must be doing something incredibly inefficient for this to be occuring.
As you can see in the query, I select out some fields, and then select out some "SUM"s of the "total_time" field in the various other tables... because i need to get the total in several different fashions (holiday, non-holiday, week1, etc etc) I am joining the same table to the mix several times over, each time applying a different ON clause for that particular condition, like the date being between the start and end dates of the respective calendar week.
Can anyone help me out in where I might be going wrong with this query?