Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

very fast queries 1

Status
Not open for further replies.

jepatte

Programmer
Jul 4, 2005
75
US
I have a site that has voter information for a state. There are over 2.5 million records in the main table with 50 fields. This site allows users to choose many different filters to get counts. ie. Race, Age, Gender, Vote history, etc. No matter what my indexes are, it still takes anywhere from 3 - 60 seconds to return results.

Here's the thing, there is another site like mine that is even more complex with many more fields, and they can return any count no matter how complex in less than one second. Can someone tell me how this is possible? Do you think they really have it reading all 2.5 million records and returning results that fast.

My server has SQL 2000 and 4 Gb ram on a Windows 2003 box. I'm not sure what this other site is using.
 
My database is normalized. Each voter has 50+ properties. How else can you normalize on that.

Age
Name
Race
Gender
VoteHistory - 20 columns which are Y/N
Addresses
Phone
Email
Districts - 10 columns...
 
I don't know who wrote this but did you do any load/stress.performance testing with a realistic dataset before going live?

I have close to a billion rows and have subsecond performance (and yes you can select several different criteria)

Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions
 
So best to have multiple tables all with one-to-one relationship, so that you have less columns?? Even though vote history is different for every single person, put that in a different table. So my database will have say 5 tables all with 2.5 million records?
 
Also, I'm not doing a select *, I'm just selecting 6 columns.
 
you will have more tables but you might not need to join on all of those (if you don't return address for example then you don't need that table in this query etc etc etc)

since you will be able to store many more rows per page your IO should be much better

Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions
 
So even if my sql statement only selects a few fields, does the IO have to include everything?
ie.
Select field1, field2, field3 from <tableWith50fields>
is much faster than
Select field1, field2, field3 from <tableWith10fields>

 
Just so I understand this, a page is what's considered the table not the recordset.

ie.
T1 = 10 fields
SELECT F1 FROM T1 WHERE xzy = ?
....is faster than....
T1 = 50 fields
SELECT F1 FROM T1 WHERE xzy = ?
 
I don't think you understand.

SQL Server stores the data in your database (obviously). What you probably don't know is HOW the data is stored. Data is stored on the disk in 8 kilobyte 'pieces'. Each piece is a Page. When data is read from the hard drive, SQL Server will read the entire page even if there are records contained within the page that are later filtered out.

Now, each column you have in a table takes up space. For example, an integer takes 4 bytes. A varchar takes the 'defined' size (varchar(20) takes 20 bytes). You cannot make a table that exceeds 8060 bytes per row. The remaining bytes are reserved for internal use.

Now, suppose your table takes 5000 bytes. You will only get 1 row per page. Alternatively, if your table is 'narrow' and only contains a couple columns, and those columns take 100 bytes, you will get approximately 80 rows per page.

So... having narrower tables allows SQL Server to store more rows per table. More rows per table means less physical I/O (hard drive access). Less I/O means better performance.

Make sense?

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
And the votehistory table would look different to if they really are yes no fields (no null allowed) Then I guess you could do with only saving the yes fields and from 20 coolumns you coud break it down to 2 perhaps 3 columns. ANd do the 10 districts contain null values?

Christiaan Baes
Belgium

My Blog
 
Thanks George, that clears tings up a lot.

Christian, I'm not quite sure I understand your VoteHistory response. Each person has a field currently Gen00, Gen02, Gen04, Gen06, etc. either Y or N.

I'm not sure how to make 20 columns into 2 or 3?

Do I need a row for every possibility, then do something like....
VoterHistoryKey, VoteHistoryData
In VoteHistoryData, have values like YNNYN, etc
Meaning Gen00Gen02Gen04Gen06, etc
 
genhistory table

voterid (id from the voter table)
gennumber (00, 02, 04,...)
yes/no field (optional)

A second posibiltiy could be

voterid (see above)
Options (1000100010001001)

Christiaan Baes
Belgium

My Blog
 
I haven't read all the threads on this post but on reading the original post my first thought was "I wonder if the other site is using SQL Server Analysis Services?"

Just a thought....

Smeat

 
Nah, its just SQL 2000, not sure if Analysis Services is possible with SQL 2000.
 
So I can see if I had a table of 2-3 million records that I should move my voterhistory to another table. But should I really move my address, phone, email to separate tables as well?

Its not good practice to have Name, Address, City, State, Zip, Phone, Email in a single table??
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top