very fast queries 1

jepatte · Oct 27, 2007

I have a site that has voter information for a state. There are over 2.5 million records in the main table with 50 fields. This site allows users to choose many different filters to get counts. ie. Race, Age, Gender, Vote history, etc. No matter what my indexes are, it still takes anywhere from 3 - 60 seconds to return results.

Here's the thing, there is another site like mine that is even more complex with many more fields, and they can return any count no matter how complex in less than one second. Can someone tell me how this is possible? Do you think they really have it reading all 2.5 million records and returning results that fast.

My server has SQL 2000 and 4 Gb ram on a Windows 2003 box. I'm not sure what this other site is using.

SQLDenis · Oct 27, 2007

>>with 50 fields.

Your table is too wide and kills your IO, normalize your database

Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions

jepatte · Oct 27, 2007

My database is normalized. Each voter has 50+ properties. How else can you normalize on that.

Age
Name
Race
Gender
VoteHistory - 20 columns which are Y/N
Addresses
Phone
Email
Districts - 10 columns...

SQLDenis · Oct 27, 2007

>>My database is normalized.

yeah right!

all those votehistory columns belong in a votehistory table same with districts

Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions

SQLDenis · Oct 27, 2007

I don't know who wrote this but did you do any load/stress.performance testing with a realistic dataset before going live?

I have close to a billion rows and have subsecond performance (and yes you can select several different criteria)

Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions

SQLDenis · Oct 27, 2007

also
Addresses belong in a separate table

Phone and Email belong in a contact table etc etc etc

what you have is a flat file in a SQL table and now you are paying the price for it

Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions

jepatte · Oct 27, 2007

So best to have multiple tables all with one-to-one relationship, so that you have less columns?? Even though vote history is different for every single person, put that in a different table. So my database will have say 5 tables all with 2.5 million records?

jepatte · Oct 27, 2007

Also, I'm not doing a select *, I'm just selecting 6 columns.

SQLDenis · Oct 27, 2007

you will have more tables but you might not need to join on all of those (if you don't return address for example then you don't need that table in this query etc etc etc)

since you will be able to store many more rows per page your IO should be much better

Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions

jepatte · Oct 27, 2007

So even if my sql statement only selects a few fields, does the IO have to include everything?
ie.
Select field1, field2, field3 from <tableWith50fields>
is much faster than
Select field1, field2, field3 from <tableWith10fields>

SQLDenis · Oct 27, 2007

10 columns per table is faster than 50 because you can store many more rows per page

Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions

jepatte · Jan 17, 2008

Just so I understand this, a page is what's considered the table not the recordset.

ie.
T1 = 10 fields
SELECT F1 FROM T1 WHERE xzy = ?
....is faster than....
T1 = 50 fields
SELECT F1 FROM T1 WHERE xzy = ?

gmmastros · Jan 17, 2008

I don't think you understand.

SQL Server stores the data in your database (obviously). What you probably don't know is HOW the data is stored. Data is stored on the disk in 8 kilobyte 'pieces'. Each piece is a Page. When data is read from the hard drive, SQL Server will read the entire page even if there are records contained within the page that are later filtered out.

Now, each column you have in a table takes up space. For example, an integer takes 4 bytes. A varchar takes the 'defined' size (varchar(20) takes 20 bytes). You cannot make a table that exceeds 8060 bytes per row. The remaining bytes are reserved for internal use.

Now, suppose your table takes 5000 bytes. You will only get 1 row per page. Alternatively, if your table is 'narrow' and only contains a couple columns, and those columns take 100 bytes, you will get approximately 80 rows per page.

So... having narrower tables allows SQL Server to store more rows per table. More rows per table means less physical I/O (hard drive access). Less I/O means better performance.

Make sense?

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

chrissie1 · Jan 17, 2008

And the votehistory table would look different to if they really are yes no fields (no null allowed) Then I guess you could do with only saving the yes fields and from 20 coolumns you coud break it down to 2 perhaps 3 columns. ANd do the 10 districts contain null values?

Christiaan Baes
Belgium

My Blog

jepatte · Jan 17, 2008

Thanks George, that clears tings up a lot.

Christian, I'm not quite sure I understand your VoteHistory response. Each person has a field currently Gen00, Gen02, Gen04, Gen06, etc. either Y or N.

I'm not sure how to make 20 columns into 2 or 3?

Do I need a row for every possibility, then do something like....
VoterHistoryKey, VoteHistoryData
In VoteHistoryData, have values like YNNYN, etc
Meaning Gen00Gen02Gen04Gen06, etc

chrissie1 · Jan 17, 2008

genhistory table

voterid (id from the voter table)
gennumber (00, 02, 04,...)
yes/no field (optional)

A second posibiltiy could be

voterid (see above)
Options (1000100010001001)

Christiaan Baes
Belgium

My Blog

SQLDenis · Jan 17, 2008

>>I'm not sure how to make 20 columns into 2 or 3?

if you have less then 32 posibilities you can store the result in a 4 byte integer column by using bitwise logic
read Speed Up Performance And Slash Your Table Size By 90% By Using Bitwise Logic

Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions

Smeat · Jan 17, 2008

I haven't read all the threads on this post but on reading the original post my first thought was "I wonder if the other site is using SQL Server Analysis Services?"

Just a thought....

Smeat

jepatte · Jan 17, 2008

Nah, its just SQL 2000, not sure if Analysis Services is possible with SQL 2000.

jepatte · Jan 21, 2008

So I can see if I had a table of 2-3 million records that I should move my voterhistory to another table. But should I really move my address, phone, email to separate tables as well?

Its not good practice to have Name, Address, City, State, Zip, Phone, Email in a single table??

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

very fast queries 1

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Similar threads

Log in

Part and Inventory Search

Sponsor