Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Create index takes long time

Status
Not open for further replies.

karche168

Technical User
Jan 7, 2004
12
US
I have 1 table in a db which has about 8GB with 110 million rows and here is the table desc.
+-------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+-------+
| LastName | varchar(15) | YES | | NULL | |
| FullName | varchar(15) | YES | | NULL | |
| HouseNumber | varchar(10) | YES | | NULL | |
| Street | varchar(26) | YES | | NULL | |
| Unit | varchar(8) | YES | | NULL | |
| City | varchar(20) | YES | | NULL | |
| State | char(2) | YES | | NULL | |
| ZipCode | varchar(9) | YES | | NULL | |
| PhoneNumber | varchar(10) | YES | | NULL | |
| DMA | varchar(5) | YES | MUL | NULL | |
| PublishDate | varchar(12) | YES | | NULL | |
+-------------+-------------+------+-----+---------+-------+

When I create the index on (dma, state, city, lastname, fullname), it takes 9 hours. Is there a way to speed up the process? It is running in Windows2003 with 2 CPU and I am using the large db configuration which comes from mysql installation.

Thanks
 
First off, 9 hours for an index this large is pretty good.

But there are things that can be improved.

You will probably get more millage out of your indexes
if you avoid multi-column indexes like the index in question.
The reason is when you do SELECTS that filter on state, city,
lastname, or fullname, this index will NOT be used unless you
also filter on DMA.
So I would suggest you limit this index to just dma,
and create new indexes on the other fields you frequently SELECT on.

Whenever you create an index on VARCHAR you have the option
of limiting the number of characters that are indexed. This is
usually a good thing to do since it saves disk space and makes indexes
smaller and they fit in memory better. I would
probably restrict every varchar index to use just the first
6 characters:
INDEX index_name (col_name(6))

Stuff your machine with as much RAM as you can afford.

One thing I don't understand as why this one-time
cost of 9 hours is a big deal. Once the index is created you
shouldn't have to recreate it except in emergency. And even
if you do have to recreate it your database will still function
while the index is being recreated in the background.
 
Thanks again, vanekl.
I have 2 CPU, but seems like it is only use 1 most of time. The 1st CPU is about 50-80%, 2nd CPU is mostly around 20%. Is there a way or configure issue that mysql not use the 2nd CPU?

Toatl Physical memory is 1GB, VM is 3 GB and always see 30% in used so memory is fine.

For the creating index. In general for others type of db (SQL, oracle, etc) to speed up the loading process is to create table, then create index. Is it true for mysql too?
In my case, it took 1 hour to import data and 9 hours for creating index. Will that be faster to create table and index, then import the data?

Thanks
 
It will be faster to create the index after all the data has
been imported.

I don't know of any way to configure MySQL so that it only
uses one CPU.

If you have 1 GB RAM and 3 GB virtual memory and you are
using 30% of all memory (4 GB), then you have exhausted RAM
and are hitting the disk drive. If you want to speed things
up you will avoid hitting the disk drive and try to minimize
use of virtual memory. More RAM would definitely speed things
up.
 
It would be useful to check how MySQL uses indexes. Some daatbases eg Microsoft Jet will process several indexes at the same time. So you could have an index on each field separately and still get good performance.

Obviously a good way to minimise index build time is to sort the data into the same order before loading. But you may not want it in index order, and with that huge index it may take you eight and three quarter hours to sort it...

 
Big question is do you really need a composite index. Are you just trying to get a fast look up of a large name and address file.

If the application/user is trying to use any one of a combination of field as a search item then it will work faster with separate indexes.

If the application/user has to specify all fields then I can see the point in a composite index but then you would be better off generating a fixed length key field perhaps
5 characters of DMA
2 chars state
1st5 characters of city
1st5 characters of lastname
1st5 characters of firstname

or a separate table of keys (22 characters long) and an integervalue that using this integergervalue to look up the records from the physically large name and address table

What are you trying to achieve ?
 
hvass, the application will use all fields for search, so I need to build the combination index.
As you suggest above with the fixed length index, what will happen if I search City, Seattle. Will that do a scan on all S?

For a separate table of keys method, do I need to import those 22 chars into a new table, then link to the large table to look up the record?

I also need to build these indexes on the table
(zipcode, lastname, fullname)
(phonenumber, lastname, fullname)
(State, City, Street, HouseNumber)

Any suggest what I need to do. Thanks
 
You don't have enough RAM to create all these
multi-column indexes. Just your first multi-column index
by itself will consume about 3 GB. You should be
considering single-column indexes, and limiting
the length of the varchar columns in the indexes.
 
vanekl, what do you mean not enough RAM? Does mysql store all index in memory? The index space used is about 3 GB.
By the way, is there a way to find out how much space an index used?

 
The whole idea behind indexes is that they should be small enough
to fit into RAM so they can be quickly accessed, otherwise
if you have to go to disk to read the index you are saving
very little time because you might as well just be reading
the raw records from disk instead.

Yes, MySQL tries to store all indexes in RAM, depending on
how you tuned the MySQL parameters. The majority of RAM is
used to to hold indexes, that's how important they are to
database speed. 'key_buffer_size' has the biggest impact on
how many of your indexes can be kept in RAM, but it also
depends on the table type you are using as to which variables
need to be tuned.

If you are using myisam tables then your indexes are in the
.MYI files.
 
the configure is something like this
skip-locking
set-variable = key_buffer=256M
set-variable = max_allowed_packet=1M
set-variable = table_cache=256
set-variable = sort_buffer=1M
set-variable = record_buffer=1M
set-variable = myisam_sort_buffer_size=64M
set-variable = thread_cache=8
# Try number of CPU's*2 for thread_concurrency
set-variable = thread_concurrency=4
log-bin
server-id = 1

which comes with mysql installation (large size db).
so what is key_buffer = 256 M means?
The table is myisam type and store in .MYI files, so does it store in memory?

Another question, if mysql stores all index on memory and I don't have enough RAM. How mysql handel it in this situation and what about other normal process?

Sorry, I am new to mysql so get a lot of question.
Thanks for the replies
 
I think I am with vanekl on this you need to start with single field indexes and keep the indexes relatively small by limiting them to the first characters of the field

Again what are you trying to do? to allow a user to search for

Mr Jones
118 Wuthering Heights
Seatle
Washington

by trying

where city like "S%"
and lastname like "Jon%"

Give us some more details on what the application is and what your current sql select statements you hope to run against the table looks like.
 
MySQL will TRY to store your indexes in RAM. If it cannot
store the entire index in RAM then the remainder will remain
on disk. The index can still be used, but it will be slower.
If you need more speed, then redesign your indexes, increase
RAM, and it may help to adjust the MySQL variables.

256 M means that MySQL is using 256 Megabytes of RAM to
store indexes (out of a total 1 Gigabyte or RAM on your system).
This is within the normal range for this parameter considering the
amount of RAM on your system. This number should be set to somewhere
between 256 and 512 MB. (You should experiment.)

The .MYI indexes will be stored in RAM if they are used by
your queries. If they aren't used by your queries then they
will not be read into RAM. If they are read into RAM they will
be replaced by other indexes as soon as other indexes are
required by other queries.
MySQL still has enough RAM left over to handle normal
processing, but things will not be as fast as they could be
if indexes are so big they wont even fit in RAM.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top