Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Dealing with 750 million records in MYSQL table

Status
Not open for further replies.

rickrude11

IS-IT--Management
Jul 15, 2007
71
NZ
Hi experts,

I have a table that will potentially grow to 750 million records. To deal with this, I need to split the table into multiple tables via a trigger or something. Can anyone help with how to do that? I don't need line by line syntax or anything, just some general logic.

Here is the basic structure of table called MAIN.

DEVICE | FIELD1 | FIELD2 | FIELD3
5 data data data
5 data data data
33 data data data
2 data data data
61 data data data
12 data data data

There are up to 100 'DEVICES' and I would like to fork each distinct device into a new table of its own. I tried using the DEVICE field as the table name in a trigger but failed.

I would like to end up with a separate table per device. Based on the above mock table, I should end up with 5 different tables called table_5, table_33, table_2, table_61, table_12. And of course having the corresponding rows in those tables.

DB admins look at me funny when I ask this, I don't know why. I am no DB guy by any means, but have been assigned this task nevertheless.

Thanks!
 
I don't really understand what you are asking here. I'll give my best shot at answering what I think your question is though.

I assume you've already created the tables (table_5, table_33, etc). So then you could create a stored procedure (routine) and have some conditional logic to determine what table you want to insert/update/delete on.

Ex (P_DEVICE is an input param).

IF (P_DEVICE = '33')
INSERT INTO `table_33`
etc

IF (P_DEVICE = '5')
INSERT INTO `table_5`
etc


Is that what you were asking? You could also do this in a trigger after it hits the Main table (the same logic above would work). If you do this then you're going to be duplicating a ton of data. The MAIN table is still going to have 750 million rows and the "sub" tables will all add up to the total.

Regards,

J
 
Sort of, sorry for not being clear. I need this done dynamically. Creating the tables manually is ok, but I am unable to write if statements because the device numbers could be anything (not just 1-100).

So I need something like this...

set _device_table = concat('table_',NEW.device)
create table if not exist `_device_table` like MAIN
insert blah into `_device_table`

I know the above doesn't work because it treats to table variable as literal!


Cheers.

 
Ehm, isn't this what table partitioning is meant for? MySQL has provisions for it. Maybe that is why DBAs are looking funny.

+++ Despite being wrong in every important aspect, that is a very good analogy +++
Hex (in Darwin's Watch)
 
Well DonQ this is why I am asking. now to google 'table partitioning' :D
 
ok so the table partitioning looks promising. However, would it not be so much simpler to just split them into separate tables?

I would rather do that, why am I the only one in the world that wants to do this? I must be missing something.
 
To deal with this, I need to split the table into multiple tables via a trigger or something.
what gave you this idea? :)


there's nothing wrong with a single table of 750 million rows

unless, of course, you don't optimize your queries with indexes, then of course the queries will do table scans, but splitting the table into numerous smaller tables isn't really the way to deal with that

:)

r937.com | rudy.ca
Buy my new book Simply SQL from Amazon
 
There's nothing wrong in theory with a MySQL table of 750 million rows but in practice I found that there were problems when I tried to replicate what rick wanted to do.

I gave up, after more than 12 hours, when trying to index the table even after tweaking the MySQL start up parameters.

Obviously it depends on the operating environment. My system is a 4GB RAM, overclocked Intel Core i7 running under 32 bit Windows XP. So, whilst not state of art, it is reasonably powerful for a home computer.

It would be interesting to know the maximum size MySQL tables people are actually using and getting a reasonable performance.

What I don't understand is how the data is getting into the (large) MySQL table in the first place. Why not amend this process to insert the data into a table of the appropriate (device) name instead of inserting the data into one large table?



Andrew
 
I gave up on the idea of having mysql handle 750 million records because it obviously can't be done. If it could, it wouldn't be that hard to find a solution.

I modified the process of data collection as towerbase had suggested but I was trying to avoid that because it it ugly.

Thanks towerbase for the time you put in to testing this.

 
Just out of interest - what are you recording in the table? Do you really need the level of granularity implied by your estimate of 750 million rows - that is an awful lot of information to rake through. What data type are the three columns you have called FIELD1..3?
 
SimonSellick said:
Just out of interest - what are you recording in the table? Do you really need the level of granularity implied by your estimate of 750 million rows - that is an awful lot of information to rake through. What data type are the three columns you have called FIELD1..3?

I am logging source / destination IP addresses and ports along with bytes, times etc (netflow) and I want to keep 30 days of data for up to 100 sites. I dumbed down the example table, all the data types are either int(32), int(16),tinyint(4) or varchar(16). The idea is that each branch manager will have their own login and be able to monitor their own users and hosts.

r937 said:

That looks like a worthwhile read there. Even with me splitting the source into multiple tables, I still have up to 7.5 million records per table. Thanks.
 
The article boils down to the fact that you need to keep the data in memory as disk I/O is very expensive.
So 2 solutions for very large tables:
a) Put enough RAM in your server
b) use solid state disks (they can cope with higher IOPS)

If I were you I would still put everything in one table as this enables an easier design...

Cheers,
Daddy

-----------------------------------------------------
What You See Is What You Get
Never underestimate tha powah of tha google!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top