Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

mysql trigger to copy data into different tables

Status
Not open for further replies.

rickrude11

IS-IT--Management
Jul 15, 2007
71
NZ
Hi experts.

I have a table that collects netflow traffic from multiple devices and it is MASSIVE. To be able to report on that in a timely manor I would like to split it up into separate tables based on the IP address of the sending device.

The reason for this is because all the reports will be run based soley on each sender, not as a whole (make sense?)

Is this possible using a trigger? The senders IP address is in a field called 'senderip', so I would like to create a table dynamically called `senderip_flows` for each sender's IP address, and add each row.

Maybe there is a better way

Thanks for any input.

rick
 
MASSIVE is not a very descriptive term. How many records?

Anyway, MySQL is capable of responsive queries on a table containing many millions of records providing the table is properly indexed and the query is optimised.

I would not recommend splitting your main table into separate tables. This is likely to increase the complexity of your system and cause all kinds of problems.

Perhaps you could tell us the size and structure of the table, which fields are indexed and an example of the query that is giving (or anticipated to give) poor performance?

Andrew
 
Thanks for replying.

At the moment there are only 4 devices sending data, and there are 30 million records for 30 days worth of data. I won't be keeping any more than 30 days, and I would expect no more than 100 devices.

The table is simple...

senderip, srcaddr, dstaddr,srcport,dstport,bytes,timestamp

The queries are also simple...

select sum(bytes) where senderip = 'x' and dstport = '80' and month(timestamp) = '6'.

The queries will never be for multiple senders.

I was summarizing the data into an hourly summary table but the trigger started taking too long to finish, so I canned it.

as for indexing, i have indexed all except the timestamp field.

What do you reckon? Criticize away, I am no expert in this field. This is my first mysql project...
 
Having implied that MySQL would be able to handle your query I thought that I should try populating a table with 750 million (yes, MASSIVE is the right word) records and actually running your query.

It took over 3 hours to load the data (using LOAD DATA commands) on a Windows XP machine running MySQL 5.1.40-community. I then tried to index the senderip field but gave up after several hours. I then ran REPAIR TABLE which took 5214 seconds.

Without an index, your sample query took 266 seconds.

I'm having another go at indexing now.

Note that I'm doing other work on the computer (overclocked Intel Core i7) so the times are indicative only.

Andrew
 
266 seconds is quicker than I thought! Still needs to be much quicker, because there could be multiple queries queuing up from multiple users :|

Yes, I will be needing to upgrade my hardware (don't shoot me for running this on a virtual machine....)

Thanks for looking at this Andrew.

 
I tried to index the table but abandoned the run after about 12 hours. I then increased the size of the MyISam Sort Buffer from about 36MB to 1GB and restarted MySQL. Tried to index again but gave up after about 10 hours.

Came across this:


which indicates that 100 million rows is too many, never mind 750 million rows.

So contrary to my initial suggestion, it looks as though you need to split up your table. One table per device looks like the way to go but without knowing more about your application and what information you want to get out of your system, I don't think I can help you any further. Sorry.



Andrew
 
The information I need is network traffic information from each of the senders, such as total amount of 'http' traffic, or smtp traffic etc. Behind each of them is a small network consisting of ~5 - ~50 users, and these devices monitor their internet usage basically.

If I could split this large table into separate tables based on the senderip field, I think I will be good, but I am not sure how or even if it's possible to create a new table on insert based on a query.

Can you help with that?

cheers,
rick

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top