Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Is Size an issue....

Status
Not open for further replies.

jpadie

Technical User
Nov 24, 2003
10,094
FR
i am about to start helping on a data-mining project where the starting data weighs in at 350GB. this is the compressed size on a set of backup tapes so i presume the uncompressed size will be greater.

i do not need the data mining scripts to be fast but i do need rapid view access to the recordsets. I am not hopefully going to be deploying expensive servers on this project either: it's hopefully not going to be a big job.

Does anyone have active experience using mysql with data of this size? will it cope on low-mid spec machines?

bear in mind:

the first few tasks will be removing columns that are obviously irrelevant to our purpose.
the next few will involve an analysis of which further columns we can delete.
then a cleansing exercise involving simplification of various columns
then a transformation on the remaining recordset
lastly a single query across the entire recordset that is intended to result in a single number (the data is several years of financial data from which we are trying to derive an index that will be maintained monthly going forward - depending on where we end up we might well decide to recalculate the index on the fly per additional base record: all depends on how long the generation query takes.

thanks in advance for any insights you may have

Justin
 
I haven't run anthing with that much data, I guess the closes I've been to that is around 90g.

This was run on a dual xeon 2.8 with 4g of ram, the hardware didn't get much above 3% most of the time.
Our dev server is a HP DL140 which performs just fine (£540ish UK) and currently has around 75g of data.

All I can really offer is ; throw as much ram at it as you can, as you don't say what you feel is a low-mid spec.



______________________________________________________________________
There's no present like the time, they say. - Henry's Cat.
 
thanks. the budget is about £500 which was what i meant by l-m spec. might be an interesting task for Yonah and a very big SATA disk...
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top