Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Problem with selecting from a large table in Perl

Status
Not open for further replies.

n827

Programmer
Dec 27, 2005
1
US
I have a large table with 4 million rows. I need to do an operation on each of the rows. I am using Perl.

When I use the following code on a small table, it works just fine but on the large table, it gets stuck on line 3. As soon as it executes line 3, my hard drive starts going and keeps on going with no end. It never reaches line 4. My diagnosis follows the code fragment below:

-------------------------------------code fragment begin
1 my $sql = 'select myColumn from myTable';
2 my $sth = $dbh->prepare($sql) || die("dbh prep failed");
3 $sth->execute() || die("execute sth failed");
4 my $colValue;
5 $sth->bind_columns(\$colValue);
6 while ($sth->fetch()) {
7 do something to $colValue;
8 }
-------------------------------------code fragment end

I think this is because mySQL is trying to load all 4 million rows into memory. Is there a way, perhaps a command line switch, to delay this so that it processes one row at a time. I have tried adding SQL_NO_CACHE to my SELECT statement but that did not make any difference at all.

On a related note, when I was trying to export the four million rows using mysql command line, I had a similar problem but I found the -q command line switch for mysql command line (From the manual: -q or --quick means "Don't cache result, print it row by row.")

I hope there is something similar for my Perl problem above too. Or am I doing something completely wrong in the way I am doing this? I am a newbie to mySQL.
 
Recent MySQL versions allow the use of server-side cursors, but a simpler and faster way would probably be to retrieve and process the data a manageable chunk at a time:
[tt]
SELECT mycolumn FROM mytable ORDER BY mycolumn LIMIT 0,100000
...
SELECT mycolumn FROM mytable ORDER BY mycolumn LIMIT 100000,100000
...
SELECT mycolumn FROM mytable ORDER BY mycolumn LIMIT 200000,100000
[/tt]
For this, you would need to have an index on mycolumn. Also, you would probably need to lock the table before starting, if there is a chance of it getting updated by someone else before you're finished.

 
I should have mentioned that you could ORDER BY any column in the table, which should be uniquely indexed.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top