Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Bulk Inserting from Perl to SQL Server 2005 2

Status
Not open for further replies.

mmatchyn

Programmer
May 26, 2009
3
US
What is the best way using Perl to take a txt file with 1 million rows and import it into SQL Server 2005?

I can already do this with C# using SqlBulkCopy. I need to do this in Perl in order to run it from a linux box.

I have tried BCP and it has issues connecting to SQL Server 2005. It uses bad join statements.

I tried freetds. Is there any other proven method for doing a fast bulk insert from Perl to SQL Server 2005. In C# I can do 1 million rows in 1 minute.
 
I just use DBI and create one huge insert statement. It's pretty fast.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
well I use both Win32::ODBC; & DBI...

I hand rolled my own SQL module to make it nice and simple..

You can use it , ripp it apart , improve it or just plain ignore it, up to you :)

Code:
######################
# Set Error Trapping #
######################

use CGI::Carp qw(fatalsToBrowser warningsToBrowser); 
use warnings;
use strict;

##########################
# Use WIN32::ODBC Module #
##########################
use Win32::ODBC;

##########################
# Set Package Name Space #
##########################
package Sql;

################
# Start Module #
################

BEGIN {

	# Invoke Exporter
	use Exporter;

	# Set Variables
	our (@ISA, @EXPORT);
	@ISA = qw(Exporter);

	# Define global vars and subs to be exported
	@EXPORT = qw( &getSQL &insSQL &updSQL &delSQL &cntSQL &sumSQL $DSN $MDB $HLP );

}


########################################################################
########################### GLOBAL SUBROUTINES #########################
########################################################################

########################
#  INIT DSN VARIABLES  #
########################

[b]our $MDB = "DSN=YOUR_DSN;UID=YOUR_USERID;PWD=YOUR_PASSWORD;";[/b]

###############################################
############## Get SQL Routine ################
###############################################

sub getSQL {

#_0 = Table
#_1 = Columns
#_2 = Where
#_3 = Order By

# Define Record Set Array & Hash
my @rs;

#Build SQL Statement
my $sel = "SELECT $_[1] FROM $_[0] WHERE $_[2]";

# Check for ORDER BY
if($_[3]){$sel .= " ORDER BY $_[3]";}

# Open DB Connection
my $db = new Win32::ODBC($MDB) || die "getSQL Error Connecting: " . Win32::ODBC::Error();

# Run SQL Command
if(!$db->Sql("$sel")) {

	# Loop SQL Record Set
		while($db->FetchRow()){ 
				# Build Array of Hashes with SQL Data
				my %dt = $db->DataHash();
			    $rs[@rs] = \%dt;	
		}
	# Close DB Connection
	$db->Close();

	# Return Record Set Array of Hashes
	@rs;

} 
else{die "Error in getSQL ($sel)" . Win32::ODBC::Error();}

}

##################################################
############## Update SQL Routine ################
##################################################
sub updSQL {

#_0 = Table
#_1 = Values
#_2 = Where

# Set Variables
my ($rowcount);

#Build SQL Statement
my $sel = "UPDATE $_[0] SET $_[1] WHERE $_[2]";

# Open DB Connection
my $db = new Win32::ODBC($MDB) || die "getSQL Error Connecting: " . Win32::ODBC::Error();

# Run SQL Command
if(! $db->Sql("$sel") ) {
	# Set rows updated count 
	$rowcount = $db->RowCount(); 

	#Close DB Connection
	$db->Close(); 
} 
else{die "Error in updSQL ($sel)" . Win32::ODBC::Error();}

# Return number of rows updated
$rowcount;

}

##################################################
############## Insert SQL Routine ################
##################################################

sub insSQL {

#_0 = Table
#_1 = Columns
#_2 = Values

# Set Variables
my ($rowcount);

#Build SQL Statement
my $sel = "INSERT INTO $_[0] ($_[1]) VALUES ($_[2])";

# Open DB Connection
my $db = new Win32::ODBC($MDB) || die "getSQL Error Connecting: " . Win32::ODBC::Error();

# Run SQL Command
if(! $db->Sql("$sel") ) {
	# Set rows inserted count 
	$rowcount = $db->RowCount(); 
	#Close DB Connection
	$db->Close(); 
} 
else{die "Error in insSQL ($sel)" . Win32::ODBC::Error();}

# Return number of rows inserted
$rowcount;

}

##################################################
############## Delete SQL Routine ################
##################################################

sub delSQL {

#_0 = Table
#_1 = Where

# Set Variables
my ($rowcount);

#Build SQL Statement
my $sel = "DELETE FROM $_[0] WHERE $_[1]";

# Open DB Connection
my $db = new Win32::ODBC($MDB) || die "getSQL Error Connecting: " . Win32::ODBC::Error();

# Run SQL Command
if(! $db->Sql("$sel") ) {
	# Set rows deleted count 
	$rowcount = $db->RowCount(); 
	#Close DB Connection
	$db->Close(); 
} 
else{die "Error in delSQL ($sel)" . Win32::ODBC::Error();}

# Return number of rows deleted
$rowcount;

}

#################################################
############## COUNT SQL Routine ################
#################################################

sub cntSQL {

#_0 = Table
#_1 = Where

# Define Record Set Array & Hash
my @rs;

#Build SQL Statement
my $sel = "SELECT COUNT(1) as COUNT FROM $_[0] WHERE $_[1]";

# Open DB Connection
my $db = new Win32::ODBC($MDB) || die "getSQL Error Connecting: " . Win32::ODBC::Error();


# Run SQL Command
if(!$db->Sql("$sel")) {

	# Loop SQL Record Set
		while($db->FetchRow()){ 
				# Build Array of Hashes with SQL Data
				my %dt = $db->DataHash();
			    $rs[@rs] = \%dt;	
		}
	# Close DB Connection
	$db->Close();

	# Return Count
	$rs[0]->{'COUNT'};

} 
else{die "Error in cntSQL ($sel)" . Win32::ODBC::Error();}

}

###############################################
############## SUM SQL Routine ################
###############################################

sub sumSQL {

#_0 = Table
#_1 = Column
#_2 = Where

# Define Record Set Array & Hash
my @rs;

#Build SQL Statement
my $sel = "SELECT SUM($_[1]) as MYTOT FROM $_[0] WHERE $_[2]";

# Open DB Connection
my $db = new Win32::ODBC($MDB) || die "getSQL Error Connecting: " . Win32::ODBC::Error();

# Run SQL Command
if(!$db->Sql("$sel")) {

	# Loop SQL Record Set
		while($db->FetchRow()){ 
				# Build Array of Hashes with SQL Data
				my %dt = $db->DataHash();
			    $rs[@rs] = \%dt;	
		}
	# Close DB Connection
	$db->Close();

	# Return Count
	$rs[0]->{'MYTOT'};

} 
else{die "Error in sumSQL ($sel)" . Win32::ODBC::Error();}

}

###########################
# END OF MODULE RETURN 1; #
###########################
1;
Remember to change the $MDB global to your connection string.

you then call the functions like so..
Code:
my @rs = &getSQL("my_table","my_column1,my_column2,my_column3","userid=my_id");

you get the idea!

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!
 
DBI will not work. I need to insert 1 million rows in 1 minute. I can do this in C# with a bulk insert. But it does not work in Perl. Also it must work from a linux pc so I can't use any win32 modules.
 
I find that statement very stange. DBI can easily insert 1 million rows in 1 minute. It's the structure of your SQL query that is going to determine how things are being handled by SQL.

if you do this:
insert into table (col,col1) values (val,val1);
insert into table (col,col1) values (val2,val3);
insert into table (col,col1) values (val4,val5);

it will take a lot longer than doing
insert into table (col,col1) values (val,val1),(val2,val3),val4,val5);

proper syntax not promised but you get the idea.

If you do individual inserts it has to do a prepare on it each time. You can prepare the statement once and call a execute on it for each set of values if you like (read the DBI man pages you'll get the idea).

I think ignoring DBI just because you don't believe it isn't fast is going to hurt you in the long run. There is nothing slow about DBI if used correctly.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
DBI really will work.

A million rows *is* a lot to insert from a linux box over a network to a SQL/Server box.

1 - The SQL/Server must be able to accept rows fast enough - I'm assuming that you've checked this with your C# routine.

2 - You must be able to xfer that million rows over to the SQL box in as short a time as possible. Check that the network is as fast as possible and as clear of other traffic as it can be.

3 - Insert as many rows in one statement as you can.

If you can't make the DBI version work - xfer the rows over the network to your C# thingy - that picks them up and inserts them. If you want to be clever (efficient) you could have the C# routine listen on a socket for the rows.

Mike

 
I've also written the above sql.pm module to work with DBI :)
Code:
######################
# Set Error Trapping #
######################

use CGI::Carp qw(fatalsToBrowser warningsToBrowser); 
use warnings;
use strict;

##################
# Use DBI Module #
##################
use DBI();

##########################
# Set Package Name Space #
##########################
package SQL;

################
# Start Module #
################

BEGIN {

	# Invoke Exporter
	use Exporter;

	# Set Variables
	our (@ISA, @EXPORT);
	@ISA = qw(Exporter);

	# Define global vars and subs to be exported
	@EXPORT = qw( &getSQL &insSQL &updSQL &delSQL &cntSQL &sumSQL $DSN $FILEDSN $IMN $HLP);

}


########################################################################
########################### GLOBAL SUBROUTINES #########################
########################################################################

###############################################
############## Get SQL Routine ################
###############################################

sub getSQL {

#_0 = Table
#_1 = Columns
#_2 = Where
#_3 = Order By

# Define Record Set Array & Hash
my @rs;

#Build SQL Statement
my $sel = "SELECT $_[1] FROM $_[0] WHERE $_[2]";

# Check for ORDER BY
if($_[3]){$sel .= " ORDER BY $_[3]";}

# Open DB Connection
my $db = DBI->connect("DBI:mysql:YOUR_DSN:localhost","YOUR_ID","YOUR_PASSWORD",{'RaiseError' => 1});

# Run SQL Command
my $sth = $db->prepare("$sel") || die "Error in SQL : $sel";
$sth->execute();

	# Loop SQL Record Set
    while (my $ref = $sth->fetchrow_hashref()) {
        # Build Array of Hashes with SQL Data
        $rs[@rs] = \%$ref;	
	}

	# Close DB Connection
    $sth->finish();

    # Disconnect from the database.
    $db->disconnect();

	# Return Record Set Array of Hashes
	@rs;

}

##################################################
############## Update SQL Routine ################
##################################################
sub updSQL {

#_0 = Table
#_1 = Values
#_2 = Where

#Build SQL Statement
my $sel = "UPDATE $_[0] SET $_[1] WHERE $_[2]";

# Open DB Connection
my $db = DBI->connect("DBI:mysql:YOUR_DSN:localhost","YOUR_ID","YOUR_PASSWORD",{'RaiseError' => 1});

# Run SQL Command
my $sth = $db->prepare("$sel");
$sth->execute();

my $rowcount = $sth->rows; 

# Close DB Connection
$sth->finish();

# Disconnect from the database.
$db->disconnect();

# Return number of rows updated
$rowcount;

}

##################################################
############## Insert SQL Routine ################
##################################################

sub insSQL {

#_0 = Table
#_1 = Columns
#_2 = Values

#Build SQL Statement
my $sel = "INSERT INTO $_[0] ($_[1]) VALUES ($_[2])";

# Open DB Connection
my $db = DBI->connect("DBI:mysql:YOUR_DSN:localhost","YOUR_ID","YOUR_PASSWORD",{'RaiseError' => 1});

# Run SQL Command
my $sth = $db->prepare("$sel");
$sth->execute();

my $rowcount = $sth->rows; 

# Close DB Connection
$sth->finish();

# Disconnect from the database.
$db->disconnect();

# Return number of rows inserted
$rowcount;

}

##################################################
############## Delete SQL Routine ################
##################################################

sub delSQL {

#_0 = Table
#_1 = Where

#Build SQL Statement
my $sel = "DELETE FROM $_[0] WHERE $_[1]";

# Open DB Connection
my $db = DBI->connect("DBI:mysql:YOUR_DSN:localhost","YOUR_ID","YOUR_PASSWORD",{'RaiseError' => 1});

# Run SQL Command
my $sth = $db->prepare("$sel");
$sth->execute();

my $rowcount = $sth->rows; 

# Close DB Connection
$sth->finish();

# Disconnect from the database.
$db->disconnect();

# Return number of rows deleted
$rowcount;

}

#################################################
############## COUNT SQL Routine ################
#################################################

sub cntSQL {

#_0 = Table
#_1 = Where

# Define Record Set Array & Hash
my @rs;

#Build SQL Statement
my $sel = "SELECT COUNT(1) as COUNT FROM $_[0] WHERE $_[1]";

# Open DB Connection
my $db = DBI->connect("DBI:mysql:YOUR_DSN:localhost","YOUR_ID","YOUR_PASSWORD",{'RaiseError' => 1});

# Run SQL Command
my $sth = $db->prepare("$sel");
$sth->execute();

	# Loop SQL Record Set
    while (my $ref = $sth->fetchrow_hashref()) {
        # Build Array of Hashes with SQL Data
        $rs[@rs] = \%$ref;	
	}

	# Close DB Connection
    $sth->finish();

    # Disconnect from the database.
    $db->disconnect();

	# Return Count
	$rs[0]->{'COUNT'};
}

###############################################
############## SUM SQL Routine ################
###############################################

sub sumSQL {

#_0 = Table
#_1 = Column
#_2 = Where

# Define Record Set Array & Hash
my @rs;

#Build SQL Statement
my $sel = "SELECT SUM($_[1]) as MYTOT FROM $_[0] WHERE $_[2]";

# Open DB Connection
my $db = DBI->connect("DBI:mysql:YOUR_DSN:localhost","YOUR_ID","YOUR_PASSWORD",{'RaiseError' => 1});

# Run SQL Command
my $sth = $db->prepare("$sel");
$sth->execute();

	# Loop SQL Record Set
    while (my $ref = $sth->fetchrow_hashref()) {
        # Build Array of Hashes with SQL Data
        $rs[@rs] = \%$ref;	
	}

	# Close DB Connection
    $sth->finish();

    # Disconnect from the database.
    $db->disconnect();

	# Return Count
	$rs[0]->{'MYTOT'};

}

###########################
# END OF MODULE RETURN 1; #
###########################
1;

All you need to do is change every instance of the followong connection string to suit your needs and it runs the same way as the Win32:ODBC version does :)

Code:
# Open DB Connection
my $db = DBI->connect("DBI:mysql:YOUR_DSN:localhost","YOUR_ID","YOUR_PASSWORD",{'RaiseError' => 1});

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!
 
No honest!

When I moved hosting a few months back I found my new host company didn't have Win32:ODBC installed, so I had no choice but to rewrite my script to use DBI as it was the only DB module installed.

Believe me, otherwise I wouldn't have bothered! [lol]

"In complete darkness we are all the same, only our knowledge and wisdom separates us, don't let your eyes deceive you."

"If a shortcut was meant to be easy, it wouldn't be a shortcut, it would be the way!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top