Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

HELP - Trying to upload file contents to database on a nightly

Status
Not open for further replies.

shann0nw

IS-IT--Management
May 22, 2003
5
AU
Hi,

I have created a Java program which I call each night at approximately
7PM whereby I pick up all files residing in a local directory in a
pre-defined format. The format is basically part,quantity with the
file name being [id].txt.

From this, I create two PreparedStatements.

1 - INSERT INTO
([id], [part], [quantity])
AND
2 - UPDATE
SET QUANTITY = ? WHERE ID = [id] and PART = part

The file contains approximately 110000 lines and it is assumed that
the file is a full file. I loop through the file calling the
PreparedStatement(2) and update each part with the new quantity. If I
hit an error whereby the part does not exist, I insert the part and
quantity via PreparedStatement(1).

The problem I have is that this program generally takes 8-9 hours to
complete which is quite long. We are looking to expand it to update
many files but at this rate we will have to wait weeks to complete all
files!!!

The format of the file can not be changed as it is sent to use from
another company and they have defined the format.

Is there a quicker better way??????

Please help, I'm at a loss to figure out how to speed this up.

Shannon
 
Hi Shannon,
What you are doing sounds fine, and I'm surprised that the insert/update of only 110,000 rows is taking 8-9 hours.

You may want to look into the stats of a number of runs to ascertain whether the majority of the rows exist or not. From the way you are doing it at the moment, you are generally expecting the rows to exist. If (generally) they don't exist, you will be issuing two calls when you could get away with just one, the insert. I'd have a look at this to see if some improvement could be had. Is it possible for the company that sends the file to supply two files, one where the part exists, and one for new parts?

The other thing that occurs to me is that you mention 'prepared' statements. Are you issuing dynamic SQL via the PREPARE statement? If so, this will add to the overhead quite dramatically, as DB2 will have to work out the access path at the time of execution each time the statement is executed. I have never used DB2 in a Java program, but is it possible to bind it into a plan before execution? This would substantially speed up execution.

Hope some of this helps.

Marc
 
Hi MarcLodge,

Thanks for your response.

I really can't ask them to send two files and really, it should always only be an update and only be an insert when a new part is brought in to stock.

I've included a snippet of what I'm doing in Java to give you an idea.

Code:
BufferedReader bufferedInput = new BufferedReader (new FileReader(uploadFile));

updateString = "UPDATE TABLE SET QUANTITY = ? WHERE SITE_ID = ? AND PART_NUMBER = ? ";
updateStmt = con.prepareStatement(updateString);
						
insertString = "INSERT INTO BATCH_PARTS_TB (SITE_ID, PART_NUMBER, QUANTITY) VALUES (? , ? , ?)";
insertStmt = con.prepareStatement(insertString);

while ((line = bufferedInput.readLine()) != null) {

 StringTokenizer st = new StringTokenizer(line, "^");
 while (st.hasMoreTokens()) {

  part = st.nextToken().trim().toUpperCase();
  partQty = st.nextToken().trim();
  partAvailQty = st.nextToken().trim();
  updateFlag = st.nextToken().trim().toUpperCase();

 }

 if (updateFlag.equals("U")) {
  updateStmt.setString(3, part);
  updateStmt.setInt(1, Integer.parseInt(partQty));
  updateStmt.setInt(2,Integer.parseInt(siteId));

  result = updateStmt.executeUpdate();
 }
 else if (updateFlag.equals("N")) {
  insertStmt.setInt(1,Integer.parseInt(siteId));
  insertStmt.setString(2, part);
  insertStmt.setInt(3, Integer.parseInt(partQty));
  insertStmt.executeUpdate();
  insertCount++;
 }
}
 
Hi Shannon,
If I understand your code correctly, what you are doing is 100% right. Am I right in saying that a flag is tagged on to the end of each record which determines whether an update or insert is executed? I had assumed that you performed an update, and if a 'not found' sqlcode was returned, you performed the insert, or something similar.

I think the problem lies in the fact that the SQL is prepared each time an insert/update is executed. As I said in the previous post, I've never used Java and DB2. Is it possible to BIND the SQL somehow, and produce a PACKAGE or a PLAN? This would speed up the access as DB2 sorts out the paths at BIND time rather than having to do it dynamically at runtime. If this is not possible, you might want to consider writing the SQL as a stored procedure, which you would then BIND, and call from your Java. This (I think) would speed up the access.

The other thing that springs to mind is to have a look at the actual table in DB2 to see if anything can be done. Are RUNSTATS runb against if fairly regularly? Is the table sized correctly? Have a look and see if anything looks odd.

Let me know if this is of any help.

Marc
 
Marc,

Yes, you are correct in the assumption the the U stands for Update and the N stands for new line. There is actually another part of the script which processes the older file formats which performs an update saving the result in to a variable and testing if it is equal to 0. (The result is generally the number of lines that were affected.

If it does equal 0 then we perform an insert. By using the ? in the SQL statement and then using the setString(1,part) statement I am in fact binding the variables on each loop so I believe this is the most effective method to perform the SQL update.
I will however look in to the other items you mentioned in your last paragraph and see if any of those things can make a difference.

I've got a strong feeling now that ths slow down is occuring because I am committing every row. I might try to commit every 500 or something similar.

Regards
 
Shannon,
As I've already said, I've never used Java and DB2, as I am more mainframe orientated. On that platform, you can issue dynamic SQL (as I think you are doing) or compile the SQL statically into a DBRM and then bind this into a plan or package. This means that the bind is only performed once, irrespective of how many rows of data are processed. I'd like to think that Java can do this too, as in my eyes you are having to perform 110,000 binds which seems kind of unnecessary.

Sorry I can't help more, I'll have to hunt out a good book on Java and DB2 and try to keep up with all this new technology!

Marc
 
OK,

I was under the impression that what I was doing was binding on each loop. I might try reading the entire file and then binding and then writing. I'll see how I go.
 
Hi Shannon,
this may not help at all, and I may be preaching what you are already aware of, but I had a look around this afternoon and discovered this document: and this one: that seem to imply that there is a dynamic version of Java available through JDBC and a static version SQLJ. It appears that the static version does the binds pre-run, and can be more efficient.

Do you have the opportunity to test this SQLJ method out at your site to see if it produces a performance enhancement?

Marc
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top