Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Correct Approach to Data Importation 3

Status
Not open for further replies.

shadyness

MIS
Dec 21, 2001
89
0
0
US
I will be going to work for a new client next week and I understand the first task is to revamp their current importation tool. So I am looking just for general input from the DBA community here.

It appears that a previous developer devised an importation tool using Cold Fusion as the GUI (for staff use) and then all of the active coding is done using nested Stored Procs. The required task is, they have to import data that comes in once a month, but in very different formats (Access, text files, FoxPro, Excel). However, these files are fixed, so one file does not look different than the previous one.

What happens is, when they go to load data, all of the resources are tapped because the SS2K base is running the sprocs that are inspecting, cleaning, forming, and importing the data. Every file imported only represents a few hundred rows of data, so we are not talking about a lot of data here. I think the “bogging down” they describe is obviously, occurring in the data preparation phase and not the actual importation.

I am thinking about designing an Access base (to replace the CF and T-SQL tool now in use) that will perform all of the functions using tailored VBA modules for each format and then build temp tables for export – then let the SQL Server import it in. I am just looking for an overall approach, so any suggestions, input, or stories of previous experiences would most welcome!

Thanks!


<== Some people say they are afraid of heights. With me its table widths. ==>
 
If the destination of the data is SQL Server, why not use DTS? It's easy to create/use/reuse DTS packages, much of the transformation is built-in, and you can use VB scripts if needed.

Of course, DTS tends to hog cpu, too.

It's been a while since I've done this; you might want to move to the DTS forum.
 
You can use DTS regardless. As long as both your source and destination have OLEDB providers (and with OLEDB for ODBC that is about everything)

So you could pump VSAM data to Dbase with DTS if you wanted to.

But I'd say use DTS also. I think this is an appropriate forum for his/her question. His solution may not end up a DTS solution....as long as he/she isn't cross posting.....you are not x-posting are you shadyness? *evil grin*
 
SemperFi, eh? Well I don't know if you are/were a Marine, but if you are, then you can understand the idea of &quot;adapt and overcome&quot;. I am going to gather as much data as possible in order to overcome this problem. I recently purchased SQL Server 2000 DTS Programming by Wrox press. I think I should break that open and give it a whirl. Thanks for the imput!



<== Some people say they are afraid of heights. With me its table widths. ==>
 
I'm in the use DTS camp too. If you have a book on it that will help immensely as data transformations in DTS can get very complex. Also, you can schedule them to run during non peak hours which is probably also a good idea.
 
I know that DTS is such a powerful tool, that some companies using other database platforms purchase a SS2K licenses just to use DTS for data movement between parties outside of the SQL base. Well then, if given the case for DTS, does anyone have any details regarding DTS versus Sprocs? From what I understand, the more interpretive language used, the longer a process takes (and some of the sprocs currently used are fairly long and complex). Would the client see significant time reduction for a data upload process if it was in DTS instead of the complex nested sprocs currently employed?

I know this is certainly more of conversation for the new DTS forum, but since we are already on the topic. . .

<== Some people say they are afraid of heights. With me its table widths. ==>
 
I have a similar process monthly. I receive a lot of data in a wide variety of formats. The data requires a LOT of manipulation prior to loading into SQL. I do the cleanup processing in MS Access then use DTS to import the data.

Given the above, her are my comments:

-If your data manipulations are not overly complex, I would use DTS. This way everything can live in SQL Server.

- If your data manipulations are complex, Access is a great platform, especially if your tables aren't overly large. Another nice plus of using Access is you can have Access tables that mirror your SQL tables and load the processed data into the SQL tables and import from these. This makes it easy to determine what is wrong with your data if you have any problems with the DTS import.

- If you go with Access, you can even have Access run the DTS import at the end of your processing. Just set a reference to the DTS object library and run the existing package (or you can create the package in Access but that is a bit of a pain).

I'm both an Access and a DTS fan, so I guess I always try to find a solution that uses my two favorite tools.

Michael
 
VogonPoet:

I know as a developer I am supposed to be conveying this aura of assurance in whatever choice I make, but in this case your line of thinking aligns with my own and that gives me some reassurance, as no real text that I have found suggests doing something like this (though they do not present importation examples as complex). I know that all of this can be done using DTS packages, but each data type not only has to be formed into acceptable table formats, but inspected as some of the data is not applicable for import and must be disregarded. This is where I think that an Access ‘cleaner’ and a DTS ‘mover’ could work nice. As I described above, the tool in use employs Cold Fusion (which calls the sprocs) as a GUI, but I could just make some simple Access form screens (or do it all in VB if Access licenses are an issue) and they would still have a GUI based tool for their import sequence.

I have to say that I am too a huge Access fan and I am really getting into advanced DTS as of late. I tend to think of Access as SS2K’s little brother that can do some bothersome smaller chores if the big brother is too busy. I think the population of people that disregard Access as a serious tool, do not have a firm grasp on VBA and the power of the reference library.

Thank you for your input and I am of course still open to hear more discussion on this import topic!


<== Some people say they are afraid of heights. With me its table widths. ==>
 
Are you saying that the only way you can tell whether a row should be imported is to visually inspect it?
 
No the data does not need to be visually inspected, perhaps maybe just open a file to make sure the client didn't accidentially send an excel file with their list of chores, but after building one table (from the clients text files, yes, as in .txt files) of 3096 records only 1,056 were needed for import so the rest were droped. At present the currrent tool uses all stored procedures to parse the text file data, build the table, and find which rows are needed (in this given case) and only imports those rows. I am thinking that maybe Access should import in the text files build the temp tables (basically apply all of the table forming logic) and then hail DTS to run a package to import that final product.

Anybody have a different take, or does that logic seem sound?

<== Some people say they are afraid of heights. With me its table widths. ==>
 
The reason I asked is that if you can automate the selection process, you can handle that within DTS. There's no need to involve an Access front-end. Either way, you're using VBScript to do the processing, so you simplify your app by cutting down on the number of tools and data passes. You did say that performance was an issue, so the fewer passes over the data, the better.
 
I understand that all of it can be performed inside of DTS, but look at it this way: There are currently 18 clients sending data in many different formats. This data has to be manipulated and formed into acceptable tables. As far as hardware goes, there is a Developmental server and the Production dedicated web server. Currently they are only running a two-tier system of CF to a SS2K base. Though DTS can handle all of the logic required for the table formation and importation, this disrupts the web usage (even though it is ran during late-night off peak usage hours). True, it will be using VBScripting either way, but you could run the Access process on the Development server and then all the production base would have to do is import the data. Really, you could do all of this in DTS and use only the Dev Server's resources for this process, but it still has to have a front-end that can be used by non-DBA staff. Which is why Access first came to mind: the ability to minimize SQL Server's resource usage and a quick-easy front-end construction environment (as opposed to me making something totally in VB or making some janky connection of Access form-screen buttons to DTS packages).

harebrain, I get what you mean as far as minimizing my apps, and in all reality I am not trying to argue for an Access front-end when I am trying to find the best answer. In fact, I would rather manipulate everything in DTS, but which app do you think would handle the request faster and with the lowest CPU strain?


<== Some people say they are afraid of heights. With me its table widths. ==>
 
Great, I don't have anything against Access, I just want to help you streamline your processing.

You can run a DTS process on a non-server platform (such as your development platform) which similarly relieves the server of the transformation load (just not the data load, which would be impossible.)

As others have mentioned, you can schedule a package to run automatically (off-peak) but you can also grant permission to non-DBA users or groups to manually execute a DTS package. If you have Enterprise Manager available to them, it's basically a pushbutton (visual) operation. I know, some users can't navigate their way out of &quot;My Computer&quot; and are paralyzed by more than one choice, but then you are stuck.

I'm sure there are others here (hint, hint) who are more familiar with this than my old experience leaves me.

I hope this is coming across more as helpful than ornery. :)
 
Access is definitely not the most efficient program on the planet, but as you stated, you can run it on another server. If fact, to state the obvious, you do not even need to run it on a server.
 
Your tools for transforming data are so much better with DTS, I would not go through Access.

If you want to keep the processing off your production server, run the DTS packages on your development server and do all the record processing there. Then run another DTS package to send the data over to Production, now it is just a matter of inserting or updating records with data that is known to be correctly formatted and the DTS package running on Production should take as little time as possible. YOu can even use the Execute Package task to roll all you different DTS packages into one master DTS that can be executed from one schedule or command.
 
Well SQLSister, after researching some advanced DTS programming documentation, I have decided to do just that. Program the whole thing useing DTS packages ran on the Dev Server and creating a VB front-end for use by the non-programming staff. A trial run that I performed using some of the more complex data uploaded a considerable amount of time faster than the current sproc system in use.

Thank you everyone for your input. I only hope that I can be as helpful in the future.

<== Some people say they are afraid of heights. With me its table widths. ==>
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top