Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

CSV file validation to be loaded in SQL table

Status
Not open for further replies.

BTrees

IS-IT--Management
Aug 12, 2006
45
0
0
CA
Hi All,

I have a requirement of validating CSV file before loading into staged-folder, and later have to load into sql table.
I need to validate metadata (the structure of the file must be same as target sql table)
No. of columns should be equal to the target sql table
order of columns should be same as target sql table
Data types of columns (no text values should exist in numeric field of csv file)
looking for some easy and efficient way achieve this.
Thanks for help

 
For a similar case, I wrote a validation program in VBscript
 
powershell, vb or c# script.

note that it is not necessarily easy - and if the files are big you are better off loading onto a staging table and do the validation there

Regards

Frederico Fonseca
SysSoft Integrated Ltd

FAQ219-2884
FAQ181-2886
 
Thanks Mikrom, regarding your validation script, is there something that you can share please
 
Hi BTrees,

The validation script is a very simple program which reads the CSV-file line by line,
it splits every line into fields (columns) and validates every field.
If something is wrong with the data, it shows the message with row and column, where the error in the CSV file occured.

To demonstrate it, I created a simple example for you:

Given is this CSV file
csv_data.csv
Code:
pers_name;pers_id;contract_no;confirm;contract_date;bank_code;created_by;date_created
foo;1234567890;123456701;yes;20140515;1111;ABC;20190225
bar;2345678901;234567802;yes;20140516;2222;XAZ;20190110
baz;3456789012;345678903;no;20180122;3333;PQR;20191230

and here is the validation script
csv_validate.vbs
Code:
[COLOR=#804040][b]option[/b][/color] [COLOR=#804040][b]explicit[/b][/color]

[COLOR=#0000ff]' show help[/color]
[COLOR=#804040][b]call[/b][/color] help

[COLOR=#0000ff]' get file path from Excel File Open Dialog[/color]
[COLOR=#804040][b]dim[/b][/color] csv_path
csv_path [COLOR=#804040][b]=[/b][/color] get_csvPath[COLOR=#804040][b]()[/b][/color]

[COLOR=#804040][b]dim[/b][/color] fso[COLOR=#804040][b],[/b][/color] csv_file
[COLOR=#804040][b]set[/b][/color] fso [COLOR=#804040][b]=[/b][/color] [COLOR=#008080]CreateObject[/color][COLOR=#804040][b]([/b][/color][COLOR=#ff00ff]"Scripting.Filesystemobject"[/color][COLOR=#804040][b])[/b][/color]
[COLOR=#804040][b]const[/b][/color] ForReading [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]1[/color][COLOR=#804040][b],[/b][/color] ForWriting [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]2[/color][COLOR=#804040][b],[/b][/color] ForAppending [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]8[/color] 
[COLOR=#804040][b]set[/b][/color] csv_file [COLOR=#804040][b]=[/b][/color] fso[COLOR=#804040][b].[/b][/color]OpenTextFile[COLOR=#804040][b]([/b][/color]csv_path[COLOR=#804040][b],[/b][/color] ForReading[COLOR=#804040][b])[/b][/color]

[COLOR=#804040][b]dim[/b][/color] line_num[COLOR=#804040][b],[/b][/color] csv_line
line_num [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]0[/color]
[COLOR=#804040][b]do[/b][/color] [COLOR=#804040][b]until[/b][/color] csv_file[COLOR=#804040][b].[/b][/color]AtEndOfStream
  line_num [COLOR=#804040][b]=[/b][/color] line_num [COLOR=#804040][b]+[/b][/color] [COLOR=#ff00ff]1[/color]
  csv_line [COLOR=#804040][b]=[/b][/color] csv_file[COLOR=#804040][b].[/b][/color][COLOR=#a020f0]ReadLine[/color][COLOR=#804040][b]()[/b][/color]
  [COLOR=#804040][b]if[/b][/color] line_num [COLOR=#804040][b]>[/b][/color] [COLOR=#ff00ff]1[/color] [COLOR=#804040][b]then[/b][/color]
    [COLOR=#804040][b]call[/b][/color] validate_csv_line[COLOR=#804040][b]([/b][/color]line_num[COLOR=#804040][b],[/b][/color] csv_line[COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]if[/b][/color]
[COLOR=#804040][b]loop[/b][/color]
csv_file[COLOR=#804040][b].[/b][/color][COLOR=#804040][b]close[/b][/color]

[COLOR=#0000ff]' At end: close all, release objects from memory and quit [/color]
[COLOR=#804040][b]set[/b][/color] fso [COLOR=#804040][b]=[/b][/color] [COLOR=#804040][b]nothing[/b][/color]
[COLOR=#008080]msgbox[/color] [COLOR=#ff00ff]"End of the validation "[/color] [COLOR=#804040][b]&[/b][/color] [COLOR=#ff00ff]VbCrLf[/color][COLOR=#804040][b], _[/b][/color]
       [COLOR=#ff00ff]vbExclamation[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]"Warning"[/color]

[COLOR=#0000ff]'************************** Subroutines/Functions *****************************[/color]

[COLOR=#804040][b]sub[/b][/color] help
  [COLOR=#804040][b]dim[/b][/color] msg
  msg [COLOR=#804040][b]= _[/b][/color]
  [COLOR=#ff00ff]"This program validates specific CSV-file"[/color] [COLOR=#804040][b]&[/b][/color] [COLOR=#ff00ff]vbCr[/color]
  [COLOR=#804040][b]call[/b][/color] [COLOR=#008080]msgbox[/color] [COLOR=#804040][b]([/b][/color]msg[COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]vbInformation[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]"Info"[/color][COLOR=#804040][b])[/b][/color]
[COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]sub[/b][/color]

[COLOR=#804040][b]function[/b][/color] get_csvPath[COLOR=#804040][b]()[/b][/color]
  [COLOR=#804040][b]dim[/b][/color] filename[COLOR=#804040][b],[/b][/color] objExcel
  
 [COLOR=#0000ff] ' Create an invisible version of Excel[/color]
  [COLOR=#804040][b]set[/b][/color] objExcel [COLOR=#804040][b]=[/b][/color] [COLOR=#008080]CreateObject[/color][COLOR=#804040][b]([/b][/color][COLOR=#ff00ff]"Excel.Application"[/color][COLOR=#804040][b])[/b][/color]

  filename [COLOR=#804040][b]=[/b][/color] objExcel[COLOR=#804040][b].[/b][/color]GetOpenFilename [COLOR=#804040][b]([/b][/color][COLOR=#ff00ff]"All CSV-Files (*.csv),*.csv"[/color][COLOR=#804040][b])[/b][/color]
  
 [COLOR=#0000ff] ' At end: close all, release objects from memory and quit [/color]
  objExcel[COLOR=#804040][b].[/b][/color][COLOR=#a020f0]quit[/color]
  [COLOR=#804040][b]set[/b][/color] objExcel [COLOR=#804040][b]=[/b][/color] [COLOR=#804040][b]nothing[/b][/color]
  
  [COLOR=#804040][b]if[/b][/color] filename [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]False[/color] [COLOR=#804040][b]then[/b][/color]
   [COLOR=#0000ff] 'wscript.echo "'Open File Dialog' was cancelled"[/color]
    [COLOR=#008080]msgbox[/color] [COLOR=#ff00ff]"No CSV-file selected !"[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]vbCritical[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]"Error"[/color]
    wscript[COLOR=#804040][b].[/b][/color][COLOR=#a020f0]quit[/color]
  [COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]if[/b][/color]
 [COLOR=#0000ff] ' return result[/color]
  get_csvPath [COLOR=#804040][b]=[/b][/color] filename
[COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]function[/b][/color]

[COLOR=#804040][b]sub[/b][/color] validate_csv_line[COLOR=#804040][b]([/b][/color]line_num[COLOR=#804040][b],[/b][/color] input_string[COLOR=#804040][b])[/b][/color]
 [COLOR=#0000ff] ' Validation check of the individual fields[/color]
  [COLOR=#804040][b]dim[/b][/color] csv_array[COLOR=#804040][b], _[/b][/color]
      pers_name[COLOR=#804040][b],[/b][/color] pers_id[COLOR=#804040][b],[/b][/color] contract_no[COLOR=#804040][b],[/b][/color] confirm[COLOR=#804040][b],[/b][/color] datum[COLOR=#804040][b], _[/b][/color]
      bank_code[COLOR=#804040][b],[/b][/color] created_by
      
  csv_array [COLOR=#804040][b]=[/b][/color] [COLOR=#a020f0]split[/color][COLOR=#804040][b]([/b][/color]input_string[COLOR=#804040][b],[/b][/color][COLOR=#ff00ff]";"[/color][COLOR=#804040][b])[/b][/color]

 [COLOR=#0000ff] ' person name[/color]
  pers_name [COLOR=#804040][b]=[/b][/color] csv_array[COLOR=#804040][b]([/b][/color][COLOR=#ff00ff]0[/color][COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]if[/b][/color] pers_name [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]""[/color] [COLOR=#804040][b]or[/b][/color] [COLOR=#008080]isNumeric[/color][COLOR=#804040][b]([/b][/color]pers_name[COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]then[/b][/color]
    [COLOR=#804040][b]call[/b][/color] field_error[COLOR=#804040][b]([/b][/color]line_num[COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]1[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]"person name"[/color] [COLOR=#804040][b],[/b][/color] priezvisko[COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]if[/b][/color]

 [COLOR=#0000ff] ' person id[/color]
  pers_id [COLOR=#804040][b]=[/b][/color] csv_array[COLOR=#804040][b]([/b][/color][COLOR=#ff00ff]1[/color][COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]if[/b][/color] [COLOR=#008080]isNumeric[/color][COLOR=#804040][b]([/b][/color]pers_id[COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]and[/b][/color] [COLOR=#804040][b]len[/b][/color][COLOR=#804040][b]([/b][/color]pers_id[COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]10[/color] [COLOR=#804040][b]then[/b][/color]
   [COLOR=#0000ff] 'pass[/color]
  [COLOR=#804040][b]else[/b][/color]
    [COLOR=#804040][b]call[/b][/color] field_error[COLOR=#804040][b]([/b][/color]line_num[COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]2[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]"person id"[/color] [COLOR=#804040][b],[/b][/color] pers_id[COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]if[/b][/color]

 [COLOR=#0000ff] ' contract number[/color]
  contract_no [COLOR=#804040][b]=[/b][/color] csv_array[COLOR=#804040][b]([/b][/color][COLOR=#ff00ff]2[/color][COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]if[/b][/color] [COLOR=#008080]isNumeric[/color][COLOR=#804040][b]([/b][/color]contract_no[COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]and[/b][/color] [COLOR=#804040][b]len[/b][/color][COLOR=#804040][b]([/b][/color]contract_no[COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]9[/color] [COLOR=#804040][b]then[/b][/color]
   [COLOR=#0000ff] 'pass[/color]
  [COLOR=#804040][b]else[/b][/color]
    [COLOR=#804040][b]call[/b][/color] field_error[COLOR=#804040][b]([/b][/color]line_num[COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]3[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]"contract number"[/color] [COLOR=#804040][b],[/b][/color] contract_no[COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]if[/b][/color]

 [COLOR=#0000ff] ' confirm[/color]
  confirm [COLOR=#804040][b]=[/b][/color] csv_array[COLOR=#804040][b]([/b][/color][COLOR=#ff00ff]3[/color][COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]if[/b][/color] confirm [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]"yes"[/color] [COLOR=#804040][b]or[/b][/color] confirm [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]"no"[/color] [COLOR=#804040][b]then[/b][/color]
   [COLOR=#0000ff] 'pass[/color]
  [COLOR=#804040][b]else[/b][/color]
    [COLOR=#804040][b]call[/b][/color] field_error[COLOR=#804040][b]([/b][/color]line_num[COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]4[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]"confirm"[/color] [COLOR=#804040][b],[/b][/color] confirm[COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]if[/b][/color]  

 [COLOR=#0000ff] ' date in format YYYYMMDD[/color]
  datum [COLOR=#804040][b]=[/b][/color] csv_array[COLOR=#804040][b]([/b][/color][COLOR=#ff00ff]4[/color][COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]if[/b][/color] [COLOR=#008080]isNumeric[/color][COLOR=#804040][b]([/b][/color]datum[COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]and[/b][/color] [COLOR=#804040][b]([/b][/color]datum [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]0[/color] [COLOR=#804040][b]or[/b][/color] [COLOR=#804040][b]len[/b][/color][COLOR=#804040][b]([/b][/color]datum[COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]8[/color][COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]then[/b][/color]
   [COLOR=#0000ff] 'pass[/color]
  [COLOR=#804040][b]else[/b][/color]
    [COLOR=#804040][b]call[/b][/color] field_error[COLOR=#804040][b]([/b][/color]line_num[COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]5[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]"contract date"[/color] [COLOR=#804040][b],[/b][/color] datum[COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]if[/b][/color]

  bank_code [COLOR=#804040][b]=[/b][/color] csv_array[COLOR=#804040][b]([/b][/color][COLOR=#ff00ff]5[/color][COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]if[/b][/color] [COLOR=#008080]isNumeric[/color][COLOR=#804040][b]([/b][/color]bank_code[COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]and[/b][/color] [COLOR=#804040][b]([/b][/color]bank_code [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]0[/color] [COLOR=#804040][b]or[/b][/color] [COLOR=#804040][b]len[/b][/color][COLOR=#804040][b]([/b][/color]bank_code[COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]4[/color][COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]then[/b][/color]
   [COLOR=#0000ff] 'pass[/color]
  [COLOR=#804040][b]else[/b][/color]
    [COLOR=#804040][b]call[/b][/color] field_error[COLOR=#804040][b]([/b][/color]line_num[COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]6[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]"bank code"[/color] [COLOR=#804040][b],[/b][/color] bank_code[COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]if[/b][/color]

 [COLOR=#0000ff] ' created by[/color]
  created_by [COLOR=#804040][b]=[/b][/color] csv_array[COLOR=#804040][b]([/b][/color][COLOR=#ff00ff]6[/color][COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]if[/b][/color] created_by [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]""[/color] [COLOR=#804040][b]or[/b][/color] [COLOR=#008080]isNumeric[/color][COLOR=#804040][b]([/b][/color]created_by[COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]then[/b][/color]
    [COLOR=#804040][b]call[/b][/color] field_error[COLOR=#804040][b]([/b][/color]line_num[COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]7[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]"created by"[/color] [COLOR=#804040][b],[/b][/color] created_by[COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]if[/b][/color]  

 [COLOR=#0000ff] ' date in format YYYYMMDD[/color]
  datum [COLOR=#804040][b]=[/b][/color] csv_array[COLOR=#804040][b]([/b][/color][COLOR=#ff00ff]7[/color][COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]if[/b][/color] [COLOR=#008080]isNumeric[/color][COLOR=#804040][b]([/b][/color]datum[COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]and[/b][/color] [COLOR=#804040][b]([/b][/color]datum [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]0[/color] [COLOR=#804040][b]or[/b][/color] [COLOR=#804040][b]len[/b][/color][COLOR=#804040][b]([/b][/color]datum[COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]8[/color][COLOR=#804040][b])[/b][/color] [COLOR=#804040][b]then[/b][/color]
   [COLOR=#0000ff] 'pass[/color]
  [COLOR=#804040][b]else[/b][/color]
    [COLOR=#804040][b]call[/b][/color] field_error[COLOR=#804040][b]([/b][/color]line_num[COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]8[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]"date created"[/color] [COLOR=#804040][b],[/b][/color] datum[COLOR=#804040][b])[/b][/color]
  [COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]if[/b][/color]
[COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]sub[/b][/color]

[COLOR=#804040][b]sub[/b][/color] field_error[COLOR=#804040][b]([/b][/color]line_num[COLOR=#804040][b],[/b][/color] col_num[COLOR=#804040][b],[/b][/color] field_name[COLOR=#804040][b],[/b][/color] field_value[COLOR=#804040][b])[/b][/color]
    [COLOR=#804040][b]dim[/b][/color] msg
    msg [COLOR=#804040][b]=[/b][/color] [COLOR=#ff00ff]"Error in CSV data:"[/color] [COLOR=#804040][b]&[/b][/color] [COLOR=#ff00ff]vbCr[/color] [COLOR=#804040][b]& _[/b][/color]
          [COLOR=#ff00ff]"           row: "[/color] [COLOR=#804040][b]&[/b][/color] line_num [COLOR=#804040][b]&[/b][/color] [COLOR=#ff00ff]vbCr[/color] [COLOR=#804040][b]& _[/b][/color]
          [COLOR=#ff00ff]"           col: "[/color] [COLOR=#804040][b]&[/b][/color] col_num [COLOR=#804040][b]&[/b][/color] [COLOR=#ff00ff]vbCr[/color] [COLOR=#804040][b]& _[/b][/color]
          field_name [COLOR=#804040][b]&[/b][/color] [COLOR=#ff00ff]" = "[/color]  [COLOR=#804040][b]&[/b][/color] field_value
    [COLOR=#804040][b]call[/b][/color] [COLOR=#008080]msgbox[/color][COLOR=#804040][b]([/b][/color]msg[COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]vbCritical[/color][COLOR=#804040][b],[/b][/color] [COLOR=#ff00ff]"Error"[/color][COLOR=#804040][b])[/b][/color]
   [COLOR=#0000ff] 'wscript.quit[/color]
[COLOR=#804040][b]end[/b][/color] [COLOR=#804040][b]sub[/b][/color]

You can save the script and the CSV-file in your directory and try it out, how it works
(for the working open file dialog you need to have Excel installed on your PC).

For example, when you look at the CSV data posted above, in the column=4 are only yes and no values and if you look in the script then you will see that only these two values are allowed in the column=4.
If you modify the file so, that you change the no (Row=4, Col=4) to Hi then the script shows the error message:

err_msg_o4x9kl.jpg
 
Thank you Mikrom, it looks very help full I am trying this.
 
Hope this helps you a little bit to create a validation program for your specific CSV-file
 
I know this is a Microsoft forum, but unix/linux utilities such as awk and perl are available to run in that environment. awk is well suited to do the three tasks you listed, count columns, column order, and data type checking.

==================================
advanced cognitive capabilities and other marketing buzzwords explained with sarcastic simplicity


 
Hi johnherman,
I fully agree with you. I'm using awk very often. It is best suited for this task. I mostly use VBscript only because it is installed on every Windows by default and the user does not need to install any additional software.
 
mikrom,
I know you can write VBScript in Notepad, but is there any particular app that you write your VBScript code with? Something that you can step thru the code? With Intelisense, maybe...?


---- Andy

There is a great need for a sarcasm font.
 
Hi Andrzejek,
Instead of notepad I'm using gvim. I don't have any advanced tool.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top