Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Import Online CSV Page

Status
Not open for further replies.

lagcat

Technical User
May 18, 2007
52
GB
Hi people

i wont make out i know anything towards SQL except general maintenace when it comes to using and changing data like i am trying to test i am going to need help

i have an online CSV format direct from this website: This is the NON-CSV version:
the CSV page is located here:
ths displays a CSV format of the data i want to import to SQL

this updates every 30minutes

i would like to create a table which will import this table on say a 15 minute cycle so i do not miss the update of figures as they are business critical to our production

i am guessing for this i need to create a job? or some sort of script to run and retrieve this information to update my table - i am happy to keep all the history so the table can grow as large as it like so users can create reports on the current data

you may say...just let them browse to the website themselve....2 reasons why not:
bandwidth: our connection is not great so one server importing this data compared to 200odd users continuously refreshing helps
Internet Control: in production areas we do not allow internet access - i could create policies on the firewall to allow only to one website but keeping it all in SQL is an alot safer option

to start with i will make a table called bmreports which seems obvious
second: create the required columns - MID / Provider / Settlement Date / Settlement Period / Price / Volume

i do not really want the MID or Provider columns but i guess its easier to import the whole lot

third: i will set the correct data type for each column so the import will work correctly

forth: this is where i am stuck on the importing

here is a paste of the CSV data below if you do not wish to go the the websites - as it is just displayed on a webpage rather than an instant downloadable CSV i do not know if this causes problems?

HDR,MARKET INDEX DATA
MID,APXMIDP,20101222,1,42.46,418.500
MID,APXMIDP,20101222,2,42.63,477.000
MID,APXMIDP,20101222,3,41.11,332.500
MID,APXMIDP,20101222,4,41.04,305.000
MID,APXMIDP,20101222,5,41.03,399.000
MID,APXMIDP,20101222,6,42.36,516.000
MID,APXMIDP,20101222,7,42.67,291.000
MID,APXMIDP,20101222,8,41.28,223.000
MID,APXMIDP,20101222,9,40.90,221.000
MID,APXMIDP,20101222,10,40.82,224.500
MID,APXMIDP,20101222,11,44.17,379.000
MID,APXMIDP,20101222,12,43.71,381.500
MID,APXMIDP,20101222,13,45.98,352.500
MID,APXMIDP,20101222,14,44.41,741.500
MID,APXMIDP,20101222,15,49.59,829.500
MID,APXMIDP,20101222,16,48.61,992.000
MID,APXMIDP,20101222,17,49.57,742.000
MID,APXMIDP,20101222,18,52.10,783.500
MID,APXMIDP,20101222,19,60.40,608.000
MID,APXMIDP,20101222,20,63.82,603.500
MID,APXMIDP,20101222,21,72.94,958.000
MID,APXMIDP,20101222,22,76.09,1053.500
MID,APXMIDP,20101222,23,82.17,648.500
MID,APXMIDP,20101222,24,81.76,693.500
MID,APXMIDP,20101222,25,81.96,610.000
MID,APXMIDP,20101222,26,80.79,550.000
MID,APXMIDP,20101222,27,74.17,695.500
MID,APXMIDP,20101222,28,74.99,663.000
MID,APXMIDP,20101222,29,75.16,658.500
MID,APXMIDP,20101222,30,73.98,687.000
MID,APXMIDP,20101222,31,62.79,1010.000
MID,APXMIDP,20101222,32,64.67,1020.500
MID,APXMIDP,20101222,33,76.82,718.000
MID,APXMIDP,20101222,34,87.44,931.000
MID,APXMIDP,20101222,35,83.56,840.500
MID,APXMIDP,20101222,36,81.63,822.500
MID,APXMIDP,20101222,37,79.18,888.500
MID,APXMIDP,20101222,38,74.13,1036.500
MID,APXMIDP,20101222,39,58.17,1403.000
MID,APXMIDP,20101222,40,56.47,1173.500
MID,APXMIDP,20101222,41,55.80,1005.000
MID,APXMIDP,20101222,42,54.15,1332.000
MID,APXMIDP,20101222,43,49.87,1555.500
MID,APXMIDP,20101222,44,49.81,1307.500
MID,APXMIDP,20101222,45,48.30,1517.500
MID,APXMIDP,20101222,46,47.52,1936.500
MID,APXMIDP,20101222,47,43.44,574.000
MID,APXMIDP,20101222,48,42.97,276.000
MID,APXMIDP,20101223,1,42.96,262.500
MID,APXMIDP,20101223,2,42.99,401.000
MID,APXMIDP,20101223,3,42.53,142.500
MID,APXMIDP,20101223,4,42.30,210.500
MID,APXMIDP,20101223,5,41.58,181.500
MID,APXMIDP,20101223,6,42.93,159.000
MID,APXMIDP,20101223,7,40.54,453.500
MID,APXMIDP,20101223,8,40.79,427.000
MID,APXMIDP,20101223,9,40.51,467.000
MID,APXMIDP,20101223,10,40.52,484.500
MID,APXMIDP,20101223,11,41.11,341.000
MID,APXMIDP,20101223,12,41.46,327.000
MID,APXMIDP,20101223,13,45.07,550.500
MID,APXMIDP,20101223,14,44.21,757.500
MID,APXMIDP,20101223,15,45.89,606.000
MID,APXMIDP,20101223,16,46.56,436.500
MID,APXMIDP,20101223,17,46.86,335.000
MID,APXMIDP,20101223,18,48.96,419.500
FTR,66


i do not know what FTR,66 mean at the end so these need to be ignored as i guess it would imports bad data into the table

any help on this would be great - even a step by step guide if you are feeling generous
i have a test server/database ready to go but i have been through so many websites and still not found a working answer

even to know if it was possible that would be great

Cheers

CCENT, CCNA
MCP, MCSA
Comptia: Network Essentials, Security +, A+
 
The first row and the last row in your data causes serious problems with importing. The problem is, there are only 2 columns in each of these rows, and the other rows (with actual data) have 6 columns.

To accommodate your data, I would suggest that you import the data in to a temp table first. Remove the rows you don't want, and then copy the data from the temp table in to the real table. I wouldn't necessarily suggest this if you were importing millions of rows at a time, but it appears as though you will be importing less than 100 rows at a time, so this method should be fine.

I tested this functionality on a table in my database. To create the real table, I used this:

Code:
Create 
Table  bmreports(
          MID VarChar(3) NOT NULL, 
          Provider VarChar(20) NOT NULL, 
          SettlementDate DateTime NOT NULL, 
          SettlementPeriod Int NOT NULL, 
          Price Decimal(10,2) NOT NULL, 
          Volume Decimal(20,4) NOT NULL)

You will probably want to have a primary key (and therefore a clustered index) on this table. This will improve your query speeds. It appears as though a combination of SettlementDate and SettlementPeriod are unique. Well... they are unique in your sample data. To create the primary key...

Code:
Alter table bmreports Add Constraint PK_bmreports Primary Key (SettlementDate, SettlementPeriod)

Now for the importing. To test this, I copy/pasted your data to notepad and saved the file to my computer with the name C:\MarketData.csv. You'll probably want to change that in the code you see below. You probably won't need to change anything else.

Code:
[green]-- Create a temp table with a single column[/green]
Create Table #bmReports(Data VarChar(1000))

[green]-- Load the data from the file in to temp table[/green]
Bulk Insert #bmreports from 'C:\MarketData.csv'
With (ROWTERMINATOR='\n')

[green]-- Delete the first and last row[/green]
[green]-- This works by removing any row with just one comma[/green]
Delete #bmreports
Where  LEN(data)-LEN(REPLACE(data, ',', '')) = 1

[green]-- Create the columns to hold the data[/green]
Alter 
Table  #bmreports 
Add    MID VarChar(10),
       Provider VarChar(20),
       SettlementDate DateTime,
       SettlementPeriod Int,
       Price Decimal(10,2),
       Volume Decimal(20,4)

[green]-- At this point, the data for all 6 columns[/green]
[green]-- are in one column (data).  Now we need to[/green]
[green]-- separate them in to individual columns.[/green]

[green]-- extract MID data[/green]
Update #bmReports
Set    MID = LEFT(Data, CharIndex(',', Data)-1),
       Data = RIGHT(Data, Len(Data)-CharIndex(',',Data))

[green]-- Extract Provider data[/green]
Update #bmReports
Set    Provider = LEFT(Data, CharIndex(',', Data)-1),
       Data = RIGHT(Data, Len(Data)-CharIndex(',',Data))

[green]-- Extract SettlementDate[/green]
Update #bmReports
Set    SettlementDate= LEFT(Data, CharIndex(',', Data)-1),
       Data = RIGHT(Data, Len(Data)-CharIndex(',',Data))

[green]-- Extract SettlementPeriod[/green]
Update #bmReports
Set    SettlementPeriod = LEFT(Data, CharIndex(',', Data)-1),
       Data = RIGHT(Data, Len(Data)-CharIndex(',',Data))

[green]-- Extract Price[/green]
Update #bmReports
Set    Price = LEFT(Data, CharIndex(',', Data)-1),
       Data = RIGHT(Data, Len(Data)-CharIndex(',',Data))

[green]-- Last Column is Volume[/green]
Update #bmReports
Set    Volume = Data

[green]-- Now we need to move this data in to the[/green]
[green]-- real table.  Since date and period appear[/green]
[green]-- to be unique, let's join to the real[/green]
[green]-- table and make sure we don't insert[/green]
[green]-- duplicates[/green]
Insert Into bmReports(mid, provider, settlementdate, settlementperiod, price, volume)
Select #bmreports.mid, 
       #bmreports.provider, 
       #bmreports.settlementdate, 
       #bmreports.settlementperiod, 
       #bmreports.price, 
       #bmreports.volume
From   #bmreports
       Inner Join bmreports   
          On #bmreports.SettlementDate = bmreports.SettlementDate
          and #bmReports.SettlementPeriod = bmreports.SettlementPeriod
Where  bmreports.SettlementDate is NULL
       and bmreports.SettlementPeriod is NULL

All of this assumes that the data already exists in a file on your computer. How you do this is up to you, and is beyond the scope of the advice I am comfortable giving.

Hope this helps.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thankyou very much i appreciate the time you have put in

i am just working through this now

i am having one problem with movement of the data

it completes with 0 rows effected...when i check in the bmreports table it is blank

but all the other code completed succesfully with no errors

i am trying to get a ftp location from the company now so i can try and make it more automated but they are throwing in extra costs so i might have to look for a way to copy the webpage automatically....there must be a way

CCENT, CCNA
MCP, MCSA
Comptia: Network Essentials, Security +, A+
 
I think I made a small mistake in the last query. Change it to this:

Code:
-- Now we need to move this data in to the
-- real table.  Since date and period appear
-- to be unique, let's join to the real
-- table and make sure we don't insert
-- duplicates
Insert Into bmReports(mid, provider, settlementdate, settlementperiod, price, volume)
Select #bmreports.mid, 
       #bmreports.provider, 
       #bmreports.settlementdate, 
       #bmreports.settlementperiod, 
       #bmreports.price, 
       #bmreports.volume
From   #bmreports
       [!]Left[/!] Join bmreports   
          On #bmreports.SettlementDate = bmreports.SettlementDate
          and #bmReports.SettlementPeriod = bmreports.SettlementPeriod
Where  bmreports.SettlementDate is NULL
       and bmreports.SettlementPeriod is NULL

Note that I changed Inner Join to Left Join.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top