Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

CFFILE Dedupe

Status
Not open for further replies.

boatguy

Programmer
Oct 22, 2001
153
US
Ok, here's my question. I want to use cffile to upload a pipe delimited text file, but in the process, I want to dedupe the file and then save it to the server as csv. Can this be accomplished?
 
i think so.

<cfset oldFile = 'myFile.txt'>
<cfset newFile = ReReplace(oldFile, '.txt', '.csv', 'all')>

then

<cffile action = "upload"
fileField = "newFile"
destination = "c:\...."
>


 
Thanks Falconseye. Now, is there a way to remove duplicates?
 
Not duplicate files, duplicate records in the file. The system my clients are pulling the records from are loaded with duplicates and they forget to remove the duplicate records before uploading.
 
oops, then you will need to do that when you read that file. that makes things much complicated though.
i would probably put them in a temp table then go thru the records and remove the dups.
hope it helps...

 
Removing duplicate records in the CSV file, isn't all that difficult.

Provided that each line in the file represents a row, and each row has the same number of columns (text delimited by the pipe), then you can use the <cfhttp> tag to automagicly convert the csv into a query. Once the contents are in the context of a query, you can QoQ to extract distinct rows are then write them off to a file.

For example:

Code:
<!--- Upload the file --->
<cffile action="upload" 
		filefield="theFileFormfield" 
		destination="C:\path\to\your\webroot\fileuploads" 
		nameconflict="makeUnique" 
		accept="text/plain">

<!--- Read the File using cfhttp --->
<!--- Obviously you will want to adjust the values of text qualifier, firstRowAsHeaders and Columns as you see fit --->
<!--- Look at the live docs page if you are unsure of the attribute meanings --->
<!--- [URL unfurl="true"]http://livedocs.macromedia.com/coldfusion/7/htmldocs/wwhelp/wwhimpl/common/html/wwhelp.htm?context=ColdFusion_Documentation&file=00000272.htm#1632966[/URL] --->
<cfhttp url="[URL unfurl="true"]http://www.yourdomain.com/fileuploads/#cffile.ServerFile#"[/URL]
		name="qryNewFileContents"
		delimiter="|"
		textqualifier="'"
		firstRowAsHeaders="no"
		columns="COL1,COL2,COL3"></cfhttp>
		
<!--- Perform a Query of Queries (QoQ) to extract distinct rows from the query --->		
<cfquery name="qryOnlyDistincts" dbtype="query">
	SELECT DISTINCT( COL1 ) AS distinctCol, COL2, COL3
	FROM qryNewFileContents
	ORDER BY COL1, COL2, COL3
</cfquery>

<!--- Write the contents of your query out to a new file, using the .csv file extentsion --->
<!--- If the file that you are writing is large, you might want to use a StringBuffer instead of the 'normal' --->
<!--- cfset, however for this example I will keep it simple. --->
<cfset newFileContents = "">
<cfset newDelim = "|">
<cfset endLine = chr(10) & chr(13)>

<cfloop query="qryOnlyDistincts">
	<cfset newFileConents = newFileContents & delim & distinctCol & delim & COL2 & delim & COL3 >
	<!--- Add a newline unless it's the last record in the query --->
	<cfif currentRow NEQ recordCount>
		<cfset newFileContents = newFileContents & endLine >
	</cfif>
</cfloop>

<!--- Using the name of the uploaded file, tack on the csv file extention --->
<cffile action="write" file="C:\path\to\new\file\location\#cffile.clientFileName#.csv" output="#newFileContents#">

This might not be the most elegant solution but it should work, provided that you can use the aggregate function DISTINCT with the QoQ.

Hope this Helps,

JAlpino
 
Thanks Jalpino,

I am getting an error:
Code:
Invalid CFML construct found on line 45 at column 92.  
ColdFusion was looking at the following text:
Name

The CFML compiler was processing:

a cfset tag beginning on line 45, column 6.
a cfset tag beginning on line 45, column 6.
 
  
The error occurred in C:\Inetpub\[URL unfurl="true"]wwwroot\Bennington_LM\admin\fileuploader2.cfm:[/URL] line 45
 
43 : 
44 : <cfloop query="qryOnlyDistincts">
45 :     <cfset newFileConents = newFileContents & delim & email & delim & Date & delim & First Name & delim & Last Name & delim & Lead Type & delim & Address & delim & Address2 & delim & City & delim & State & delim & Country & delim & Zip Code & delim & Phone & delim & Purchase Time & delim & Boat Owner & delim & Owned Categories & delim & Interested Categories>
46 :     <!--- Add a newline unless it's the last record in the query --->
47 :     <cfif currentRow NEQ recordCount>

Here's how I have that loop formatted:

Code:
<cfloop query="qryOnlyDistincts">
    <cfset newFileConents = newFileContents & delim & email & delim & Date & delim & First Name & delim & Last Name & delim & Lead Type & delim & Address & delim & Address2 & delim & City & delim & State & delim & Country & delim & Zip Code & delim & Phone & delim & Purchase Time & delim & Boat Owner & delim & Owned Categories & delim & Interested Categories>
    <!--- Add a newline unless it's the last record in the query --->
    <cfif currentRow NEQ recordCount>
        <cfset newFileContents = newFileContents & endLine >
    </cfif>
</cfloop>

What do I have wrong?
 
Could it be
Code:
45 :     <cfset newFileCon[COLOR=red]t[/color]ents = newFileContents & delim

________________________________________________________________
If you want to get the best response to a question, please check out FAQ222-2244 first.
'If we're supposed to work in Hex, why have we only got A fingers?'
Drive a Steam Roller
 
I think the problem might be the space in field First Name. Could it be that CF is treating First and Name as two different field names? Try removing the space and leaving the var as:
Code:
<cfset newFileConents = newFileContents & delim & email & delim & Date & delim & FirstName & delim & LastName & delim & LeadType & delim & Address & delim & Address2 & delim & City & delim & State & delim & Country & delim & ZipCode & delim & Phone & delim & PurchaseTime & delim & BoatOwner & delim & OwnedCategories & delim & InterestedCategories>

____________________________________
Just Imagine.
 
you will need to modify how you set your newFileContents var.

first, the number of cols in your qryOnlyDistincts must be equal to your number of fields in the text field. so, if you have say 15 cols, you will do

<cfquery name="qryOnlyDistincts" dbtype="query">
SELECT DISTINCT( COL1 ) AS distinctCol, COL2, COL3,
COL4, ........ COL15
FROM qryNewFileContents
ORDER BY COL1, COL2, COL3, ....., COL15
</cfquery>

then in the loop

<cfloop query="qryOnlyDistincts">
<cfset newFileContents = newFileContents & delim & distinctCol & delim & COL2 & delim & COL3 & delim & COL4
& delim & COL5 ...... & delim & COL15>
<!--- Add a newline unless it's the last record in the query --->
<cfif currentRow NEQ recordCount>
<cfset newFileContents = newFileContents & endLine >
</cfif>
</cfloop>


hope it helps...


 
hi boatguy,

I'm guessing you got the jist of what I was trying to show you, I hope it works out for you and if it doesn't, keep posting back to the forum so we can help solve it.

As far as the error exception goes, GUJUm0deL picked off the code that was throwing the error, there should not be a space in your variable names.

johnwm did a good job in pointing out a 'bug' that would not have thrown an exception but would have caused a lot of headache when you tried to output 'newFileConents', it would not be what you were expecting.

jalpino
 
You've all been a tremendous help, but alas I am still getting the error:

Current error:
Code:
Invalid CFML construct found on line 40 at column 46.  
ColdFusion was looking at the following text:
,

The CFML compiler was processing:

an expression beginning with "newFileContents", on line 40, column 30.This message is usually caused by a problem in the expressions structure.
a cfset tag beginning on line 40, column 6.
a cfset tag beginning on line 40, column 6.
 
  
The error occurred in C:\Inetpub\[URL unfurl="true"]wwwroot\Bennington_LM\admin\fileuploader2.cfm:[/URL] line 40
 
38 : 
39 : <cfloop query="qryOnlyDistincts">
40 :     <cfset newFileContents = newFileContents&,&email&,&Date&,&FirstName&,&LastName&,&LeadType&,&Address&,&Address2&,&City&,&State&,&Country&,&ZipCode&,&Phone&,&PurchaseTime&,&BoatOwner&,&OwnedCategories&,&InterestedCategories>
41 :     <!--- Add a newline unless it's the last record in the query --->
42 :     <cfif currentRow NEQ recordCount>
I tried it by just removing the spaces from the field names and then I removed all spaces. It just doesn't seem to like that comma after NewFileContents.

Any more ideas?
 
Hi Boatguy,

The reason for your error this time is because of the commas in your string (on line 40). Replace the comma with that variable called 'newDelim' or wrap the commas in double quotes like so:

<cfset newFileContents = newFileContents & "," & email & "," & Date & "," & FirstName & "," & LastName & "," & LeadType & "," ....

In order to concatinate CF Variables with plain text strings, the strings must be enclosed in either a single quotes or double quotes. Be consistant on which ever you choose.

I would recommend using the variable approach instead of plain text. If you decide that you want to change the comma down the road to something else like a semi-colon ';', you won't have to change it in a bunch of places but rather you would just change the value of that one variable.


Good Luck,

JAlpino
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top