Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Parse CSV 1

Status
Not open for further replies.

GROKING

Programmer
Mar 8, 2005
26
0
0
US

Hello,

I have a csv file with several columns but I only want to keep 3 of them.

"me","you","you and me","nobody"
"me","you","you and me and nobody","nobody"

so I want to create a new file with just column 2 and 3 for instance. So 3 is a variable lenght.

any ideas would be helpful
 
read the file line by line, use the split function to split each line into an array of string (based on the ,), then reassemble the line string using only those fields you want and write it to your new file.

-Rick

VB.Net Forum forum796 forum855 ASP.NET Forum
[monkey]I believe in killer coding ninja monkeys.[monkey]
 
If you have very large files and exactly 4 columns you could use a Regular Expression something like this:
Code:
Imports System.Text.RegularExpressions

Private Function ChangeFile(fileText As String) As String
    Dim matchPattern As String = _
        "^(""[^""]*"")\s*,\s*(""[^""]*"")\s*,\s*(""[^""]*"")\s*,\s*(""[^""]*"")\s*$"
    Dim replacePattern as String = "$2,$3"
    Dim options As RegEx.Options = RegexOptions.Multiline

    If RegEx.IsMatch(text, matchPattern, options) Then
        text = RegEx.Replace(text, matchPattern, replacePattern, options)
    End If
    Return text
End Function 'ChangeFile
What this does is match
^ - Starting at the beginning of the line
( - Beginning of group #1
" - A Double Quote
[^"]* - Anything that is not a " zero or more times
" - A Double Quote
) - End of group #1
\s* - any white space zero or more times
, - a comma
\s* - any white space zero or more times

Repeat 3 times

$ - end of line

Groups within parentheses are numbered from left to right as match 1, 2, 3 and 4.

Then the replacement pattern $2,$3 says replace with
$2 - group #2
, - a comma
$3 - group #3

The multiline option means to match ^ and $ at the beginning and ending of every line instead of only the beginning and ending of the entire string. Enjoy...

Have a great day!

j2consulting@yahoo.com
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top