I'm new to perl and need help building a pipe delimited text file with rows that contain the correct location of the delimiter.
I'm trying to modify a text file that has record elements delimited by spaces. But I don't want to replace all of the spaces in each row. Then I need to remove any rows that are missing data in the fields. This is to prepare a file for import into a database using an import utility.
Here's what I have so far, input file looks like this and has several hundred rows.
<code>
0211121253 Mike Dell ID065533
empty_field Bill Apple ID394033
0211121253 Leo De Vinci ID02014
0211121253 Joe Del La Marco ID653335
2033223222 John Michael Smith ID345234
...
</code>
<code>
#!/usr/bin/perl
$header = "ACCOUNT|FNAME|LNAME|ID\n";
@data =();
open INFILE, "< dat.txt"
or die "Cant open input file : $!";
# print $header;
while (<INFILE>) {
chomp;
push (@data, $_);
$_ =~ s/\s/\|/g;
print "$_\n";
}
close INFILE;
</code>
After processing, the output file should look like this:
<code>
ACCOUNT|FNAME|LNAME|ID # <-header added
0211121253|Mike|Dell|ID065533
0211121253|Leo|De Vinci|ID02014 # <-space left in place
0211121253|Joe|Del La Marco|ID653335 #same here
2033223222|John Michael|Smith|ID345234 # <-space left place
...
</code>
The script should also delete any rows with empty elements.
I'm not sure what would work best, the join, split, push, shift/unshift operators, or a regex that replaces the specified space with a pipe character? If using a regex, can I specify replacing the first space, after the ACCOUNT field, then the space after the FNAME field? However, the first or last names may have a space in them. The last record has a first name that has a space and the other records might have spaces in the last name field. I may be able to get a file(s) that have a list of the last or first names only, and use that as a reference for delimiting.
I code use regex to delimit the first space and the preceding space of the last field. Then use some way of pulling in the first and last name fields putting them in between the account and ID pipe delimited fields.
Unfortunately, I can't get the text file already delimited correctly, thus all this work, but I can get separate files that have only the individual fields in them (e.g, first_name.txt, last_name.txt, id.txt, etc). Then find away to build the dat.txt file with delimiters properly located.
Thank you in advanced for your help,
~Shotokan
I'm trying to modify a text file that has record elements delimited by spaces. But I don't want to replace all of the spaces in each row. Then I need to remove any rows that are missing data in the fields. This is to prepare a file for import into a database using an import utility.
Here's what I have so far, input file looks like this and has several hundred rows.
<code>
0211121253 Mike Dell ID065533
empty_field Bill Apple ID394033
0211121253 Leo De Vinci ID02014
0211121253 Joe Del La Marco ID653335
2033223222 John Michael Smith ID345234
...
</code>
<code>
#!/usr/bin/perl
$header = "ACCOUNT|FNAME|LNAME|ID\n";
@data =();
open INFILE, "< dat.txt"
or die "Cant open input file : $!";
# print $header;
while (<INFILE>) {
chomp;
push (@data, $_);
$_ =~ s/\s/\|/g;
print "$_\n";
}
close INFILE;
</code>
After processing, the output file should look like this:
<code>
ACCOUNT|FNAME|LNAME|ID # <-header added
0211121253|Mike|Dell|ID065533
0211121253|Leo|De Vinci|ID02014 # <-space left in place
0211121253|Joe|Del La Marco|ID653335 #same here
2033223222|John Michael|Smith|ID345234 # <-space left place
...
</code>
The script should also delete any rows with empty elements.
I'm not sure what would work best, the join, split, push, shift/unshift operators, or a regex that replaces the specified space with a pipe character? If using a regex, can I specify replacing the first space, after the ACCOUNT field, then the space after the FNAME field? However, the first or last names may have a space in them. The last record has a first name that has a space and the other records might have spaces in the last name field. I may be able to get a file(s) that have a list of the last or first names only, and use that as a reference for delimiting.
I code use regex to delimit the first space and the preceding space of the last field. Then use some way of pulling in the first and last name fields putting them in between the account and ID pipe delimited fields.
Unfortunately, I can't get the text file already delimited correctly, thus all this work, but I can get separate files that have only the individual fields in them (e.g, first_name.txt, last_name.txt, id.txt, etc). Then find away to build the dat.txt file with delimiters properly located.
Thank you in advanced for your help,
~Shotokan