Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reading files... again :>

Status
Not open for further replies.

Rhoon

Programmer
Mar 23, 2001
55
US
Hey all,

How would one skip the first 3 lines in a file that you need to open. Say the first three lines are always the same...

like:

Name ....
Data....
Garbage data... but it could be more than that say, 4 or 5 lines long...
the next line ALWAYS starts with the date.

in the form

04/03/2001 .... everything else...

there an easy way tell where to start reading data from?

If I'm not too clear on what I'm trying to get let me know, I'll try explain more.. thanks in advance.

Rhoon
 
Well, I'd write a loop to loop and throw away stuff until you find something in the format you're looking for. That should be pretty good. Sounds like a good problem for perl, though.

Disclaimer:
Beware: Studies have shown that research causes cancer in lab rats.
 
1.
If u exacly know how much to skip, u can "seek" to the
correct posn.

2.
if the data u don't want is in the form of structures,
u can "fread" them so that , after the freads u will
be in the correct offset.

3. Try skipping "\n" as you read. If u know the exaxt
number of lines u wanna skip.

4. If u r in unix/linux, just mmap the file and searching
file would boil down to searching a string.

Hope this helps
Do not rejoice that ur code works.
it might be a special case of an error :-(
 
It sounds like you're looking for a line that contains a pattern and once you've hit this pattern, you want to start processing the data. If the file is a text file and the line you're looking for is guaranteed to be distinct from any other lines you read before it, this should be enough to get you started:

/*** incomplete/untested/part pseudocode ***/

#define MAXLINE 100 /* or some suitable number */

/* ... */

char buf[MAXLINE+1];
FILE *fp;
int ok=0;

/* ... */

fp=fopen("data_file.txt","r");

if (fp!=NULL) {
while (fgets(buf,sizeof buf,fp)!=NULL) {
/* see if the contents of buf match
* the pattern
*/

if (line agrees with pattern) {
ok=1;
}
if (ok) {
/* process line */
}
}
}

If you're talking about simple pattern matching, it should be trivial to parse the line using one or more of the str* family of functions -- strchr(), strtok(), strstr(), strcspn() etc.

If you run into more difficulties, post what you have so far and we'll try to help.

Russ
bobbitts@hotmail.com
 

This is what I have so far, *fp1 == File reading from. *fp2 == file writing to, the printf is just a test to make sure I'm reading the information correctly. This is the data I'm reading out of the file:

Date,Time,Call Leg Setup Time,Gateway Identifier,Connection Id,Call Leg Direction,Call Leg Type,Call Leg Connect Time,Call Leg Disconnect Time,Call Leg Disconnect Cause,Remote Gateway IP Address,Calling-Station-Id,Called-Station-Id,Acct-Status-Type,Description
03/25/2001,00:00:00,04:57:10.775 UTC Sun Mar 25 2001,netnyc36.113.,B4C20D3A 300301CF 0 776EEA8,answer,Telephony,,,,,,,Start,
03/25/2001,00:00:01,04:56:58.747 UTC Sun Mar 25 2001,netnyc36.113.,B4C20D3A 300301AB 0 7768AE0,originate,VoIP,04:57:11.827 UTC Sun Mar 25 2001,04:57:11.827 UTC Sun Mar 25 2001,10,203.91.128.86,,01188031721576,Stop,



Here's the code I'm using. I'm curious if this is the best way to do it or not. I haven't been able to get past the host name (Which is netnyc36.113 in the first example) and I get stuck reading in B4C20D3A 300301AB 0 7768AE0 as you can see everything is seperated by comas, I was thinking about reading everything in one thing at a time, but as you can see its a mess of numbers and characters combined together...

int convert(FILE *fp1, FILE *fp2) {

int i;
int date[3];
float time[3];
float callsetuptime[4];
char datetime1[10], datetime2[5], datetime3[5];
int day, year;
char hostname[50];
char connectID[10], connectID2[10], connectID3[5], connectID4[10];
char CLdir[20];
char CLtype[20];
float CLconnecttime[4];
float CLdisconnectime[4];
int CLDiscause;
int ip[4];
unsigned long CSid;
unsigned long CSDid;
char AStype[10];


fscanf(fp2, "%d/%d/%d,%f:%f:%f,%f:%f:%f %c%c%c %c%c%c %c%c%c %d %d,%13s.,%8s %7s %1s %7s,%c%c%c%c%c%c%c,%s,%f:%f:%f.%f %s,%f:%f:%f.%f %s,%d,%d.%d.%d.%d,%lu,%lu,%s",

&date[0], &date[1], &date[2],
&time[0], &time[1], &time[2],
&callsetuptime[0], &callsetuptime[1], &callsetuptime[2],
&datetime1[0], &datetime1[1], &datetime1[2],
&datetime2[0], &datetime2[1], &datetime2[2],
&datetime3[0], &datetime3[1], &datetime3[2], &day, &year,
&hostname,
&connectID, &connectID2, &connectID3, &connectID4,
&CLdir[0], &CLdir[1],&CLdir[2],&CLdir[3],&CLdir[4],&CLdir[5],&CLdir[6], &CLtype,
&CLconnecttime[0], &CLconnecttime[1], &CLconnecttime[2], &CLconnecttime[3],
&CLdisconnectime[0], &CLdisconnectime[1], &CLdisconnectime[2], &CLdisconnectime[3],
&CLDiscause,
&ip[0], &ip[1], &ip[2], &ip[3],
&CSid, &CSDid, &AStype);

printf("

Date: %d/%d/%d
Time: %.0f:%.0f:%.2f
Call Setup Time: %.0f:%.0f:%.0f %c%c%c %c%c%c %c%c%c %d %d
Host Name: %s
Connect ID: %s %s %s %s
Call Leg Direction: %c%c%c%c%c%c%c
Call Leg Type: %s
Call Leg Connect Time: %f:%f:%f.%f %s
Call Leg Disconnect Time: %f:%f:%f.%f %s
Call Leg Disconnect Cause: %d
Remote Gateway IP: %d.%d.%d.%d
Calling Station ID: %lu
Called Station ID: %lu
Account Status Type: %s\n",

date[0], date[1], date[2],
time[0], time[1], time[2],
callsetuptime[0], callsetuptime[1], callsetuptime[2],
datetime1[0], datetime1[1], datetime1[2],
datetime2[0], datetime2[1], datetime2[2],
datetime3[0], datetime3[1], datetime3[2], day, year,
hostname,
connectID, connectID2, connectID3, connectID4,
CLdir[0], CLdir[1],CLdir[2],CLdir[3],CLdir[4],CLdir[5],CLdir[6], CLtype,
CLconnecttime[0], CLconnecttime[1], CLconnecttime[2], CLconnecttime[3],
CLdisconnectime[0], CLdisconnectime[1], CLdisconnectime[2], CLdisconnectime[3],
CLDiscause,
ip[0], ip[1], ip[2], ip[3],
CSid, CSDid, AStype);


Any help is much appreciated...

Rhoon
 
Whew! That's quite an fscanf() call you have there! :) I didn't test your code and I don't know what the problem is on quick visual inspection. However, you should always check the return value of fscanf() to see if it was able to parse the line as expected.

Personally, I would do this another way. First of all, fscanf() can be a tricky function, particularly when the data in the file isn't EXACTLY as you expect it to be. Secondly, things like this can make code maintenance a nightmare! I prefer not to have to tweak huge format strings when the log file and/or program requirements change at some point.

Also, you might want to think about encapsulating all of your variables inside of a struct. To take things a step further, you might also consider making the code more modular by placing the struct in its own header file and placing functions that manipulate the struct in the corresponding .c file.

Having said that, this looks like a job for Perl ;-)

At any rate, below is an alternative way to do it using strtok(). Note that the code should be divided up into function and not just all inside main(). Hope this will help:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAXLINE 500 /* Or whatever agrees with your maximum line size */

/* Instead of *magic* numbers out of the struct below */
#define DATE_SZ 10
#define TIME_SZ 8
#define CALL_SETUP_TIME_SZ 31

/* And so on */

/* To keep track of which field you're parsing in the current entry */
enum field_index {
DATE,
TIME,
CALL_SETUP_TIME
/* And so on */
};

/* To hold the different entries */
struct log_entry {
char date[DATE_SZ+1];
char time[TIME_SZ+1];
char call_setup_time[CALL_SETUP_TIME_SZ+1];
/* And so on */
};

int main(void)
{
FILE *fp=fopen(&quot;crap.txt&quot;,&quot;r&quot;);
char buf[MAXLINE+1];
char *word;
enum field_index idx;
struct log_entry entry;

if (fp==NULL) {
fputs(&quot;Failed to open log file&quot;,stderr);
return EXIT_FAILURE;
}

/* loop through each line in the file */
while (fgets(buf,sizeof buf,fp)!=NULL) {
/* Get the 1st field of this line */
word=strtok(buf,&quot;,&quot;);
/* Is this a field we're interested in? */
if (word!=NULL && strchr(word,'/')!=NULL) {
/* Yes, loop through the reset of the fields */
for (idx=DATE;
word!=NULL && *word!='\n' && *word!='\r';
++idx) {
switch(idx) {
case DATE:
/* 1st field, date */
strcpy(entry.date,word);
break;
case TIME:
/* 2nd field, time */
strcpy(entry.time,word);
break;
case CALL_SETUP_TIME:
/* 3rd field, call setup time */
strcpy(entry.call_setup_time,word);
break;
}
word=strtok(NULL,&quot;,&quot;);
}
/* At this point, write the structure to a file or
* whatever you need to do, maybe put this part
* in a separate function and just pass the struct
* to it
*/
}
}

fclose(fp);
return 0;
}


Russ
bobbitts@hotmail.com
 
Start reading character by character, checking for EOF and \n. Count three '\n' characters and you've skipped the first three lines.
Easy...

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top