Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Parsing file and adding like fields 1

Status
Not open for further replies.

johngiggs

Technical User
Oct 30, 2002
492
US
I'm trying to parse a file and combine all instances of each field and add the time, MB, etc.

The sample input looks like this:

Client: server1.domain.com
Backup ID: server1.domain.com_1190696991
Policy: virtualHosts_group1
Policy Type: Standard
Sched Label: AppSvr_cumul_inc
Schedule Type: Cumulative Incremental Backup
Retention Level: 2 weeks (1)
Backup Time: 09/25/2007 01:09:51
Elapsed Time: 000:01:54
Expiration Time: 10/09/2007 01:09:51
Compressed: no
Encrypted: no
Kilobytes: 32
Number of Files: 0
Primary Copy: 1
Image Type: 0 (Regular)
Keyword: (none specified)
Ext Security Info: no
File Restore Raw: no
Image Dump Level: 0
File System Only: no
Object Descriptor: (none specified)
Multiplexed: yes
TIR Available: no
Backup Copy: Standard (0)

Client: server1.domain.com
Backup ID: server1.domain.com_1190696991
Policy: virtualHosts_group1
Policy Type: Standard
Sched Label: AppSvr_cumul_inc
Schedule Type: Cumulative Incremental Backup
Retention Level: 2 weeks (1)
Backup Time: 09/25/2007 01:09:51
Elapsed Time: 000:01:54
Expiration Time: 10/09/2007 01:09:51
Compressed: no
Encrypted: no
Kilobytes: 32
Number of Files: 0
Primary Copy: 1
Image Type: 0 (Regular)
Keyword: (none specified)
Ext Security Info: no
File Restore Raw: no
Image Dump Level: 0
File System Only: no
Object Descriptor: (none specified)
Multiplexed: yes
TIR Available: no
Backup Copy: Standard (0)

Client: server2.domain.com
Backup ID: server2.domain.com_1190696991
Policy: virtualHosts_group1
Policy Type: Standard
Sched Label: AppSvr_cumul_inc
Schedule Type: Cumulative Incremental Backup
Retention Level: 2 weeks (1)
Backup Time: 09/25/2007 01:09:51
Elapsed Time: 000:01:54
Expiration Time: 10/09/2007 01:09:51
Compressed: no
Encrypted: no
Kilobytes: 32
Number of Files: 0
Primary Copy: 1
Image Type: 0 (Regular)
Keyword: (none specified)
Ext Security Info: no
File Restore Raw: no
Image Dump Level: 0
File System Only: no
Object Descriptor: (none specified)
Multiplexed: yes
TIR Available: no
Backup Copy: Standard (0)

The desired output is:

Client Policy Date Time Run Time MB # of Files
=========================================================================================
server1 virtualHosts_group1 09/25/2007 01:09:51 000:01:54 0 0

Where for each server has the time and MB added together for every iteration found in the file.

So if there are 5 backups for server1, the total backup time and MB will be added up.

I have a current awk script that parses the file, but does not add up the numbers for the total time and MB if there are multiple iterations for different backups.

Any help will be greatly appreciated.

Thanks,

John
 
This should get you started.
Code:
BEGIN { FS = ": +" }

"Client" == $1 { server = substr($2,1,index($2,".")-1)
  keys[ server ]++ }
"Policy" == $1 { data[ server, "policy" ] = $2 }
"Backup Time" == $1 { data[ server, "time" ] = $2 }
"Elapsed Time" == $1 { data[ server,"elapsed"] += secs($2) }
"Kilobytes" == $1 { data[ server, "kbytes" ] += $2 }

END {
  for ( key in keys )
    print key, data[ key, "policy" ], data[key,"time"],
      sprintf( "%4d", data[key,"elapsed"] ),
      sprintf( "%5.2f", data[key,"kbytes"]/1024),
      sprintf( "%3d", keys[ key ] )
}

function secs( str,      n, array, i )
{ split( str, array, /:/ )
  for (i=1; i<4; i++)
  { n *= 60
    n += array[i]
  }
  return n
}
I hope this helps. [thumbsup2]
 
futurelet,

Thank you for your help! I tweaked it a bit and it's working great.

John
 
After re-reviewing the output, I noticed that some of the information for some of the "Client"s isn't being displayed although all of the information is in the input file.

Does anyone see any flaws in the setup of this awk script?

Any help would be greatly appreciated.

Thanks,

John
 
Can you give a more detailed description of the problem? Or show some input data from which the program generates bad output?
 
futurelet,

I think I figured out what the problem was. Some of the "Client"s do not have .domain.com, so when the awk script was parsing the file, it was ignorning all clients without that characteristic.

I have since updated it to use:

Code:
"Client" == $1 { if ($2 ~ ".com") server = substr($2,1,index($2,".")-1); else server = $2
  keys[ server ]++ }

instead of

Code:
"Client" == $1 { server = substr($2,1,index($2,".")-1)
  keys[ server ]++ }

I think I'm all set now.

Thanks,

John
 
I'm glad you were able to fix it. Another approach would be to make sure that every client has a ".".
Code:
server = substr($2,1,index($2 ".", ".")-1)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top