Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Extrract Column Names and Values from Text File

Status
Not open for further replies.

NIXDBA

Programmer
May 21, 2012
1
US
I have a log file that I want to extract the field name and the field value
and write them to a text file for importation it a database table for reporting
purposes.

Each record starts with the following Line: DB Table at:


How can I extract the desired data from this file .

Example: dbt_dbid=4 dbt_dbid is the field name, 4 is the field value

dbt_flmode: 0x0000 (0x00000000) dbt_flmode is the field name, 0x0000 (0x00000000) is the field value


This file is not formatted with one record(field) on each line in some instances more than one field is on
the same line


I want to outputevery column from the text file with it's corresponding value in the following format


Column Name Value

dbt_xxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxx

Example: dbt_verstimestamp= May 15 2012 3:37PM dbt_dbname=TestDB

SO how would I grab the date 'May 15 2012 3:37PM' and grab the field dbt_dbname and it value 'TestDB'

then read the next field. With the file formated with more than one field on each line I think the
delimiter for the next field would be dbt_


Sample Text File:

DBCC execution completed. If DBCC printed error messages, contact a user with System Administrator (SA) role.

DBTABLES (ACTIVE):

DB Table at: 0x0000000021986BA0


dbt_dbid=31515 dbt_stat=0xc (0x0008 (DBT_AUTOTRUNC), 0x0004 (DBT_SELBULK))
dbt_extstat=0x0 (0x0000)
dbt_stat2=0xffff8000 (0x8000 (DBT2_MIXED_LOG_DATA))
dbt_stat3=0x20000 (0x00020000 (DBT3_SYSPARTITIONS_EXISTS))
dbt_stat4=0x0 (0x00000000)
dbt_runstat=0x0(0x0000)
dbt_state=0x2(0x0002 (DBST_ACTIVE)) dbt_keep=9
dbt_hdeskeep=0
dbt_next=0x000000002197DE80
dbt_systask_keep=0
dbt_detachxact_keep 0
dbt_dcompver_default=1
dbt_lock=0 dbt_dbaid=1
dbt_verstimestamp= May 1 2012 10:01AM
dbt_dbname=sybmgmtdb
dbt_logrows=48
dbt_lastlogbp=0x00000000256F1338
dbt_logsema=00000000040BEE10
dbt_nextseq=79
dbt_oldseq=12
dbt_dbinfobuf.dbi_logvers=7
dbt_dbinfobuf.dbi_upgdvers=35 ... .dbi_upgd_minor=1720
dbt_dbinfobuf.dbi_dbinfovers=5
dbt_dbinfobuf.dbi_sarg_vers=2
dbt_threshstat=0x0
dbt_thresholds=0x00000000219873B8
dbt_thresh_spin=0x000000002011E300
dbt_maxthresh=256
thc_segment 1 thc_level 947 thc_status 0xe
thc_segment 2 thc_level 16 thc_status 0x5 <-- last chance threshold
thc_segment 2 thc_level 145 thc_status 0xa
dbt_nextid=748526669 dbt_nextidstat=0x0
dbt_dflinfo=0x0000000000000000
dbt_dflstat=0x0
dbt_dumpthreadlock=0
dbt_dbts=0x0000 0x000102d2
dbt_xdesqueue next=0x0000000021986C20 prev=0x0000000021986C20
dbt_xdesqueue_spin=0x000000002011E100
dbt_seqspin=0x000000002011E1C0
dbt_append_log_reqs=0
dbt_append_log_waits=0
dbt_logical_reads=0
dbt_physical_reads=0
dbt_apf_reads=0
dbt_physical_pages_read=0
dbt_physical_writes=0
dbt_physical_pages_written=0
dbt_assigned_cnt=0
dbt_sharable_tab_cnt=0
dbt_lock_reqs=0
dbt_lock_waits=0
dbt_cat_lock_reqs=0
dbt_cat_lock_waits=0
dbt_commits=0x0000 0x000102d2
dbt_commits_HWM=0x0000 0x00000000
dbt_commitsqueue next=0x0000000021986CC0 prev=0x0000000021986CC0
dbt_lastckptts=0x0000 0x00000000
dbt_lwm_ts =0x0000 0x00000008
dbt_desqueue 0x0000000021986F98 next=0x0000000021804DD8 prev=0x000000002190D5F8
dbt_udes=0x00000000219870B0 dbt_mastxlate=0x00000000200407B8
dbt_xlate=0x0000000021986FB0
dbt_dbaudflags=0x0
dbt_deftabaud=0x0
dbt_defviewawd=0x0
dbt_defpraud=0x0
dbt_xstat=0
dbt_slotid=6
dbt_ddlcount=0
dbt_logxlate=0x0000000000000000
dbt_logxlate_maxoff=0
dbt_xclrwritexlate=0x0000000000000000
dbt_protstamp=0
dbt_logflush=78
dbt_delayed_commit_seq=1
dbt_backup_start: 0x0000000021986F60 (dtdays=0, dttime=0) (uninitialized)
dbt_pagelkprom=[lwm=200 hwm=200 pct=100 status=(0x0008 (LKPROM_PAGE), 0x0001 (LKPROM_SERVER)) seqno=1]
dbt_rowlkprom=[lwm=200 hwm=200 pct=100 status=(0x0010 (LKPROM_ROW), 0x0001 (LKPROM_SERVER)) seqno=1]
dbt_singleuser=0x0000000000000000
dbt_seg = 00000000219871B8 segment 0 has no thresholds.
dbt_seg[0]: sg_unreservedpgs=5613
dbt_seg[1]: sg_unreservedpgs=5613, sg_thbelow_idx=1, sg_thbelow=0x00000000219873B8, sg_thabove_idx=0, sg_thabove=0x0000000000000000
dbt_seg[2]: sg_unreservedpgs=5613, sg_thbelow_idx=3, sg_thbelow=0x00000000219873C8, sg_thabove_idx=0, sg_thabove=0x0000000000000000
dbt_logsuspended=0
dbt_repl_context=0000000021987BD8 dbt_repl_spin=000000002011E280
dbt_secondary_truncpg=0
dbt_rep_stat=0x0
dbt_rep_work=0x0, dbt_rep_level=0
dbt_rep_gen_id=0
dbt_sqlrep_threshold=50
dbt_dbcache=0 dbt_sysindcache=-1
dbt_csysindcache=-1 ha_suspect_info=0x0000000000000000
dbt_supergam array
96 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0
dbt_logsema=0x00000000040BEE10
dbt_logsize=5,
dbt_plcspace=0,
dbt_dbinstcount=0
dbt_logallocs: high=0, low=3085
dbt_logallocs_at_recalc: high=-1, low=-1
dbt_logdeallocs_at_recalc: high=-1, low=-1
dbt_logfreespace_from_recalc=0 dbt_pg24objid=20

Async Log service disabled

dbt_systabvers=0x0000000000000000, No translation required for any catalogs

Database Disk Map:

DMAPFM (0x0000000021986D28) contains 1 frame; UNRSVDFM (0x0000000021986E48) contains 1 frame:
lstart is logical page #; vstart is virtual (2K) page #; size is # of logical pages.

dm_segmap: 0x00000007 (0x00000004 (LOG_SEGMENT), 0x00000002 (SEG_DEFAULT), 0x00000001 (SYS_SEGMENT))
dm_lstart=0 (dm_hi_lpage=9471) dm_vstart=0 (dm_hi_vpage=37887)
dm_lsize=9472 (74 Mb) *dbt_unrsvd=5613
dm_vstart=[ 0x0000000021986D50 vpgdevno=4 vpvpn=0 vdisk=0x0000000020D0E030 ]

dbt_durability= 1 (DBT_DURABILITY_FULL)
dbt_rec_options=0x0000 (0x0000)
dbt_temp_durability= 0 (DBT_DURABILITY_UNDEFINED)
dbt_flmode: 0x0000 (0x00000000)
dbt_lobcomplvl: 0
dbt_inrowloblen: 0

Execution time: 0.063 seconds






 
It's an awkward one to parse (no pun intended!), isn't it? Inconsistent field separators (some with "=", some with ":", some with just spaces), etc.

How do you propose to display the values like thc_*, dbi_*, sg_*, next, prev and the dbt_supergram array?

And how do you propose to display lines that don't contain name/value pairs, such as:

Code:
DMAPFM (0x0000000021986D28) contains 1 frame; UNRSVDFM (0x0000000021986E48) contains 1 frame:
lstart is logical page #; vstart is virtual (2K) page #; size is # of logical pages.

I'd start by running it through sed (this example uses GNU sed with -r for extended regular expressions, but you could modify the regexp for normal sed flavours) to at least get the variables on separate lines:

Code:
sed -r '
        s/(thc|dbt|dm)_[a-z]+[: =]/&/g
        s/,$//
' inputfile

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
How many files have you got ? It will be faster to modify each file yourself than by a script.
 
Hmm... dunno what happened to my post, it was supposed to look like this:

Code:
sed -r '
        s/(thc|dbt|dm)_[a-z]+[: =]/&/g
        s/,$//
' inputfile

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
Damn... it still looks the same. Before the & there should be a backslash and a new line.

Must be the new Tek-Tips posting features... I'll follow up with site owners.

Annihilannic
[small]tgmlify - code syntax highlighting for your tek-tips posts[/small]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top