Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Text file issues - reposted

Status
Not open for further replies.

thiagarr

Technical User
Oct 20, 2006
86
US
K,

Thank you for the post. I had a look after posting and though I was not able to scroll and see the whole text, I was
able to view some of the code and when I copied and pasted the data in notepad, it was complete. Now, I had wordwrapped
the whole thing and removed unwanted part of the text. Hope this is much better read than the previous post.

My apologies for the inconvenience caused to all.

I use CR XI and the data source is a UNIX log file that has about 200,000 lines. Sample data is as follows:

Code:
2007.01.09  13:41:30 Client request = <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE GetUserRequest SYSTEM
"GetUserRequest.dtd"><GetUserRequest><CommonRequest CustomerSubType="FSM" CustomerType="FSM"><TxNumber>802</TxNumber>

2007.01.09  13:41:30 Client request = <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE GetUserRequest SYSTEM
"GetUserRequest.dtd"><GetUserRequest><CommonRequest CustomerSubType="FSM" CustomerType="FSM"><TxNumber>802</TxNumber>

2007.01.09  13:41:32 Client request = <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE GetCountryProhibitsRequest 
SYSTEM"GetCountryProhibitsRequest.dtd"><GetCountryProhibitsRequest><CommonRequest CustomerSubType="WEBSITE" 

2007.01.09  13:41:34 BACKEND response sending to Servlet = <?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE EditCommodityCountryReply SYSTEM "EditCommodityCountryReply.dtd">

<EditCommodityCountryReply><CommonReply><TxNumber>963</TxNumber><TxUUID>00g00kafac</TxUUID></CommonReply><Error>
<ErrorCode>77003</ErrorCode><ErrorMsg/></Error></EditCommodityCountryReply>

2007.01.09  13:41:59 Invoking method TradeXpress.CustomerProfile.CustomerProfileEJB_jruilg_ELOImpl.retrieveCommodityRecords

2007.01.09  13:41:59 weblogic OBJECTBROKER.PROCESSMESSAGE  TX= 851 Invoke time= 7

2007.01.09  13:42:00 BACKEND response sending to Servlet = <?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE DetermineDocsForCommodityReply SYSTEM "DetermineDocsForCommodityReply.dtd">

<DetermineDocsForCommodityReply><CommonReply><TxNumber>920</TxNumber><TxUUID>g0g00o4f17</TxUUID></CommonReply>
<Error><ErrorCode>55002</ErrorCode><ErrorMsg>[854]</ErrorMsg></Error></DetermineDocsForCommodityReply>

2007.01.09  13:42:00 OBJECTBROKER.PROCESSMESSAGE  Invoking method TradeXpress.DocDetermination.DocDeterminationEJB_jisdhc_ELOImpl.determineDocsforCommodity


I inserted a DUMMY blank line after every line in the code shown here to properly identify the original text fields
(since due to the number of characters, the lines were wrapped in the Preview screen - I added carriage return to be able to view the code in the screen). In the source file, they are not
there.

Out of these lines, all I am interested is in parsing the file for the text <errorcode> and use that line and the two
previous lines concatenated in the original order. For example, the first occurrence of this will become as follows
(The log file somehow seems to insert carriage return where it should not be there).


Code:
2007.01.09  13:41:34 BACKEND response sending to Servlet = <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE
EditCommodityCountryReply SYSTEM "EditCommodityCountryReply.dtd"><EditCommodityCountryReply><CommonReply><TxNumber>
963</TxNumber><TxUUID>00g00kafac</TxUUID></CommonReply><Error><ErrorCode>77003</ErrorCode><ErrorMsg/></Error>
</EditCommodityCountryReply>


I have CR text driver defined to read lines that are 2,000 characters long in FIXED format while reading this file.

For want of any other ideas, I started as follows:

@concat:

if left({error35_log.Title},9) = "<!DOCTYPE" then
previous({error35_log.Title}) & {error35_log.Title} & next({error35_log.Title})

From here, I use another formulao to identify the lines with errorcode in them (the lines that I really want for use in
the report)

@errorcode_data:
if instr({@concat},"<ErrorCode>",1) <> 0
then
{@concat}
else
"NODATA" //NODATA is used to suppress the unwanted lines in Details section

After this step, I have different formulas to extract the information I need to report:

@Date
left({@errorcode_data},10)

@errorcode_string:
((extractstring({@errorcode_data},"<ErrorCode>","</ErrorCode>")))

@TXN_string:
(extractstring({@errorcode_data},"<TxNumber>","</TxNumber>"))

I tried using tonumber(((extractstring({@errorcode_data},"<ErrorCode>","</ErrorCode>"))) in the above formula and I get
error 'non-numeric values' but when I use tonumber(@errorcode_string) in a separate formula, it works fine. Any ideas?

I am sure there are more straightforward methods to achieve the above and I can use any ideas / help in doing that.

I thought Record selection formula would be the best but I am not sure if I can combine the three lines in one while
doing that. That will make the report work faster as I normally only have about 1% of the total lines have the error
data.

Also, I want to group the errorcode and TXNumber field, but I do not get an option to use the formula in a group (only
my original field is available for grouping or summary). I would like to know how I can achieve that.

I would appreciate any help / guidance I can get on this.
Thank you very much to all of you for your support.

TR
 
Are the lines wrapped in the file?

I'd probably opt to read the file into a database, perhaps an MS Access table, and then report off of that.

-k
 
K,

Thank you.

No, the data is not wrapped in the file. The original file has a carriage return character at the end of each line in the file. I viewed the characters in the Textpad editor. The problem is these characters are in the wrong lines also (as mentioned). Unfortunately, we also do not have access to Access database.

I have no issue in reading the long lines using a CR Text driver configured to read upto 4,000 characters in a line of Fixed type (as mentioned in my post).

I also tried a record selection formula to bring only the lines with "<errorcode>" in them. That works fine and I am able to create the formulas I mentioned and do grouping based on the formulas (using the exact logic). Only issue here is the date field is missing.

If I can sort out the grouping issue in the sample I posted, I will get what I am looking for.


Thanks and regards,

TR
 
To be honest, your post is too time consuming to go through, and you're overthinking this.

What we need is example dtaa and expected output. Posting numerous formulas that aren't providing the desired results bores me and I lose interest. WHat is to be gained from showing us what doesn't work?

So if I understand this correctly, you want to parse out some values and use some values as a group field.

Using a previous function will disallow the formula from being used as a group, so you need to rethink your approach, and don't insert carriage returns, it doesn't make sense to buggar the data.

In general, you need to ALWAYS return a value for every row if you want to group on it, even if it's "".

So, test for values, and then use an if to return values:

//date
left({table.field},10)

//@errorcode_string:
extractstring({table.field},"<ErrorCode>","</ErrorCode>")

//@TXN_string:
extractstring({table.field},"<TxNumber>","</TxNumber>")

You can group on the above.

So stop overcomplicating this, read the data where appropriate, and filter using the record selection as required, such as:

left({table.field},1) <> "<"

You can also convert the date to a real date, etc.

Anyway, this should get you close.

-k
 
K,

Thank you. My apologies on the overkill of the data in the post.

Your explanation about previous function answers my question on why I am not able to group on the formula based on this. Is there anyway I can concatenate the three lines in the file based on condition in the line instead of Previous and next functions? Maybe I will declare a variable for each one and concatenate the variables?

I inserted carriage retun only in the example data I posted and not in the data source file.




Thanks and regards,

TR
 
Oooooh, I see, I thought you had inserted a CR in the data.

You need to define a means to extract data from each row as it's read if you require grouping.

Every row must have a value, and no you can't concatente data and then treat it as a single row of data within Crystal, hence my earlier suggestion of reading it into a database.

-k
 
K,

Thank you.

I realize your point now. I will keep working at it and post back, if I solve it.

I am also looking at REXX scripting to work on the data source file to bring only the concatenated lines I need from the file into another file, which I can use in CR.

Thanks and regards,

TR
 
Hi,

I just want to bounce a thought I had and see if it is workable.

I can get the Recordnumber from the system though the data source has only one field. Is it possible to display a record based on the Record Number?

For example, if a condition is True and I obtain the record number, is it possible to display, or use in a formula, the next record based on the (record number + 1) formula? This way, I can extract the strings I need from here. If yes, will I still be able to group on the extracted fields?

I hope this makes sense.

Thanks and regards,

TR
 
If you use the NEXT or PREVIOUS function you can't group by tht field.

But yes, you can use the recordnumber field.

-k
 
Hi K,

Thank you. I will keep working on that.

I searched the forum for ideas on how to display a record based on the recordnmuber with no luck.

Basically in the formula, I want to use the field where the recordnumber = X + 1, where X is the recordnumber of the field matching my original search criterias (thus simulating the NEXT function).

Any thoughts on how I can achieve that?

Thank you for your patience and support.



Thanks and regards,

TR
 
I guess there's now way to get this across, each row needs it's own value if you are going to perform grouping.

If you reference another value, it is no longer that rows value, hence no grouping.

I think that you should spend more time analyzing how you can use the data as is, there must be rules there you can code for.

-k
 
K,

Thank you for the details and for staying with me to help out. I will keep at it and post back when i find an answer.

:)

Thanks and regards,

TR
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top