Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reposted - How to extract specific text from BLOB file 1

Status
Not open for further replies.

Doogster

Technical User
Nov 6, 2006
5
AU
Hi folks

I posted this yesterday in the VBA Forum, but overnight it vanished. I got a notification that MiggyD had responded to it, but as the thread was deleted, I couldn't see the response.....

Anyway, here's the original thread (let's hope it too doesn't vanish):

I have several large BLOB files from a Uniface application which contain text and junk characters. I need to extract certain pieces of text from the BLOB file. Is this possible using VBA using either Excel or Access (versions 2003)?

Here is a small portion of one BLOB file:

ENTRY_LINES=SECTION1=N026F005_SECTION=1N026F015_WEEK_NO
=N026F020_TIME_TYPE=WORKN026F022_ALLOW_TYPE=N02
6F025_DATE=20051229N026F030_TIME_FROM=0000000005300000
N026F035_BREAK_FROM=0000000010300000N026F040_BREAK_T
O=0000000011000000N026F045_BREAK_HOURS=N026F050_TIM
E_TO=0000000016000000

In the above example the text "N026F025_DATE=" occurs many times. I need to extract each instance of that text and the characters after the equals sign (which in most cases are different).

Any hints would be greatly appreciated.

Regards. Doogster
 
Probably you'd be best served by using InStr inside a do while loop. The difficulty will be when parts of the target data are on different (consecutive) lines. There are several ways around this depending on what you can expect about the file and the strings you're looking for. Will it always be the same number of characters after the "="? Do you know that the "N026F025_DATE=xxx...xxx" string will never span lines? How long are the lines?

_________________
Bob Rashkin
 
Thanks Bob

No, there will not always be the same number of chars after the =, which is what makes this so frustrating. The N026.... does span lines.

I was thinking of using the GetChunk method, while using InStr to locate each instance of the data I want to extract. Hopefully I can then copy the data I want them move on to the next instance using a loop.

Cheers. Doogster
 
If the lines aren't too long (and I don't know what length is too long), you should probably always append the next line. That is, use InStr to find where "N026..." is. Then lop off the left part of the string to there. Then append the next line, and so on. You could build up quite a long string if you don't find the target in several lines, but you avoid the spanning problem.

_________________
Bob Rashkin
 
Doogster,
How big is your blob file? If it's not too large you might[ol]
[li]Read the whole file into a [tt]String[/tt][/li]
[li]Identify what the actual characters '' represent.[/li]
[li]Use that to [tt]Split()[/tt] the [tt]String[/tt] into an array.[/li]
[li]Identify the elements in the array you want:[ul]
[li]Excel: Loop through the array looking for '[tt]N026F025_DATE=[/tt]'[/li]
[li]Access: Use [tt]VBA.Filter()[/tt] to create an array with the elements you want.[/li][/ul][/li]
[/ol]

Hope this helps,
CMP

[small]For the best results do what I'm thinking, not what I'm saying.[/small]
(GMT-07:00) Mountain Time (US & Canada)
 
Hi CMP

Thanks for that suggestion. Each BLOB file is around 20,000 characters, so a string is not appropriate.

Cheers. Doogster
 
Open the file binarily, and step through each character, adding it to a string, until you reach whatever delimiter you set up? Once a string contains a complete 'record' save it (maybe a string array?) and start processing the next one?

 
Doogster said:
Each BLOB file is around 20,000 characters, so a string is not appropriate.
Why? See the test code and output below, took less than a second to parse the file.
Code:
Sub ReadFile()
Debug.Print "Start", Now
Const cDelim = ""
Dim f As Integer
Dim s As String
Dim a() As String
f = FreeFile
Debug.Print "Open File", Now
Open cFile For Input As #f
s = Input(LOF(f), #f)
Close #f
Debug.Print "Close File", Now
a = Split(s, cDelim)
Debug.Print "Split", Now
Debug.Print "Char: " & Len(s), "Elements: " & UBound(a)
End Sub
Output
[tt]Start 11/9/2006 6:29:33 AM
Open File 11/9/2006 6:29:33 AM
Close File 11/9/2006 6:29:33 AM
Split 11/9/2006 6:29:33 AM
Char: 20286 Elements: 630[/tt]

Hope this helps,
CMP

[small]For the best results do what I'm thinking, not what I'm saying.[/small]
(GMT-07:00) Mountain Time (US & Canada)
 
CautionMP: Nice, clean, and simple. 20K is pretty small beer for a file these days. Have a star...

Is it worth checking the value returned by LOF() to implement a safety limit on size ?

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top