SUPER-HARD QUESTION about textptr-patindex-readtext

FlashMerlot · Mar 27, 2006

Need advice on the "best-approach" to accomplish this ...

I'm using TEXT var types in a MS SQL 8.0 database. My TEXT fields can contain (literally) up to 2 gb each. I need to find every instance of "Hello World" appearing anyplace in the 2gb text field.

Finding which particular records, happen to contain a TEXT field which happens contains "Hello World" SOMEPLACE ... is actually rather easy. Full-Text search works great to find the record.

HOWEVER ... now let's assume a record is found ... How to I locate EACH INDIVIDUAL INSTANCES of the words "Hello World" burried SOMEPLACE in a 2 gigabyte text bucket? Why? Because I want to return 500charsBEFORE+"Hello World"+AFTERchars500. e.g. not just the words "Hello World" but the CONTEXT-IN-WHICH-THE-WORDS-APPEAR for each instance.

Obviously, using a solution based in the Client is not feasible, since such solution would require over-the-wire-transmissions of theoretically dozens-and-dozens of gigabytes from the server back to the client for additional sub-processing. A solution that works at-the-server is the only reasonable approach? Right?

So ... using TSQL, how do I "walk" a 2 gb text field? Extracting 500+HelloWorld+500 ....

Remember, PATINDEX does not offer a "Start-Looking-Here-Offset" location.

Any advice greatly appreciated!

mrdenny · Mar 28, 2006

I beleive that something like this should do the trick.

Code:

declare @StartChar as bigint
set @StartChar = (select PATINDEX('%Hello World%', TextField) from table where key = 'value')
while @StartChar <> 0
BEGIN
    select substring(TextField, @StartChar, 1000)
    from table
    where key = 'value'

    set @StartChar = (select PatIndex('%Hello World%', Substring(TextField, @StartChar, 20000000000000000000)) from table where key = 'value')
END

This hasn't been tested or anything, but it should put you in the right direction.

Denny
MCSA (2003) / MCDBA (SQL 2000) / MCTS (SQL 2005)

--Anything is possible. All it takes is a little research. (Me)
[noevil]

http://www.mrdenny.com

FlashMerlot · Mar 30, 2006

Rats! Sometimes it's SO SIMPLE!
I've written your sample about 5 times.
Every time I wrote it, it failed!

BUT, I did not use a BIGINT
Changed my code from int to bigint.
Shazam!
Worked first time!

So close, yet so far!

Thanks MrDenny!

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

SUPER-HARD QUESTION about textptr-patindex-readtext

FlashMerlot

Programmer

mrdenny

Programmer

FlashMerlot

Programmer

Similar threads

Part and Inventory Search

Sponsor