Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Remove Text from Memo

Status
Not open for further replies.

smilbert

IS-IT--Management
Aug 28, 2006
36
US
I've migrated data from several systems into a new MS SQL server. One of the fields contains the text pulled from .TXT and .DOC files.

There are various series' of text that I'd like to remove from the field (or replace with ""). I'm pretty comfortable with Enterprise Manager, but pretty much a newbie to writing queries and replacing text.

Samples of what I'd like to change:
>3 Carriage Returns with 2 CR
TAB+TAB+"420 Libbey Parkway"
"? Fax: (781) 340-0503"
Text from "Patient : " until "Clinician :" (with wildcard text between)

Any help on the best way to accomplish this (for a newbie) would be greatly appreciated.

THANKS! -Scott
 
See Books Online topic CHAR, a function that returns a character given a number. It is useful to represent things like CR in queries, for example, CHAR(54) is the ASCII character 6 and CHAR(13) is a carriage return.

Here are some ways to use REPLACE() and a couple other string functions.
Code:
SELECT a_text_column, REPLACE(a_text_column, CHAR(13), '')
FROM MyTable
WHERE CHARINDEX( CHAR(13), a_text_column) > 0
   OR PATINDEX( '%' + CHAR(13) + '%', a_text_column ) > 0

Once you devise a SELECT statement like this that gives the result you wish, write an UPDATE query using the REPLACE expression.

Replacing text between Patient and Clinician is more challenging. More later.
 
This query has a REPLACE() expression which yields an edited result. It also includes the various pieces of that complex expression to show what is being calculated. You might use these in developing the REPLACE expression, but they are not needed in the UPDATE query which you would eventually use to edit the text.
Code:
SELECT messageid
	, PATINDEX('%Patient: %', message)
	, PATINDEX('% Clinician%', message)
	, (   PATINDEX('% Clinician%', message) - 
                 PATINDEX('%Patient: %', message)   )
	, (   PATINDEX('% Clinician%', message) - 
                 PATINDEX('%Patient: %', message) - 8   )
	, SUBSTRING( message, 
                     (PATINDEX('%Patient: ', message) + 8), 
                     (PATINDEX('% Clinician%', message) 
                         - PATINDEX('%Patient: %',  message) 
                         - 8 )
                   )
	, REPLACE(CONVERT(VARCHAR(8000), message), 
                  SUBSTRING( message, 
                             (PATINDEX('%Patient: %', 
                                 message) + 8),
                             (PATINDEX('% Clinician%', 
                                  message)
                                  - PATINDEX('%Patient: %',
                                              message) 
                                  - 8 )
                           ), 
                  '')
FROM MyTable
WHERE messageid = 469808
  AND (PATINDEX('% Clinician%', message) 
       - PATINDEX('%Patient: %', message) 
       - 8) > 0

The central issue that makes this so difficult is that REPLACE(string_expression_to_edit, string_to_find, replacement_string) does not allow wild cards or pattern matching in the string_to_find. So we must calculate the string to find for each row.

The caluculation finds the beginning and ending positions of Patient and Clinician using PATINDEX which allows pattern matching.

Another limitation of REPLACE is that it must work on a character datatype, it does not work on TEXT, and SQL Server does not automatically CAST TEXT to VARCHAR.

The WHERE clause limits the damage, I mean the editing, to rows which contain the pattern we are looking for. Otherwise messages without that pattern will generate an error because the calculated length will be negative, 0 - 0 -8 .

Finally, notice that I developed this on a single message, 469808. I strongly advise you to similarly limit the number of rows in the UPDATE statement also. Because, data has a way of surprising us, the solution we think will work usually does something wacky somewhere because we could not envision all the permutations of data that matched our expressions.

With that thought in mind, you should also make a copy of the table before you start UPDATEing, because these changes are impossible to reverse. And something will go wrong, that I can guarantee.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top