Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

replace special characters like 'á' in table not working 1

Status
Not open for further replies.

pms18066

Programmer
Jan 12, 2005
94
US
I need to replace some special characters in a name field. Im pretty sure my syntax is correct because it works when I run it by itself. However when I run the whole procedure it doesnt update. the code is below.. this is part of a larger proc that is called as part of 10 steps. One .sql file is called and that one calls this on. After loading the data I still see the special characters in the field as if it ignored my replace statement. If I run the whole batch, verify the characters exist and then just run these statements, the characters are removed. Is there a certain commit or something I need to do to get these statements to work when run as a batch?

-- cleanup of raw1 data

declare @specchar varchar(10)
set @specchar = 'á'

begin
set @sql ='update raw1 set name=replace(name,''$specchar$'','''')'
set @sql = replace(@sql,'$specchar$', @specchar)
exec ( @sql )
end
begin
set @sql ='update raw1 set Affiliate=replace(Affiliate,''$specchar$'','''')'
set @sql = replace(@sql,'$specchar$', @specchar)
exec ( @sql )
end
begin
set @sql ='update raw1 set Address=replace(Address,''$specchar$'','''')'
set @sql = replace(@sql,'$specchar$', @specchar)
exec ( @sql )
end
begin
set @sql ='update raw1 set ScopeofDisqualification=replace(ScopeofDisqualification,''$specchar$'','''')'
set @sql = replace(@sql,'$specchar$', @specchar)
exec ( @sql )
end
begin
set @sql ='update raw1 set Office=replace(Office,''$specchar$'','''')'
set @sql = replace(@sql,'$specchar$', @specchar)
exec ( @sql )
end



/*
truncate table ident
truncate table aka
truncate table crimes
*/


...The inserts from the raw table into the 3 data tables is performed here -- works fine
 
Are you 100% sure that this code is actually being run? I would check that first.

I have some comments I'd like to make regarding this code (not related to your original question).... if you're interested.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Id love to hear them. It may be worth mentioning I did have these as straight replace statements, but changed them to exec(sql) while trying to debug this issue. There is no reason why they arent being run though, as everything else in this proc is running as expected except for these updates.

I appreciate your advice!!
 
In the code above, why do we need to use EXEC(@sql) at all? I see nothing dynamic in any of the update statements.
 
you r right, there is no need to do this dynamically.

The original code, which did not work either, was like this...

update raw1 set name=replace(name,'á','')
update raw1 set Affiliate=replace(Affiliate,'á','')
.
.
.



with no begin/end around them either, again trying to figure out why they werent working...

We load the data into the raw1 table and then do any custom changes needed, such as in this case. we then turn around and load this data into three seperate tables using insert statements inside of this same proc, just down further.

Thanks again for your help!!
 
Alright, here we go.

First, like you and markros pointed out, there's no reason to be using dynamic SQL. If you are hardcoding the @specchar value, then there really isn't any security risk associated with this code, and the performance will be roughly the same, so it doesn't really hurt that much, either.

Mostly my comments are centered around performance issues. The correct approach to take (when optimizing this for speed) depends on the data itself. If the Raw1 table only has a dozen rows, the approach you take won't really matter because it's gonna be fast anyway. But, if there are a lot of rows in the table, you should be able to dramatically speed up the replaces.

From a performance perspective, every row in all 5 columns in this table will be affected every time you run this code. This is a potential/probably performance problem because it's unlikely that all your data needs to be affected.

In this case, I suspect the special characters you are replacing do not appear very often in your data set. For example, you may have 1,000 rows and only 3 of them contain this special character. If this is the case, then I would suggest that you add a where clause to each query, like this:

Code:
update raw1 set name=replace(name,'á','')
[!]Where name like '%á%'[/!]

When you run the above query, the execution plan will show a table scan or an index scan. What happens is.... SQL needs to examine every row in the table to determine if there is a match (this is called scanning). If there is a match, then the replace occurs. Without the where clause, every row will be updated. Every one. If the data does not contain á, it will still be updated, it's just getting updated with the same data. If there is an update trigger on this table, the update trigger will fire, further slowing things down. By scanning the table for the special character, and only updating the rows that actually need updating, you will improve performance. You will only notice the improvements if there is a relatively small number of rows that actually need updating.

If many/most/all rows contain your special character, there is still room for improvement. Thinking about the original goal again... we want to replace the special character wherever it appears in any of the 5 columns in this table. There is no reason to update each column separately, though. You could easily do this...

Code:
update raw1 
set    name                   =replace(name,'á',''),
       Affiliate              =replace(name,'á',''),
       Address                =replace(name,'á',''),
       ScopeofDisqualification=replace(name,'á',''),
       Office                 =replace(name,'á','')

Notice, there is no where clause on this query. This will affect each of the 5 columns for every row in the table. But... since it's all done in one query, it will still execute faster than the 5 individual queries.

Basically....

if your special character rarely appears in the data
I recommend 5 individual updates but with WHERE clauses to minimize the number of rows affected.

if your special character appears in most of your data
I recommend updating all 5 columns with just one update statement.

Make sense?

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
In addition to excellent points by George, would it work if you declare your parameter as nvarchar(10) ?
 
perfect sense! Thanks for the clear explanation of how the scanning functions, that is most helpful. However this still doesnt solve the issue as to why the replace isnt working. Ive changed the statements as you mentioned and still the characters are there. Ill show the code as it is now in its entirety....


-- create table to hold raw file header
if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[raw1h]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table raw1h

create table raw1h ( line varchar(8000) )


-- create table to hold raw file (fields in raw1 based on header info)
if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[raw1]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table raw1

create table raw1 (Name varchar(100), Affiliate varchar(100), Address varchar(100), ScopeOfDisqualification varchar(100), DisqualificationStartDate varchar(100), DisqualificationEndDate varchar(100), DisqualificationListDate varchar(100), Office varchar(100))

-- save previously known good header
declare @header varchar(8000)
set @header='Name Affiliate Address Scope of Disqualification Disqualification Start Date Disqualification End Date Disqualification List Date Office '


-- import 1 line of raw file (to examine header only)
set @sql ='bulk insert raw1h from ''$rawfile$'' with ( maxerrors=1, lastrow=1)'
set @sql = replace(@sql,'$rawfile$', @rawfile)
exec ( @sql )


-- check if bulk insert returned any rows
if not exists(select * from raw1h)
begin
select @err=1, @errmsg='error reading header'
goto error
end

-- verify header is expected
if not exists(select * from raw1h where line=@header)
begin
select @err=1, @errmsg='unexpected header'
goto error
end


-- import raw file data
set @sql ='bulk insert raw1 from ''$rawfile$'' with (maxerrors=1)'
set @sql = replace(@sql,'$rawfile$', @rawfile)
exec ( @sql )



-- verify some raw data imported
if not exists(select top 10 * from raw1)
begin
select @err=1, @errmsg='no raw data'
goto error
end


-- cleanup of raw1 data

update raw1 set name=replace(name,'"','')
update raw1 set Affiliate=replace(Affiliate,'"','')
update raw1 set Address=replace(Address,'"','')
update raw1 set ScopeofDisqualification=replace(ScopeofDisqualification,'"','')
update raw1 set Office=replace(Office,'"','')

update raw1 set name=replace(name,'á',''),
Affiliate=replace(Affiliate,'á',''),
Address=replace(Address,'á',''),
ScopeofDisqualification=replace(ScopeofDisqualification,'á',''),
Office=replace(Office,'á','')
Where name like '%á%'


/*
truncate table ident
truncate table aka
truncate table crimes
*/




-- process IDENT table

insert into i.....
 
This isn't correct:

[tt][blue]
update raw1 set name=replace(name,'á',''),
Affiliate=replace(Affiliate,'á',''),
Address=replace(Address,'á',''),
ScopeofDisqualification=replace(ScopeofDisqualification,'á',''),
Office=replace(Office,'á','')
Where name like '%á%'
[/blue][/tt]

If the special character does not exist in name, but does exist in any of the other columns, it will not get updated.

Code:
update raw1 set name=replace(name,'á',''),
Where name like '%á%'

update raw1 set Affiliate=replace(Affiliate,'á','')
Where Affiliate like '%á%'

update raw1 set Address=replace(Address,'á','')
Where Address like '%á%'

update raw1 set ScopeofDisqualification=replace(ScopeofDisqualification,'á','')
Where ScopeofDisqualification like '%á%'

update raw1 set Office=replace(Office,'á','')
Where office like '%á%'

How are you running this code? Are you running this in a query window (SQL Server Management Studio)?

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Oops, sorry about that part. I have changed it to have the correct where clauses now.

I am running it from mgmt studio. I have the main script open and run it. It has a call to execute the load script which is where the "problem" code is located. When it is run like this it doesnt work. The main script creates the database and schema and then loads(calls the load script)removes duplicates and then creates some documentation.

When truncate the tables and I run the load script by itself again, from mgmt studio, it works fine. Arghh!! there has to be some disconnect here that is keeping this from working.
 
Is it possible that the data is getting updated in the Raw1 table, but not getting copied over to the other tables (ident, aka, crimes)?

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
OK here goes. It must have something to do with the way the script gets called from the other script..OSQL???

When I changed the special character to Char(225) it works great! BTW, This data was being loaded from a tab delimited file which was generated from an excel spreadsheet. This file is supplied from a vendor so it is out of my control as to what is in the xls file. The following is the new update stmts. I still would appreciate it if someone could tell me WHY SQL server handles it differentley from the script as opposed to running it directly in Mgmt studio though...

update raw1 set name=replace(name,char(225),'')Where name like '%'+char(225)+'%'
update raw1 set Affiliate=replace(Affiliate,char(225),'') where Affiliate like '%'+char(225)+'%'
update raw1 set Address=replace(Address,char(225),'') Where Address like '%'+char(225)+'%'
update raw1 set ScopeofDisqualification=replace(ScopeofDisqualification,char(225),'') Where ScopeofDisqualification like '%'+char(225)+'%'
update raw1 set Office=replace(Office,char(225),'') Where Office like '%'+char(225)+'%'
 
I wonder if this is a character encoding issue. When you save .sql files, the default encoding (on my computer) is "Western European (windows) codepage 1252". That's not to say it's the same for everyone though.

If you open a query window and copy/paste this code....

Code:
declare @specchar varchar(10)
set @specchar = 'á'

then, click file -> save, you will see a "Save File As" dialogue. Take a real close look at the "Save" button. There is actually 2 different clickable areas on this button (horrible design in my opinion). Click the down arrow part of the button and you can choose "save with encoding". Now, if you change the encoding to "US-ASCII - Codepage 20127" and click save, the 'á' will get saved as '?'.

I don't know for sure if this is the actual issue, but it is one possible explanation.


-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Im thinking that the OSQL works is the issue. When the scripts are run they are calling stored procs which use osql to do the work. When I run them directly in mgmt studio they arent. Im recreating the stored procs to use sqlcmd instead and see if that works differently still. Not sure but it has piqued my interest enuff to give it a try.

Thanks for everyones time in trying to figure this out with me!!!
 
Have you considered collation? Make sure that you don't have an accent-insensitive collation going...
 
I tested this using sqlcmd instead of osql and it works great. Whatever the issue is is fixed in sqlcmd. Im thinking that the special character in this case is not being read correctly by the osql engine. Cant explain why but we have one sql 2000 server to convert. Once this happens Im changing the sprocs to use sqlcmd so we dont see this issue again. Ive got my fingers crossed!!

Thanks again for everyones help!!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top