Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Test for profanity

Status
Not open for further replies.

tmcneil

Technical User
Nov 17, 2000
294
US
Hi All,

I'm close to buttoning up a stored procedure that parses out words in a sentence into a word array. I'm using a SQL statement to see if each word in that sentence is kosher. If there are any profanities, then I would like to catch it and print the list of those words onto the GUI.
Code:
PROCEDURE RestrictedContent(InMessage  IN VARCHAR2,
                            pList      OUT VARCHAR2,
				            ErrorMsg   OUT VARCHAR2,
						    ErrorCode  OUT NUMBER)
AS
    warray     WORD_ARRAY;
    word       VARCHAR2(100) := '';
    rword      VARCHAR2(100) := '';
BEGIN
  -- Initialize ErrorCode and ErrorMsg
  ErrorCode        := 0;
  ErrorMsg         := 'Success';
  
  IF InMessage IS NULL THEN
      RETURN;
  END IF;
  
  UTILITIES_PKG.fprintf('debug.txt', 'RestrictedContent: InMessage: ' || InMessage);
  
  -- Call ArrayFromContent to create word array
  ArrayFromContent(InMessage, warray);
  
  -- Print out warray to debug
  FOR i in 1..warray.COUNT LOOP
    UTILITIES_PKG.fprintf('debug.txt', 'RestrictedContent: windex: ' || i);
    UTILITIES_PKG.fprintf('debug.txt', 'RestrictedContent: warray: ' || warray(i));
  END LOOP;
  
  pList := '';
  -- Compare each word to RESTRICTED_WORDS table
  FOR i in 1..warray.COUNT LOOP
    word := warray(i);
        
    BEGIN
      SELECT RESTRICTED_WORD INTO rword
      FROM RESTRICTED_WORDS
      WHERE LOWER(RESTRICTED_WORD) LIKE LOWER('%'||word||'%');
      
      UTILITIES_PKG.fprintf('debug.txt', 'RestrictedContent: Match: '  || rword || ' ' || warray(i));
      
      --  Build comma-delimited list, plist, of profanity words...
	  IF LENGTH(rword) > 0 THEN
	    IF LENGTH(pList) > 0 THEN
	      pList := CONCAT(pList, ', ');
	    END IF;
	    pList := CONCAT(pList, rword);
	  END IF;
	  
	  UTILITIES_PKG.fprintf('debug.txt', 'RestrictedContent: pList: '  || pList);
      
    EXCEPTION
    WHEN OTHERS
    THEN
      ErrorMsg  := 'Select Restricted Words, Did not find a match!';
      UTILITIES_PKG.fprintf('debug.txt', 'RestrictedContent: Error: ' || ErrorMsg);
    END;
  END LOOP;

EXCEPTION
WHEN OTHERS
THEN
    UTILITIES_PKG.fprintf('debug.txt', 'RestrictedContent: Error: ' || ErrorMsg);
    ErrorMsg  := 'Restricted Words failed!';
    ErrorCode := -20000;
    --RAISE_APPLICATION_ERROR(-20000, 'Restricted Words timeout!');

END RestrictedContent;


PROCEDURE ArrayFromContent(InMessage IN VARCHAR2,
						   warray    IN OUT WORD_ARRAY)
AS
    word       VARCHAR2(100);
    pos        NUMBER;
    windex     NUMBER;
    Content    VARCHAR2(512);
BEGIN

  windex  := 0;
  
  -- Remove all leading and trailing spaces
  --Content := LTRIM(RTRIM(InMessage));
  Content := TRIM(BOTH ' ' FROM InMessage);
  
  -- Use TRANSLATE to replace numbers and some punctuation with empty spaces
  Content := TRANSLATE(Content,'0123456789-?!/\,;:(){}"',' ');
  
  -- Use REPLACE to replace remaining punctuation with empty spaces
  Content := REPLACE(Content, CHR(10), ' ');  --linefeed character
  Content := REPLACE(Content, CHR(13), ' ');  --carriage return character
  Content := REPLACE(Content, '@', ' ');
  Content := REPLACE(Content, '.', ' ');
  Content := REPLACE(Content, 'http', ' ');
  Content := REPLACE(Content, '[URL unfurl="true"]www',[/URL] ' ');
  
  UTILITIES_PKG.fprintf('debug.txt', 'ArrayFromContent: Content: ' || Content);
  
  pos := INSTR(Content, ' ');

  --  Build array of words from message content...
  WHILE pos > 0 LOOP
  
    -- Fetch first word from Content...
    -- Trim off any trailing spaces from word
	word := RTRIM(SUBSTR(Content, 1, pos-1));
	--UTILITIES_PKG.fprintf('debug.txt', 'ArrayFromContent: word:' || word || ':');
    windex := windex + 1;

	IF LENGTH(word) > 0 THEN
	  warray(windex) := word;
	END IF;

    -- Trim off any leading spaces from remaining content
    -- this is precautionary
	Content := LTRIM(SUBSTR(Content, pos + 1, LENGTH(Content) - pos + 1));
	--UTILITIES_PKG.fprintf('debug.txt', 'ArrayFromContent: Content: ' || Content);
	pos     := INSTR(Content, ' ');
  END LOOP;
  
  -- Handle trailing word from Content...
  IF LENGTH(Content) > 0 THEN
    warray(windex + 1) := Content;
  END IF;

END ArrayFromContent;

Well, I do have one problem. If a swear word is used twice and not separated by a space between the two words, then it makes it through the filter. This is not what I want and could create some problems down the road. I would like to create a filter that will test for this kind of problem and retrieve the correct word or words based on the content. I'm thinking that I have to update this select statement or write som if blocks around it.
Code:
SELECT RESTRICTED_WORD INTO rword
      FROM RESTRICTED_WORDS
      WHERE LOWER(RESTRICTED_WORD) LIKE LOWER('%'||word||'%');

Any ideas?

Todd
 
How about

WHERE INSTR(LOWER(RESTRICTED_WORD),LOWER('%'||word||'%');
) > 0

Marty
 
typo left a semicolon in there

WHERE INSTR(LOWER(RESTRICTED_WORD),LOWER('%'||word||'%')
) > 0

what do you do for words like bass?

Marty

 
...or words like "shitake mushroom", "mishit" (as in baseball or cricket), the hundreds of non-profane words that contain the string "ass", all of the words with both profane and non-profane double entendres (which I won't list here for the sake of propriety), and the spammers' favourite: slightly mis-spelling words or using numerals to "fake out" software such as yours?

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I can provide you with low-cost, remote Database Administration services: see our website and contact me via www.dasages.com]
 
I always like Santa's comments!

I still have something wrong, had to remove the %

WHERE INSTR(LOWER(RESTRICTED_WORD),LOWER(word)) > 0

Also why use lower on the bad word from your array. Since you made the array why not have them all lower case?

Marty

 
And the winner (in my estimation) for the best "red-faced URL" was the original address for "Experts Exchange": "www.expertsexchange.com"[blush]. Many filters would not allow access and when the organisation realised its mistake, they changed the URL to its current "
[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I can provide you with low-cost, remote Database Administration services: see our website and contact me via www.dasages.com]
 
possibly getting slightly OT, but there some lovely examples here:
Yours,

fish

["]As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs.["]
--Maur
 
Just curious, are you in a school setting that you feel that you have to censor people’s thoughts? Any filter you write will always fail. For example if I write an article about "Breast Cancer", are you going to filter it. Unless there is a real reason to filter, then don't. If people are writing inappropriate comments, then talk to them, discipline them, or fire them. But don't trample on peoples first amendment rights without a real need.

Bill
Oracle DBA/Developer
New York State, USA
 
Bill,

I am all in favour of The First Amendement, but Todd may have responsibility for an application where there is a certain expectation of decorum, just as we have here on Tek-Tips. If, for example, Bill, you chose to test out your First Amendment latitude here by telling someone to **** off or eat **** or the like, the posting would probably last as long as it took the owner of Tek-Tips to read the "Red Flag" request that I'm certain you'd receive. You would probably lose your First Amendment "rights" on Tek-Tips.

So, please don't confuse the First Amendment with some delusion that you (or anyone) has the right to say whatever they want, whenever they want...Don't confuse "The Bill of Rights" with "The Bill of Wrongs".

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I can provide you with low-cost, remote Database Administration services: see our website and contact me via www.dasages.com]
 
Santa, Thats the exact point I was trying to make. Don't use automatic filtering that will always fail. If a user users profanty that is inapproporate for the setting. Tell them not to do it again. If they do, then block them or fire them (depending on the setting). I just have a knee jeck reaction against automatic censoring and this wasn't the forum to rant, sorry.

Bill
Oracle DBA/Developer
New York State, USA
 
Gotha, Bill...with your clarification (which you offered in your earlier post, but I was just reacting too hastily), I couldn't agree with your more ![2thumbsup]

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I can provide you with low-cost, remote Database Administration services: see our website and contact me via www.dasages.com]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top