Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

To find html tags in memo 2

Status
Not open for further replies.

ozgurtekin

Programmer
Apr 22, 2003
17
TR
I am creating a program with delphi and I want to find the tags of a html page which I loaded into a memo object.
There are some lines like this:
<title>Welcome
</title>

I only want to find the '<title>' part or the 'Welcome' part
I want a code which can Delete 'Welcome' or make it possible for me to change it.

 
If possible, I'd suggest using a TRichEdit to load the html page into as TRichEdit has a built-in function called FindText which you can use to search for &quot;<title>&quot; et al.

Example:
Code:
  RichEdit1.FindText('<title>', 1, Length(RichEdit1.Text), [stWholeWord]);

Clive [infinity]
Ex nihilo, nihil fit (Out of nothing, nothing comes)
 
In order to remove &quot;Welcome&quot;, you could use FindText to locate where this word begins then make sure it is selected (you can use SelStart and SelLength properties of TRichEdit to do this). Then you could use TRichEdit.ClearSelection to remove it from the control.

Clive [infinity]
Ex nihilo, nihil fit (Out of nothing, nothing comes)
 
How about (which I haven't tested) ?
Code:
procedure ChangeWelcome ( const filename, msg: string );
var
  html: TStringList;
begin
  html := TStringList.Create;
  try
    html.LoadFromFile ( filename );
    html.text := StringReplace (   html.text, '<title>Welcome', '<title>'+msg, [] );
    html.SaveToFile ( filename );
  finally
    html.free;
  end;
end;

Andrew
 
Because you have the html already in a memo your code could be shrunk to:
Code:
  memo.lines.text := StringReplace ( memo.lines.text, '<title>Welcome', '<title>'+msg, [] );

Andrew
 
If you want to find more than just <title> or 'Welcome' you might want to look at using something like SAX (Simple API for XML).

SAX lets you parse pretty much any type of document, raising an event each time an element etc are encountered. I've used it the XML-like documents and can almost guarantee it will work with HTML. Can take a little while to learn to use (took me 2 hours to read the tutorial stuff and get my head around how to use it for what I wanted). A brief example from my code is below:

...
FReader := TSAXXMLReader.Create(nil);
FReader.Vendor := 'Keith Wood';
FReader.Options := FReader.Options + [soNamespaces];
FHandler := TSAXContentHandler.Create(nil);
FError := TSAXErrorHandler.Create(nil);

FReader.ContentHandler := FHandler;
FReader.ErrorHandler := FError;

FHandler.OnStartDocument := saxOnStartDocument;
FHandler.OnEndDocument := saxOnEndDocument;
FHandler.OnStartElement := saxOnStartElement;
FHandler.OnEndElement := saxOnEndElement;
FHandler.OnCharacters := saxOnCharacters;
FError.OnWarning := saxOnWarning;
FError.OnError := saxOnError;
FError.OnFatalError := saxOnFatal;
...

The saxOn... procedure declarations are:

procedure saxOnStartDocument(Sender: TObject);
procedure saxOnEndDocument(Sender: TObject);
procedure saxOnStartElement(Sender: TObject; const NamespaceURI, LocalName, QName: SAXString; const Atts: IAttributes);
procedure saxOnCharacters(Sender: TObject; const PCh: SAXString);
procedure saxOnEndElement(Sender: TObject; const NamespaceURI, LocalName, QName: SAXString);
procedure saxOnWarning(Sender: TObject; const Error: ISAXParseError);
procedure saxOnError(Sender: TObject; const Error: ISAXParseError);
procedure saxOnFatal(Sender: TObject; const Error: ISAXParseError);


as you can see, you can get to the element attributes in the OnStartElement event and the text in the OnCharacters event.....

anyway, I've included a link below:

 
[tt]
function TagsToLines(S:string):string;
var
InsideTag:boolean;
i:LongInt;
begin
InsideTag:=False;
Result:=S;
for i:=length(Result) downto 1 do begin
case Result[ i ] of
'>':begin
insert(#13+#10, Result, i+1);
InsideTag:=True;
end;
'<': begin
insert(#13+#10, Result, i);
InsideTag:=False;
end;
#10, #13: if InsideTag then Delete(Result, i, 1);
end;
end;
end;

procedure TForm1.Button1Click(Sender:TObject);
var
i:LongInt;
begin
RichEdit1.Lines.Text:=TagsToLines(RichEdit1.Lines.Text);
for i:=0 to RichEdit1.Lines.Count-1 do begin
if ( RichEdit1.Lines[ i ] <> '' ) then begin
if ( RichEdit1.Lines[ i ][ 1 ] = '<' ) then begin
{ Do something with the TAG }
end else begin
{ Do something with the TEXT }
end;
end else begin
{ Do something with the empty line }
end;
end;
end;
[/tt]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top