Split problem with </a> - Escape char? 2

NoCoolHandle · Aug 4, 2008

Hi..

I have a little control I am trying to write that acts like a stock ticker.. The issue is that it is a div tag that contains "a" tags.

This prevents me from just taking the first char and adding it to the end.

What I need to do is strip the text out of the a tag and if the a tag already exists at the end of the line, then just append to the end of the last a tag the beginning of the first.

The easiest way I could think of doing this was to use the string split method on the "</a>".

However this seems to have "escape" characters in it that prevent it from splitting the string into one "A" tag per parameter.

I have tried all sorts of "split parameters", but have had no success. (I am getting to the point of not doing this with the split, but...)

Any help here would be appreciated.

TIA

ROb

tperri · Aug 4, 2008

Seems like a situation for RegEx to me.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

http://www.lessthandot.com

- The Complete IT Community

NoCoolHandle · Aug 4, 2008

so being a total regex newbie, how would I do that?

Currently I am heading down a know path of indexof....

jmeckley · Aug 4, 2008

System.Text.RegularExpressions is the first place to start. the pseudo regex would look like something like this:
^<a+>$1</a>$
where
^ is the begining of a line
$ is the end of the line
+is any character following the a
$1 is all text between > and </a>. in this case it gives you the text of the link.

Jason Meckley
Programmer
Specialty Bakers, Inc.

NoCoolHandle · Aug 4, 2008

Ok..

so that is the equiv of

stringVariable.indexOf("</a>")

so what is the benefit of

^<a+>$1</a>$

seems way more complex..

Can i use this in a split?

do I still need to loop through the string to find e4ach instance?

Why use regex? (Did I mention I am a newbie to regex - except on the surface)

Rob

jmeckley · Aug 5, 2008

so what is the benefit

the benefit is wild cards and complex search patterns.

Can i use this in a split?

no a split is looking for an exact match.

do I still need to loop through the string to find e4ach instance?

if you want to find each match then yes. there is a Regex.Matches() function to provide this functionality

Why use regex?

Regex is another useful tool. just like generic lists, IDb[X] objects and Streams. The power of regex is the effecient string searching capabilities.

Think of regex like a manual transmission and split as an automatic. Either way you can drive a car, but you have more control with a manual transmission. It also requires more knowledge about how to handle the car.

Jason Meckley
Programmer
Specialty Bakers, Inc.

NoCoolHandle · Aug 5, 2008

Thanks Jason..

I will start to play with this.

Rob

NoCoolHandle · Aug 6, 2008

Ok..
so what is wrong with this.. (It doesn't seem to get any matches)

Code:

        Dim re As RegularExpressions.Regex
        Dim m As RegularExpressions.Match
        Dim s As String = "<a href=books.xml alt=books>this is some book stuff</a><a href=feedhost.aspx alt=""test bed"">page to host feed</a><a href=StyleSheet.css alt=""some style sheet"">a stype sheet.</a>"
        'm = re.Match(s, "^<a+>$1</a>$")
        Dim mc As MatchCollection
        mc = re.Matches(s, "^<a+>$1</a>$")

        Response.Write(mc.Count)

jmeckley · Aug 6, 2008

the pattern is a bit more complex that I originally thought. fortunately searching for html tags is a common practice

here is sample code for parsing the string above.

Code:

using System;
using System.Text.RegularExpressions;

namespace RegexTest
{
    internal class Program
    {
        private const string XML = @"<a href=books.xml alt=books>this is some book stuff</a><a href=feedhost.aspx alt=""test bed"">page to host feed</a><a href=StyleSheet.css alt=some style sheet>a stype sheet.</a>";
        private const string pattern = @"<a\b[^>]*>(.*?)</a>";

        private static void Main()
        {
            MatchCollection matches = Regex.Matches(XML, pattern);
            Console.WriteLine("Pattern: {1} Matches: {0}", matches.Count, pattern);
            foreach (Match match in matches)
                Console.WriteLine("\t{0}", match.Result("$1"));
            Console.ReadLine();
        }
    }
}

the pattern break down[tt]
\b Matches word boundary operator (not sure exactly what that means though).
[^>] that is not a right angle bracket
* anything
(.*?) something, nothing, anything store in a "regex variable"
</a> plain text
[/tt]
when displaying the text $1 means the value within the 1st parentheses.

Jason Meckley
Programmer
Specialty Bakers, Inc.

NoCoolHandle · Aug 6, 2008

Thanks m8

I was having a hard time getting anything out or knowing what to search for.

Rob

NoCoolHandle · Aug 7, 2008

Jason,

again thanks. I ended up building a page to test this with different options.

test page

As you say.. very useful.

Rob

RyanEK · Aug 14, 2008

Thanks for sharing that test page.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Split problem with </a> - Escape char? 2

NoCoolHandle

Programmer

tperri

Programmer

NoCoolHandle

Programmer

jmeckley

Programmer

NoCoolHandle

Programmer

jmeckley

Programmer

NoCoolHandle

Programmer

NoCoolHandle

Programmer

jmeckley

Programmer

NoCoolHandle

Programmer

NoCoolHandle

Programmer

RyanEK

Programmer

Similar threads

Part and Inventory Search

Sponsor

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Split problem with &lt;/a&gt; - Escape char? 2

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Similar threads

Log in

Part and Inventory Search

Sponsor

Split problem with </a> - Escape char? 2