Pattern matching in ASP - Kindly help

cfdeveloper · Oct 22, 2004

I'm really hoping someone could help me. I am a ColdFusion developer. I'm trying to re-write some CF code in ASP, and am stuck.
To give you an overview, I'm working on a job logging system. The customer calls the helps desk to log a job, the help desk creates a new job, adds the comments to the log. The comment is wrapped around a userdefined tag in this format and saved in the database. ex:

<newLog username='cfcoder'><span class='timestamp'>cfcoder |
20/10/2004 16:34:12</span><br>new test comments<hr></newLog>

as you may have noticed, the comment (in this case: "new test comments") is wrapped around the userdefined tag <newlog>

The customer front-end is written in ASP. Currently if a customer views this job it displays the <newlog> tag and I want to add some asp, do some pattern matching to only display the comment. I would really appreciate if someone could show me how to re-write my code in ASP. I have attached my coldfusion code

Code:

<cfscript>
function GetHeaders(header){
    // regexp for timestamp
    var timestampRegexp = "\d{1,2}/\d{1,2}/\d{4}\s+\d{1,2}:\d{1,2}:\d{1,2}";
    var stTmp = StructNew();
    var stReturn = StructNew();
    // remove *** if necessary
    header = REReplace(header,"(^\s*[*]{3}\s*|\s*[*]{3}\s*$)","","all");
    // find timestamp
    stTmp = REFind(timestampRegexp,header,1,true);
    if(stTmp.pos[1]){
        // if there is a timestamp, the user is everything in the header but the timestamp
        // a header looks like this: CFCODER01 | 14/11/2003 14:18:56
        // get the list of all usernames from the header for the log (one or more)
        stReturn.user = Trim(Removechars(header,stTmp.pos[1],stTmp.len[1]));
        // get the timestamp
        // get the list of all timestamps from the header for the log (one or more)
        stReturn.timestamp = REReplace(Mid(header,stTmp.pos[1],stTmp.len[1]),"\s+"," ");
    }
    else{
        // if there is NO timestamp, the user is everything in the
        stReturn.user = Trim(header);
        // create a fake timestamp for normalization
        stReturn.timestamp = "00/00/0000 00:00:00";
    }
    // remove trailing pipe from user if necessary
    stReturn.user = REReplace(stReturn.user,"\s*\|$","");
    // remove log tags
    stReturn.user = REReplaceNoCase(stReturn.user,"</?(new|legacy)Log\b.*?>","","all");
    // determine log type
    stReturn.log = "legacyLog";
    if(FindNoCase("<newLog",header)) stReturn.log = "newLog";
    return stReturn;
}
function FormatComment(stComment){
    var oldchars = "<,>,""";
    var newchars = "&lt;,&gt;,&quot;";
    return '<#stComment.log# userName="#stComment.user#"><span class="timestamp">#stComment.user# | #stComment.timestamp#</span><br>    <pre style="font-family:Arial, Verdana;">#ReplaceList(stComment.text,oldchars,newchars)#</pre><hr></#stComment.log#>';
}
function GetComments(str){
    var start = 1;
    var cnt = 0;
    var i = 0;
    var stComments = StructNew();
    var bComments = true;
    var stTmp = StructNew();
    /* regular expression for header: line surrounded with *** OR line ending with
    (?m) multi-line mode - matches before and after newlines
    (^ start of line instead of start of string)
    ($ end of line instead of end of string) 
    ^        start of line
    [ \t]*    any number of spaces or tabs
    (        start group (for OR)
    [*]{3}    three stars
    [^\n]*?    Any characters except new line any number of times 
            (non greedy operator *?) 
    [*]{3}    three stars
    |        OR
    [^\n]*?    Any characters except new line any number of times
    \d{1,2}    1-2 digits
    /        /
    \d{1,2}    1-2 digits
    /        /
    \d{4}        4 digits
    [ \t]+    one or more spaces or tabs
    \d{1,2}    1-2 digits
    :        :
    \d{1,2}    1-2 digits
    :        :
    \d{1,2}    1-2 digits
    )        end group
    [ \t]*    any number of spaces or tabs
    $ end of line */
    var commentRegexp = "(?m)^[ \t]*([*]{3}[^\n]*?[*]{3}|[^\n]*?\d{1,2}/\d{1,2}/\d{4}[ \t]+\d{1,2}:\d{1,2}:\d{1,2})[ \t]*$";
    /*
        Initialize structure
        The structure has 3 keys:
        1. "slug" containing the slug
        2. "comments" is an array of comments. each element in the array is a structure containing 4 keys:
            "user": user from the header
            "timestamp": timestamp from the header
            "text": text of the comment
            "comment": full formatted comment
        3. "formatted": the "comments" field with formatted headers
    */
    stComments.comments = ArrayNew(1);
    stComments.slug = "";
    // normalize returns to newline character (chr(10))
    str = REReplace(str,"\r(\n)","\1","all");
    // first get rid of redundant tags
    str = REReplaceNoCase(str,"</?(span|pre|hr)\b.*?>","","all");
    // convert <br> to newline
    str = REReplaceNoCase(str,"<br\b.*?>",Chr(10),"all");
    // GET THE SLUG
    // find first header
    stTmp = REFind(commentRegexp,str,start,true);
    if(stTmp.pos[1]){
        // if there is a header, the slug is everything untill that header
        if(stTmp.pos[1] GT 1) stComments.slug = Trim(Left(str,stTmp.pos[1]-1));
        // insert the first header into the array
        ArrayAppend(stComments.comments, GetHeaders(Mid(str,stTmp.pos[1],stTmp.len[1])));
        // next start position
        start = stTmp.pos[1]+stTmp.len[1];
    }else{
        // if there are no comments, everything is the slug
        stComments.slug = Trim(str);
        bComments = false;
    }
    // get comments if any
    while(bComments){
        // find next header
        stTmp = REFind(commentRegexp,str,start,true);
        // get index of last comment added to array
        i = ArrayLen(stComments.comments);
        if(stTmp.pos[1]){
            // there is a next header
            // get number of characters for last comment
            cnt = stTmp.pos[1]-start;
            // extract last comment text (and get rid of double linefeeds and log tags)
            stComments.comments[i].text = REReplace(Trim(Mid(str,start,cnt)),"(\n)\n+","\1","all");
            stComments.comments[i].text = REReplaceNoCase(stComments.comments[i].text,"</?(new|legacy)Log\b.*?>","","all");
            // construct entire comment
            stComments.comments[i].comment = FormatComment(stComments.comments[i]);
            // Get next header
            ArrayAppend(stComments.comments, GetHeaders(Mid(str,stTmp.pos[1],stTmp.len[1])));
            // next starting position
            start = stTmp.pos[1]+stTmp.len[1];
        }
        else{
            // there is no next header
            // get number of characters for last comment (untill end of string)
            cnt = Len(str)-start+1;
            // extract last comment text (and get rid of double linefeeds)
            stComments.comments[i].text = REReplace(Trim(Mid(str,start,cnt)),"(\n)\n+","\1","all");
            stComments.comments[i].text = REReplaceNoCase(stComments.comments[i].text,"</?(new|legacy)Log\b.*?>","","all");
            // construct entire comment
            stComments.comments[i].comment = FormatComment(stComments.comments[i]);
            break;
        }
    }
    // rebuild the entire formatted comment field
    // start with the slug
    stComments.formatted = stComments.slug;
    // loop over formatted comments
    for(i=1;i LE ArrayLen(stComments.comments);i=i+1){
        // add a double newline if there is already something in the comment field
        if(Len(stComments.formatted)) stComments.formatted = stComments.formatted & chr(10) & chr(10);
        // add the formatted comment
        stComments.formatted = stComments.formatted & stComments.comments[i].comment;
    }
    return stComments;
}
function GetCommentsNew(str){
    var start = 1;
    var cnt = 0;
    var i = 0;
    var stComments = StructNew();
    var bComments = true;
    var stTmp = StructNew();
    // regular expression for header: line surrounded with *** OR line ending with
    var commentRegexp = "<(newLog|legacyLog)\b[^>]*>(.*?)</\1>";
    /*
        Initialize structure
        The structure has 3 keys:
        1. "slug" containing the slug
        2. "comments" is an array of comments. each element in the array is a structure containing 4 keys:
            "user": empty
            "timestamp": empty
            "text": empty
            "comment": full formatted comment
        3. "formatted": the "comments" field with formatted headers
    */
    stComments.comments = ArrayNew(1);
    stComments.slug = "";
    // normalize returns to newline character (chr(10))
    str = REReplace(str,"\r(\n)","\1","all");
    // GET THE SLUG
    // find first comment
    stTmp = REFind("<(newLog|legacyLog)\b",str,start,true);
    if(stTmp.pos[1]){
        // if there is a comment, the slug is everything untill that comment
        if(stTmp.pos[1] GT 1) stComments.slug = Trim(Left(str,stTmp.pos[1]-1));
    }else{
        // if there are no comments, everything is the slug
        stComments.slug = Trim(str);
        bComments = false;
    }
    // get comments if any
    while(bComments){
        // find comment
        stTmp = REFind(commentRegexp,str,start,true);
        if(stTmp.pos[1]){
            // get index of comment to add
            i = ArrayLen(stComments.comments) + 1;
            stComments.comments[i] = StructNew();
            stComments.comments[i].user = "";
            stComments.comments[i].timestamp = "";
            stComments.comments[i].text = "";
            stComments.comments[i].comment = Mid(str,stTmp.pos[1],stTmp.len[1]);
            // next starting position
            start = stTmp.pos[1]+stTmp.len[1];
        }
        else{
            // there is no next comment
            break;
        }
    }
    // rebuild the entire formatted comment field
    // start with the slug
    stComments.formatted = stComments.slug;
    // loop over formatted comments
    for(i=1;i LE ArrayLen(stComments.comments);i=i+1){
        // add a double newline if there is already something in the comment field
        if(Len(stComments.formatted)) stComments.formatted = stComments.formatted & chr(10) & chr(10);
        // add the formatted comment
        stComments.formatted = stComments.formatted & stComments.comments[i].comment;
    }
    return stComments;
}
function GetCommentsWrapper(txt){
    if(REFindNoCase("<(newLog|legacyLog)\b[^>]*>",txt))
        return GetCommentsNew(txt);
    else return GetComments(txt);
}
</cfscript>
<cfset stComments = GetCommentsWrapper("<newLog username='cfcoder'><span class='timestamp'>cfcoder | 20/10/2004 16:34:12</span><br>new test comments<hr></newLog>")>
<cfset nLog = ArrayLen(stComments.comments)>

<cfif nLog>
    <cfoutput>
        <cfloop from='1' to='#ArrayLen(stComments.comments)#' index='i'>
            #Replace(stComments.comments[i].comment,"""",'&quot;','All')##Chr(10)#
        </cfloop>
    </cfoutput>
</cfif>

I look forward to hearing from someone

Best regards
cfcoder

TheConeHead · Oct 22, 2004

How much of your salary do I get?

[conehead]

trollacious · Oct 22, 2004

Are you looking for VBScript ASP, or JScript ASP?

cfdeveloper · Oct 22, 2004

VB Script ASP

cfdeveloper · Oct 22, 2004

I really need someone to help me here, conehead, if I don't get this done by today, I won't be getting any salary

cfdeveloper · Oct 22, 2004

here is a working CF code:

Code:

<cfscript>
function GetHeaders(header){
	// regexp for timestamp
	var timestampRegexp = "\d{1,2}/\d{1,2}/\d{4}\s+\d{1,2}:\d{1,2}:\d{1,2}";
	var stTmp = StructNew();
	var stReturn = StructNew();
	// remove *** if necessary
	header = REReplace(header,"(^\s*[*]{3}\s*|\s*[*]{3}\s*$)","","all");
	// find timestamp
	stTmp = REFind(timestampRegexp,header,1,true);
	if(stTmp.pos[1]){
		// if there is a timestamp, the user is everything in the header but the timestamp
		// a header looks like this: CFCODER01 | 14/11/2003 14:18:56
		// get the list of all usernames from the header for the log (one or more)
		stReturn.user = Trim(Removechars(header,stTmp.pos[1],stTmp.len[1]));
		// get the timestamp
		// get the list of all timestamps from the header for the log (one or more)
		stReturn.timestamp = REReplace(Mid(header,stTmp.pos[1],stTmp.len[1]),"\s+"," ");
	}
	else{
		// if there is NO timestamp, the user is everything in the
		stReturn.user = Trim(header);
		// create a fake timestamp for normalization
		stReturn.timestamp = "00/00/0000 00:00:00";
	}
	// remove trailing pipe from user if necessary
	stReturn.user = REReplace(stReturn.user,"\s*\|$","");
	// remove log tags
	stReturn.user = REReplaceNoCase(stReturn.user,"</?(new|legacy)Log\b.*?>","","all");
	// determine log type
	stReturn.log = "legacyLog";
	if(FindNoCase("<newLog",header)) stReturn.log = "newLog";
	return stReturn;
}
function FormatComment(stComment){
	var oldchars = "<,>,""";
	var newchars = "&lt;,&gt;,&quot;";
	return '<#stComment.log# userName="#stComment.user#"><span class="timestamp">#stComment.user# | #stComment.timestamp#</span><br>	<pre style="font-family:Arial, Verdana;">#ReplaceList(stComment.text,oldchars,newchars)#</pre><hr></#stComment.log#>';
}
function GetComments(str){
	var start = 1;
	var cnt = 0;
	var i = 0;
	var stComments = StructNew();
	var bComments = true;
	var stTmp = StructNew();
	/* regular expression for header: line surrounded with *** OR line ending with
	(?m) multi-line mode - matches before and after newlines
	(^ start of line instead of start of string)
	($ end of line instead of end of string) 
	^		start of line
	[ \t]*	any number of spaces or tabs
	(		start group (for OR)
	[*]{3}	three stars
	[^\n]*?	Any characters except new line any number of times 
			(non greedy operator *?) 
	[*]{3}	three stars
	|		OR
	[^\n]*?	Any characters except new line any number of times
	\d{1,2}	1-2 digits
	/		/
	\d{1,2}	1-2 digits
	/		/
	\d{4}		4 digits
	[ \t]+	one or more spaces or tabs
	\d{1,2}	1-2 digits
	:		:
	\d{1,2}	1-2 digits
	:		:
	\d{1,2}	1-2 digits
	)		end group
	[ \t]*	any number of spaces or tabs
	$ end of line */
	var commentRegexp = "(?m)^[ \t]*([*]{3}[^\n]*?[*]{3}|[^\n]*?\d{1,2}/\d{1,2}/\d{4}[ \t]+\d{1,2}:\d{1,2}:\d{1,2})[ \t]*$";
	/*
		Initialize structure
		The structure has 3 keys:
		1. "slug" containing the slug
		2. "comments" is an array of comments. each element in the array is a structure containing 4 keys:
			"user": user from the header
			"timestamp": timestamp from the header
			"text": text of the comment
			"comment": full formatted comment
		3. "formatted": the "comments" field with formatted headers
	*/
	stComments.comments = ArrayNew(1);
	stComments.slug = "";
	// normalize returns to newline character (chr(10))
	str = REReplace(str,"\r(\n)","\1","all");
	// first get rid of redundant tags
	str = REReplaceNoCase(str,"</?(span|pre|hr)\b.*?>","","all");
	// convert <br> to newline
	str = REReplaceNoCase(str,"<br\b.*?>",Chr(10),"all");
	// GET THE SLUG
	// find first header
	stTmp = REFind(commentRegexp,str,start,true);
	if(stTmp.pos[1]){
		// if there is a header, the slug is everything untill that header
		if(stTmp.pos[1] GT 1) stComments.slug = Trim(Left(str,stTmp.pos[1]-1));
		// insert the first header into the array
		ArrayAppend(stComments.comments, GetHeaders(Mid(str,stTmp.pos[1],stTmp.len[1])));
		// next start position
		start = stTmp.pos[1]+stTmp.len[1];
	}else{
		// if there are no comments, everything is the slug
		stComments.slug = Trim(str);
		bComments = false;
	}
	// get comments if any
	while(bComments){
		// find next header
		stTmp = REFind(commentRegexp,str,start,true);
		// get index of last comment added to array
		i = ArrayLen(stComments.comments);
		if(stTmp.pos[1]){
			// there is a next header
			// get number of characters for last comment
			cnt = stTmp.pos[1]-start;
			// extract last comment text (and get rid of double linefeeds and log tags)
			stComments.comments[i].text = REReplace(Trim(Mid(str,start,cnt)),"(\n)\n+","\1","all");
			stComments.comments[i].text = REReplaceNoCase(stComments.comments[i].text,"</?(new|legacy)Log\b.*?>","","all");
			// construct entire comment
			stComments.comments[i].comment = FormatComment(stComments.comments[i]);
			// Get next header
			ArrayAppend(stComments.comments, GetHeaders(Mid(str,stTmp.pos[1],stTmp.len[1])));
			// next starting position
			start = stTmp.pos[1]+stTmp.len[1];
		}
		else{
			// there is no next header
			// get number of characters for last comment (untill end of string)
			cnt = Len(str)-start+1;
			// extract last comment text (and get rid of double linefeeds)
			stComments.comments[i].text = REReplace(Trim(Mid(str,start,cnt)),"(\n)\n+","\1","all");
			stComments.comments[i].text = REReplaceNoCase(stComments.comments[i].text,"</?(new|legacy)Log\b.*?>","","all");
			// construct entire comment
			stComments.comments[i].comment = FormatComment(stComments.comments[i]);
			break;
		}
	}
	// rebuild the entire formatted comment field
	// start with the slug
	stComments.formatted = stComments.slug;
	// loop over formatted comments
	for(i=1;i LE ArrayLen(stComments.comments);i=i+1){
		// add a double newline if there is already something in the comment field
		if(Len(stComments.formatted)) stComments.formatted = stComments.formatted & chr(10) & chr(10);
		// add the formatted comment
		stComments.formatted = stComments.formatted & stComments.comments[i].comment;
	}
	return stComments;
}
function GetCommentsNew(str){
	var start = 1;
	var cnt = 0;
	var i = 0;
	var stComments = StructNew();
	var bComments = true;
	var stTmp = StructNew();
	// regular expression for header: line surrounded with *** OR line ending with
	var commentRegexp = "<(newLog|legacyLog)\b[^>]*>(.*?)</\1>";
	/*
		Initialize structure
		The structure has 3 keys:
		1. "slug" containing the slug
		2. "comments" is an array of comments. each element in the array is a structure containing 4 keys:
			"user": empty
			"timestamp": empty
			"text": empty
			"comment": full formatted comment
		3. "formatted": the "comments" field with formatted headers
	*/
	stComments.comments = ArrayNew(1);
	stComments.slug = "";
	// normalize returns to newline character (chr(10))
	str = REReplace(str,"\r(\n)","\1","all");
	// GET THE SLUG
	// find first comment
	stTmp = REFind("<(newLog|legacyLog)\b",str,start,true);
	if(stTmp.pos[1]){
		// if there is a comment, the slug is everything untill that comment
		if(stTmp.pos[1] GT 1) stComments.slug = Trim(Left(str,stTmp.pos[1]-1));
	}else{
		// if there are no comments, everything is the slug
		stComments.slug = Trim(str);
		bComments = false;
	}
	// get comments if any
	while(bComments){
		// find comment
		stTmp = REFind(commentRegexp,str,start,true);
		if(stTmp.pos[1]){
			// get index of comment to add
			i = ArrayLen(stComments.comments) + 1;
			stComments.comments[i] = StructNew();
			stComments.comments[i].user = "";
			stComments.comments[i].timestamp = "";
			stComments.comments[i].text = "";
			stComments.comments[i].comment = Mid(str,stTmp.pos[1],stTmp.len[1]);
			// next starting position
			start = stTmp.pos[1]+stTmp.len[1];
		}
		else{
			// there is no next comment
			break;
		}
	}
	// rebuild the entire formatted comment field
	// start with the slug
	stComments.formatted = stComments.slug;
	// loop over formatted comments
	for(i=1;i LE ArrayLen(stComments.comments);i=i+1){
		// add a double newline if there is already something in the comment field
		if(Len(stComments.formatted)) stComments.formatted = stComments.formatted & chr(10) & chr(10);
		// add the formatted comment
		stComments.formatted = stComments.formatted & stComments.comments[i].comment;
	}
	return stComments;
}
function GetCommentsWrapper(txt){
	if(REFindNoCase("<(newLog|legacyLog)\b[^>]*>",txt))
		return GetCommentsNew(txt);
	else return GetComments(txt);
}
</cfscript>
<cfset stComments = GetCommentsWrapper("<newLog username='cfcoder'><span class='timestamp'>cfcoder | 20/10/2004 16:34:12</span><br>new test comments<hr></newLog>")>
<cfset nLog = ArrayLen(stComments.comments)>

<cfif nLog>
	<cfoutput>
		<cfloop from='1' to='#ArrayLen(stComments.comments)#' index='i'>
			#Replace(stComments.comments[i].comment,"""",'&quot;','All')##Chr(10)#
		</cfloop>
	</cfoutput>
</cfif>

TheConeHead · Oct 22, 2004

http://www.monster.com

Genimuse · Oct 22, 2004

Sorry, cfdeveloper, you're just asking too much for this forum. Folks here are really great and giving, but you're not likely to find someone to spend hours working through your code and rewriting it for free. Try

http://www.cfm2asp.com/

vbkris · Oct 22, 2004

i can give u a small code that strips all tags:
<%
set RegEx=new RegExp
RegEx.global=true
RegEx.ingnorecase=true
RegEx.pattern="<.*?>"

TheString="<asdasdasd>Comment</asd>"

TheString=RegEx.replace(TheString,"")
response.write TheString
%>

Known is handfull, Unknown is worldfull

cfdeveloper · Oct 26, 2004

function GetCommentsWrapper(txt){
if(REFindNoCase("<(newLog|legacyLog)\b[^>]*>",txt))
return GetCommentsNew(txt);
else return GetComments(txt);
}

I want to display everything between the opening <newLog> or <legacyLog> tag and the
closing </newLog> or </legacyLog> tag. The above function works, but I'm trying to re-write the code in
ASP. I tried this but it does not work

vbkris your code does not work, it displays nothing

vbkris · Oct 26, 2004

sorry there was a mistake, try this:

Code:

set RegEx=new RegExp
RegEx.global=true
RegEx.ignorecase=true
RegEx.pattern="<.*?>"

TheString="<asdasdasd>Comment</asd>"

TheString=RegEx.replace(TheString,"")
response.write TheString

it will strip out all <>

Known is handfull, Unknown is worldfull

cfdeveloper · Oct 29, 2004

Cheers vbkris. Thanks for your help
Best regards

vbkris · Nov 1, 2004

hi,
by the way beware of the RegEx speed when it comes to BIG strings. refer this thread:
thread333-941032

Known is handfull, Unknown is worldfull

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Pattern matching in ASP - Kindly help

cfdeveloper

Programmer

TheConeHead

Programmer

trollacious

Programmer

cfdeveloper

Programmer

cfdeveloper

Programmer

cfdeveloper

Programmer

TheConeHead

Programmer

Genimuse

Programmer

vbkris

Programmer

cfdeveloper

Programmer

vbkris

Programmer

cfdeveloper

Programmer

vbkris

Programmer

Similar threads

Part and Inventory Search

Sponsor