Hello,
I have the following regular expressions that I am trying to use in my VB.NET 2.0 code. This code attempts to react to an XmlException to try to fix the malformed XML.
I have several problems with this method. The first RegEx used successfully parses the values inside of single-quote strings (e.g. 'td' and 'a' in this example). However, the expressions used to fetch the line number and the position of the error do not work. The RegEx retrieves the entire string containing the line/position numbers (e.g. "Line 36," and "position 56.") instead of just the integers inside.
I am new with regular expressions, so any help would be greatly appreciated. I thought I was doing the same thing in the "lineNumberSearch" and "positionSearch" constants as the "textInsideQuotes" query, but apparently I am not. I used Expresso to create and test these RegEx's and they tested successfully there. However, when actually running my .NET 2.0 code, I am unsuccessful.
Finally, in the "search" string, I am trying to find the problem in the HTML source code. Below is the fragment that I am using for testing.
Thank you for your help!
Nick Ruiz
Associate Integrator
PPLSolutions IT Billing and Transactions
I have the following regular expressions that I am trying to use in my VB.NET 2.0 code. This code attempts to react to an XmlException to try to fix the malformed XML.
Code:
'Trying to parse information from the following exception message:
'The 'td' start tag on line 36 does not match the end tag of 'a'. Line 36, position 56.
Private Function FixTagMismatch(ByVal ex As XmlException) As Boolean
Dim message As String = ex.Message
Dim success As Boolean = False
Const textInsideQuotes As String = "'([^']*)'"
Const lineNumberSearch As String = "Line\s(\d*),"
Const positionSearch As String = "position\s(\d*)."
Const rexOptions As Integer = CInt(RegexOptions.IgnoreCase _
Or RegexOptions.Multiline _
Or RegexOptions.Singleline _
Or RegexOptions.CultureInvariant _
Or RegexOptions.Compiled)
Dim rex As New Regex(textInsideQuotes, rexOptions)
Dim matches As MatchCollection = rex.Matches(message) 'Regex.Matches(message, textInsideQuotes)
Dim tag1 As String = String.Empty
Dim tag2 As String = String.Empty
' Retrieve the two tags in question.
If matches.Count = 2 Then
success = True
tag1 = matches.Item(0).Value
tag2 = matches.Item(1).Value
End If
' Retrieve the line number
rex = New Regex(lineNumberSearch, rexOptions)
Dim tmp As String = rex.Match(message).Value
'Dim line As Integer = CInt(rex.Match(message).Value)
' Retrieve the position of the problem
rex = New Regex(positionSearch, rexOptions)
Dim tmp2 As String = rex.Match(message).Value
'Dim pos As Integer = CInt(rex.Match(message).Value)
'Dim reader As New StringReader(Me.ModifiedText)
Dim errorLine As String = String.Empty
'Dim builder As New StringBuilder
'' Travel to the line in question.
'For i As Integer = 1 To line - 1
' builder.Append(reader.ReadLine() & vbNewLine)
'Next
'errorLine = reader.ReadLine()
' Fix the problem on this line
Dim search As String = "<tag1[^>]*>[^<tag2.*?>].*?</tag3>"
search = search.Replace("tag1", tag1)
search = search.Replace("tag2", tag2)
rex = New Regex(search, rexOptions)
matches = rex.Matches(Me.ModifiedText)
For Each mat As Match In matches
Debug.WriteLine(mat.Value)
Next
'builder.Append(errorLine & vbNewLine)
'builder.Append(reader.ReadToEnd())
'reader.Close()
Return success
End Function
I have several problems with this method. The first RegEx used successfully parses the values inside of single-quote strings (e.g. 'td' and 'a' in this example). However, the expressions used to fetch the line number and the position of the error do not work. The RegEx retrieves the entire string containing the line/position numbers (e.g. "Line 36," and "position 56.") instead of just the integers inside.
I am new with regular expressions, so any help would be greatly appreciated. I thought I was doing the same thing in the "lineNumberSearch" and "positionSearch" constants as the "textInsideQuotes" query, but apparently I am not. I used Expresso to create and test these RegEx's and they tested successfully there. However, when actually running my .NET 2.0 code, I am unsuccessful.
Finally, in the "search" string, I am trying to find the problem in the HTML source code. Below is the fragment that I am using for testing.
Code:
<table cellpadding="3" cellspacing="0" bordercolor="#CCCCCC" border="1">
<tr align="Center" bgcolor="#CCCCCC">
<td valign="top" class="tablefont" colspan="2"><b>Service Classification for 2006</b></td>
<td valign="top" class="tablefont" width="29%"><b>EDI Load Profile Code</b></td>
<tr>
<td valign="top" class="tablefont" width="31%">SC-1, SC1B</td>
<td valign="top" class="tablefont" width="40%"><a href="../../non_html/sc1std_06.xls">Standard Service</a> (xls)</td>
<td valign="top" class="tablefont" width="29%">1SC1, 2SC1</td></tr>
<tr>
<td valign="top" class="tablefont" width="31%">SC-1C</td>
<td valign="top" class="tablefont" width="40%"><a href="../../non_html/sc1c_06.xls">Optional Large Time of Use</a> (xls)</td>
<td valign="top" class="tablefont" width="29%">1SC1C, 2SC1C </td></tr>
<tr>
<td valign="top" class="tablefont" rowspan="2" width="31%">SC-2</td>
<td valign="top" class="tablefont" width="40%"><a href="../../non_html/sc2nd_06.xls">Non-Demand</a> (xls)</td>
<td valign="top" class="tablefont" width="29%">1SC2, 2SC2 </td></tr>
<tr>
<td valign="top" class="tablefont" width="40%"><a href="../../non_html/sc2dem_06.xls">Demand</a> (xls)</td>
<td valign="top" class="tablefont" width="29%">2SC2D, 3SC2D, 1SC2D</td></tr>
<tr>
<td valign="top" class="tablefont" rowspan="4" width="31%">SC-3</td>
<td valign="top" class="tablefont" width="40%"><a href="../../non_html/sc3sec_06.xls">Secondary</a> (xls)</td>
<td valign="top" class="tablefont" width="29%">1SC3</td></tr>
<tr>
<td valign="top" class="tablefont" width="40%"><a href="../../non_html/sc3pri_06.xls">Primary</a> (xls)</td>
<td valign="top" class="tablefont" width="29%">2SC3</td></tr>
<tr>
<td valign="top" class="tablefont" width="40%"><a href="../../non_html/sc3sub_06.xls">Subtransmission</a> (xls)</td>
<td valign="top" class="tablefont" width="29%">3SC3</td></tr>
<tr>
<td valign="top" class="tablefont" width="40%"><a href="../../non_html/sc3tra_06.xls">Transmission</a> (xls)</td>
<td valign="top" class="tablefont" width="29%">4SC3</td></tr>
<tr>
<td valign="top" class="tablefont" width="31%">Private Area Lighting</td>
<td valign="top" class="tablefont" width="40%"><a href="../../non_html/pal_06.xls">Private Area Lighting</a> (xls)</td>
<td valign="top" class="tablefont" width="29%">1SC1L</a> (xls)</td></tr>
<tr>
<td valign="top" class="tablefont" width="31%">Traffic Signals</td>
<td valign="top" class="tablefont" width="40%"><a href="../../non_html/traffic_06.xls">Traffic Signals</a> (xls)</td>
<td valign="top" class="tablefont" width="29%">1SC4L</td></tr>
<tr>
<td valign="top" class="tablefont" width="31%">Street Lighting</td>
<td valign="top" class="tablefont" width="40%"><a href="../../non_html/stlght_06.xls">Street Lighting</a> (xls)</td>
<td valign="top" class="tablefont" width="29%">1SC2L, 1SC3L, 1SC5L, 1SC6L</td></tr>
</table>
Thank you for your help!
Nick Ruiz
Associate Integrator
PPLSolutions IT Billing and Transactions