Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Strip HTML from XL cell 3

Status
Not open for further replies.

RonRepp

Technical User
Feb 25, 2005
1,031
US
Hi all:

A client has given me imported csv files that have HTML markup in them. Can someone steer me to a good link on the subject? Or, better yet, is there a function to easily do this?

Dim HTML As String
HTML = "<b>b</b>"
HTML = Replace(HTML, "<b>", "")
MsgBox HTML


This works easily enough, but then you have to delete the original tag, too. To strip out any/all HTML markup coming through an imported csv file could be infinitesimal. I'm looking for a way to loop through the text quickly, much like I've seen algorithms for alphanumeric number replacements.

Any help would be greatly appreciated.

Ron

Ron Repp

If gray hair is a sign of wisdom, then I'm a genius.

My newest novel: Wooden Warriors
 
A starting point:
Code:
Function HTML2Text(HTML As String)
Dim Text As String, i As Long, x As String
For i = 1 To Len(HTML)
    x = Mid(HTML, i, 1)
    If x <> "<" Then
        Text = Text & x
    Else
        While Mid(HTML, i, 1) <> ">"
            i = i + 1
        Wend
    End If
Next
HTML2Text = Text
End Function

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Thanks, PHV. It's been too long for me. Glad I can count on you guys to always help.

Ron Repp

If gray hair is a sign of wisdom, then I'm a genius.

My newest novel: Wooden Warriors
 
Just a thought, which I can test for the moment as I don't have Office on this machine, but ...

Couldn't you start an instance through VBA of VBScript and use RegEx. I'm sure I've done something similar in the past. It's just annoying that I can't cobble together an example at the moment.
 
Add a reference to Microsoft HTML Object Library (MSHTML.TLB).
The code:

Dim HTML As String
HTML = "<b>b</b>"
Dim htmlDoc As MSHTML.HTMLDocument
Set htmlDoc = New MSHTML.HTMLDocument
htmlDoc.body.innerHTML = HTML
MsgBox htmlDoc.body.innerText


combo
 
>Add a reference to Microsoft HTML Object Library

This was why I was wondering how complex the HTML was likely to be
 
Combo and Strongm:

I understand from where you're coming, but I have to paste the text from an XL cell (actually CSV) into a PowerPoint slide. Now, if there's a way to just transfer the HTML into the TextBox on PP, I'd love to see it. I searched all over and everyone says it cannot be done.

PHV's example did exactly what I wanted: got rid of all the formatting. I've been fudging w/taking the bold and then bolding the characters within PP. This works OK, but a lot of code for so few BOLD references. My client does use a lot of bulleted lists, however, and an example in this area would be very helpful.

Thanks...

Ron Repp

If gray hair is a sign of wisdom, then I'm a genius.

My newest novel: Wooden Warriors
 
Combo and StrongM:

The reference to the MSHTML.TLB example worked as well. Thanks.

What the client would like is to actually have the PP slide show the HTML in the textbox (like a Web page). Transferring the text there is easy, and I've been dynamically coding the characters w/in the textbox. i.e., if it's BOLD, then that ## of characters is bold. The bulleted lists are a bit more difficult for this old brain.

Thanks,


Ron Repp

If gray hair is a sign of wisdom, then I'm a genius.

My newest novel: Wooden Warriors
 
>got rid of all the formatting

But so does combos, and it'll get rid of far more complex tags.

But it seems to me you want the formatting - "I've been fudging w/taking the bold and then bolding the characters within PP" and "if there's a way to just transfer the HTML into the TextBox". Well, there is.

Basically the TextRange object has a PasteSpecial method that'll paste HTML from the clipboard. So the trick is how to get the HTML onto that clipboard , and we can do that (although it isn't straightforward, and makes up the bulk of the following code). Here's an example that assumes at least 1 slide with at least 1 shape on it. Should work on all versions of PPT from 2003 onwards.

Code:
[blue]Option Explicit

Private Declare Function _
   RegisterClipboardFormat _
   Lib "user32" _
   Alias "RegisterClipboardFormatA" _
   (ByVal lpString As String) As Long
   
Private Declare Function CloseClipboard Lib "user32" () As Long
Private Declare Function OpenClipboard Lib "user32" (ByVal hWnd As Long) As Long
Private Declare Function GlobalAlloc Lib "kernel32" (ByVal wFlags As Long, ByVal dwBytes As Long) As Long
Private Declare Function SetClipboardData Lib "user32" (ByVal wFormat As Long, ByVal hMem As Long) As Long
Private Declare Function EmptyClipboard Lib "user32" () As Long
Private Declare Function GlobalLock Lib "kernel32" (ByVal hMem As Long) As Long
Private Declare Function GlobalUnlock Lib "kernel32" (ByVal hMem As Long) As Long
Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (pDest As Any, pSource As Any, ByVal cbLength As Long)
   
Dim m_cfHTMLClipFormat As Long
Const RegHtml As String = "HTML Format"
Private Const m_sDescription = _
                  "Version:1.0" & vbCrLf & _
                  "StartHTML:aaaaaaaaaa" & vbCrLf & _
                  "EndHTML:bbbbbbbbbb" & vbCrLf & _
                  "StartFragment:cccccccccc" & vbCrLf & _
                  "EndFragment:dddddddddd" & vbCrLf
                
Public Sub test()
    InitHTMLFormat
    PutHTMLClipboard "<B>Test</B> Spoon" ' Obviously your source would be contents of cells from sheet
    ActivePresentation.Slides(1).Shapes(1).TextFrame.TextRange.PasteSpecial ppPasteHTML
End Sub

Private Sub InitHTMLFormat()
    m_cfHTMLClipFormat = RegisterClipboardFormat(RegHtml)
End Sub

' Thanks to KB274326
Private Sub PutHTMLClipboard(sHtmlFragment As String, _
   Optional sContextStart As String = "<HTML><BODY>", _
   Optional sContextEnd As String = "</BODY></HTML>")
   
   Dim sData As String
   
   If m_cfHTMLClipFormat = 0 Then Exit Sub
   
   ' Build HTML fragment header required by clipboard (and consuming apps)
   ' Add the starting and ending tags for the HTML fragment
   sContextStart = sContextStart & "<!--StartFragment -->"
   sContextEnd = "<!--EndFragment -->" & sContextEnd
   
   'Build the HTML given the description, the fragment and the context.
   'And, replace the offset place holders in the description with values
   'for the offsets of StartHMTL, EndHTML, StartFragment and EndFragment.
   sData = m_sDescription & sContextStart & sHtmlFragment & sContextEnd
   sData = Replace(sData, "aaaaaaaaaa", _
                   Format(Len(m_sDescription), "0000000000"))
   sData = Replace(sData, "bbbbbbbbbb", Format(Len(sData), "0000000000"))
   sData = Replace(sData, "cccccccccc", Format(Len(m_sDescription & _
                   sContextStart), "0000000000"))
   sData = Replace(sData, "dddddddddd", Format(Len(m_sDescription & _
                   sContextStart & sHtmlFragment), "0000000000"))

   'Add the HTML code to the clipboard
   If CBool(OpenClipboard(0)) Then
   
      Dim hMemHandle As Long, lpData As Long
      
      hMemHandle = GlobalAlloc(0, Len(sData) + 10)
      
      If CBool(hMemHandle) Then
               
         lpData = GlobalLock(hMemHandle)
         If lpData <> 0 Then
            CopyMemory ByVal lpData, ByVal sData, Len(sData)
            GlobalUnlock hMemHandle
            EmptyClipboard
            SetClipboardData m_cfHTMLClipFormat, hMemHandle
         End If
      
      End If
      Call CloseClipboard
   End If
End Sub[/blue]
 
StongM:

Excellent! I knew I should have come up here before reading how it couldn't be done (for hours and numerous sites). I was even going to try adding an OLE obj to the slide and see if I could do that way.

Thanks so much,


Ron Repp

If gray hair is a sign of wisdom, then I'm a genius.

My newest novel: Wooden Warriors
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top