In any case, here's an easy way, using regular expressions, to strip out all the HTML from a string.
This function accepts a string input (the string whose HTML tags are to be stripped). The regular expression pattern <(.|\n)+?> is used to get all matches of < and > characters with at least one character in-between. The Replace method of the regular expression object is then used to replace all instances with an empty string (""). Finally, all remaining < and > signs are replaced with their respective HTML encoded forms.
Something to consider: if you strip out the <BR> tags and you're re-displaying the string, it will all run together. The fix there would be to do a straight replace BEFORE sending the string to the stripHTML Function:
<%This replaces all the Break tags (those in both upper and lowercase) with {} right next to each other. Then send TheString to the StripHTML Function:
TheString = Replace(TheString,"<BR>","{}",1)
%>
<%Now you have a string with all the HTML stripped out and all the Break tags replace with {}. Now do one last Replace to put the Break tags back in:
Function stripHTML(strHTML)
Dim objRegExp, strOutput
Set objRegExp = New Regexp
objRegExp.IgnoreCase = True
objRegExp.Global = True
objRegExp.Pattern = "<(.|\n)+?>"
'Replace all HTML tag matches with the empty string
strOutput = objRegExp.Replace(strHTML, "")
strOutput = Replace(strOutput, "<", "<")
strOutput = Replace(strOutput, ">", ">")
stripHTML = strOutput 'Return the value of strOutput
Set objRegExp = Nothing
End Function
%>
<%
TheString = Replace(TheString,"{}","<BR>",1)
%>
That's it!
No comments:
Post a Comment