Use Regex To Find Specific String Not In Html Tag
Solution 1:
This should do it:
(?<!<[^>]*)_mystring_
It uses a negative look behind to check that the matched string does not have a < before it without a corresponding >
Solution 2:
When your regex processor doesn't support variable length look behind, try this:
(<.+?>[^<>]*?)(_mystring_)([^<>]*?<.+?>)
Preserve capture groups 1 and 3 and replace capture group 2:
For example, in Eclipse, find:
(<.+?>[^<>]*?)(_mystring_)([^<>]*?<.+?>)
and replace with:
$1_newString_$3
(Other regex processors might use a different capture group syntax, such as \1)
Solution 3:
Another regex to search that worked for me
(?![^<]*>)_mystring_
Solution 4:
A quick and dirty alternative is to use a regex replace function with callback to encode the content of tags (everything between < and >), for example using base64, then run your search, then run another callback to decode your tag contents.
This can also save a lot of head scratching when you need to exclude specific tags from a regex search - first obfuscate them and wrap them in a marker that won't match your search, then run your search, then deobfuscate whatever is in markers.
Solution 5:
Why use regex?
For xhtml, load it into XDocument / XmlDocument; for (non-x)html the Html Agility Pack would seem a more sensible choice...
Either way, that will parse the html into a DOM so you can iterate over the nodes and inspect them.
Post a Comment for "Use Regex To Find Specific String Not In Html Tag"