regex replace urls with anchors ONLY if not already in anchors

I've seen similar questions asked before, but none with a working solution.

I am trying to replace all urls on a page with anchor tags, but only those which aren't already within anchor tags.

so should be replaced with

<a href=""></a>

But only if it's not already within an anchor tag.

Any thoughts?



I think you need to do a two-pass operation. Split the source into

PART1 <a href=...>blah></a> PART2 <a href=...>blah</a> PART3...

Then replace urls with <a href="url"> in each of PART1, PART2 etc, then paste it all back together.

Doing it within a single regex is going to be a headache, if not impossible, depending on your dialect.


For jobs like this, I normally recommend people do it with code rather than regex because regex gets really messy, really fast. However, if you do want a regex, here is a workable solution. Please go to the link to get a full understanding and view of test cases I used.


with replacement

<a href="http\1://\2">\2</a>

I do not promise that is is perfect, but it does handle a lot of cases. Let me know if there are any test cases it needs to be fixed for.


If you are doing it on the client-side it might be worth doing it by walking document tree

Look through text nodes (nodeName="#text") and if there is substring starting with http/https and parent tag is not A - replace it with pattern (<a href="\1">\1</a> etc)

consider this to start

 // getting all tags where there is a text with 'http' which are not links
 var textTags = []'*'))
        .filter(function(n) { 
               return !n.children.length 
                  && n.nodeName !='A' && n.nodeName !='INPUT' 
                  && (n.innerHTML.indexOf('http') > -1) })

 for(var i in textTags) {
   // your code to replace links with whatever you want


Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us

©2020 All rights reserved.