What's a fast, straight-forward “find any of these strings in this text” for JavaScript?

What's the best method to find any of a list of substrings in a specific string?

This works, but can't be right.

var searchEngines = [
    new RegExp("www.google."),
    new RegExp("www.yahoo."),
    new RegExp("search.yahoo."),
    new RegExp("www.bing.")
  ];

function isSearchEngine(url){
  for (let i=0,len=searchEngines.length; i < len; i++){
    if (searchEngines[i].exec(url)) {
      return true;
    }
  }
  return false;
}

Anything to speed this up, really...

Answers:

Answer

Try a single regex using the | character for alternative values. Now instead of looping through an array, you can simply return a single regex test.

function isSearchEngine(url){
  return /www\.google\.|www\.yahoo\.|search\.yahoo\.|www\.bing\./i.test(url);
}

If your match strings are in an array, try something like this:

    function isSearchEngine2(url, array){
      var fullRegString = array.join("|");//add regex escape characters here if necessary
      return new RegExp(fullRegString).test(url);
    }

    //array of strings we want to match -- ideally add escape characters to these if necessary
    var searchEngines = [
      "www.google.",
      "www.yahoo.",
      "search.yahoo.",
      "www.bing."
    ];

    console.log(isSearchEngine2('www.google.com', searchEngines));//true -correct
    console.log(isSearchEngine2('abcdefg', searchEngines));//false - correct
    console.log(isSearchEngine2('wwwAgoogleAcom', searchEngines));//true -incorrect mis-match because of '.' matching all

Answer

Here is something a little more generic. This will return the string you pass in if it is found in the string you are searching against.

function findIn (str, here) {
    let location = here.indexOf(str),
    found = here.slice(location, location + str.length);
    if (found) {
        return found;
    } else {
        return `Sorry but I cannot find ${str}`;
    }
}

/** examples
console.log(findIn('hoo', "www.yahoo.com/news/some-archive/2103547001450"));

console.log(findIn('www', "www.yahoo.com/news/some-archive/2103547001450"));

console.log(findIn('news', "www.yahoo.com/news/some-archive/2103547001450"));

console.log(findIn('arch', "www.yahoo.com/news/some-archive/2103547001450"));
*/

Answer

Are you expecting a simple true/false from the url, or are you expecting to find multiple searchEngines in one string? I assume it's the former, as urls don't really contain multiple addresses....

Generally, String.indexOf() has the best performance for matching characters. Here's a benchmark I did a while back on various string parsing methods. The benchmark itself is set up to test if multiple words are all present instead of one instance, so RegExp.test() takes the cake there, but performance suffers HARD when the result is false. String.indexOf() was by far the most reliable for parsing true/false matches and easily the most performant when testing one string for one single value (don't have the benchmark for that, sorry);

However, you're doing this in a loop to test for multiple things. As you can see on the benchmark, RegExp.test() is the most performant on successes. If we can assume most of the urls you're passing to the function contain one of those urls, I would recommend using that:

var searchEngines = [
    "www.google.",
    "www.yahoo.",
    "search.yahoo.",
    "www.bing."
  ];

function isSearchEngine(url){
  let regex = new RegExp(searchEngines.join('|'), 'gi');
  return regex.test(url); // returns true/false
}

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us

©2020 All rights reserved.