Optimize regular expression for filtering thousands of HTML select options

Background

I developed a jQuery-based shuttle widget for HTML select elements because I could not find one that was minimally codified and offered a regular expression filter that compensated for diacritics.

Problem

When a few thousand entries are added to the select, the regular expression filter slows to a crawl. You can see the problem as follows:

  1. Browse to: http://jsfiddle.net/U8Xre/2/
  2. Click the input field in the result panel.
  3. Type any regular expression (e.g., ^a.*ai).

Code

I believe the culprit is lurking here:

var options = $src.empty().scrollTop( 0 ).data( "options" );
var search = $.trim( $input.val() );
var regex = new RegExp( search, 'gi' );
var len = options.length;
var $html = $(document.createElement( 'option' ));
for( var i = 0; i < len; i++ ) {
  var o = options[ i ];
  if( o.text.dediacritics().match( regex ) !== null ) {
    $src.append( $html.clone().text( o.text ).val( o.value ) );
  }
}
$src.css( 'width', $input.width() + 4 );

Where $src is the source $('#select') and String.prototype.dediacritics is defined as in the fiddle. The code above runs for every single keypress. There is one more relevant snippet:

// Create a copy of the source options to use when matching the regex.
var $options = [];
$src.find( "option" ).each( function() {
  $options.push( { value: $(this).val(), text: $(this).text() } );
});
$src.data( "options", $options );

This makes a copy of the options from the source list, but only runs once. (This results in a duplication bug when shuttling options, but adding the above code into the input event handler would slow the filter even more.)

Question

How can the code be made to perform regular expression filtering on lists up to 5,000 words in nearly real-time?

Thank you!

Answers:

Answer

I'd guess the harder job is the repeated call to dediacritics() (with its many regex-replaces) than doing the search (though I haven't done any profiling). So, you should cache these de-diacriticified strings and search only through them. Btw, test is usually faster than match.

Also, you should avoid as many DOM operations as possible - you have a lot of them when emptying and re-appending the whole options list onkeypress.

// once:
var options = [],
    src = $src[0]; // or whatever to get the DOM element
$.each( src.options, function() {
    options.push( { el: this, text: $(this).text().dediacritics(), hidden:false } );
});
// you might put it on the element via .data(), but need not

// onkeypress:
var regex = new RegExp( $.trim($input.val()), 'i' );
var curEl = src.firstChild;
for (var i=0; i<options.length; i++) {
    var option = options[i];
    if (regex.test( option.text )) {
        if (option.hidden)
            src.insertBefore(option.el, curEl);
        curEl = option.el.nextSibling;
        option.hidden = false;
    } else {
        if (!option.hidden) {
            curEl = option.el.nextSibling;
            src.removeChild(option.el);
        }
        option.hidden = true;
    }
}

Demo: This is blazingly fast ("realtime"), but you can feel the time it needs to construct the options array when calling dediacritics() 5000 times.

Answer

I suggest you to

  • create a multi-line string containing a list of all option names, each in a separate line
  • apply regex to this multi-line string to filter its content by removing non-matching lines
  • update html with matching lines as options of select element
Answer

A minor comment, if you aren't using the result of the regex match, then you should use a regex test:

  if( o.text.dediacritics().match( regex ) !== null ) {

use a test:

  if( regex.test(o.text.dediacritics()) ) {

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us

©2020 All rights reserved.