XPath or querySelector?

XPath can do everything querySelector can do, and more, so when would you ever choose the latter? I haven't seen any speed benchmarks comparing the two, so right now I'm choosing based on syntax conciseness, which seems kind of arbitrary.

Edit: I probably should have stated that I'm writing Greasemonkey scripts for Firefox, so I'm not worried about cross-browser compatibility, and would rather not include any libraries.

Answers:

Answer

What browser are you using? In Safari (or the iPhone), querySelector and querySelectorAll are much faster than XPath. IE doesn't support XPath at all, and IE6 and IE7 don't support querySelector. The fastest cross-browser selector engine is Sizzle, created by John Resig. Sizzle is also the main selector engine used in jQuery. It uses querySelector where appropriate and normal DOM methods where querySelector is unavailable.

Answer

In terms of functionality your best bet will be to use a library that includes a selector engine, and many of them (e.g. MooTools, Dojo, Prototype) are already using XPath internally to execute some classes of queries. You should be able to count on a good library choosing the fasted method for you.

XPath might be able to do everything that querySelector can do (I think this statement is a bit suspect, but that is beside the point) but querySelector and querySelectorAll are not supported by all browsers, so really we should be comparing XPath to native DOM querying methods (i.e. getElementsByTagName, getElementById, querySelector, standard traversal and filtering methods, etc.)

Using native DOM filtering methods requires knowledge of browser quirks and limitations and quickly becomes impractical for complex querying unless you use a library (e.g. jQuery or MooTools) to iron out the inconsistencies. The reason that native DOM techniques (whether through a proxy like jQuery, or custom implementations) are often chosen over XPath is that they do offer much more flexibility than XPath. For example, if you want to filter for checked inputs, "hidden" elements or disabled inputs XPath comes up short but jQuery gives you :checked, :hidden and :disabled pseudo-classes.

Answer

You'd only use querySelector if you haven't learned XPath yet but only know about CSS selectors. Other than that, XPath syntax can be more complex even for simple queries. So if you don't need the power provided by XPath, it might be easier to use CSS selectors instead.

You should be aware of two things:

  • id selectors don't work with querySelector when used on pure XML (or aren't reliable at least)
  • querySelector only works with selectors that the browser currently supports, so if it doesn't support some CSS3 selectors you can't use those.
Answer

CSS syntax is awesome for two reasons:

  • It is an order of magnitude faster and less resource intensive than the more complex XPath.
  • When what you want to find can be found with a css selector, a corresponding XPath query doing the same would most of the time be much longer and harder to read.

Case in point: take this css selector: h1.header > a[rel~="author"]

Its shortest functional XPath equivalent would be //h1[contains(" "+normalize-space(@class)+" "," header ")]/a[contains(" "+normalize-space(@rel)+" "," author ")]

…which is both much harder to read and write.

If you wrote this XPath instead: //h1[@class="header"]/a[@rel="author"]

…you would incorrectly have missed markup like <h1 class="article header"><a rel="author external" href="/mike">...</a></h1>

When you really need XPath, though, it's the only option, unless you want to walk around the DOM manually with code (which gets hideous fast).

Personally (and I am one of the maintainers of Greasemonkey), I use the very tiny on.js library for all my node slicing needs - which gives me a combination of both XPath (for when I need that), and CSS (which I tend to use almost all the time) – mostly because it lets me separate out all the code that deals with digging up parts of a page I need to digest, into the script header so my code gets served all the stuff it needs, and can be all about actually doing fun or great things to web pages.

Web browsers are very heavily optimized for running javascript really fast, and if I were you I would recommend using whatever makes you most efficient and happy as a developer, over what makes the browser run the least amount of code. One of the side benefits of on.js in particular, though, is that it automatically helps scripts often not get run at all, on pages where the nodes you thought were around, turn out not to be, instead of destroying the page.

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us

©2020 All rights reserved.