Multiple hash signs in URL

Or maybe you call it "sharp" - the # symbol.

I've came across one instance, where #! and # used simultaneously in a single URL. From reading other articles, including RFC, I can't understand whether that is a legal combination or not. When encountering such page Mozilla browser (Iceweasel in this case) displays the URL as having 2 #'s, while Chrome displays only one, but dies shortly afterwards (the tab containing the page becomes unresponsive and crashes - but it may not be connected).

Now, my question is, is it legal to have both in one URL, is it maybe legal and redundant (should be normalized), or is it just a bug in Mozilla browser? So, suppose I'm making an AJAX request, or trying to navigate the browser history - what should I do, if I encounter this situation?

double hash in url

RFC-3986: http://tools.ietf.org/html/rfc3986#section-3.4 , which should be clarifying it... just in case.

Also: https://developers.google.com/webmasters/ajax-crawling/docs/specification how Google crawlers see things.

Answers:

Answer

The format for a fragment only allows slashes, question marks, and pchars. If you look up the RFC, you'll see that the hash mark is not a valid pchar.

However, browsers will try their best to read non-valid URLs by treating repeat hashes as though they are escaped, as you can see by checking the value of window.location.hash (in IE, Firefox, and Chrome) for

http://www.example.com/hey#foo#bar

which is the same window.location.hash for

http://www.example.com/hey#foo%23bar
Answer

It may be legal as @apsillers mentioned. But I would avoid it unless necessary as it can cause a certain confusion concerning the url.

That kind of url:

http://www.example.com/hey#foo#bar

Seems really confusing to me and will be even more confusing to regular users and maybe search engines.

Answer

My answer is a clear no, at least when referring to RFC 3986. But you have to look at more than just 3.4

Section 3 defines the structure of an URI as follows:

     foo://example.com:8042/over/there?name=ferret#nose
     \_/   \______________/\_________/ \_________/ \__/
      |           |            |            |        |
   scheme     authority       path        query   fragment

(I just took the upper part, relevant for URLs)

So, to answer your question, you have to look at all the parts:

  • The scheme may not contain a hash sign (only ALPHA *( ALPHA / DIGIT / "+" / "-" / ".")
  • The autority may not contain a hash (I don't go into detail here) and is even 'terminated by the next slash ("/"), question mark ("?"), or number sign ("#")'.
  • The path 'consists of a sequence of path segments separated by a slash ("/") character.' The path segments in turn can only consist of pchars, see e.g. this answer. So no hashes here! It will also be terminated 'by the first question mark ("?") or number sign ("#"), or by the end of the URI'.
  • The query part (indicated by the first "?") may only consist of pchar, "/" or "?" and will be 'terminated by a number sign ("#") character or by the end of the URI.'

So, no hashes allowed so far except for terminating the URI, which is not what we want, if would like to use at least one hash ;-)

Finally:

  • The fragment is 'indicated by the presence of a number sign ("#")' and also consists only of pchar, "/" or "?". It is 'terminated by the end of the URI'.

To sum up: Only one "#" is allowed in a compliant URL (or URI) as the marker for the URL-fragment. Especially hash signes that are supposed to be in the path (at least from the looks, as there are slashes afterwards) are problematic as they officially terminate the path part.

This can cause problems e.g. in single page applications where this is used because the navigation after the hash is done on client side not on the server. In this case, the SPA should make sure, it correctly handles the rest of the URL on reception which can include the possibly (browser specific) URL-encoded query and fragment .

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us

©2020 All rights reserved.