How to extract base URL from a string in JavaScript?

I'm trying to find a relatively easy and reliable method to extract the base URL from a string variable using JavaScript (or jQuery).

For example, given something like:

http://www.sitename.com/article/2009/09/14/this-is-an-article/

I'd like to get:

http://www.sitename.com/

Is a regular expression the best bet? If so, what statement could I use to assign the base URL extracted from a given string to a new variable?

I've done some searching on this, but everything I find in the JavaScript world seems to revolve around gathering this information from the actual document URL using location.host or similar.

Answers:

Answer

Edit: Some complain that it doesn't take into account protocol. So I decided to upgrade the code, since it is marked as answer. For those who like one-line-code... well sorry this why we use code minimizers, code should be human readable and this way is better... in my opinion.

var pathArray = "https://somedomain.com".split( '/' );
var protocol = pathArray[0];
var host = pathArray[2];
var url = protocol + '//' + host;

Or use Davids solution from below.

Answer

WebKit-based browsers, Firefox as of version 21 and current versions of Internet Explorer (IE 10 and 11) implement location.origin.

location.origin includes the protocol, the domain and optionally the port of the URL.

For example, location.origin of the URL http://www.sitename.com/article/2009/09/14/this-is-an-article/ is http://www.sitename.com.

To target browsers without support for location.origin use the following concise polyfill:

if (typeof location.origin === 'undefined')
    location.origin = location.protocol + '//' + location.host;
Answer

Don't need to use jQuery, just use

location.hostname
Answer

There is no reason to do splits to get the path, hostname, etc from a string that is a link. You just need to use a link

//create a new element link with your link
var a = document.createElement("a");
a.href="http://www.sitename.com/article/2009/09/14/this-is-an-article/";

//hide it from view when it is added
a.style.display="none";

//add it
document.body.appendChild(a);

//read the links "features"
alert(a.protocol);
alert(a.hostname)
alert(a.pathname)
alert(a.port);
alert(a.hash);

//remove it
document.body.removeChild(a);

You can easily do it with jQuery appending the element and reading its attr.

Answer
var host = location.protocol + '//' + location.host + '/';
Answer

A lightway but complete approach to getting basic values from a string representation of an URL is Douglas Crockford's regexp rule:

var yourUrl = "http://www.sitename.com/article/2009/09/14/this-is-an-article/";
var parse_url = /^(?:([A-Za-z]+):)?(\/{0,3})([0-9.\-A-Za-z]+)(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/;
var parts = parse_url.exec( yourUrl );
var result = parts[1]+':'+parts[2]+parts[3]+'/' ;

If you are looking for a more powerful URL manipulation toolkit try URI.js It supports getters, setter, url normalization etc. all with a nice chainable api.

If you are looking for a jQuery Plugin, then jquery.url.js should help you

A simpler way to do it is by using an anchor element, as @epascarello suggested. This has the disadvantage that you have to create a DOM Element. However this can be cached in a closure and reused for multiple urls:

var parseUrl = (function () {
  var a = document.createElement('a');
  return function (url) {
    a.href = url;
    return {
      host: a.host,
      hostname: a.hostname,
      pathname: a.pathname,
      port: a.port,
      protocol: a.protocol,
      search: a.search,
      hash: a.hash
    };
  }
})();

Use it like so:

paserUrl('http://google.com');
Answer

Well, URL API object avoids splitting and constructing the url's manually.

 let url = new URL('https://stackoverflow.com/questions/1420881');
 alert(url.origin);
Answer

If you are extracting information from window.location.href (the address bar), then use this code to get http://www.sitename.com/:

var loc = location;
var url = loc.protocol + "//" + loc.host + "/";

If you have a string, str, that is an arbitrary URL (not window.location.href), then use regular expressions:

var url = str.match(/^(([a-z]+:)?(\/\/)?[^\/]+\/).*$/)[1];

I, like everyone in the Universe, hate reading regular expressions, so I'll break it down in English:

  • Find zero or more alpha characters followed by a colon (the protocol, which can be omitted)
  • Followed by // (can also be omitted)
  • Followed by any characters except / (the hostname and port)
  • Followed by /
  • Followed by whatever (the path, less the beginning /).

No need to create DOM elements or do anything crazy.

Answer

I use a simple regex that extracts the host form the url:

function get_host(url){
    return url.replace(/^((\w+:)?\/\/[^\/]+\/?).*$/,'$1');
}

and use it like this

var url = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/'
var host = get_host(url);

Note, if the url does not end with a / the host will not end in a /.

Here are some tests:

describe('get_host', function(){
    it('should return the host', function(){
        var url = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://www.sitename.com/');
    });
    it('should not have a / if the url has no /', function(){
        var url = 'http://www.sitename.com';
        assert.equal(get_host(url),'http://www.sitename.com');
    });
    it('should deal with https', function(){
        var url = 'https://www.sitename.com/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'https://www.sitename.com/');
    });
    it('should deal with no protocol urls', function(){
        var url = '//www.sitename.com/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'//www.sitename.com/');
    });
    it('should deal with ports', function(){
        var url = 'http://www.sitename.com:8080/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://www.sitename.com:8080/');
    });
    it('should deal with localhost', function(){
        var url = 'http://localhost/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://localhost/');
    });
    it('should deal with numeric ip', function(){
        var url = 'http://192.168.18.1/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://192.168.18.1/');
    });
});
Answer

To get the origin of any url, including paths within a website (/my/path) or schemaless (//example.com/my/path), or full (http://example.com/my/path) I put together a quick function.

In the snippet below, all three calls should log https://stacksnippets.net.

function getOrigin(url)
{
  if(/^\/\//.test(url))
  { // no scheme, use current scheme, extract domain
    url = window.location.protocol + url;
  }
  else if(/^\//.test(url))
  { // just path, use whole origin
    url = window.location.origin + url;
  }
  return url.match(/^([^/]+\/\/[^/]+)/)[0];
}

console.log(getOrigin('https://stacksnippets.net/my/path'));
console.log(getOrigin('//stacksnippets.net/my/path'));
console.log(getOrigin('/my/path'));

Answer

This, works for me:

var getBaseUrl = function (url) {
  if (url) {
    var parts = url.split('://');
    
    if (parts.length > 1) {
      return parts[0] + '://' + parts[1].split('/')[0] + '/';
    } else {
      return parts[0].split('/')[0] + '/';
    }
  }
};

Answer

A good way is to use JavaScript native api URL object. This provides many usefull url parts.

For example:

const url = 'https://stackoverflow.com/questions/1420881/how-to-extract-base-url-from-a-string-in-javascript'

const urlObject = new URL(url);

console.log(urlObject);


// RESULT: 
//________________________________
hash: "",
host: "stackoverflow.com",
hostname: "stackoverflow.com",
href: "https://stackoverflow.com/questions/1420881/how-to-extract-base-url-from-a-string-in-javascript",
origin: "https://stackoverflow.com",
password: "",
pathname: "/questions/1420881/how-to-extract-base-url-from-a-string-in-javaript",
port: "",
protocol: "https:",
search: "",
searchParams: [object URLSearchParams]
... + some other methods

As you can see here you can just access whatever you need.

For example: console.log(urlObject.host); // "stackoverflow.com"

doc for URL

Answer
String.prototype.url = function() {
  const a = $('<a />').attr('href', this)[0];
  // or if you are not using jQuery ????????
  // const a = document.createElement('a'); a.setAttribute('href', this);
  let origin = a.protocol + '//' + a.hostname;
  if (a.port.length > 0) {
    origin = `${origin}:${a.port}`;
  }
  const {host, hostname, pathname, port, protocol, search, hash} = a;
  return {origin, host, hostname, pathname, port, protocol, search, hash};

}

Then :

'http://mysite:5050/pke45#23'.url()
 //OUTPUT : {host: "mysite:5050", hostname: "mysite", pathname: "/pke45", port: "5050", protocol: "http:",hash:"#23",origin:"http://mysite:5050"}

For your request, you need :

 'http://mysite:5050/pke45#23'.url().origin

Review 07-2017 : It can be also more elegant & has more features

const parseUrl = (string, prop) =>  {
  const a = document.createElement('a'); 
  a.setAttribute('href', string);
  const {host, hostname, pathname, port, protocol, search, hash} = a;
  const origin = `${protocol}//${hostname}${port.length ? `:${port}`:''}`;
  return prop ? eval(prop) : {origin, host, hostname, pathname, port, protocol, search, hash}
}

Then

parseUrl('http://mysite:5050/pke45#23')
// {origin: "http://mysite:5050", host: "mysite:5050", hostname: "mysite", pathname: "/pke45", port: "5050"…}


parseUrl('http://mysite:5050/pke45#23', 'origin')
// "http://mysite:5050"

Cool!

Answer

If you're using jQuery, this is a kinda cool way to manipulate elements in javascript without adding them to the DOM:

var myAnchor = $("<a />");

//set href    
myAnchor.attr('href', 'http://example.com/path/to/myfile')

//your link's features
var hostname = myAnchor.attr('hostname'); // http://example.com
var pathname = myAnchor.attr('pathname'); // /path/to/my/file
//...etc
Answer

You can use below codes for get different parameters of Current URL

alert("document.URL : "+document.URL);
alert("document.location.href : "+document.location.href);
alert("document.location.origin : "+document.location.origin);
alert("document.location.hostname : "+document.location.hostname);
alert("document.location.host : "+document.location.host);
alert("document.location.pathname : "+document.location.pathname);
Answer
function getBaseURL() {
    var url = location.href;  // entire url including querystring - also: window.location.href;
    var baseURL = url.substring(0, url.indexOf('/', 14));


    if (baseURL.indexOf('http://localhost') != -1) {
        // Base Url for localhost
        var url = location.href;  // window.location.href;
        var pathname = location.pathname;  // window.location.pathname;
        var index1 = url.indexOf(pathname);
        var index2 = url.indexOf("/", index1 + 1);
        var baseLocalUrl = url.substr(0, index2);

        return baseLocalUrl + "/";
    }
    else {
        // Root Url for domain name
        return baseURL + "/";
    }

}

You then can use it like this...

var str = 'http://en.wikipedia.org/wiki/Knopf?q=1&t=2';
var url = str.toUrl();

The value of url will be...

{
"original":"http://en.wikipedia.org/wiki/Knopf?q=1&t=2",<br/>"protocol":"http:",
"domain":"wikipedia.org",<br/>"host":"en.wikipedia.org",<br/>"relativePath":"wiki"
}

The "var url" also contains two methods.

var paramQ = url.getParameter('q');

In this case the value of paramQ will be 1.

var allParameters = url.getParameters();

The value of allParameters will be the parameter names only.

["q","t"]

Tested on IE,chrome and firefox.

Answer

Instead of having to account for window.location.protocol and window.location.origin, and possibly missing a specified port number, etc., just grab everything up to the 3rd "/":

// get nth occurrence of a character c in the calling string
String.prototype.nthIndex = function (n, c) {
    var index = -1;
    while (n-- > 0) {
        index++;
        if (this.substring(index) == "") return -1; // don't run off the end
        index += this.substring(index).indexOf(c);
    }
    return index;
}

// get the base URL of the current page by taking everything up to the third "/" in the URL
function getBaseURL() {
    return document.URL.substring(0, document.URL.nthIndex(3,"/") + 1);
}
Answer

This works:

location.href.split(location.pathname)[0];
Answer

You can do it using a regex :

/(http:\/\/)?(www)[^\/]+\//i

does it fit ?

Answer
var tilllastbackslashregex = new RegExp(/^.*\//);
baseUrl = tilllastbackslashregex.exec(window.location.href);

window.location.href gives the current url address from browser address bar

it can be any thing like https://stackoverflow.com/abc/xyz or https://www.google.com/search?q=abc tilllastbackslashregex.exec() run regex and retun the matched string till last backslash ie https://stackoverflow.com/abc/ or https://www.google.com/ respectively

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us

©2020 All rights reserved.