Must-Know URL Hash Techniques for AJAX Applications

By coding the page state into the URL, even single-page web applications can support deep bookmarks and the browser’s back button. The most accepted approach is to utilize the location hash, i.e. the local part of the URL. This article explains this technique and what pitfalls you should be aware of, based on my team’s experience from building an AJAX interface for Solr. You will also learn about the HTML5 History API, which is a second, more modern technique.

Modern web applications often load data via AJAX without leaving the original page; some sites even have only one single page left. The idea behind that was pioneered by Outlook Webclient and, more prominently for Web users, Google Mail in order to create a desktop-like experience. Other benefits include less bandwidth used by the client, faster response time and eventually a more interactive application feeling.

After the first enthusiasm about all these new applications, users found the first serious drawbacks hitting them hard. The beloved back button was not working anymore, bookmarks always led to the first and not the current page and links could not be forwarded.

All these problems have been known for a long time. New techniques now offer very elegant solutions for this problem and I will discuss them in this article.

From Stateless Web Sites to Single-Page Web Applications

In the beginning, the web was stateless. This reflects the stateless nature of the HTTP protocol. Most web sites were purely informational and content-driven like e.g. newspapers and similar. New content was requested via navigating URLs and a whole new page with a new URL was shown.

When the first applications like online shops were created, a state or session would have proved immensely useful; otherwise all URLs had to be dynamic. The web was still young and dynamic, so cookies were invented for this and have become immensely popular since then. The request-response cycle was unaffected; the HTTP requests were still stateless.

The situation changed again when more and more Javascript was used. Javascript now had implications in rendering the local page which were only known to the browser. By leveraging the XMLHttpRequest object, it soon became possible to get information from the server and incorporate that into the document object model (DOM) in the browser (AJAX). A client-side action can trigger a request to the server which brought the displayed URL totally out-of-sync with the state of the web application. This is still the case in most web applications today!

Coding the Application State into the URL

Some clever developers have found out that changes in the URL (also via Javascript) will be interpreted as new pages by the browser. This immediately enables the back button. And if done correctly, the URL once again uniquely designates the current page even if that has been initially created by a single request but modified by a multitude of subsequent AJAX requests.

An URL that is in-sync with the web page state does not come for free though. In order to work correctly, some things have to be taken into account:

  1. The URL must be changeable without reloading the page.
  2. Each and every action that modifies the current page must trigger a URL change.
  3. When the AJAX-changed page is opened using the URL (from a bookmark), the URL must be interpreted to “replay” the changes which are recorded in it.
  4. When the back button is hit, Javascript must detect this.

For an example, consider the scenario in the figure below, where a bookmarked page “/url#2” is called from a browser. Only the part before the “#” (i.e. “/url”) is relevant for the server, so this URL is requested. The Javascript code in the browser needs to interpret the local part (i.e. “2”) by itself and determines which actions to take. In our case, this triggers a subsequent AJAX request. Note that the URL in the browser then does not have to be changed again.

Another scenario is shown in the following figure. Here, a user hits the back button. The hashchange event fires and the Javascript code in the browser must decide how to get the appropriate content, in this case also via an AJAX request like in the previous figure.

Unique URLs (AJAX Patterns) has some great details about this technique.

Changing the Local Part of the URL

Although it might sound easy, already the first point in the list above turns out to be complicated. Of course, Javascript can modify the URL in the browser using location.href, but the browser will then load the page from the modified URL. This is apparently not the desired result.

Clever programmers soon found a solution by only modifying the local part of the URL which then does not trigger a page reload. The local part is defined to be everything after the first hash “#” in the URL. You can directly modify this local part by using the location.hash property. The article “Using Javascript’s Location Object to Work with URLs” is a great introduction.

This technique has become very popular. A lot of sites are using this, the most famous one is probably Twitter. If you open the Tweets page of somebody, you will see that the URL looks like “!/myachinghead”. Effectively, this means that there is only a single web page (apart from about etc.) and everything else is loaded via AJAX.

Links on the page to other “pages” are all intercepted via Javascript and trigger changes in the local part of the URL. For example with Twitter, see what happens if you click on other users but also on the tabs on the tweets page. The URL will change in a well-defined way which is suitable for bookmarks and interpreting all actions which have been performed.

By using this technique, bookmarking is easy. The URL structure of Twitter is fortunately quite simple: the local part is interpreted as the username and the corresponding tweets are loaded asynchronously via AJAX. The same is true for subpages like “Favorites”, “Following” etc.

Detecting Changes in the Local Part of a URL

As the technique has become quite popular, an API for changes on the local part of the URL has been designed and is called onHashChange. You can easily catch this event by using something like:

window.onhashchange = function () { ...  }

Unfortunately, older browsers have no easy interface for listening to hash changes. An alternative is to use setInterval in order to check periodically for a changed hash.

Frameworks like jQuery have plugins for hash, see e.g. jQuery BBQ. These plugins degrade gracefully to older browsers and thus hide the messing around with different APIs from your code etc.

The following figures shows the workflow when changing hashes: clicking an active element must only change the hash (i.e. by serializing the current application state there). Immediately after that, a hashchange event fires and will be detected by the browser. The local part of the URL must then be interpreted and in this case generates an AJAX request to the server. Note that if the element is clicked again and the serialized data is unchanged, no event will be triggered (as the hash has not changed) and the application has been accidentally optimized.

Twitter is also an excellent example of what problems can arise using this solution:

  • Search for the Twitter page of somebody using Google (or Bing, it doesn’t matter). Click the result and see how the URL changes (the “#!/” is inserted). The reason is that search engines will never index pages which only differ in the local part of the URL. For search engines to work properly, some tricks like sitemap.xml, rewrite rules etc. have to be used. See our blog series about SEO to get more information. Also, Google provides an interesting document “Making AJAX Applications Crawlable”.
  • Open the Tweets page of a user, click “about” (the company link) and then use the back button. The result is not what you expect! The back button works fine when jumping between tweets of users, though.

Using a real URL and not only one differing in the local part, offers many exciting possibilities.

Example 1: AJAX-ifying a conventional page-based website

For example, if you convert a normal page-based (i.e. request-response based) website into an AJAX-based one, you can keep your current URL structure but generate the URLs on the client side by using pushState. The associated AJAX call will create a request to the server. You can (and should!) use the same URL as for the whole page and detect on the server side that this is an AJAX request and should render as e.g. JSON (which can be pushed as data in pushState) whereas normal requests will render as complete HTML.

Following this approach, your server-side logic needs no modification, only the rendering must be changed. Bookmarking works still in the usual way as the bookmarked page will be requested as a full HTML page even if the application has become a single page application under the covers. This technique has been used successfully by e.g. Github for quite some time. It is so seamless that you might not even have noticed it.

Example 2: Creating an HTML search interface

In a real-world project, we have created a large-scale search application using Apache Solr as a backend. The API is exposed via Tomcat, which intelligently distributes search requests among the cluster, and we created a REST interface with JSON as data transport.

Putting an AJAX-enabled Web application on top of that proved to be quite easy. JSON data can be used directly in Javascript. Each search creates an AJAX request to the server and updates the results table. Trouble hit us as soon as the first users were testing the application. They were used to working with the back button which didn’t work as expected. A solution had to be found.

As legacy browsers still had to be supported we chose to implement a URL hash. As jQuery was already in use in the project, the choice was to go with jQuery BBQ. The application was changed to “serialize” the state into the URL. URL changes then raise the hashchange event which in turn trigger the search. We got some nice add-ons for free:

  • The first search starts automatically.
  • Modifying the URL directly also creates a search.
  • Hitting search again without changing the parameters does not change the hash and no event is raised, i.e. no search is performed.
  • Reload works out of the box.
  • Users can (again) use bookmarks e.g. for wrong results and check back later.
  • Users can forward URLs to their colleagues for cross-checks.

After this change has been successfully rolled out, the customers were much happier. The application got a more desktop-like feeling without sacrificing the convenience of the Web’s ubiquitous back button and bookmarks.

Solutions for more complex Web Applications

The discussed solution works fine for small websites or dialog-driven applications. As soon as the application gets more complex, new approaches will prove more efficient.

There are quite a few client-side frameworks which all have their individual strengths. The more popular ones are Backbone.js and Knockout.js. For an interesting discussion about their differences see the discussion “Knockout.js vs Backbone.js (vs ?)”. Depending on the requirements, other paradigms like SproutCore, GWT or Vaadin should also be considered.

Using the HTML5 History API

Instead of messing with URLs and onhashchange events and since the functionality is already so important and will be gaining even more relevance in the future, HTML5 provides a different, more elegant solution. This solution is the History API, a dedicated interface that has been designed solely for the forward/backward navigation in the browser. For this post, mostly two methods are interesting:

  1. If a URL has to change, the method history.pushState(url, title, data) can be called. For the parameter: Besides the new URL (url) which replaces the current URL without reloading, a title parameter and a data object can be given. The title is more or less irrelevant, whereas the data parameter can be used to encapsulate the state of the page. If the back button is pressed, this state can be retrieved and used to correctly render the page. Using this state opens therefore possibilities of handling navigation more fine-grained than on a pure URL basis (it can also be viewed as a local session storage which is context-sensitive to each navigational step in the page).Interesting reads on this topic are: “location.hash is dead. Long live HTML5 pushState!”, Degradable JavaScript Applications Using HTML5 pushState, and the jQuery Plugins BBQ and History.js with their fallback capabilities to the old onhashchange functionality.
  2. The History API also defines a callback for the window element which is called onpopstate. You should define a function and assign it to window.onpopstate, which is then responsible for reconstructing the original page. Of course you have access to the URL of the desired page and to the state object. Both can be used for rendering the page, e.g. via performing AJAX requests or using the pushed variables.

The two figures above show the workflow when using the HTML5 History API. The upper figure describes forward navigation and includes pushing the new state and the AJAX request to the server. The lower figure shows that the URL changes and a popstate event fires when the back button is pressed. The application can either use local state storage or (in our case) use an AJAX call to update the page.