How do redirects work with websites?

Depending on the type of redirect you can use (there are five main types, plus a sixth based on the DNS system), the redirection you see and the effect on the visitor will vary, as they all work on different stages of the HTTP & DNS protocols.

To start off with, we'll cover how the HTTP protocol works (and it's interaction with the DNS protocol) on a normal reqest (without any form of redirect).

A Standard Connection

For this standard connection we're going to take you through the steps of a request to view example.com/test.html.

Once the URL has been entered into the address and the user clicks on the Go button (or presses Enter), the browser, via the OS, will need to make a DNS lookup for the domain name to find out which computer it needs to contact to get the page.

The browser will ask the Operating System to get the IP address for example.com (also known as the A record). However, the OS cannot do this by itself - it needs to ask a full DNS server to perform the lookup (which is usually your ISP's or network's DNS servers).

Within a few miliseconds, the ISP's DNS servers will return the record (if it's available) to your computer, which will then be given to the browser.

In this case, example.com resides on the server with the IP address 192.0.34.166, so the browser will open a connection to that address on port 80 (ir port 443 if your connecting via HTTPs so that it can make a request.

Before the remote server can send us anything, we first need to send it a request. This is a few lines of information sent to the server from the browser asking it for a particular page, as well as defining what version of browser is making the request, on which site the request should be processed and more. For example, a full HTTP request looks like:

GET /test.html HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.8.1b2) Gecko/20060901 Firefox/2.0b2
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

Here, the two first lines are the most important (the GET line and the Host option). The GET line tells the server to get the page /test.html using the HTTP protocol (version 1.1), while the Host header states that /test.html is on the example.com domain.

The reason for the Host header is that many sites can exist on one IP address using virtual hosts within the web-server. When the browser makes a connection to the server, it does so to the IP address, not the domain name. So, the server needs a way to tell on which site the request is for.

By sending the domain name as part of the request the internet can save on the number of IP addresses is uses by combing many sites onto a single address

The server will then look in it's configuration, find out where the files for example.com are located, add /test.html onto the end and then return the request back.

However, the content the server returns isn't just the file. The return also has it's own set of headers which describe what type of file it is, the date it was served, how big it is as well as (on some connections) how long the file should be cached for by a proxy server/browser. For example, the return from the server for /test.html on example.com looks like this:

HTTP/1.1 404 Not Found
Date: Fri, 01 Dec 2006 14:09:41 GMT
Server: Apache/2.0.54 (Fedora)
Content-Length: 284
Connection: close
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /test.html was not found on this server.</p>
<hr>
<address>Apache/2.0.54 (Fedora) Server at example.com Port 80</address>
</body></html>

With the page returned, the browser can display the content and close the connection - the request was successful (even though in this case the page cannot be found, the process was successful).

The Redirects

There are five main types of redirects available in the HTTP protocol:

Meta Refresh
HTML code which tells the browser to move onto the next page after x seconds.

Frame Redirect
HTML code which loads the contents of another page inside a frame, making it appear that the page is from somewhere else.

Header Redirect
A Location header is returned by the server telling the browser to go to another page (generally there's no content with this page, it's just a return telling the browser to go somewhere else).

HTTP Redirect
Is similar to the header redirect, but is more definitive and includes HTTP response codes which tell the search engine/browser more information about the redirect.

Rewrite
Not a redirect in the same sense as above (in that it forces another request to somewhere else), a rewrite changes the URL on the server-side, redirecting the request to another file/location before it's processed by the server.

DNS CNAME Redirect
Again, not quite the same as the first four, as this works below the HTTP protocol at the DNS level, but can be used in some circumstances and is a trap some people fall into, thinking that it will redirect a site.

The Meta Refresh

The meta refresh redirect works above the main HTTP protocol and requires an application that understands the HTML header through which this command is given. However, (as near as makes no different) 100% do support this, and it's the simplest form of redirect:

<html><head>
<title>Test Page</title>
<meta http-equiv="refresh" content="0;url=http://example.com/redirected.html">
</head>
<body></body></html>

Here, 0 is the number of seconds to wait before refreshing the page and the ;url=... section is the new URL to go to when the browser refreshes the page.

The main downside is that the redirect isn't immediate - the browser must first render the page then wait x number of seconds, which you set, before moving on. Yet this in itself can be useful - you can set it to refresh the page every y seconds by omitting the ;url=... section (leaving just content="x" where x is a number), avoiding the need for JavaScript.

Also, both the source URL and the destination URL need to be on valid domains in order for the request to be successful. As just like the first request, the second request (for the page we've redirected to) will go through all the steps (Make DNS query » Receive IP address » Establish connection » Send request (with Host header of the new domain) » Receive data).

Frame Redirect

Similar to the meta refresh redirect, in that this works outside of the HTTP protocol, so the application must understand the request (which almost all do), rather than move the browser onto another page, the start page creates a single frame inside the current page, which fits the whole page, and loads the destination URL inside it.

Like meta refresh, both domains must be valid web-sites in order for this to work (as a whole new request, connection, etc. will be made for the sub-page), although Plesk does have a short-cut allowing you to create a frame redirect without having to create a full website.

On the other hand, the main advantage of this is that the destination site can 'appear' to be from the source site as the destination site will always load inside the frame from the source site - the address in the URL bar doesn't change.

To create a frame page you just need something similar to the following:

<html><head>
<title>Test Page</title>
</head><body>
<frameset>
<frame name="redirect" src="http://example.com/redirected.html"></frame>
</frameset>
</body></html>

Header Redirect

Also known as Standard Redirect in Plesk, rather than redirecting the request by refreshing the page to another location, or loading it up in a sub-page (frame), the server returns a Location header as part of the return to the browser which tells the browser not to load this page and go to this location instead.

The main upside of this is it's more immediate than a meta refresh as the browser will follow the link until it gets to a valid page before trying to display it (you won't see blank screens between pages). It's also the same option that's used in the HTTP redirects below.

However, it's not supported directly in HTML and therefore can only be done through a script. For example (in PHP):

<?php
  header('Location: http://example.com/redirected.html');
php?>

would return the following header by the server:

HTTP/1.1 200 OK
Date: Fri, 01 Dec 2006 14:09:41 GMT
Server: Apache/2.0.54 (Fedora)
Connection: close
Location: http://example.com/redirected.html

Again, this will make a completely new request, so both URLs must be on valid sites.

HTTP Redirect

Still using the Location header from the header redirect option above, a HTTP redirect works by changing the return status code for the page (the number which tells the browser the status of the request). For example, 200 is OK (everything was successful and here's what you've requested), while 404 is a 'Page Not Found' error - the server couldn't find the page you're looking for.

The two codes here are 301 and 302.

301
The page has moved and can be found at the new location given. This is a permanent change and the old URL shouldn't be used any more.

302
The page has temporarily moved. Please use the new URL for the time being, although you can still use the old URL.

To set this up, you need to use .htaccess files in Apache and use the Redirect, RedirectTemp any RedirectPermanent commands. For example:

# Temporary Redirect
Redirect 302 /test.html http://example.com/redirected.html
RedirectTemp /test.html http://example.com/redirected.html

# Permanent Redirect
Redirect 301 /test.html http://example.com/redirected.html
RedirectPermanent /test.html http://example.com/redirected.html

Again, both sites must be valid as it will force a whole new request to the redirected URL.

Rewrite

Below the HTTP protocol, the rewriting option allows, in Apache, for you to rewrite the URL after it's been received by the server, but before it's been processed by the system. For example, you can tell the server when anyone requests /test.html, rewrite it to /redirected.html and serve that instead.

The rewrite isn't known by the visitor/browser as no request is made to the browser, telling it to go somewhere else (although this can happen on certain conditions - see links below for the guide on mod_rewrite). The change is internal to the server, and so is generally only used on the same site.

If you try and rewrite to another site/domain, the server will normally send out a HTTP redirect (Temporary) instead as it cannot process the redirected URL.

DNS CNAME Redirect

This is the odd-one-out in this group, as it's the only one that works outside of the HTTP protocol at any point, and in that respect is not a full redirect.

By changing the DNS settings of one domain to point to the settings of another (i.e. using the CNAME record), you're telling the one domain to use the DNS settings from another, you are pointing the domain at the same server/IP address, however, there is a huge limitation with this option.

When the browser asks the OS for the IP address for the first domain, the OS cannot fully resolve it. So, it must pass the request onto a proper DNS server, which will complete the look up before returning the IP address back to the computer requesting it and hence back to the browser.

Therefore, when the OS sends out the request for the domain, all it receives back is an IP address. Nothing more. It won't know, unlike the DNS server, that the domain name points to another. So, when the request to the server is made, the Host value has the domain name from the original request, and not the domain name it was 'redirected' to.

If you have configured the master domain (i.e. the domain to which the old domain was pointed to) with domain aliases (available in Plesk 8.0 and above) of the other domains, then this can work, but the server must know about both the domains with correct vhosts configured, or the redirect will fail.

Knowledgebase

Categories

Categories

A Standard Connection

The Redirects

The Meta Refresh

Frame Redirect

Header Redirect

HTTP Redirect

Rewrite

DNS CNAME Redirect

Links

Related Articles

Support

Knowledgebase

Categories

Categories