If you’ve been reading my past blog posts, then you’ll already be aware of both on page and off page optimisation. Your site, however, could still be holding you back from that all important place atop the rankings.

There’s a lot to technical SEO – from how well Google’s spiders can crawl and index the site, to how fast the site loads up. Optimising the URL could also be considered as technical SEO, but I have already covered that in my previous further on page optimisation post.

Is your site being indexed?

Google indexes your site by sending out robots to crawl the page. These robots, commonly often referred to as spider or bots, search through links and catalogue the content. A good linking structure is vital so the spiders (and the uses too!) can find their way around your site and index it.404 not found

Another way to make sure the spiders can find their way around is by setting up both an XML and HTML sitemap. These are lists of all the pages on your site which the robots can use to access every page you want them to index.

If your site doesn’t have one, you can generate your own XML sitemap by following this link and filling in the form. Once you have that, submit it to Google and they’ll crawl it.

But even if you’ve got good navigation throughout your site and you’ve set up your sitemaps, the robots could still run into a barrier which stops them from crawling your entire site.

Spiders run into problems if your site is built on Flash, for instance, as they can’t index Flash like a normal webpage. While a site should never be built completely from Flash, Flash can be used to add some value to your pages – as long as you remember that none of the content will be indexed.

It’s similar for JavaScript. Spiders can’t run any of the JavaScript on a site, so anything hidden behind that will stay hidden to Google. It’s worth reading this if you want to find out more about Google crawling JavaScript.

An easy way of checking if your site is being indexed properly is by searching for it in Google and looking at the cached page. Whatever you see on that is what Google has indexed.

Status codes

An important part of technical SEO is status codes. You might have seen a few of these before while browsing the web – the most common seen being a 404, which means there is no page at the URL specified. Status codes are really important if you’re redirecting pages on your site.

For example, if you move a page from one URL to another you’ll need to set up a 301 redirect. A 301 is a code that tells your browser and Google’s spiders that a page has moved – redirecting all traffic to the old URL to the new one. A 301 redirect also transfers the link equity from the old URL to the new one; making sure you’re not missing out on links that would otherwise go dead.

There’re other status codes other than this. Check out this link to the SEOmoz site – there’s an infographic on there that shows exactly what some of the most common status codes mean.

Canonicalization

When you set up your website and the domain, you will be able to visit your site by typing in either http://sitename.com or http://www.sitename.com. This is seen as two separate sites by Google, and is called canonicalization.

This raises a few issues. Anything done on one of the pages is mirrored on the other – duplicated – and as any SEO will tell you, duplicated content is not cool. Links going to the non-www page don’t boost the page authority of the page with www either.

URL redirect

However, this is a great place for those 301 redirects. You can choose to redirect one of the pages to the other, passing on all the link equity and removing the duplicated page. If you decide to redirect to the www version, you can tell when the 301 is in place by typing the URL without the www and seeing if it changes.