Website Crawlability
Is your website crawlable?
Website Crawlability
The SEO community is totally confused as to the meaning of website crawlability in reference to preventing problems with Google.
Crawlability refers to the ability of a search engine to crawl through the entire text content of your website, easily navigating to every one of your webpages, without encountering an unexpected dead-end.
It has absolutely nothing to do with W3C Validation issues.
Always avoid unexpected dead-ends, both for humans using web browsers and for search enigne bot visitors.
Crawlability is the number one factor for preventing Google problems (supported by Googler Matt Cutt’s rather unaccessible video [Transcription]).
- Test Crawlability with a TEXT Browser – Starting from the home page of your web site, a visitor to your site, should be easily able to locate every one of your web pages. Avoiding Google problems is all about website fundamentals. If you can view / access your entire site well enough to read its informational content using only an old out of date CSS challenged web browser browser (such as Netscape v. 4.08 for Windows95/98) in text mode only, then you are going to be in pretty good shape search engine wise. Google itself provides a Cache Text option that explicitly points out what Google sees on each webpage that is cached by Google. Alternatively, you can check it one page at a time using the Poodle Predictor website. Even performing a simple Ctrl-A select all in one of your webpages and then pasting it in a text editor can revealing interesting information about what search engines see on your webpage.
- Sitemaps – HTML sitemaps enable human visitors to reach orphan pages on your website. Accordingly HTML sitemaps should be followed, but not indexed by Google. Text and XML formated sitemaps are for search engine bots, spiders, and crawlers. They artificially make uncrawlable websites more crawlable.
- Bad, or Broken, Links are NOT Crawlable – Every hyperlink on your entire website, whether internal or external, should be functional or in working condition. Nor, are links that Google choses to ignore crawlable.
“Make sure your navigation links are in HTML, and not in Flash or Javascript. Search engines have trouble extracting links from anything other than HTML.” — X-Googler Vanessa Fox
Website Crawlability Rating
Bad or broken links are an unexpected and a unwanted roadblock to navigating the Web. When traveling on the Web nobody likes to run into an unexpected dead-end, neither does Google. Dead-ends on your site lead visitors to nowhere. Every bad, or broken, hyperlink on your entire website reduces the crawability rating of your website. If a hyperlink does NOT work, that means that a visitor to your site cannot navigate or visit it. Google assigns each website a crawlability rating. Too many bad or broken hyperlinks on your entire website and your site will be completely filtered out of Google’s listings or SERPs.
The links on this website are only a few months old, yet I have had to replaced over a half-dozen of them already. Once your site starts voting for other websites, you are committing yourself to a never-ending maintenance headache. Websites are constantly changing their URLs. Webpages are here today, but gone tomorrow. Put off maintaining your inbound links long enough, and Google will filter out your site 100%.
Webpage Crawlability
The most important consideration for one of your webpages to be crawled is for it to have other links pointing to the page. That means designing your website so that:
- Other internal webpages on your site link to it, the more the better.
- Every page on your site should link to your home page as well as to other pages on your site.
- The more internal links a webpage has pointing to it, the more important it is considered by Google, and the higher its pagerank.
To be crawlable, a webpage has to be able to take a visitor someplace else. Visitors must be able to navigate to either another one of your webpages or to an external site. Bad or broken links reduces the crawability of your webpages.
Beyond your site’s internal link structure, other considerations for determining webpage crawlability is the existence of text on the page that can be indexed. And, whether or not the links on a given page are good, bad, or are being ignored by Google. Remember that Google has chosen to ignore graphic hyperlinks without alt text image tags, some types of javascript, and Flash.
2,275 views — Tags: CSS, design, Google Gone Wrong, John H. Gohde, PHP, problems, search engine, SERPs, website
» Website Crawlability
Then help our blog by linking to it. Simply copy and paste the code below into your website (Ctrl+C to copy)
It will look like this: Website Crawlability
Digg It Add To Delicious Stumble This Add to Technorati