In my last blog post, I discussed search engine optimization technically looking at XML sitemaps, Robots.txt files and mentioned all the errors that can negatively affect your SEO performance. To recap:
Why Do A Website Crawl?
- To make sure the website is crawlable
- Identify any issues that could be holding back your SEO efforts
We would usually do a web crawl when we get a new client on board to identify any issues their website may have.
What We Look At:
- Look at the site’s HTML pages and architecture
- See all the Meta Data and the current URLs
- Response/Status Codes (eg 200, 301, 302, 404, 501, etc)
- Duplicate URLs
- Duplicate Page Titles, Meta Tags and h1 Tags
- Page Titles, Meta Tags and h1 tags that are missing or over optimized
- Image sizes – any images over 100kb are not good
Possible Website Errors:
- 4xx Errors
- Server 5xx Errors
- Duplicate URLs
- Duplicate and Missing Page Titles
- Duplicate and Missing h1 tags
- 301/302 Redirects
- Missing Meta Descriptions
- Duplicate Meta Descriptions
- Images over 100kb
- Images Missing Alt Text
The HTTP 404 not found error means the page on the site you are trying to reach could not be found on the server. When someone goes to a page on a site that either does not exist, or the page has since been deleted, you will see a 404 error page appear. 404 errors are not good for users or search engines. In the case when a page has been removed, it needs to have a proper redirect associated with it. 404 errors occur when a page is moved to a new URL; and a page got deleted, or you are linking to an incorrect URL.
The 500 Internal Server Error is really a general HTTP status code which means something has gone wrong with the website’s server rather than individual pages. Having 500 errors is more serious as it can affect the entire website. In this case, you would need to contact your server provider to remedy the issue.
Duplicate URL’s are a source of duplicate content, which will negatively impact rankings for your site due to the “Panda” algorithm that looks for duplicate content.
Mostly, duplicate content issues are caused by URL parameters. For instance, visitors could get to the same page with the following URLs:
If all pages are indexed, they will be considered to be duplicate content. When you receive duplicate URLs it is important to use a canonical tag to tell Google the more important page you want to be indexed.
Duplicate and Missing Page Titles
A page title is one of the most crucial on-page ranking factors in the eyes of Search Engines. A page title is visible in the SERPs. Duplicate page titles, also known as title tags, can hurt rankings because each page title should be unique and targeting specific keywords that are associated with the content on that specific page. Essentially you are telling search engines that all of the pages with the same title tag are about the same topic. Duplicate title tags need to be eliminated for better SEO performance. Here you need to have unique title tags, and therefore, you may need to rename the page titles and add title tags to pages that are missing title tags.
Duplicate and Missing h1s
The header tag also known as the h1 tag will usually be the title of a post or a page which shows emphasized text on the page. It is generally the largest text on a page that stands out. H1s are a key ranking factor for on-page SEO. Not having h1s on a page or having too many Duplicate h1 tags can hurt rankings because each page should be targeted towards a specific keyword that is associated with the content on that page.
There are two main types of 3xx redirects – a 301 and a 302. The 301 redirects aren’t an issue. These are the most common ways to tell search engines that a page has been permanently moved to a new location. This could be from something like a PPC campaign landing page that is no longer being used so the old page gets redirected to the home page. That way if people still have the old link, it doesn’t give them a 404 (not found) error but still keeps them engaged on the website. When you tell a search engine that a page has permanently been moved to a new URL (301), it will transfer most of the old page’s authority to the new one. A 302 redirect, on the other hand, signifies a temporary redirect – like maybe an item is temporarily out of stock, and you want to redirect them to a similar item for the time being.
Duplicate and Missing Meta Descriptions
Meta descriptions are HTML attributes that provide a brief but comprehensive information about what the web page is about. Meta descriptions appear in the SERPs. While not a ranking factor, meta descriptions are still important to have from a click-through rate (CTR) perspective. It should be present, unique and have a strong call-to-action to help with the click through rate of a listing on a search engine results page.
Images over 100kb
Images should not be over 100kb as they increase page load speeds, which negatively impacts organic rankings. Photoshop can be used to reduce the image sizes.
Images Missing ALT Text on Images
Image Alt Text also known as Alternative text helps search engines understand what the image is about. It is used to provide a title for your image. When an image is added to a site, it has its file name. It should also have a unique “alt tag” name associated with it. That means that if you want any images on your website to be indexed, they need to have Alt text.
As you can see, many errors can occur on a website and without a website crawl you will never know the errors your website has encountered. Not only is it important to produce high-quality content, but it is also vital that you have an error free website to help boost your on-page SEO efforts.
Essentially, you must first take care of the errors that have high priority. These include 401/501 errors, Duplicate URL’s, Duplicate and Missing Title tags, Duplicate and Missing H1s. Once all the errors have been fixed and you have produced high-quality content, the next step is to work on off-page SEO. Thanks for reading my technical SEO guide on website crawl errors! Contact us today for a technical SEO audit on your website!