skip to main content

TAMU Webmaster's Blog

Information and insight from the A&M Webmasters

SEO Report Phase 2 – Site Structure

August 6th, 2014 by Erick Beck

Many elements of a site’s structure affect how search engines interpret and rank the pages. Each of these should be taken into account from the first stages of site design so that you don’t end up with a poorly optimized site that has to be restructured after-the-fact.

Page download speed

Google has publicly announced that download speed is a factor that affects search ranking. They have further announced that slow-performing mobile sites would be penalized. They feel that a poor performing site leads to a bad user experience, and should therefore receive less promotion within the results page. They have not said outright how they measure speed, but the SEO industry has set up several tests in order to make educated guesses on how the algorithm works. They have found that in practice speed is not taken into account enough to affect the top-ten rankings. It seems to only be a factor when speeds are extraordinarily slow and sites are ranked closely enough that speed can be used as a determining factor. This should not be taken as free reign to make slow sites, though. Google is trusted because it gives its users what they want, and as performance becomes more and more of an issue the download speed can easily become more important within the algorithms.

Robot exclusion

Since the beginning of the web there has been a need to instruct automated website crawlers, often known as spiders, how to view your site. The robots.txt file, which must be located in the site’s root directory, is the accepted standard. It allows the site owner to declare which spiders may access the site and what parts of the site are excluded from crawling.

Search engines are the best example of an internet spider, and most legitimate search engines honor robots files. We can take advantage of this in terms of optimizing our site for searches by excluding pages or directories that we do not want indexed.

Robots.txt files should not be the only method employed, though. If a pages that is indexed contains a link to any of the pages that are excluded, that link can still show up in the search results. If the page header can be templated, or if only a small number of pages need protecting, a meta tag is actually preferable.

Orphan pages

Understand how search engines work in terms of creating their index. They crawl from one page to another through following links. Be careful, then, not to create orphan pages that do not have any links pointing to them. They will never show up in the search index. Remember that inbound links are an important part if creating a page’s rank.

Images vs. text

Another important characteristic of a spider that it processes your page as text. It cannot visualize what is in your images and impart any meaning from them whatsoever. Therefore, use text instead of images, Flash, or other technologies to convey important information. Search can’t understand information conveyed visually through other media. For images that you do use, be sure to include alt tags, that can at least add some sense of what you are trying to convey. Keep in mind, though, that keywords appearing as text do outweigh those found in alt tags. Using a text browser such as Lynx will give you a good idea of how search engines see your site.


Sitemap files should be a must when creating and updating any site that you want indexed in the major search engines. These are not the old HTML sitemaps that are simply links to every page on your site. Instead they are specifically formatted XML files that list your pages, which pages you consider the most important, and the last updated time.

This file can in many ways act as a shortcut for the search engines. It allows you to explicitly state the organizational framework of your site, specify which pages you consider to be most important, and let the search engine know when each page was last updated. This last feature can be particularly important if you make a change deep within your site that you want indexed quickly.


Broken links are generally perceived as bad for a site’s search optimization. This can make site redesigns and even general site maintenance a problem when page URLs get removed or renamed. Keep in mind that even though your site might have updated its links to the new structure, there are likely dozens if not hundreds of links to your former pages from sites that you don’t control, all of which will be creating 404s. When a page’s URL can’t be maintained, use redirects to the new page location. Further, use true 301 Redirect statements at the server level rather than meta refresh tags within a stub page. While popular and easy, these refresh statements act only after a preset period of time once a page has been hit, which can disrupt the crawling process.

Search engines and javascript

One common question is whether search engines can index content inside of javascript. The short answer is “sometimes.” It really depends on the technique used to embed the content. One popular use is tabbed windows to show different content. A SEO firm put these to the test, using different off-the-shelf scripts. Some of the scripts allowed Google to index the content and associate it with the parent pages, others crawled the content but indexed it as a completely separate page. So if you want to be sure, put content in plain text.


Wednesday, August 6th, 2014 Search
Share this article

No comments yet.

Leave a comment