skip to main content

TAMU Webmaster's Blog

Information and insight from the A&M Webmasters

TAMU Search

November 2nd, 2007 by Erick Beck

After a full day of letting the Google spider run we have some interesting observations. One – there is a lot of old, broken junk out there. Two – we can tell who is using robots.txt files and we really appeciate you. Three – about 100 sites make up 80% of the web content online.

We’re going to continue refining our search filters next week, I really want to avoid an opt-in approach to what will be allowed in the search engine even if that means micromanaging the search filters. The one thing slowing us down – until Google receives payment we’re stuck with a license limiting us to 500,000 documents rather than 2 million. So until we get that in place we can’t go a whole lot further since we’re way over that amount of pages online.

Friday, November 2nd, 2007 Ongoing Projects, Search
Share this article

No comments yet.

Leave a comment