skip to main content

TAMU Webmaster's Blog

Information and insight from the A&M Webmasters

Search engine

July 20th, 2007 by tamuwebmaster

We’ve been dealing with a search engine for quite a while. Back at CIS, we worked with the Systems folks when they brought in Inktomi/Ultraseek. So when the university moved to Google’s university search engine offering, it did and didn’t seem to help. While it would go after all of the pages and we were no longer bound by a license, we lost the ability to index on demand, and “weight” the rankings to ensure the right sources always appeared at the top.

So, CIS began the discussion again with Marketing & Communications about the search engine before we came on board, and now we have taken it up ourselves. However, the landscape has changed and we will be investigating some new options. Depending on the product/solution, we’d want to be able to provide the equivalent of a library “collection”, where we can collect certain links and collections of links that would go together. This way there might be an “engineering” collection or “admissions” collection to better serve the needs of certain audiences who might not know exactly what they are looking for inside of a certain term range.

But while finding the right application/solution, we’re also keen on setting up some best practices for page coding to ensure pages are tagged/coded the right way, so that we don’t end up trying to fight so much junk. There are still those individual calendar pages out there that create single html pages for every day since creation to the end of time, and those really don’t need to be in the search results.

To sum it all up let me use an example, when someone types in “science” into the Google CSE we have on WWW, then we should be able to/want to weight pages such as the “College of Science” home page (9th on list) higher than say “Color Science” (2nd). We’d also want to make sure that Political Science for example (10th) titles their page correctly so that I know that this is the home page for the PoliSci folks.

Poultry Science (1st and 4th) is also a good example, because we’d want to make sure that the home page shows up on the list before their “virtual library”.

Possible related refinements would be nice so that “[variable] science” options could be highlighted, and could help Biomedical Science, Food Science, and Construction Science (12th, 14th, 19th) “move up”. But then again, when we are able to control the returns, the weighting, the spidering, we can also make sure that “Index of/” and “File not found” (17th and 18th)are gone–page by page, directory by directory or if it’s really bad, site by site.

Friday, July 20th, 2007 Future Projects
Share this article

2 Comments to Search engine

  1. Have you looked into Google Coop at all?

    I haven’t messed around with it much, but it seems to do a lot of what you’re after.

  2. Bill Erickson on July 20th, 2007
  3. Google’s Custom Search Engine is built on the Google Co-op platform. It’s fairly robust, but like Google Co-op you don’t have any control over what gets spidered, when it gets spidered, and how often it gets spidered. It does offer refinements but it is limited, especially across the whole campus.

    It does let you return results without ads (which is great for us), but we will be pursuing a hosted appliance or engine ourselves to give us even greater control over returns, features and search capabilities.

  4. chiv on July 20th, 2007

Leave a comment