Quick and easy custom search with Google

By Nick Holmes on July 14, 2008
Comments Off on Quick and easy custom search with Google
Filed under Articles, Search

Some 18 months ago Google launched its Custom Search service (still in beta) that enables you to create a custom search engine (CSE) focussing on anything up to 2,000 specified URLs.

The rationale is that, despite its undoubtedly sophisticated algorithms, even with a carefully crafted search, Google will always return results near the top that are of low relevance or unwanted and effectively hide some results that are particularly relevant or wanted. A Google CSE enables you to influence the results by including exclusively or emphasising results from particular sites or particular parts of sites that are determined by you to be most relevant to your audience.

Applications

Before looking at the nitty gritty of how it works, let’s consider the circumstances in which a Google CSE might be useful.

Search your own site

Chances are Google indexes all the open-access content on your site. By pointing a Google CSE just at your site you instantly have a site search facility with which your users are familiar. Even if you currently have a site search facility, it’s worth experimenting with a Google CSE.

Search a single favourite site

How many large sites have you visited where the search engine provided is less than satisfactory or even worse than useless for your purposes? By setting up a Google CSE for the site you can customise the search experience to suit your exact needs.

Search multiple sites in a particular domain

On the face of it, this is the most compelling application for CSEs. You can combine the power of Google with your own expertise and judgment in a particular domain to deliver results that are most relevant to your target audience.

How to create a custom search engine

Setting up a basic Google CSE is simplicity itself.

  • Go to www.google.com/coop/cse/ and, if you do not already have a Google account, register.
  • Now click the button to Create a Custom Search Engine.
  • Enter a Name for your search engine and a Description and drop into the Sites to search box a list of the URLs of the sites you want to include or emphasise in the search.
  • Accept the Terms of service and click Next.
  • On the next page, click Finish and you’re done.

You now have a Google CSE accessible via a page on Google that will search just the sites you have listed.

Improvements

To improve your CSE, go to the My search engines page and click the control panel link for your CSE.

Include or emphasise the sites?

Under preferences, you’ll note you can select to search only the included sites or to search the entire web but emphasise the included sites. The latter option will weight the sites you have listed so that they generally appear before other results. This is an important decision and it will depend on each particular application. Bear in mind that if you include only the sites you have listed, all other results – many of which may be relevant to your users – will not appear.

Whole sites, folders or patterns?

Click the Sites menu link and click on one of the sites you’ve listed. You’ll note you can include all pages whose address contains the URL or include just the specific page or URL pattern you have entered. It’s tempting to take the easy route and simply include whole sites. But consider whether it would be better instead to include several specific folders or to use wildcards to select specific subsets of pages.

Refinements?

By labelling your sites, you enable the user to refine their results by category. Your labels can be used either to include only sites in that category or to emphasise those sites over the others. The decision will usually depend on whether the categories are mutually exclusive or not.

My place or yours?

If you’re using the default Google-hosted version, just click the Look and feel menu link to customise the format of your search page. But your CSE will be far more effective if you host it on your own site. This involves a little more work.

Click the Code menu link and you’ll be given the Google iFrame code that will enable you to:

Place a search box for your CSE on any page on your site. Simply drop the search box code into the relevant place on any page on your site.

Place the results on a designated page on your site. Set up a page template to host the results; drop the search code into the top of the page and drop the search results code beneath.

Pesky ads?

If you accepted the default Standard Edition which is free, Google ads will be displayed on the results pages. These will appear either above and below the results or to the right. Click the Code menu link to change this.

To suppress the Google ads you’ll need to pay Google $100 p.a. Click the Business Edition menu link to sign up.

Effectiveness

Do CSEs deliver useful search engines that improve the search experience for your users?

There are many circumstances in which CSEs will be effective in searching single sites or particular parts of sites because an existing alternative is lacking or inefficient. However, my experience of multi-site CSEs produced for particular domains is that, without exception, they all disappoint.

Examples of CSEs produced for the UK legal arena include:

My blawgle, searching UK law blogs, and a few others I developed covering cases, legislation and government sites.

Nearly Legal’s LawSearch which includes selected statute and case law, government guidance, reports, commentary, other resources and help and law blogs.

OUT-Law’s LawTrawlUK which currently includes 95 major law firm sites and some significant public sector sites.

Most CSEs I’ve come across are fairly basic, including just a selected list of sites to be included in the results. But to produce a CSE that does the business does require considerable thought and time implementing the advanced features: carefully and methodically selecting specific folders and/or file types using lists or wild cards, labelling the entries and weighting them.

For those reasons, when setting up the experimental CSEs on infolaw, I decided that creating CSEs with tightly defined scopes might be a fruitful path to follow; within some I spent time pointing to specific folders and folder patterns rather than just the sites; and for some I added labels so that results could be refined. I did not get into weighting the results; that was time I wasn’t willing to spend initially. Reactions from others have been positive, but I have to say even I do not use them much myself and I have yet to try and “sell” their benefit to others.

I have also tried a number of other CSEs designed for other domains. Genuine effort and expertise has gone into all of these; but they fail to engage because either one doesn’t know sufficiently precisely the scope of ones search, or if one does, one would prefer a different selection of sites, or the results feel unbalanced; and all the time one knows one will be missing some key results and those unexpected nuggets that a well-crafted global Google search would serve up. Narrowing the domain searched often takes away more than it gives.