Search

You are currently browsing the archive for the Search category.

Catching up

Been away on protracted hols. Quite possible to have kept posting of course, but did not have the inclination. Had I done so, here’s a few things I might have posted about:

Martindale-Hubbell Connected

In July Robert Ambroggi took an exclusive first look. It’s now out in public beta. Will this fly or crash?

The rise of Twitter for lawyers

Adrian Lurssen on JD Supra posted a list of 145 Lawyers (and Legal Professionals) to Follow on Twitter; the list has now grown to 250. Kevin O’Keefe has also been posting a lot about lawyers’ use of Twitter. So we’re now in the talking it up phase; as to how usesful a tool it will turn out to be, the jury will be out for some time longer.

Information Overlord goes RSS crazy

Scott Vine posted an impressive list of links for UK Central government departments, executive agencies and non-departmental public bodies with rss feeds and he’s followed it up with similar for several other European countries. Must be some kind of masochist.

The FindLaw gaming Google game

There’s been plenty more on this. An indifferent article on law.com sums up, but Kevin O’Keefe continues to be the man on the case.

A recent post on LexBlog highlights the importance of knowing what you’re doing or what others are doing for you when you seek to boost your Google juice by purchasing links or engaging in “excessive” link exchanges. In his post FindLaw gaming Google? Kevin O’Keefe reviews what FindLaw are doing for lawyer customers for $1,000 per month and quotes Google’s webmaster guidelines which suggest that’s money not just down the drain, but backing the drain up:

Buying or selling links that pass Google PageRank is in violation of Google’s webmaster guidelines and can negatively impact a site’s ranking in search results. Not all paid links violate our guidelines. Buying and selling links is a normal part of the economy of the web when done for advertising purposes, and not for manipulation of search results. Links purchased for advertising should be designated as such.

Read the full post and much related discussion.

Later: Steve Matthews’ post on the topic looks at the story from a different angle. Were FindLaw just guilty of being arrogant?

Some 18 months ago Google launched its Custom Search service (still in beta) that enables you to create a custom search engine (CSE) focussing on anything up to 2,000 specified URLs.

The rationale is that, despite its undoubtedly sophisticated algorithms, even with a carefully crafted search, Google will always return results near the top that are of low relevance or unwanted and effectively hide some results that are particularly relevant or wanted. A Google CSE enables you to influence the results by including exclusively or emphasising results from particular sites or particular parts of sites that are determined by you to be most relevant to your audience.

Applications

Before looking at the nitty gritty of how it works, let’s consider the circumstances in which a Google CSE might be useful.

Search your own site

Chances are Google indexes all the open-access content on your site. By pointing a Google CSE just at your site you instantly have a site search facility with which your users are familiar. Even if you currently have a site search facility, it’s worth experimenting with a Google CSE.

Search a single favourite site

How many large sites have you visited where the search engine provided is less than satisfactory or even worse than useless for your purposes? By setting up a Google CSE for the site you can customise the search experience to suit your exact needs.

Search multiple sites in a particular domain

On the face of it, this is the most compelling application for CSEs. You can combine the power of Google with your own expertise and judgment in a particular domain to deliver results that are most relevant to your target audience.

How to create a custom search engine

Setting up a basic Google CSE is simplicity itself.

  • Go to www.google.com/coop/cse/ and, if you do not already have a Google account, register.
  • Now click the button to Create a Custom Search Engine.
  • Enter a Name for your search engine and a Description and drop into the Sites to search box a list of the URLs of the sites you want to include or emphasise in the search.
  • Accept the Terms of service and click Next.
  • On the next page, click Finish and you’re done.

You now have a Google CSE accessible via a page on Google that will search just the sites you have listed.

Improvements

To improve your CSE, go to the My search engines page and click the control panel link for your CSE.

Include or emphasise the sites?

Under preferences, you’ll note you can select to search only the included sites or to search the entire web but emphasise the included sites. The latter option will weight the sites you have listed so that they generally appear before other results. This is an important decision and it will depend on each particular application. Bear in mind that if you include only the sites you have listed, all other results – many of which may be relevant to your users – will not appear.

Whole sites, folders or patterns?

Click the Sites menu link and click on one of the sites you’ve listed. You’ll note you can include all pages whose address contains the URL or include just the specific page or URL pattern you have entered. It’s tempting to take the easy route and simply include whole sites. But consider whether it would be better instead to include several specific folders or to use wildcards to select specific subsets of pages.

Refinements?

By labelling your sites, you enable the user to refine their results by category. Your labels can be used either to include only sites in that category or to emphasise those sites over the others. The decision will usually depend on whether the categories are mutually exclusive or not.

My place or yours?

If you’re using the default Google-hosted version, just click the Look and feel menu link to customise the format of your search page. But your CSE will be far more effective if you host it on your own site. This involves a little more work.

Click the Code menu link and you’ll be given the Google iFrame code that will enable you to:

Place a search box for your CSE on any page on your site. Simply drop the search box code into the relevant place on any page on your site.

Place the results on a designated page on your site. Set up a page template to host the results; drop the search code into the top of the page and drop the search results code beneath.

Pesky ads?

If you accepted the default Standard Edition which is free, Google ads will be displayed on the results pages. These will appear either above and below the results or to the right. Click the Code menu link to change this.

To suppress the Google ads you’ll need to pay Google $100 p.a. Click the Business Edition menu link to sign up.

Effectiveness

Do CSEs deliver useful search engines that improve the search experience for your users?

There are many circumstances in which CSEs will be effective in searching single sites or particular parts of sites because an existing alternative is lacking or inefficient. However, my experience of multi-site CSEs produced for particular domains is that, without exception, they all disappoint.

Examples of CSEs produced for the UK legal arena include:

My blawgle, searching UK law blogs, and a few others I developed covering cases, legislation and government sites.

Nearly Legal’s LawSearch which includes selected statute and case law, government guidance, reports, commentary, other resources and help and law blogs.

OUT-Law’s LawTrawlUK which currently includes 95 major law firm sites and some significant public sector sites.

Most CSEs I’ve come across are fairly basic, including just a selected list of sites to be included in the results. But to produce a CSE that does the business does require considerable thought and time implementing the advanced features: carefully and methodically selecting specific folders and/or file types using lists or wild cards, labelling the entries and weighting them.

For those reasons, when setting up the experimental CSEs on infolaw, I decided that creating CSEs with tightly defined scopes might be a fruitful path to follow; within some I spent time pointing to specific folders and folder patterns rather than just the sites; and for some I added labels so that results could be refined. I did not get into weighting the results; that was time I wasn’t willing to spend initially. Reactions from others have been positive, but I have to say even I do not use them much myself and I have yet to try and “sell” their benefit to others.

I have also tried a number of other CSEs designed for other domains. Genuine effort and expertise has gone into all of these; but they fail to engage because either one doesn’t know sufficiently precisely the scope of ones search, or if one does, one would prefer a different selection of sites, or the results feel unbalanced; and all the time one knows one will be missing some key results and those unexpected nuggets that a well-crafted global Google search would serve up. Narrowing the domain searched often takes away more than it gives.

When Google launched it’s Custom Search Engine service 18 months ago, I expected thousands of CSEs to pop up all over. That’s happened, but I’m not aware that any in the areas I monitor have made a mark. Why so?

In the UK legal arena I know of only a few CSEs:

  • I put together a number focussed on UK blawgs, cases, legislation and government sites.
  • Nearly Legal’s LawSearch includes selected statute and case law, government guidance, reports, commentary, other resources and help and law blogs
  • More recently Struan Robertson at OUT-Law launched LawTrawlUK which currently includes 95 major law firm sites and some significant public sector sites

I’ve also tried out a number of others in related fields and, without exception, they all disappoint. This is not to detract from the genuine effort that has gone into them. They fail to excite because either one doesn’t know sufficiently precisely the scope of ones search, or if one does, one would prefer a different selection of sites, or the results feel unbalanced; and all the time one knows one will be missing some key results and those unexpected nuggets that a well-crafted global Google search would serve up. Narrowing the domain searched often takes away more than it gives.

For those reasons, when setting up my experimental CSEs, I figured that CSEs with tightly defined scopes might be a fruitful path to follow; within some I spent time pointing to specific folders and folder patterns rather than just the sites; and for some I added tags so that results could be refined. I did not get into weighting the results; that was time I wasn’t willing to spend initially.

Most CSEs I’ve come across are fairly basic, including just a selected list of sites to be included in the results. But to produce a CSE that does the business does require considerable thought and time implementing the advanced features: carefully and methodically selecting specific folders and/or file types using lists or wild cards, labelling the entries and weighting them.

A couple of big players have recently come out with new “legal” search engines (for the US market).

There is Westlaw’s WebPlus which, “through a combination, it seems, of editorial selection of sites or domains and an algorithm the engine offers to fetch you from the web a better selection of legally interesting results than a simple Google search might do” (per Slaw) and Law.com’s Quest which “draws results from Law.com sites and ALM publications. The broader search includes legal sites and blogs selected by Law.com staff” (per Robert Ambrogi).

To give them a whirl I tried various “mid-stream” searches, for example:

Westlaw WebPlus search for recorded music copyright
Law.com Quest search for recorded music copyright

WebPlus certainly has a nicer, more familiar-feeling interface. There are plenty of results that come from non-legal news etc sites. One assumes though that, by some deft algorithm, the legal sites are ranked higher all other things being equal. Quest seems to limit itself to “legal” sites, some of which are subscription only; the interface I find tacky. More thorough, in-depth study would be needed for a proper review. If any readers have used these search engines, let’s have your comments.

I concur with Bob Ambrogi that the drawback with these (and any other such custom search engine) is that, unless you know which sites and blogs they index, you are left uncertain of how to interpret the results. We’ve become used to Google etc searching everything, and to how it ranks results, so when someone says “Here’s a legal search engine that gives you more relevant results”, you go “Whoa! I really want to know what you are searching (and what you are not searching) and how you are ranking the results before I’ll swallow your line.”

Incidentally, I plugged recorded music copyright into my Google CSE, Blawgle, which was literally knocked up in minutes (plus a few more minutes from time to time to supplement the list of sites). What is it searching? All UK blawgs. How is it ranking results? Unadulterated Google. Strikes me that despite (or maybe because of) its limited scope it may be more immediately useful than WebPlus or Quest.

WebPlus also very prudishly told me it had no results for sex.

Navigational search

Navigational search – gaining access to a specific site or page by searching for the actual web address or a portion of it – is common, not just amongst the uninitiated (who you might say do it out of ignorance), but amongst the web savvy.

Jeremy Crane at Compete:

It’s actually astonishing how often people search for the complete web address and click on the corresponding search result to get to the site they are trying to navigate to. It makes me laugh every time I see my parents do this, but even more amazing is when the “web savvy” amongst us does this.

Jack Schofield at the Guardian:

As a “web savvy” person, I do it often, and Jeremy should know why. First, if I type into the search box instead of the address bar, it doesn’t matter if I make a typing mistake. Second, I might be guessing or have half-remembered the URL I want: it may look stange if I get it right, but often I don’t. Third, there are plenty of Web sites that are not very responsive, or include a lot of junk code. Rather than going to the site, I might actually want to look at it in Google’s cache first.

But thinking in terms of URLs for web access is so Web 1.0 don’t you think?

With Google refining its algos and providing special searches and with websites increasingly using URL rewriting, the need to use URLs for access and to use in-site searches will all but disappear.

For example, to access a Wikipedia article on a topic, I just type wikipedia <search term> into the Google search box. Most times a specific Wikipedia article is No. 1 on Google. Similarly to find a page on any other site, I type <domain> <search term>. This works even for low-ranked sites provided the domain is a sufficiently unique word; and only of course if the search term is likely to be in a page title. For special Google searches I type define <word>, books <search term> etc, or just a post code to find a location on Google maps, etc.

And in Firefox you don’t even need to use the search box as it will perform a search on words typed into the address bar (you can configure Firefox to use whichever search engine you prefer).

Larry Bodine thinks so:

Clients use Google to look up phone numbers and addresses, so law firms can cancel their yellow pages ads. When clients want to check out your firm, they are not going to call up to get your printed brochure, they will look you up online.

Kevin O’Keefe agrees but sees the directories playing a part in your visibility on Google:

As for directories such as FindLaw, Martindale-Hubbell, Super Lawyers, Avvo, and the like, the most important function they can play is getting the biographical information of your firm and its lawyers indexed at Google. The days of a lawyer directory portal site where Internet users go to look up lawyers are coming to an end.

However, I’m sure both Larry and Kevin would agree that what’s most important is for you directly to influence your visibility on the search engines – Google in particular – as that’s where most people search, unrestricted by the broad subject classifications used by the directories; you target the specific terms that will be most relevant and most effective in attracting hits. Gaining front page rankings on Google is not easy, but a significant improvement is achievable and need cost very little: the basic principles for improving the “keyword relevance” of your pages are straightforward, and, for example, the effectiveness of blogs in generating Google juice is well proven. When you’ve done what you can, you can then consider where to spend your “directory” budget – lawyer directories or Google Ads? I think Google wins again.

Adnonsense (3)

Google is in a bit of a bind. On the plus side it can be credited with:

  • opening up access to the web with Google search,
  • providing advertisers an effective channel for their web marketing through its AdWords scheme, and
  • giving legitimate publishers, large and small, the opportunity to generate income from serving up those ads through its AdSense scheme.

Ranged against this positive record is the fact that its AdSense scheme is responsible for phenomenal pollution of the web.

First, AdSense ads are everywhere.

Up to the plate has stepped AdBlock Plus which effectively kills ads on web pages so you can experience the web as clean as it was 10 years ago. But are we also in a bit of a bind here? Will AdBlock Plus also kill the internet economy? Nick Carr thinks:

There’s no evidence that Adblock Plus or similar products are about to go viral. In fact, there’s no evidence that the masses view online ads as a nuisance.

but if he’s wrong about that, he points out that:

Since nearly the entire internet economy relies on advertising of one form or another, the widespread use of ad blockers could well devastate many businesses, from giants like Google and Yahoo! to scores of tiny startups.

He rehearses the reasons for and against using AdBlock Plus more fully in a later post, asking what would Jesus do? and pointing to Mark Evans who speaks for many web publishers when he calls Adblock Plus an evil predator.

If you believe in Web 2.0 and/or if you believe in the concept of free, Adblock is pure evil.

So the jury’s still out on that one.

However, the AdSense scheme is also responsible for a huge explosion of “made for Adsense” (MFA) sites – sites with no value of their own that post content optimised purely to drive ad traffic.

Initially most such sites were splogs (spam blogs) and the like – gibberish generated by machines usually from content scraped from others’ sites. But a new industry has grown of businesses that employ inexpensive “authors” to write original (to Google’s algorithmic eyes) but inevitably inane articles based on others’ content. This phenomenon is explored fully by Danny Bradbury in the Guardian article Word Farms of the Web.

Not only are these sites devoid of value; they also provide a poor return for the advertisers. From Ben Edelman, an expert investigator of spyware affiliate networks:

If I were Google, I wouldn’t have a difficult time deciding what to do here. This content is not useful. The world would be better off if these pages didn’t exist … The issue is where the money comes from – how it is that reasonably well-respected advertisers end up paying for this stuff?

Google can surely fix this – but will it?

SEO for dummies

Nearly Legal has a thing for Sally Field naked: she boosts his Google juice.

His recent rise in the rankings for the said search term was helped by the fact that on Tuesday Sally won the Best Lead Actress Emmy for her role in Brothers and Sisters where all those leading TV actors you’ve seen over the last ten years or more pretend now to be related. Not only this, but her acceptance speech was censored by Fox who removed the concluding observation that “if the mothers ruled the world, there would be no goddam wars in the first place”.

In the UK legal domain, you’re unlikely to generate any serious traffic if you stick to your subject. But stray into the popular realm by design or accident and you can see why there are so many sploggers and AdSense farmers out there. My own rather modest success in this domain was when I posted about new gambling and sports law blogs from Cecile Park Publishing. Almost instantly my Technorati rankings rocketed as some splogger referenced the post in more than 20 blogs.But, like crack, or any drug, the high is temporary, and you’re in for a downer when those links drop off the radar and you are restored to normality. And if you’re not after the AdSense bucks you’ve achieved nothing other than an entertaining diversion from your daily grind. But that’s enough reward sometimes.

SEO: quality is the key

Google Keeps Tweaking Its Search Engine in the New York Times gives a rare view inside one of the key departments at Googleplex. Amit Singhal, for some reason quaintly referred to as “Mr. Singhal” throughout, is the master of Google’s ranking algorithm, the complex program that calculates the relevance of a particular page to a particular query. He and his search quality team consider 100 or so reports per day on deficiencies and anomalies in Google results and make about a half-dozen changes a week to the page ranking formulae – everything from penalising particular types of spam sites that are ranking highly to promoting local traders who are not ranking high enough for very relevant local queries.

The article is short on analysis, but you clearly get the picture that this is a truly determined ongoing effort to improve search quality: first to review in a myriad of narrow contexts what is relevant and then through changes to the algorithm to promote the deserving, demote the undeserving and avoid upsetting the apple cart for the rest.

It follows that the only way to longer term success in the page rankings is to provide copious pages of quality, in-depth information relevant to all possible issues your prospective clients might want to address. In doing so you should also consider and apply basic optimisation rules, but the most important can be summarised on a few pages and implemented at little cost. Most other techniques employed by self-proclaimed SEO experts are simply costly shortcuts to short-term success, as Seth Godin comments:

In the SEO arms race, shortcuts have a shorter shelf-life than ever before. [Google's search quality team] is obsessed with them, and they outnumber whoever you might hire to beat the system. Organic success, on the other hand, is a clear path. If you want to be on the front page of matches for “White Plains Lawyer”, then the best choice is to build a series of pages (on your site, on social sites, etc.) that give people really useful information. Not just boilerplate information you stole from a legal website, but really useful stuff about you, the local courts, the forms people need … the things you’d want to find if you were doing that search.

Once you’ve done everything you can … once you’ve built a web of information and once you’ve given the ability to do this to your best clients and your partners and colleagues, then by all means apply the best SEO thinking in the world to your efforts. Hire the best consultants and use the resources you’ve got left to be sure you’re playing by the right rules.

(Hat tip: Kevin O’Keefe)

« Older entries § Newer entries »