September 2001

You are currently browsing the monthly archive for September 2001.

In too deep

A Page on the Web, published in the Solicitors Journal, September 2001

I have busied myself recently with the redesign and relaunch of the infolaw website. The main purpose of the site, since its inception in February 1995, has always been to provide ready access to free UK legal resources on the web – a portal in current parlance. The old site did so by means of a set of static web page indexes. The new site has considerably wider coverage. In addition to the general legal resources formerly on the site, it indexes primary law and other official documents, forms and precedents and other materials. These indexes are managed in a database and intuitive methods of browsing and searching this are provided.

Coincidentally, the latest issue of Computers & Law (vol 12 issue 3, Aug/Sep 2001) dropped on my desk this week, containing two (as we will see, related) articles concerned with issues of direct import to the infolaw site and to UK legal portals in general.

Free access law services

Laurence Eastham argues in his editorial that although there are substantial free legal materials on the web:

  • the long-term commitment of the commercial publishers to free access services is suspect; and
  • many lawyers are ignorant of the ‘genuine’ free access services and there is therefore a need for an effective marketing drive to ‘sell’ their free gifts.

The commercial publishers’ free services are essentially marketing tools, designed to hook users in and drive them to purchase the publisher’s chargeable services. By contrast, the aim of the genuine free services is to facilitate access to the law, either out of public duty or out of charitable intent. There are, of course, grey areas in between occupied by, amongst others, the hobbyist, the exhibitionist etc, but the prime distinction is between commercial and non-commercial publishers.

The more commercial the publisher, the less useful the ‘free’ information ultimately is, as not only will the publisher have an interest in extracting value out of the user, with the likelihood of withdrawal of the free services also ever present, but they will also strongly assert their copyrights and other publishing rights in the material, limiting the use of that information.

Deep linking

Another C&L article in the same issue by Ian Jeffrey concerns the significance of the database right (incorporated from EC Council Directive into the Copyright Designs and Patents Act 1988) for online publishers.

Even a basic website may fall within the definition of a ‘database’ as it has a logical structure. Provided there has been a substantial investment in compiling and presenting the information, the database right will subsist and will be infringed if there is an unauthorised extraction or re-utilisation of all or a substantial part of it. Further, repeated extraction of small parts may amount to re-utilisation. The latter, in particular, has significant implications for the online portal.

Much of the value of the web lies in the ability to link disparate pages, not just on the ‘home’ site, but anywhere on the web. As the web grows and websites become more dense and complex, links to home pages become less and less useful for the researcher (as opposed to the online ‘shopper’). Any sufficiently focussed portal or other specialist site will therefore usually include direct links to pages within a website – this process being known as ‘deep’ linking. Potentially therefore, deep linking, if substantial, can infringe a linked site’s database right.

The more commercial the linked site and the more closely the linking site is seen to compete with it, the more likely it is that the linked site will assert its database right. Commercial publishers want you to enter via their home pages, as there you will get the corporate message, there you will be guided as the publisher intended, there you will see the adverts intended and there you will start to clock up that most precious of website statistics, the page view.

By contrast, deep linking benefits the user. For example, in providing a reference to the present two C&L articles, I could point you to the Society for Computers and Law’s site at www.scl.org. [However, you would then need to click on Publications, then SCL magazine, then click on Other issues: 2, then locate the appropriate links within the contents list and click them. I prefer therefore to refer you direct to the pages in question: Laurence Eastham's editorial and Ian Jeffrey's database right article.] You found that much more useful, didn’t you? But if I do that too often on a web page, I may be in trouble.

Search engines deep link all the time, since their crawlers find and do not just index home pages, but follow all links, down to a certain level, to individual pages on the site in non-protected areas. Have you ever heard anyone complain that pages on their site appear in a major search engine? I suspect not! However, a publisher who sees a specialist portal as a competitor may well object since the value of the resulting visit will be diminished. If objection is well-founded, the linking site may well remove all the links and there will be no resulting visits and no value! Or the parties may agree some form of license, the fee for which the linking site will somehow have to recoup.

So, as we said at the start, the lunch is not free after all.

Note: Article amended December 2001 following relaunch of new SCL website!

First published in the Internet Newsletter for Lawyers, September 2001.

The public provision of law on the web leaves much to be desired:

  • it is not comprehensive
  • it is not “joined up”
  • it is not easily accessible.

This article takes stock of the current position and looks at ongoing initiatives to improve it. It refers in the main to primary law. However, similar issues apply also to other legal documents: official papers and reports, law forms and other authored legal resources on the web.

Law on the web is not comprehensive

HMSO’s site publishes a complete set of primary and delegated legislation of the four UK and national parliaments from 1988 onwards. While this is a substantial database (about 930 Acts and 29,000 SIs), there is nevertheless a large body of earlier statute law currently in force that is unavailable on the web.

The position with case law is even less satisfactory, as will be seen from the table below. Selected handed-down judgments started appearing on the Court Service site in 1996. The number of judgments being added has increased significantly, but remains incomplete. The ongoing provision from other courts appears more comprehensive, though few go back earlier than 1996 – effectively the year the world discovered the web.

Provision of case law on the web

Publisher Coverage

Law on the web is not “joined up”

This is a favoured expression (of intent) in relation to government these days. How about joined up law? The law is not joined up, firstly, in the sense that statute law is not officially published in consolidated form. This, of course, has nothing to do with the web, but is simply an historical deficiency. But somehow the publication of the law in unconsolidated form on the web brings home how un-joined up it is. A Statute Law Database has been in development for many years and is intended to remedy this, providing a consolidated corpus of statute law able to reflect the law as it stood at any particular date. It currently contains the text of all Acts that were in force on 1 February 1991, and all Acts and printed Statutory Instruments passed since then. The Statutory Publications Office reports that there now over 400 users of the SLD from other government departments. The SPO is evaluating the best way of making the data available to all the various bodies that will require access in the future. It estimates that the editorial work will be completed by Spring 2002 and that the information will be available to the general public at that time free of charge.

Another respect in which the law is not joined up is that it is published on many disparate sites, largely unconnected save for the odd set of “useful links”. This has implications as to the accessibility of the law as discussed below.

Law on the web is not accessible

To access all primary law on the web requires visiting a dozen unconnected sites. This in itself is inconvenient. But this is compounded by the fact that each site has a different structure and different methods of access. Some sites do not have a search facility and can only be browsed; others provide a structured search of key data fields (eg name, date, court etc); and others provide full text searching using various search engines. The average user is understandably completely at sea and cannot extract optimum value from what is intended in sum to be a valuable resource.

Of course a friendly means of accessing each database (via browsable indexes, a structured search template or other site search engine) should be provided by the publisher. However, this will be a local solution based on perceived requirements for accessing a particular set of documents. It is far more important that the documents should be readily identifiable and accessible in the first place so that others may develop access solutions (across these databases) best suited to their requirements.

For optimum accessibility what is required is that:

  • each document should have a unique standard neutral citation and a permanent web reference that can readily be inferred therefrom
  • each document should be annotated with key “metadata” (ie its properties, such as title, citation, date, subject) and browsable indexes and search facilities should use this.

As to the first requirement, Acts and SIs have long had a standard method of numbering (ie year and chapter/SI number) and HMSO now uses this systematically in its website file management.

However, as to cases, there was until recently no system at all. As from 11 January 2001 a form of neutral citation is now employed in both divisions of the Court of Appeal (EWCA Civ and EWCA Crim) and in the Administrative Court (EWHC Admin), judgments being uniquely numbered and cited in the following way:

[2000] EWCA Civ 1 (see Practice Direction (Judgments: Form and neutral citation), 11 January 2001
http://www.lawreports.co.uk/civjan0.3.htm).

This scheme is also adopted by the House of Lords (UKHL). There seems to me no good reason why this scheme should not be extended post haste to other courts and even applied retrospectively. And having implemented this scheme, why not use it in the website file management (none of the relevant sites currently does).

The second requirement may appear more demanding, but is by no means a large task for each publisher given appropriate expert input. It is astonishing how many web pages, even from the official sites in question, sport a meaningless title (ie that which appears in the blue bar at the top of your browser): the title is the most fundamental form of reference for a web document, used and displayed by the search engines.

As to what scheme should be employed, there is a widely adopted metadata standard, known as Dublin Core or “DC” (see
http://dublincore.org/). This is being adopted as the basis for a metadata scheme for websites “developed by organisations in the legal and advice sectors” under proposals published by the Community Legal Service. The CLS scheme is understandably geared to the needs of the general public: it is to be hoped it will be adaptable to the needs of lawyers. For further details see my article at
http://www.infolaw.co.uk/ifl/articles/lwi0012.htm.

Making the law more accessible

The above summarises the current problems and some of the official initiatives under way that may (in time) remedy these. But what other solutions are available now?

The charitable initiative BAILII needs no introduction in this Newsletter, having been described in previous issues (see the January/February issue). As to comprehensiveness, BAILII is still in its set-up phase and lacks many publicly available materials, significantly UK statutory instruments. Legislation is from HMSO and so dates from 1988 only and is in unconsolidated form, though the full text of the Revised Northern Ireland Statutes since 1495 (sic) is available. Conversely, case report coverage is far wider: Smith Bernal have provided their CA and HC reports from 1996 to 1999 to supplement the handed-down cases provided by the Court Service. The current aim is to bring the BAILII databases into “fully-acceptable shape” by October 2001.

Quantity aside, joining up the law and making it more accessible is BAILII’s signal achievement. All materials are available in one place, cross-referenced and in one uniform searchable format.

Consolidating, annotating and commenting on legislation and case law has always been the bread and butter of the major commercial publishers. From copious printed volumes of Halsbury’s Statutes and the like, offerings have evolved into sophisticated online databases from Butterworths, Sweet & Maxwell/Westlaw, Justis, Lawtel, Smith Bernal etc. The big drawback is of course cost, with prices for services starting at several hundred pounds per user per annum.

Making sense of the UK legal web

At infolaw, we are doing things differently. We started from the premise that on the web a document need only be published once. That document can then be directly referenced by anyone. Further, if value is to be added, there is (in theory) no need to reprocess and republish the document (as does BAILII), since additional data can be extracted from and/or associated with the document and published “alongside” it or the data and additional rules can be used to process the document “on the fly” and redisplay it with value added.

This is essentially what the major search engines do – and do extremely well. However, they are using fully automated processes to crawl and process any type of document or other web object on any subject. What a UK legal web search tool needs to do is focus on UK legal sites and others that are likely to be useful to the UK lawyer!

So we set out to catalogue what there is on the “UK legal web” and to design a search tool that is sensitive to the UK legal context. The infolaw catalogue now includes Acts, SIs, Cases, Forms and Precedents, Other Documents such as Bills, Command Papers and Law Commission reports; as well as hundreds of key Legal Resources sites and public and national Organisations of use to the lawyer providing direct access to more than 36,000 documents and other resources in total.

The catalogue can be browsed by category or searched and there are also powerful “references to” features for following up references to documents found and forwarding search terms to other search engines.

We do not compile a full text index of all the documents. Of course this has disadvantages, but for the initial service it is a deliberate choice. Structuring the searching and results has the advantage that all results are relevant: there is no spurious relevance ranking and results are delivered latest first (for dated documents such as Acts, SIs and Cases). References can also be found, followed up and compared far more precisely than with a full text search engine. Having said that, we intend to provide a full text solution in due course.

The new infolaw service can be accessed at http://www.infolaw.co.uk.