RSS – who profits?

By Nick Holmes on June 4, 2008
9 comments
Filed under Feeds, Internet law

In response to my last post, Susan Cartier Liebel raises the question of the legalities of streaming others’ feeds without permission. She points to her post Shouldn’t You Have To Ask Permission If You Want To Take A Blog’s Feed For Your Profit? which has attracted considerable comment.

Of course your content is your copyright and others should not copy it without your permission. But a feed can be repurposed in many ways, and we need to look at what parts of the feed are being copied and who profits.

Copyright lawyers will have to fill me in on the latest case law on all of this, but I think in practice we have despatched the question whether links are legal (is the web legal?) with a resounding yes. As Sir Tim father-of-the-web-but-not-a-lawyer Berners-Lee has said:

There are some fundamental principles about links on which the Web is based. These principles allow the world of distributed hypertext to work. Lawyers, users and technology and content providers must all agree to respect these principles.

What of link+title? In principle there is copyright in a title, but it’s hard to see anyone any longer seeking to enforce copyright here.

But an RSS feed is an aggregation, so what of a bunch of links+titles? Here there is a stronger case for saying that this aggregation is protected by copyright, and if we’re talking about an aggregation of links+titles+descriptions or even +excerpts, that is clearly protected. So let’s talk about permission, express or implied.

I don’t believe there’s any implied permission for others to republish feeds. But in practice, why publish a feed if you don’t want it to be republished? It will be, and there’s little you can do to stop it. You can frame some stern T & Cs or apply a more friendly CC licence, but most, whether intentionally or by default, will take little notice.

Susan makes much of others taking your (blog) feed “for profit”. We are all miffed if we see others profiting from our work at our expense. But, with feed repurposing, in most cases we profit too, sufficiently that we do not see it as being at our expense.

  • Google indexes, caches and republishes parts of my website, my blog, my feeds without my permission. Google profits handsomely, but I profit too.
  • Other specialist search engines and directories – like Tehcnorati, Blawg Search – also index and repurpose my content. If I’ve submitted my site to them, I’ve probably given them permission to do this, but in most cases my signing up only legitimates what they have been doing / would do anyway. (Susan, Technorati indexes your blog whether you’ve claimed it or not.) They profit, but I profit too.
  • Smaller fish might also republish my feeds, but in all cases short of their republishing my full text, I profit as much as or more than they do. All items link back to me. And I really am not going to lose sleep if they choose to wrap Google ads around it or seek to profit in other ways. (I do view sploggers etc as the scum of the earth, but I blame Google Adsense.)

So in practice, what we are all most concerned about is others claiming our real work – our full posts or articles – as their own; and there is a simple answer: if you want to protect your content, include only excerpts rather than full text in your feeds. Syndicate your metadata, not your data.

9 comments

Nick, I appreciate you joining the conversation on this. I strongly believe this is just the beginning of the discussion and legal minds will start to weigh in. I don’t equate Google indexing in this discussion as it was pointed out…it functions as a library. You are not going to find a blog unless you are searching.

The point remains it is wiser to put a mechanism in place whereby people opt in for the benefit the aggregator purports to offer and as others have done…successfully, then to just take.

Just because it is unknown territory doesn’t mean businesses who choose to make a living this shouldn’t tread lightly…

by Susan Cartier Liebel on 4 June 2008 at 4:51 pm. #

Nick, this is a thorny one. I have been trying to work out dividing lines.

As you know, I have a page and now a unified feed of other peoples’ feeds (and in several cases, set up page scrape feeds where no RSS existed in the original). But I deliberately kept this to title/description only. I may be a not-for-profit site, but my sense is, unless expressly authorised, I won’t put up someone else’s content. I will point to it instead.

You do something similar with your feed pages and aggregators. As this does nothing but alert people to new content on other people’s sites/feeds, the copyright infringement is minimal to non-existent and the sole effect is to give traffic to others.

Technorati, google and others may index my feed and site, but, unless their use of cache is abused by them, again it solely gives traffic to the original site/feed as a result.

Where I am with Susan is non-consensual aggregation of full posts by commercial outfits. I do run a full feed – I can’t stand excerpted feeds myself so don’t put others through that – but that doesn’t mean that a for-profit site can simply replicate my full content without my permission (or at least, cannot do so without being on the same legal and ethical level as spam bloggers).

After all, I work on a free as in beer basis for putting information out – hence a no commercial use without permission restriction. Some uses of the feed that are technical breaches of my copyright, I may (and have) condone(d) as either being harmless or in my interest. Others, like spam bloggers, I do pursue, out of annoyance and out of self-interest in google indexing.

But why should someone make money from material that I have deliberately and purposely made available for free? (at least without my permission to do so?)

by Contact on 5 June 2008 at 12:27 am. #

@Contact “Where I am with Susan is non-consensual aggregation of full posts by commercial outfits.”

@Me “what we are all most concerned about is others claiming our real work – our full posts or articles – as their own”

So we all agree on that.

My question is, if you deliver full text feeds, how can you prevent it? For the unscrupulous, you are actually encouraging it.

by Nick Holmes on 5 June 2008 at 6:03 am. #

Nick. On the side question of whether links are legal – and of course it shouldn’t really be a question – this is still not really resolved in case law (there isn’t really any in the UK for example – the Shetland Times case which settled being the closest thing).

In Europe, the Newsbooster and the Paperboy cases show examples of the courts coming down on either side of the argument (In Newsbooster, the Danish court did rule that deep linking – without permission was illegal).

Both these sites were news trawling sites, like Google News, which of course is having its own battle with belgian newspaper association Copiepresse over their extraction and linking to Belgian newspapers without permission (Court of First Instance ruled in favour of Copiepresse). BitTorrent sites are facing the same ‘linking’ problems too.
The issue in Europe also depends on whether your website could be classed as a database, as the EU Database Directive prohibits the ‘repeated and systematic extraction and/or re-utilisation of insubstantial parts of the contents of the database implying acts which conflict with normal use of the database or which unreasonably prejudice the legitimate interests of the maker of the database’.

This last issue obviously does tie back into the main question of the post on feed repurposing by others without your permission. There is certainly a good case for arguing that blogs are databases, so it would be interesting if someone tested the law on this issue – although in reality I think this is unlikely to happen.

by scott on 5 June 2008 at 11:05 am. #

@Scott Sure, by definition, every blog and every other CMS-driven website is a database, but per William Hill and Fixtures Marketing, only investment to seek out existing materials and collect them into a database will give rise to a database right; resources used for the creation of materials that make up the database will not be sufficient. So you have copyright in the content created, but I’d argue you don’t have a database right, as WordPress etc have made the investment, not you.

Good summary of the database right at http://www.out-law.com/page-5698

by Nick Holmes on 5 June 2008 at 3:27 pm. #

“Legal” or not, I find it very odd that people seek to restrict the on-syndication of RSS feeds. I can see some concern where people publish full content in RSS (not the purpose of the RSS standard; the key XML element is labeled “description” not “content”; it’s different with Atom, but there you could easily exclude the full content from an on-syndicated feed), but otherwise two things will happen with an RSS item: the reader will click on the link to go to the full text, or ignore it. Where is the harm in this? How is it different from a Google indexation?

There is harm in adding additional copyright restrictions. I have developed a low-priced software application, Xenos, which enables users to easily create new RSS feeds/email newsletters by drap/dropping items from a number of RSS feeds. Using this, information managers won’t need to buy so much information from aggregators; further, the revenue from the RSS gets returned directly to the news producers, via page views therefore advertising. This is a crucial change that needs to happen as aggregators are experiencing double-digit revenue growth, while newspapers are actually declining (according to Outsell, Inc.). So, if an information manager at a law firm takes one item out of a blog RSS feed and includes that in a daily newsletter to be circulated to the firm’s partners, does that constitute commercial usage and a copyright infringement? Yeah, nobody knows, but a number of them will be discouraged from trying. It just increases the sphere of ambiguity and fear.

As a final point if we are speaking about (oh, unlovely word) “monetization”, there is a heck of an easy way to “monetize” links from RSS feeds directly. Just embed the link ref in a cgi script call that brings up a timed redirect to an advertising page, before redirecting to the actual full text — you know, like the NY Times does on some article links. Only, of course, 99.9% of bloggers won’t have a clue how to do that, at least not until Blogger & co. work out they should include it as a feature.

Which is part of what is at the source of this issue. Technology not only substitutes for capital in the economy, it also affects the behaviour and effect of laws. Want to copyright headlines/titles? Pretty easy to leave out the last couple of words and insert an ellipsis, and not that much harder to provide a context-sensitive substitution of one word. Not even that hard to just scrape a blog site, and provide a summary using open source software based on the full content.

The thing is this: the internet is this big, steamy pond of technical evolutionary development. If you want to play in that ecosystem, you have to expect it is going to “play” back. It’s not an extension of the past practice that the law honours. It needs that kind of input, but what comes back is not going to be the predictable and the expected.

by scott lewis on 6 June 2008 at 8:55 am. #

Lawyers will come to realize there is more value in having their blog content syndicated than there is in having people view their content on the lawyers own blog or in an individual’s newsfeed.

And the syndicators of that content, which first need to aggregate it, need to bring in some revenue to offset hardware, software, and editorial expenses. Saying people should do this soley out of the good of their heart, unless you’re Google, is shortsighted.

by Kevin OKeefe on 10 June 2008 at 4:19 pm. #

“My question is, if you deliver full text feeds, how can you prevent it?”

I don’t think you can — as others have intimated, the nature of links and RSS feeds means anyone can pick up what you’ve written and redeploy it as they wish. The fact that it’s easy doesn’t make it right, but it does make it virtually impossible to stop (cf. the record industry, or what’s left of it).

Susan and I have corresponded about this — I think the solution, eventually, will be one or more highly regarded law-focused aggregators, to which blawg feeds are contributed by consent and mutual benefit (deriving value from the universal currency of the link, and maybe even cash for the best content providers). These legitimate aggregators will be widely known and respected, such that most of the cheap, fly-by-night, content-scraping aggregators will fall by the wayside, because people won’t be paying attention to them.

How would these aggregators distinguish themselves, if everyone can access the same RSS content? In a number of potential ways, but most likely through its own original content, the astute choice of the best feeds, aggressive branding and a host of other innovative steps. This aggregator, of course, will look a little like major newspapers look now — or at least, as they will soon look like on the web. Scott Karp (http://publishing2.com/2008/02/20/reinventing-journalism-on-the-web-links-as-news-links-as-reporting/) and Jeff Jarvis (http://www.buzzmachine.com/newspapers-in-2020/) have some intriguing and provocative thoughts on the future of newspapers that could be usefully applied here.

Bottom line, the misappropriation of RSS feeds today won’t last, for the same reason that a hundred yellow newspapers running the same story didn’t last at the dawn of that industry — eventually, the winners emerged as those who delivered the best content in the best format in ways audiences wanted to read. The likely result is that new law aggregators and the best legal content providers will come to mutual agreements that serve both them and their readers.

by Jordan on 10 June 2008 at 5:44 pm. #

If web sites don’t want their information accessible through RSS then they should just not provide the RSS/Atom functionality.

Aggregating RSS feeds, whether via an aggregator like Feedburner or software from Xenos, should be perfectly legitimate. The information is already freely available for access on the web. The links to the content are also freely available, otherwise how would we find them in the first place?

Putting linked content into a user-defined context should be applauded. Context-rich information is what gives content meaning.

These naysayers and lagal hacks are still of the control-or-perish mentality of the print world. The web is different folks – the web is biased towards the readers and the users of content. Get used to it.

But as I said before, if print-based thinking wants to pervade the web, then don’t offer RSS to your content and remove yourself from Google and all the other search engines.

by Brad on 27 October 2008 at 3:12 am. #