Peter Coleman: The wrong stick to hit generative AI theft

Jan 03, 2024 at 05:45 pm by admin

Media companies were divided on AI – especially the generative variety – as the first full year of living with ChatGPT came to an end.

The issue of ‘threat or opportunity’ depends on how you make your money, and to a large extent, the jurisdictions under which you operate.

Most publishers have already taken advantage in one way or another, of AI’s ability to generate content, expand and target services, based on factual information.

There’s understandable concern about copyright, and the unlicensed theft of original writing, but copyright generally protects only the arrangement of words and not the facts they communicate, leaving publishers in a bind over expensive investigative reporting.

And with the evaporation of advertising as a primary revenue source for news publishers, there’s concern over the ability of chatbots to bypass paywalls, extracting “protected” content one paragraph at a time. If the walls intended to guard “subscriber only” journalism can’t be made effective, the issue of what these bots then do with the proceeds of their theft would appear to be of critical importance.

Examples I’ve seen of “par-by-par” regurgitation of content appear pretty unimaginative, and well below the level of which AI is undoubtedly capable.

Which brings us to the question of technological solutions to a technological problem. It’s long been possible to identify a photograph on the web, used without the owner’s permission. With enormously greater processing power now available, AI surely provides the means to identify “unique” strings of words and phrases, over which copyright might be claimed, as they recur at scale in the output of a bot such as ChatGPT.

Which is what brings doubt to the New York Times’ claim against the Microsoft-backed technology company and – as John Durie puts it in The Australian – “exploit anti-digital platform sentiment in its landmark law suit”.

If indeed, there is enough anti-digital platform sentiment to achieve the objectives they have in mind.

When Australia took on Big Tech during the drafting of its News Media Bargaining Code legislation, popular support was by no means unanimous, and it took prompting from media companies to push the government of the day over the line. Some of that reluctance has been evident recently with the response to the current treasury review of the legislation’s first year, which may include a requirement for less secrecy over publishers’ deals.

The NYT action, if it takes place, will be interesting but is unlikely to be definitive: The US is struggling to legalise collective negotiations with Big Tech, and many other countries lack the protection afforded by Australia’s bargaining code, and in any case may have less stringent copyright laws. Again, the New York Times – which is calling for the “destruction” of all large language models trained disproportionately on its copyrighted work, as well as damages, restitution and costs – is a publisher with clout, but it’s unlikely they will succeed in putting the genie back in the bottle. Is the legislation they are invoking sufficient to protect its “massive investment” in original journalism?

Elsewhere of course, other publishers have taken a different approach.

Notably German publisher Axel Springer, which owns Politico and Business Insider as well as domestic market leaders Bild and Die Welt, has announced a deal with OpenAI, which licenses the ChatGPT developer to both train its LLMs on, and create summaries of Springer’s content, including that behind its paywalls.

A key area of interest is search, in which Microsoft never managed to gain traction with Bing, and from which Apple would like revenue of its own instead of accepting a fee from Google.

The likelihood is that – as with Australia’s bargaining code – priorities for each country will vary, according to perceptions and pressure.

Durie fears Australia will “simply piggyback on international development”, despite its early leadership. Presenting what may or may not also be his employer’s view, he says the government response… “is not the Christmas gift we hoped for”.

For his part, Dominic Ponsford, editor of UK trade publication Press Gazette, says “working with tech platforms and taking short-term money (as publishers previously did with Facebook) is not always the best long-term strategy”.

It would be good to think that a technological solution might occur to protect journalism, but as chatbots get smarter and more creative, it’s unlikely that one can realistically be built on copyright law.

Sections: Columns & opinion


or Register to post a comment