John Juliano: These interesting times

Aug 29, 2012 at 12:15 am by Staff


‘In five years, a computer programme will win a Pulitzer Prize – and I’ll be damned if it’s not our technology,’’ Kris Hammond of Narrative Science told the ‘New York Times’, a comment referred to in GXpress Magazine in June.

Self-serving, yes, but can it be true? Given that 2017 isn’t far away.

In the US newspapers continue to look for technology to replace staff. It is what western culture has been doing since the beginning of the Industrial Revolution. Computers were to be the end of manual record keeping, and they were. Traction control and anti-lock brakes have taken over how we drive in rain and snow. We outsource our brains to our GPS systems.

If it is a repetitive task that can be explained, a machine can be developed to do the task. It is less abstract when it’s your job that is going to be automated.

When I was straight out of undergraduate school I implemented a membership system at the American Society of Civil Engineers. It took billing and much of the record keeping away from the squadron of women who kept membership records and sent magazine subscription bills. I told myself that rather than reducing jobs, I was allowing the organisation to do more with the staff they had. I never looked closely to see if I was right.

It’s easy to understand decision support platforms like Visual Revenue’s. Heuristics-based decision support in which the heuristics can be explained.

But systems such as Narrative Science’s Quill product which can write stories based on numerical facts – including financial and sports stats – are less easily understood as they process large amounts of related information while looking for relationships.

The real test is whether the reader can tell if an article wasn’t written by a person. It’s early in the game and hard to judge. A review of an early phonograph said the sound was indistinguishable from a live orchestra, but today, we find this unbelievable because of our experience with the technology. Will we soon be able to easily discern the differences between human-written content and computer generated? Will our customers care?

I’ve moved away from Arthur C. Clark’s statement that “any sufficiently advanced technology is indistinguishable from magic” to my own: If I can’t understand it then either it’s not being explained properly or it is smoke and mirrors. The hype around text generation companies as they attack our industry isn’t quite either.

NS is a Big Data company. Have you been following Big Data? Big Data is the processing of the massive amounts of data our culture collects.

At lunch last fall the head of innovation for one of the world’s largest newspaper chains advised me that Big Data was the future and that is where I should look to make money. How, I asked. He shrugged.

NS says it places its efforts on financial data – where the money is – and has Forbes as a customer. Hammond is also quoted as saying that within 15 years 90 per cent of news will be written by computer. An interesting statistic, but one that can be self-fulfilling. My own company works to provide the increasing amount of content necessary for a modern news organisation’s electronic presence. A media company needs lots of content and it needs to turn over that content quickly to retain users.

Software can generate an unending stream of content by continually analysing numerical information and churning new slants, new anomalies, new interesting things to be pointed out. Computers can easily churn out seven times the amount of text generated by the working journalists of today, but will it be worth reading? Is a picture still worth a thousand words?

The going rate for a computer generated story is about $10 per story, less than even Patch.com pays for a story, but then unlike the Patch reporter, someone else has gathered the numerical information that is crunched and put into a textual form for consumption. The strength is in numerical analysis, but is this journalism?

When I was learning about journalism, I was told that sports writers were the best writers in the industry. Each day they took a game, which was generally not much different than thousands of other games, and made it interesting. A sports writer develops a voice. When you name reporters how many sports reporters can you name versus (please excuse me) hard news writers?

About the Pulitizer Prize: Generated from strictly numerical data? It’s been done before. The 1985 Pulitzer Prize won by the ‘Denver Post’ was based on number crunching. It exposed that only 200-300 stranger-to-victim kidnappings a year occur in the US, rather than the tens of thousands we’re lead to believe from milk carton sides.

So what makes a good story? Human interest, quotes, investigative reporting that does not show up in gathered statistics.

Companies that generate stories from numbers – and there is more than one – do hard data analysis and then rather than display that content in a graph, generate text. Cool, but not reporting. It is Big Data Analysis.

Six months before the financial crash I met with a friend at a large bank who did financial modelling. I asked, ‘‘does your model work? Does it accurately predict?’’ I was answered with the condescending scorn born of hubris: ‘‘You must not understand financial modeling to ask such a question.’’ I guess I didn’t understand, but then they didn’t seem to have, either.

The danger and the opportunity are in the holes that such a product will inevitably have. Most of us prefer a narrative to charts, we want a story, and we want analysis: Tell me what the numbers mean. Should computer-generated stories become more and more dominant, we will rely less and less on our own analysis and more and more like my banking friend who was so very certain of her computer model.

Text generation systems can take statistics from any game, gathered along the way by team professionals, attendees in the stands or proud parents and instantly generate a game synopsis.

Depending on who the intended reader is the synopsis can play up successes and ignore failures, producing an entirely different synopsis for each team. It can always be a sunny day in the neighborhood or the sky can be falling. A team can be routed by a superior team or robbed of its victory: computer programmed editorial view. Or even a worldview that matches your view based upon your behavioural profile picked out of Big Data. A ‘‘readership of one’’ is a phrase coming into usage.

We trust our GPS in faraway places that we don’t know and evaluate the recommendations carefully in our local environs. When it comes to Big Data we’re convinced that we can not assess the data on our own, and as products that do analysis and generate text move us farther and farther away from the raw numbers, we have less and less detailed insight.

Returning to the Visual Revenue’s decision support product. The strength of such a system is its objective analysis: No Spin, no emotions, just the application of the heuristics.

Big Data and text generation systems move us away from this model. They tell us a narrative, a story and will even place whatever voice or spin the customer wants. This is the strength and weakness of a story: It is a point of view versus cold graphs and stark numbers that are slanted only by excluding or including data.

It’s an interesting time.

Is a good reporter like Jean Harlow’s gold digger character at the end of the 1933 American film ‘Dinner at Eight’, who asks Marie Dressler, ‘‘Do you know that the guy says that machinery is going to take the place of every profession?’’

Dressler arches an eyebrow and replies, ‘‘Oh, my dear, that’s something you need never worry about.” The film then fades to black.

Newspaper systems industry veteran John Juliano writes regularly for GXpress Magazine, Contact him at john@jjcs.com
Sections: Newsmedia industry

Comments

or Register to post a comment




ADVERTISEMENTS


ADVERTISEMENTS