Jane Smith: Robot journalism

Imagine a news story written and published within three minutes of the event happening. That’s a real scenario described by Emily Bell in her T P Stead Lecture at the British Library last week. I was intrigued by her title “Robot reporters” and went to hear more about “Journalism in the Age of Automation and Big Data.”

Bell, who formerly ran Guardian Unlimited and is now director of the Tow Center for Digital Journalism at the University of Columbia in New York, was arguing that journalists need to know about the technologies that help create and distribute their stories. These days that means they need to work alongside software programmers and engineers and understand the algorithms that underlie services that mine big databases and expose news stories—like Google News. In particular, they need to know the biases of the algorithms—because they will have some—and that it is much harder to find out about these than it is with human informants and writers, particularly if the code is commercially confidential—as it is with the Googles and Facebooks of the world.

Her example of the three minute story was about an earthquake in Los Angeles that appeared in the Los Angeles Times. It was “written” by Quakebot, a programme developed by a Los Angeles Times journalist that takes scans from automated earthquake alerts from the US Geological Survey, and if the quake is above a certain size pulls the data into a pre-written template and puts it in the Los Angeles Times database, where it is available for journalists to review and publish. As Bell said, the story was very precisely written:

“A shallow magnitude 4.7 earthquake was reported Monday morning five miles from Westwood, California, according to the U.S. Geological Survey. The tremblor occurred at 6:25 a.m. Pacific time at a depth of 5.0 miles.”

“According to the USGS, the epicenter was six miles from Beverly Hills, California, seven miles from Universal City, California, seven miles from Santa Monica, California and 348 miles from Sacramento, California….This information comes from the USGS Earthquake Notification Service and this post was created by an algorithm written by the author.”

What happened was that the journalist who wrote Quakebot and whose byline the story appeared under (and who is described as “journalist and programmer”) was woken by an earthquake, went to his computer, saw the story already written and waiting, and pressed the “publish” button. It all took about three minutes—far faster than any human journalist could have spotted the information, assimilated it, and written it up.

I went to the lecture wondering whether robot reporters would mean that the BMJ’s news team might one day be obsolete, but I soon dismissed that idea because, as Bell said, computers may be very accurate about facts, but humans are much better at opinions and explanations.

Instead I came away thinking that a more immediately realistic scenario relevant to the BMJ might be an RCTbot that automatically wrote up a clinical trial using the trial database as the source of information. Bell’s fear is that bots have hidden biases, but a decently written RCTbot might do a less biased job than human authors of RCTs often do. At the very least the bots could ensure that all the information that readers want about a trial—from absolute risks to side effect rates is actually provided.

Jane Smith was a BMJ deputy editor until September 2012.

Information for Authors