21 Jan, 11 | by BMJ
Following on from last week’s discussion of information-seeking behaviour, today we’ll be exploring one way of transforming individual articles into portals to greater information; Utopia Documents .
What is Utopia Documents?
At its most basic level, Utopia Documents is a PDF reading tool that allows articles to be augmented with interactive content, and helps the reader explore data associated with a particular paper. It’s a desktop application for reading and exploring papers, and functions in many respects like a normal PDF reader. Its real potential becomes clear when configured with appropriate domain-specific ontologies and plugins. Once these are in place, the software transforms PDF versions of articles from static facsimiles of their printed counterparts into dynamic gateways to additional knowledge, linking both explicit and implicit information embedded in the articles to online resources, as well as providing access to auxiliary data and interactive visualisation and analysis tools. For a thorough demonstration of the software, take a look at the video below:
Why concentrate on PDF’s and not HTML?
Given the huge investment in XML/HTML versions of articles, are we taking a step backwards by semantically-tagging our PDF’s? Professor Teresa K. Attwood, who led the bio-informatics component of the EPSRC/DTI-funded UTOPIA(d) project, argues that:
“Utopia Documents was developed in response to the realization that, in spite of the benefits of ‘enhanced HTML’ articles online, most papers are still read, and stored by researchers in personal archives, as PDF files. Several factors likely contribute to this reluctance to move entirely to reading articles online: PDFs can be ‘owned’ and stored locally, without concerns about web sites disappearing, papers being withdrawn or modified, or journal subscriptions expiring; as self-contained objects, PDFs are easy to read offline and share with peers (even if the legality of the latter may sometimes be dubious); and, centuries of typographic craft have led to convergence on journal formats that (on paper and in PDF) are familiar, broadly similar, aesthetically pleasing and easy to read.”
Further authors have responded to reservations regarding the semantically-limited nature of PDF’s as being a non-issue.
“We argue that PDFs are merely a mechanism for rendering words and figures, and are thus no more or less ‘semantic’ than the HTML used to generate web pages. Utopia Documents is hence an attempt to provide a semantic bridge that connects the benefits of both the static and the dynamic online incarnations of published texts.”
What are the main features of Utopia Documents?
“Utopia Documents links scientific research papers to the data and to the community. It enables publishers to enhance their publications with additional material, interactive graphs and models. It allow the reader to access a wealth of data resources directly from the paper they are viewing, makes private notes and start public conversations. It does all this on normal PDFs, and never alters the original file. We are targeting the PDF, since they still have around 80% readership over online viewing.”
Explore article content
An integrated semantic search bar enables users to explore the biological content of an article from within a PDF reader. This offers readers the opportunity to investigate aspects of a scientific article further or clarify given terms.
Discover published metadata
If a publisher has invested in the appropriate domain-specific ontologies and plugins, Utopia Documents can provide access to additional context, from database entries to golssary definitions. All new articles in the Semantic Biochemical Journal, for example, include publisher-curated annotations of the most salient facts.
Comment on articles
The software allows readers to annotate their PDF’s, either privately for personal reference or publicly as part of an online discussion.
Interact with live data
Utopia Documents allows users to interact directly with curated database entries. Within the familiar setting of a PDF reader, they can play with molecular structures; edit sequence and alignment data and even plot curated tabular data.
Many scholars of research behaviour argue that for electronic journals to survive and thrive, they must be different from their print antecedents. Although it is certainly true that online journals must offer added functionality, it would be more appropriate to refer to the printed versions as competitors rather than predecessors. Designers and publishers must therefore fully exploit the electronic medium’s basic properties, with ‘interactivity’ as the primary characteristic of new technologies. Utopia Documents allow the user to search through an integrated search bar, play with molecular structures and annotate documents for online collaboration. While reading electronic journals is not the same as reading a print copy, it’s time to fully exploit the opportunity of these electronic documents by offering users advanced features and novel forms of functionality beyond what is possible in print.
Utopia Documents is free and can be downloaded here: http://getutopia.com/documents/