Electronic medical records are increasingly being augmented by a tsunami of digitally generated health data collected by individuals via social media, apps, and wearable monitors. Non-clinical factors such as diet, fitness and sleep, as well as patient preferences, are important to health maintenance and there is hope that the increasing data reservoir will yield significant medical advances. Recently there has been a huge effort to collect information from people on the spread and clinical course of covid-19 infection, and many data gathering apps and websites have been rapidly developed. [1-5] Now, more than ever, the question of data ownership has major relevance and impact. Often the query, “Who owns patient data?” is met with the confident and passionate answer, “The patient owns their data, and they should control its use”. But it is not that simple.
Let’s first consider why control of digital data is different to non-computerised information. Decades ago much of our daily lives, like food shopping, went unrecorded, and sensitive records, such as a medical diagnoses, were held by a trusted healthcare practitioner in one place under lock and key. Digital data have more permanent form than unrecorded words, and more accessibility than paper documents. Permanence coupled with accessibility means that data mountains are now available for mining and repackaging by numerous organisations and individuals to generate scientific insights as well as revenue.
So, who “owns” the valuable medical data? Some would argue that the concept of “ownership” exists in order to clarify rights and responsibilities. In relation to data this could mean the ability to utilise information (i.e. reproduce, sell etc), control the flow of that data for use or restrict it to preserve privacy, and the responsibility to avoid harmful information release.
The boundaries of digital data rights and responsibilities are wide and fuzzy.  Although individuals may be classified as data subjects, the information about them may be simultaneously “owned” by different individuals, organisations, governments, as well as by society. As a comparatively simple example, the famous painting “Girl with a Pearl Earring” had a very clear data subject, but by recording the information in oils with his own interpretation, Johannes Vermeer owned and held intellectual property rights on the work of art he created. A more modern example is how our very personal purchasing habits are used by Amazon to create a suggested wish list for others—“People who bought this also bought….”
People may share health information on social media and with their healthcare team, who routinely record such data as well as test results and treatments to optimise care. These electronic medical records can be de-identified and included in research databases in line with local laws. Data companies facilitate access to medical databases for research studies published to advance public health. The media can then broadcast those scientific results. At every stage in this scenario, individuals, organisations, and society invest time and resources and have degrees of “ownership” in the resulting data.
The distributed and complex nature of data rights and responsibilities can lead to conflicts. People understandably want to keep their medical affairs private and may have fears that their data can lead to discrimination. Additionally, health data may not solely be about a single individual, for example, genomic information applies to family members. People want privacy and data protection and think this is achievable through total control of their data. However, true control by individuals is almost impossible, first because consent systems are intentionally and unintentionally driven by system design and may be influenced by the data collectors’ agenda, but also being fully informed of complex and continuous data use is a monumental and probably impossible undertaking. Indeed, it is unfair and may be harmful for organisations and society to abdicate all responsibility for ethical data sharing by imposing that burden on individuals who are the main data subjects.
People may feel that their freely given medical data should not be used for commercial gain but that is an oversimplification since value is derived, not only when individual records are en masse, but also through extensive application of costly technology and expertise to assemble and curate the data, and it is rarely economically viable to pay millions of people for their data. Additionally, the medical insights derived from such data have enormous public benefit. However, does some data collection, for example Track and Trace in the South Korean covid-19 app, achieve the right balance between public health gains and privacy intrusion?
Is the answer then to avoid the concept of individual data ownership altogether? All those involved in the health data cycle need to understand their rights and responsibilities and build models of trust to provide an environment that facilitates the responsible use of health data with robust identity protection, and collective and participatory ethical data sharing mechanisms. Rather than focussing on individual experiences, this requires a society-centred design approach.  A responsible flow of data will benefit humanity through medical discovery, and potentially improve population health.
Who owns the data? We all do.
Alison Bourke, Scientific Director, Centre of Advanced Evidence Generation, IQVIA, and Past President of the International Society for Pharmacoepidemiology.
Georgina Bourke is a Designer and Strategist at Projects by IF, a technology studio that focuses on the ethical and practical uses of data and AI.
Competing interests: AB is a part time paid employee and shareholder of IQVIA. GB is a full time paid employee of Projects by IF.
This article reflects the personal views of the authors and should not be construed to represent IQVIA’s or Projects by IF’s views or policies.