The healthcare system generates approximately a zettabyte (a trillion gigabytes) of data each year, which includes both classic data from sources such as EHRs, diagnostics and genetics, as well as newer data sources such as gut biome sensors, wearable devices and environmental monitors, and social media. Consequently, it’s now possible to quantify a person across three dimensions of human existence: biological, environmental and digital/social.


Big tech is leading the way in quantifying our existence, creating tools and technology to track and measure consumers’ weight, heart rate and other traditional health signifiers, as well as their social determinants of health—information on where and how people live, such as what (or if) they eat, their access to travel, how much they exercise and how they socialize. Social determinants account for roughly 60% of human health and well-being, while healthcare accounts for only 10%. In other words, having the data on where and how people live provides the power to influence people’s health status, so as tech companies race to collect and link this information, they’re reshaping how we view and deliver healthcare. Pharma companies—and healthcare stakeholders more broadly—should take note, paying attention to what’s coming, how healthcare delivery could change, and what it means for their vertical.



As most anyone following business news these days knows, all of the biggest tech players have announced efforts or products to claim a piece of the exploding healthcare data pie, and they’re partnering with traditional stakeholders across the industry to plan their entry. For example, Apple has its FDA-cleared electrocardiogram feature on its Apple WatchResearchKit, a software framework for clinical trial apps; Health Records, an app that aggregates existing patient-entered data from its Health app with a user’s electronic medical record data; and a collaboration with Aetna on Attain, an iPhone/Apple Watch app that tracks and rewards users for healthy behaviors. Apple is using its iPhone and other technologies—and even a new Apple credit card—to systematically build a platform to quantify many aspects of human existence, and has implemented methods to assure that the information can be readily shared and accessed if permission is granted.


Other companies such as Amazon and Google have created an integrated suite of offerings (with Amazon acquiring Whole Foods and Google offering hardware like Nest, Google Home and Pixel). These tech companies will have enough data on consumers to gain a holistic view into their lives and to offer targeted solutions to their healthcare needs.


The healthcare industry has already found itself behind the eight ball, and it’s now grappling with the larger question about how data structure, ownership and access will work in the fully integrated healthcare data landscape of the future. Moreover, there are still considerable challenges to using the emerging healthcare data effectively. Some data is messy or is missing altogether, and AI and machine learning systems often lack the necessary training data sets. And, of course, interoperability issues abound. The fragmented nature of healthcare data has led to the formation of data aggregators that sell data to other stakeholders in data marketplaces, and these data marketplaces either aggregate data across different data types or focus almost exclusively on one data type.


For example, the HealthVerity Marketplace contains HIPAA-compliant, de-identified data on more than 300 million U.S. consumers, pulling together medical and prescription claims, lab results, EMRs and other data types from more than 30 data suppliers across the country. Meanwhile, Nebula Genomics offers a blockchain-based network that houses users’ genetic information.


Some of these data marketplaces are starting to partner with each other and form more comprehensive marketplaces, which is a step in the right direction, and other solutions are emerging. For example, Fast Healthcare Interoperability Resources (FHIR), a new web standard that enables healthcare information to be shared electronically, and SMART on FHIR, which are open specifications that can be used to integrate health-related apps with EHRs and other healthcare IT systems, are burning down interoperability barriers.


Big players like Intermountain Healthcare and Partners Healthcare are using SMART on FHIR to build and utilize apps that work seamlessly with their EHR systems—providing better access to data for them and their patients while expanding data collection. Regulators are on board: New rules from CMS and ONC require healthcare providers and insurers to implement open data-sharing technology that will ensure data movement across plans and expand patients’ access to data.


Furthermore, there have been advancements in structuring some of the unstructured data that better describe the “whole person.” For example, the American Medical Association has partnered with UnitedHealthcare to create 22 new ICD-10 codes for social determinants of health (such as food insecurity, access to transportation, and social connectedness) so that researchers can structure information about wellness.



To capitalize on healthcare’s data-driven evolution—and to keep pace with the change—pharma needs to keep an eye on four major trends.


1. New patient segmentation: Payers have a strong incentive to harness all of the data that they can get to ensure that their members are as healthy as possible. As a result, they lead the pack in applying machine learning, big data analytics and even natural language processing (from phone conversations) to segment people by risk.


For example, Anthem has created an integrated data warehouse that holds its claims data along with EHRs, lab results and other necessary data sets—allowing analysts to investigate members’ specific characteristics and determine their risks for emergency medical treatment or unstable health conditions, and creating the ability to segment and target members with offers of health coaching or additional services.


What will this new approach to patient segmentation mean for pharma’s clinical trial design and recruitment? How will it change the way that payers evaluate patient populations for access to drugs? Pharma companies are going to have to modify their clinical trial designs to match new evidence standards that link social determinants of health to outcomes. They’ll also need to consider new dimensions of patient segmentation along the lines of access to travel, food security, social engagement and the like. This will be necessary as payers consider this information in their population health analyses.


2. Care model change: Fueled by data, the healthcare delivery model is beginning to change, shifting the focus from treating sickness to maintaining and encouraging wellness. While there will always be specialty care, the majority of healthcare is moving towards localized health hubs offering social program-like services related to education, prevention and treatment in a retail setting.


For example, Cityblock, a company under the Alphabet umbrella, is partnering with health plans to reach people in neighborhoods with high poverty rates and other social challenges. They collect structured and unstructured data, and synthesize it into dashboards to enable community health practitioners to create personalized overall life wellness plans that address personal habits and social behaviors—ideally heading off health conditions before they start.


For pharma companies, the increasingly local and down-skilled care model indicates that it’s time to assess the new influence points for care decisions and to optimize their commercial models appropriately. This tech-enabled care delivery will rely on pathways and rules-based decisions.  Pharma will need to identify where and how to influence these predetermined care decisions in a more business-to-business manner. For example, as payers move patients to post-acute sites of care, providers here are gaining the power to switch treatments to generics or drive biosimilar adoption.


3. Evolving engagement dynamics: In the race for access to healthcare data, many former enemies are “frenemies.” Everyone is a potential partner. For example, Pfizer buys cancer data from Flatiron, which is owned by Roche. Roche and Pfizer are competitors. Everyone is cool with this. Meanwhile, the Yoda Project at Yale has direct competitors contributing their data to an open-access platform.


Data-oriented partnerships and alliances are becoming the norm, so pharma companies should consider adding more data-focused roles to their roster. In addition to the chief data officers and chief digital officers being recruited now, they’ll need data alliance/venture teams and data stewards, with savvy data monitoring and licensing experts leading the way on finding and securing valuable data relationships.


4. Using data as an R&D value generator: Increasingly, pharma companies are using new forms of data in new ways, in all parts of their business and throughout their product development cycle. For example, Daphne Koller, a computer science professor at Stanford University, founded Insitro, which is focused on reversing the death spiral of R&D productivity by leveraging machine learning, CRISPR and other techniques to make drug discovery more efficient. Leveraging technologies such as organs on a chip and stem cells, mutating these cells, and phenotyping, Insitro has created a bio-data factory to explicitly enable machine learning. The results have been promising—leading to a partnership with Gilead.


Data is now a strategic asset for pharma companies. As data sources paint a more complete picture of biological function, pharma companies can examine their small molecule libraries for new applications.  In addition, they can design new biomarkers (including digital) to create product differentiation and, ideally, to enable earlier read-outs on clinical trials, which will bring products to market sooner and cut development costs. More rich and complete data handled in the right way will enable better and more effective decisions from R&D all the way to commercial.


The race is on to acquire and aggregate data, clean data, structure unstructured data, find new forms of data and generate missing data. In the next decade, navigating the data marketplace will be as critical as navigating the reimbursement landscape was in the past decade. Commercial viability will depend on it.