The stakes are high. According to an Accenture report cited during the Strategic Portfolio Review, 83 percent of life sciences executives acknowledge they will miss their growth targets without significant digital and data-driven transformation. A speaker representing Parexel drove home the necessity of this transformation by illustrating the logarithmic decline in FDA-approved drugs relative to the R&D spending. Drug discovery remains a complex, inefficient process, full of opportunities for improvement. The challenge now is to implement the foundational data, knowledge, and governance changes needed to revitalize the pipeline.
Breaking the Data Silos: The Operational Core of R&D
A central theme of the conference was the fundamental challenge of siloed, inconsistent data across the R&D landscape. This fragmentation continues to hinder efforts to demonstrate meaningful, scalable use cases for AI in drug discovery. An R&D IT leader from AstraZeneca illustrated this problem – a path forward – through a case study on operationalizing data transformation with the development of their team’s CMC (Chemistry Manufacturing & Control) Data Hub. Confronted with CMC regulatory documents that can exceed 800 pages, the team set out to integrate information from multiple legacy systems, including six LIMS, three ELNS, and a wide array of additional data sources, with the goal of automating CMC document generation.
By mirroring data from these fragmented systems into a unified data hub accessible via a Power BI interface, the team eliminated manual lookups and transcription checks. A key operational challenge was the inconsistent terminology used across products, teams, and external partners. Standardizing incoming data from external partners (organizations) was critical given the heavy reliance on the fact that a large proportion of the oncology pipeline is supported by contract development and manufacturing organizations. The resulting agentic, AI-driven solution reduced data-transfer time from external manufacturing partners from four weeks to less than 24 hours. The team then optimized the data model into the required file formats, demonstrating a high-impact, practical application of automation.
The Standardization Imperative
The challenge of disparate data goes beyond internal systems. One speaker noted that the SDTM data formatting standards established by CDISC are inappropriate for large-scale experimental outputs, such as the massive datasets generated by high throughput sequencing in omics research. With omics data often reaching multiple terabytes, simple tabular formats typically required for regulatory submissions are not feasible.
For machine learning and agentic AI to deliver meaningful value in drug discovery, both harmonization of data sources and reliable standardization are essential. Without these foundations, AI outputs cannot be trusted, slowing adoption across the industry. Many breakout and stream sessions at the conference focused on addressing obstacles on the path to achieving FAIR data, including member-identified challenges in semantic data, preclinical workflows, and clinical trial submissions.
The use cases shared by presenters and participants align with industry findings. A recent Benchling AI report cited during one presentation identified data integrity as the single greatest concern to deploying advanced digital tools in pharmaceutical R&D.
Building the Enterprise Knowledge Spine: Ontology and Meaning
The consensus across the conference was clear: data is not enough – it must be meaningful. Semantics and ontology work, previously considered niche and largely behind the scenes, have become essential for any organization seeking to apply automation and advanced analytics effectively.
Major technology providers – including Microsoft, Palantir, AWS, and Google Cloud – are all actively discussing ontologies, confirming their role as the vital infrastructure for knowledge representation. However, building and maintaining robust ontologies requires sustained investment, especially in complex research environments where terminology evolves rapidly.
To address the volume of required changes in complex research environments, organizations are increasingly exploring automation opportunities within ontology workflows. AI-enabled tools can now propose solutions to ontology requests, with a critical human check in the loop: subject matter experts must still review and approve all changes.
As with any LLM-driven system, the quality of these proposals depends entirely on the quality and consistency of the underlying data. In organizations where expertise is limited or terminology varies widely, AI tools will inevitably reflect those inconsistencies.
Despite the promise that AI and machine learning will revolutionize drug discovery processes and reduce time-to-market, the conference highlighted a more sobering reality: many implementations to date have exposed – rather than solved – fundamental weaknesses in data capture, processing and use. In the rare instances where member organizations have made meaningful progress in AI-enabled digital transformation, two key strategies emerged as critical for managing internal variability and domain to mitigate specialization:
- Grounding in Domain Documents: Using domain-specific documents to ensure that any proposed automation is rooted in established, trusted, and contextually accurate knowledge.
- Human Fine-Tuning: Maintaining expert human oversight to continuously refine and correct the evolving knowledge model.
Knowledge graphs, which map key terms and integrate hundreds of internal and external datasets into a massive billion-edge network, were repeatedly highlighted as a powerful solution to the data-modeling challenges facing pharma. This sophisticated structure provides the foundation needed for sophisticated organizational decision-making and is rapidly becoming an indispensable asset for designing and deploying agentic AI.
The FAIR Imperative and Data Readiness
While advanced digital tools rely on high-quality data, they can also play a critical role in managing it – ultimately helping organizations achieve FAIR principles (Findable, Accessible, Interoperable, and Reusable) and then maintain that data integrity though continuous updates and integration.
Conference presenters emphasized that data must be high-quality, appropriately scoped and above all, compliant with regulatory requirements. Data without strong governance can never be applied systematically or at scale. Several examples highlighted how digital capabilities can support these principles:
- Findable: Automated tagging and metadata generation to improve discoverability.
- Accessible: Automatic classification of data sensitivity to ensure appropriate access control
The overarching message was clear: organizations must move away from locally developed, team-specific data models. To achieve true consistency and interoperability, data models and business logic need to be defined and governed at the enterprise level. Only then can organizations fully leverage automation, analytics, and AI across the R&D landscape.
Sourcing Authentic Patient Experience and Engagement
Patient experience data is increasingly viewed as a vital source of insights. A speaker from Boehringer Ingelheim described a collaborative project using advanced digital processing to extract patient experience data from social media content. The raw data is anonymized and filtered through analytical tools to generate rich patient insights, emphasizing the belief that unmoderated patient conversations are more authentic, particularly concerning adverse events. This approach is especially valuable in regions where cultural or legal barriers discourage official reporting. For example, in India, where suicidal ideation is a criminalized, patients may be far less likely to disclose such experiences through formal mechanisms. The Pistoia Alliance’s Social Media & Real World Evidence project is working to develop regulatory guidelines to support the responsible use of these emerging tools.
Complementary to this are efforts to improve recruitment for clinical trials. The high failure rate in clinical trials remains both an operational and ethical challenge: nearly 80% of clinical trials fail to meet enrollment targets, and 20-40% of sites do not recruit a single patient. These statistics underscore the urgent need for new approaches to patient engagement and data sourcing.
Richard Coupe of Our Future Health presented a not-for-profit initiative aimed at addressing this gap by recruiting five million adults across the UK. The program collects clinical review data and blood samples and links them with NHS data and future wearables. This consent-driven model creates an unprecedented research resource designed to accelerate recruitment for early detection and phenotype-driven trials. The broader takeaway is that clinical trials represent a huge opportunity for large-scale data collection and standardization. If leveraged effectively, this could dramatically improve drug discovery timelines and strengthen the evidence base that underpins therapeutic development.
The Ultimate Hurdle: Organizational Change and Human Adoption
Throughout the technical discussions, particularly those recounting real-world AI implementations, a consistent thread of organizational and cultural challenges persisted. While computational models are advancing rapidly, human adoption is the true bottleneck.
A speaker from Rocke’s Computation Sciences team presented a bold but challenging vision: designing business processes around advanced digital capabilities instead of exclusively around human workflows. To achieve this shift successfully, she recommended that organizations commit 20% of both budget and time to change management.
Dr. Nicole Mather of IBM Consulting reinforced the severity of the Adoption Barrier, stressing that “Human-digital collaboration requires new skills”. She also cautioned against falling into the “AI Debt Trap,” where siloed pilot projects fail to integrate across the enterprise-wide systems. Moira from UCB, discussed their work on extracting medical insights, highlighted a list of critical success factors that transcend far beyond technology: Data Readiness, Human Oversight, Stakeholder Buy-In, Integrated Workflows, and Change Management. She emphasized that these elements are essential for ensuring that AI solutions can be trusted, adopted, and scaled across the organization.
A profound tension is at the center of the industry’s digital transformation:
- The Person-Centered Gap: Despite the emphasis on data and efficiency, discussions consistently prioritized cost savings over end-user satisfaction. The challenge of designing and deploying truly person-centered technology at scale remains unsolved.
- Structural Resistance: Large, complex organizations, often structured by acquisitions, legacy systems, and internal politics, are not designed for the holistic process redesign required by enterprise-level digital strategies. This structural inertia slows or even blocks meaningful transformation.
- Misaligned Incentives: Investment in new digital capabilities is frequently justified by project cost reductions and, implicitly, workforce reductions. This creates a disincentive to invest in the necessary cultural change and upskilling required for successful adoption. The recurring sentiment that employees must “get on board with the technology or they’re gone” underscores a fundamental misalignment between the goals of technological advancement and the necessity of human empowerment.
The major call-to-action emerging from the conference was unmistakable: organizations must confront the barriers and enablers of digital transformation in the lab, establish clear recommendations for stakeholder engagement, and define core principles that support human adoption. Until organizations commit to solving the culture and change readiness challenge with the same rigor they apply to data infrastructure and advanced analytics, the promise of R&D transformation will remain limited. The community consensus is clear: the future of drug discovery depends on rigorous data governance and making change management part of the status quo.



