About
Optimizing Thyroid Cancer Research with Real-World Data and NLP: The NATI Study
Unlocking clinical insights through AI-powered data extraction from unstructured medical records.
success case
Key takeaways
NLP-Driven Data Extraction
Advanced Natural Language Processing (NLP) enabled the transformation of unstructured EHRs into structured, research-ready data.
Real-World Oncology Insights
4,482 patient records were enriched using NLP, complementing the 655 structured entries and significantly expanding the dataset.
Deep Clinical Characterization
364 variables analyzed per patient, including diagnosis, treatments, staging, comorbidities, and genetic alterations.
Interoperability
Federated OMOP-CDM model ensured interoperability across four hospitals in Spain.
context
Thyroid cancer is the most prevalent endocrine malignancy, representing 3.1% of global cancer cases. In Spain, national data is limited due to the lack of population-based registries, creating barriers for research and public health planning.
The NATI study (NATural language in ThyroId cancer), aimed to provide an updated, real-world view of how thyroid cancer is diagnosed and managed in clinical practice by analyzing patient data from 4 Spanish hospitals.
The NATI study (NATural language in ThyroId cancer), aimed to provide an updated, real-world view of how thyroid cancer is diagnosed and managed in clinical practice by analyzing patient data from 4 Spanish hospitals.
the challenge
Obtaining a comprehensive picture of thyroid cancer in Spain required access to large volumes of clinical information, much of which is buried in unstructured formats within EHRs. Manual data extraction is time-consuming and resource-intensive, making it unfeasible at scale.
The NATI study aimed to overcome this challenge by using AI and NLP technologies to automate data capture and generate a more complete dataset to support thyroid cancer research.
The NATI study aimed to overcome this challenge by using AI and NLP technologies to automate data capture and generate a more complete dataset to support thyroid cancer research.
how iomed helped
IOMED implemented its AI-powered Data Space Platform to support individual patient characterization through the automated extraction of real-world data from electronic health records (EHRs). Conducted across four hospitals in Spain, IOMED leveraged Natural Language Processing (NLP) to transform unstructured clinical narratives into high-quality, research-ready data.
This approach significantly enhanced data completeness and depth, revealing genetic alterations and demonstrating the scalability of NLP for large-scale oncology research. By unlocking valuable clinical content hidden in free-text medical records, IOMED’s technology played a key role in enabling a more accurate and granular view of thyroid cancer care across participating institutions.
This approach significantly enhanced data completeness and depth, revealing genetic alterations and demonstrating the scalability of NLP for large-scale oncology research. By unlocking valuable clinical content hidden in free-text medical records, IOMED’s technology played a key role in enabling a more accurate and granular view of thyroid cancer care across participating institutions.
Results
5.137
Total patients with thyroid cancer included in the study
364
Variables analyzed per patient across clinical, diagnostic, and treatment domains
4.482
Patient records enriched using NLP from unstructured notes
impact 1
Hospital
Enabled deep patient characterization using existing EHR data, improving research readiness without adding clinical workload.
impact 2
Industry
Provided life sciences with scalable access to real-world insights for biomarker discovery and precision oncology strategies.
impact 3
Patients
Enhanced understanding of diagnostic and treatment journeys supports more targeted interventions and future care planning.
impact 4
Community
Demonstrated the power of AI and RWD to address data fragmentation and support national-level cancer research initiatives.


