Synthetic data and ‘avatar’ patients to accelerate clinical research in haematology

SYNTHEMA is Humanitas’ Cancer Centre and AI Centre’s project that won a €7 million grant from the European Commission within the Horizon 2020 programme to develop new data analysis systems within the field of haematological diseases. SYNTHEMA wants to create an international hub where innovative artificial intelligence-based techniques can be developed and validated to anonymize patients’ clinical and biological information. Additionally, it aims to generate synthetic data in compliance with GDPR (General Data Protection Regulation) regulations, to overcome the scarcity and fragmentation of information available for research today.

Among the 16 research institutes and universities part of the European maxi-consortium coordinated by the Polytechnic University of Madrid, there is also the IRCCS Istituto Clinico Humanitas, which sees the involvement of its Cancer Centers and AI Centers, and will be responsible for the activity of development and clinical validation of the project. Among the Italian entities also the University of Bologna, the University of Padua and the consulting firm Datawizard.

The project, which will start in early 2023, will focus on rare haematological diseases such as sickle cell anaemia and acute myeloid leukaemia.

“Haematological diseases derive from quantitative or qualitative abnormalities of blood cells, lymphoid organs and coagulation factors,” explains Prof. Matteo Della Porta    Head of Leukaemia and Myelodysplasias at Humanitas, lecturer at Humanitas University and clinical coordinator of the SYNTHEMA project. Even though most of these diseases are rare, the overall number of patients affected worldwide is significant. When looking at all types of tumours (which include blood diseases such as acute myeloid leukaemia), in Italy around 20% of patients are affected by a rare form of neoplasia. In recent years, many collaborative research groups have sprung up at national and EU level to study rare haematological diseases, yet progress in clinical and translational research is often slowed down due to the relatively low number of affected patients and the lack of connection and communication between clinical and research centres.

The AI Centre at Humanitas has been trying to solve this gap by studying secure and anonymous systems for generating knowledge from patient data.

“The future is synthetic data,” explains Victor Savevski, Managing Director of Humanitas’ AI Centre. “To obtain it, we need algorithms that can collect information from the real patient to create a virtual copy that is different from the individual who generated that data, but contains all the clinical, genetic, biological and statistical properties of the original. A sort of ‘avatar’ of the individual, but with no threat to his privacy: the data is not copied one by one, as a mirror, but from a group perspective, considering the relationship between the data. These virtual patients no longer have anything to do with the real patients, it is as if they have absorbed all their properties and relationships in a similar but not equal way’.

Synthetic data can speed up trials: studies that need to compare two different populations (e.g., to verify the efficacy of a drug), will just need to enrol one group of people instead of two, with obvious benefits also from an economic point of view.

Humanitas’ commitment to AI and digitalization

Humanitas Clinical Institute is the first hospital in Italy to have an integrated Artificial Intelligence Research Centre: the Humanitas AI Centre. The centre’s mission is to create a space for sharing knowledge and skills between engineers, doctors and data scientists in order to achieve even higher standards of care. This is done by creating intelligent algorithms capable of processing a large amount of clinical information, so as to find associations and define prediction models that can advance scientific research and innovation in areas such as predictive medicine, personalisation of treatments and diagnostic imaging.

Humanitas is committed to applying Artificial Intelligence in various fields: from the prevention of colon and rectal cancer (where it is working on a database of omics data in endoscopy) to diagnostic imaging and haematology. In 2021, for example, the GenoMed4All project was launched, funded by the European Commission under the Horizon 2020 program, with the aim to create a platform that facilitates the sharing and analysis of the enormous amount of genomic and clinical data on haematological diseases.


Humanitas is a highly specialized Hospital, Research and Teaching Center. Built around centers for the prevention and treatment of cancer, cardiovascular, neurological and orthopedic disease – together with an Ophthalmic Center and a Fertility Center – Humanitas also operates a highly specialised Emergency Department.