Google has introduced DeepSomatic, a powerful new artificial intelligence tool designed to identify genetic mutations that drive cancer growth more accurately than existing methods. The announcement, published in Nature Biotechnology, marks another milestone in the use of AI to accelerate cancer research and precision medicine.

Pinpointing the Genetic Triggers of Cancer
Cancer begins when normal cell division goes awry, often due to genetic mutations that disrupt normal biological controls. Understanding which mutations are responsible is essential for doctors to select targeted treatments that stop tumour growth and prevent its spread.
Modern cancer care often involves sequencing tumour DNA from biopsies. However, separating genuine cancer-related mutations from background noise and sequencing errors remains a major challenge—especially when dealing with somatic variants, which arise during a person’s lifetime rather than being inherited.

Somatic mutations can result from environmental damage like UV exposure or from natural errors during cell replication. Because they often appear in only a fraction of tumour cells, distinguishing them from technical errors requires exceptional precision—something DeepSomatic is built to deliver.

How DeepSomatic Works
In typical clinical workflows, scientists sequence both tumour and normal cells from the same patient. DeepSomatic compares the two, spotting subtle differences that reveal which mutations are fuelling cancer.
The AI transforms raw DNA sequencing data into detailed visual representations, which are then analysed by convolutional neural networks—the same type of deep-learning architecture used in image recognition. This allows the model to distinguish between inherited genetic variants, sequencing artifacts, and true cancer-causing mutations.
DeepSomatic can also operate in a “tumour-only” mode for cases where healthy tissue samples aren’t available, such as blood cancers like leukaemia. This flexibility makes it suitable for a wide range of clinical and research applications.
Built on a Robust Foundation
To train DeepSomatic, Google partnered with the UC Santa Cruz Genomics Institute and the U.S. National Cancer Institute to create a gold-standard reference dataset called CASTLE. The team sequenced tumour and normal cells from breast and lung cancer samples using three major sequencing technologies. Combining these datasets produced a high-accuracy reference that filters out platform-specific errors.
This comprehensive training data helped the AI achieve superior performance across sequencing methods. On Illumina data, DeepSomatic reached a 90% F1-score for detecting complex insertions and deletions—outperforming the next-best model by a significant margin. The advantage was even greater with Pacific Biosciences data, where DeepSomatic’s accuracy exceeded 80%, compared to less than 50% for competing tools.
The system also proved effective on more challenging samples, including those preserved using formalin-fixed-paraffin-embedded (FFPE) methods and whole exome sequencing (WES) datasets. These findings suggest DeepSomatic can handle lower-quality or older samples often used in retrospective studies.
Beyond the Training Data
One of the most promising aspects of DeepSomatic is its ability to generalize. When tested on glioblastoma, an aggressive form of brain cancer not included in its training data, the tool successfully identified known driver mutations. In collaboration with Children’s Mercy Hospital in Kansas City, DeepSomatic also analysed paediatric leukaemia samples, confirming known variants and uncovering ten new ones—all from tumour-only data.
A Step Toward Precision Oncology
By making both the DeepSomatic model and the CASTLE dataset openly available, Google aims to accelerate global cancer research and improve clinical decision-making. The tool’s ability to identify both known and novel mutations could help doctors tailor treatments more precisely and researchers uncover new therapeutic targets.
As AI continues to reshape genomics, DeepSomatic represents a significant step toward understanding what drives each individual tumour—and ultimately, how to stop it.