DATA REPOSITORY
FOR PRECIOUS3GPT
The Architecture and Training
of the Precious3GPT model
Omics tabular observations are transformed into structured input prompts to train the transformer block. Additional modalities (text and knowledge graph) are embedded through modality mapper units as extensions to the frozen transformer block.
Overview of datasets used to train Precious3GPT
Precious3GPT as a central player in a multi-agent system for biomedical research.
Autonomous agents (left column) may use web services (center) as tools to carry out the functions typical for a biomedical research group. In such a collective of AI agents, Precious3GPT serves as a fast and affordable way to screen compounds or run omics experiments to generate hypotheses, e.g. “maslinic acid is a promising senolytic”. Other agent act on this hypothesis, preparing a lab protocol for and scheduling in vitro validation. Its results are shared with agents who can digest the findings in natural language for a human-assisted review. After the review, the results are integrated into the ever-growing database to be used in Precious3GPT fine-tuning. Finally, Precious3GPT is ready to generate new hypotheses, thus closing the loop.
Performance of Gene Ontology (GO) term classifiers trained on embeddings extracted from various LLMs
The Precious3GPT-based classifiers exhibited superior performance with 8 out of 18 high-level GO terms but underperformed in classifying the genes’ molecular functions.
ROC-AUC — area under the receiver operating curve;
Std — standard deviation.
The background set consisted of 22,509 human genes.
The green and red highlights denote the best- and worst-performing classifiers for each term.
We employed
Precious3GPT to generate novel geroprotectors.
Out of 22 compounds screened at Insilico Medicine's Robotic Lab, 8 showed senomorphic or senolytic action in an IMR90 in vitro senescence model
Precious3GPT allows several strategies to identify such compounds, among which we selected the method that relies on generating differential gene lists for younger and older adults. Previous works that have used machine learning to identify novel senolytics have reported a 14% (3 out of 21) hit rate 1. In comparison, 23% of the compounds proposed by Precious3GPT showed senomorphic activity and no cytotoxicity.

XL-888 was recently identified as a senolytic in an in vitro model of lung fibrosis 2. This senolytic activity is supposedly realized via HSP90 inhibition, a well-documented pathway activated by some other senolytics 3. We independently identified XL-888 as a potential senolytic showing selective toxicity to senescent cells.
Other promising compounds selected by Precious3GPT for screening include maslinic acid, a natural terpenoid found in olives, which has previously been reported to alleviate aging-related disorders in muscle and cartilage 4.
Thanks to Precious3GPT, we managed to report it as a senomorphic compound preventing senescent cell formation, a mechanism of action through which many beneficial effects can be achieved in the elderly.

For other identified senomorphics, only scarce evidence of geroprotective effects can be found. Clomethiazole, for instance, is a sedative reported to have neuroprotective effects in ischemia 5. Dapsone was previously reported to exert an anti-inflammatory effect by inhibiting reactive oxygen species production 6. One compound that demonstrated potential senolytic action in our in vitro screening, metronidazole, has been used since the 1950s to treat inflammatory gastrointestinal conditions, such as colitis, thanks to its cytotoxic effect on anaerobic bacteria 7. Despite a long history of research, the metabolism of metronidazole by human cells is still not fully understood, and our findings suggest that this compound may be worthwhile to study outside the antibiotic context 8.
1. Smer-Barreto V, Quintanilla A, Elliott RJR, et al: Discovery of senolytics using machine learning. Nat Commun 14:3445, 2023
2. Lee JY, Reyes NS, Ravishankar S, et al: An in vivo screening platform identifies senolytic compounds that target p16INK4a+ fibroblasts in lung fibrosis. J Clin Invest 134:e173371, 2024
3. Fuhrmann-Stroissnigg H, Ling YY, Zhao J, et al: Identification of HSP90 inhibitors as a novel class of senolytics. Nat Commun 8:422, 2017
4. Proshkina E, Plyusnin S, Babak T, et al: Terpenoids as Potential Geroprotectors. Antioxidants (Basel) 9:529, 2020
5. Sydserff SG, Cross AJ, Murray TK, et al: Clomethiazole is neuroprotective in models of global and focal cerebral ischemia when infused at doses producing clinically relevant plasma concentrations. Brain Res 862:59–62, 2000
6. Khalilzadeh M, Shayan M, Jourian S, et al: A comprehensive insight into the anti-inflammatory properties of dapsone. Naunyn Schmiedebergs Arch Pharmacol 395:1509–1523, 2022
7. Edwards DI: Nitroimidazole drugs--action and resistance mechanisms. I. Mechanisms of action. J Antimicrob Chemother 31:9–20, 1993
8. Dingsdag SA, Hunter N: Metronidazole: an update on metabolism, structure-cytotoxicity and resistance mechanisms. J Antimicrob Chemother 73:265–279, 2018