CAMBRIDGE, Mass., May 4, 2026 — Insilico Medicine (“Insilico”, HKEX: 3696), a global clinical-stage generative artificial intelligence (AI)-driven drug discovery company, today announced that its research paper, “When Single Answer Is Not Enough: Rethinking Single-Step Retrosynthesis Benchmarks for LLMs,” has been accepted for presentation at the 43rd International Conference on Machine Learning (ICML 2026) in Seoul, South Korea, from July 6–11, 2026, at the COEX Convention & Exhibition Center.
The research, led by Insilico’s specialized team at its Generative AI and Quantum Computing R&D Center in Abu Dhabi, challenges the “ground-truth dogma” of legacy retrosynthesis benchmarks. The paper introduces ChemCensor, a novel, chemistry-aware metric designed to move beyond rigid Top-K accuracy and toward expert-level evaluation that reflects how chemists actually think—using reaction centers and functional groups rather than binary logic.
“Real chemistry is inherently multi-solution, and our benchmarking should reflect that reality,” said Alex Zhavoronkov, PhD, founder and co-CEO of Insilico Medicine. “ChemCensor is a critical component of our broader MMAI Gym Suite, which we are building to create realistic, scalable training environments for Pharmaceutical Superintelligence.”
Key Scientific Contributions:
Insilico remains committed to full transparency and reproducibility. A large set of supporting materials will be released on Zenodo, Hugging Face, and GitHub. Read the pre-print of our ICML-2026 paper on ChemCensor: https://arxiv.org/abs/2602.03554
The research, led by Insilico’s specialized team at its Generative AI and Quantum Computing R&D Center in Abu Dhabi, challenges the “ground-truth dogma” of legacy retrosynthesis benchmarks. The paper introduces ChemCensor, a novel, chemistry-aware metric designed to move beyond rigid Top-K accuracy and toward expert-level evaluation that reflects how chemists actually think—using reaction centers and functional groups rather than binary logic.
“Real chemistry is inherently multi-solution, and our benchmarking should reflect that reality,” said Alex Zhavoronkov, PhD, founder and co-CEO of Insilico Medicine. “ChemCensor is a critical component of our broader MMAI Gym Suite, which we are building to create realistic, scalable training environments for Pharmaceutical Superintelligence.”
Key Scientific Contributions:
- Beyond Top-K Accuracy: The paper argues that single-answer evaluation on constrained benchmarks like USPTO-50K is fundamentally insufficient for retrosynthesis and suggests the new chemistry-aware ChemCensor metric.
- CREED Dataset: Introduction of a Comprehensive Reactant Exhaustive Enumeration Dataset comprising 6.4 million ChemCensor-validated reactions.
- C3LM Model: The paper reports results from a January 2026 version of C3LM; however, Insilico’s latest iteration already surpasses state-of-the-art (SOTA) conventional models including LocalRetro, R-SMILES, and Retro-KNN.
- URSA-expert-2026: Release of an expert-annotated out-of-domain benchmark of 100 novel targets to ensure evaluation is free from data leakage.
Insilico remains committed to full transparency and reproducibility. A large set of supporting materials will be released on Zenodo, Hugging Face, and GitHub. Read the pre-print of our ICML-2026 paper on ChemCensor: https://arxiv.org/abs/2602.03554