Prioritization Scores
Omics score values represent the probability of a given evidence group indicating the association of a given gene to a given disease. All scores range from 0 to 1, with 0 indicating no evidence, and 1 - the highest degree of evidence
NEW – Heterogeneous Graph Walk
NEW – Matrix Factorization
Interactome Community
Causal inference
Mutated/Disease Sub-modules
Network Neighbors
Heterogeneous Graph Walk (HeroWalk) is a guided random walk-based approach that is applied to a heterogeneous gene-disease graph.
Matrix Factorization is a collaborative filtering algorithm widely used in recommender systems applied to a gene-disease interaction graph. The new scores strengthen the AI component and result in a better performance of the target hypothesis generation engine while preserving the general layout and backwards compatibility of the Target ID tool.
Expression relies on the differential gene expression analysis defined by a collection of datasets of interest. Machine learning-based models are used to normalize gene expression data available across multiple samples from unrelated datasets.
The score utilizes the collection of the protein-protein interactome exploration. AI methods aim to assess interactome communities, in order to detect a given gene. The higher the score, the richer the dense community with known targets for a given disease, the genes related to a given disease by GWAS, TWAS, OMIM and ClinVar evidence, and differentially expressed genes defined by a collection of datasets of interest.
This score is based on the causal inference of transcription factors. It estimates the number of genes associated with the disease progression/treatment, controlled by a similar set of transcription factors to a given gene. It uses a manually curated regulatory network and known drug targets to predict potential disease-modifying transcription factors.
These scores are based on the AI model for comparison of differential expression profiles of interest. The LINCs cell line expression profiles are compared with gene expression profiles in cell lines overexpressing particular genes (knockin) or carrying gene knockout (knockout) based on LINCS database. The score characterizes a gene which knockin/knockout affects cancer cell line expression profile, similar to the disease related transcriptomics dataset submitted by the user.
This score estimates target relevance for the process/disease phenotype. The score relies on a manually curated database and the data coming from external sources (OMIM, ClinVar, Open Targets).
This score is based on the AI weighting of the evidence inferred from a manually curated list of gene variants and TWAS/GWAS studies. High score indicates mutation burden, correlating with a given disease.
The score is based on the combination of pathway analysis approaches. First, iPanda is used to examine a given gene implication in pathway activation patterns, in order to get a collection of gene expression datasets of interest (activation/inhibition of each pathway is examined separately). Next, all the pathways from the library are merged into a single network, which is examined from the perspective of signal propagation by a number of methods. The final score indicates the implication of a given gene in individual pathways activation/inhibition, and its ability to affect multiple pathways via cross-talks.
The score explores direct network neighbors interacting with a given gene. The higher the score, the bigger the number of neighbor differentially expressed genes defined by a collection of datasets of interest.
Combines evidence from the external databases (OMIM, Open Targets, ClinVar).
Text-based (NLP)
This group of scores is based on the analysis of text sources, including scientific publications, grants, patents, clinical trials. Insilico monitors the biomedical text data and uses advanced NLP and AI-based technology, to analyze the contextualized mentions of entities such as genes, diseases, drugs and KOLs in a variety of data sources. The scores are calculated for each gene in the disease-agnostic manner (total mentions of the gene), and in the context of disease areas of interest. Attention spikes are calculated for target — disease associations only (no disease-agnostic scores).
Attention score
Attention Spike
Attention score measures the overall attention to the target at all times. Attention score is based on the total mentions of a gene in the text data, including scientific publications, grants, patents, clinical trials, across all time periods (both disease-agnostic and disease-specific).
Trend score estimates the growth of attention to the given gene, during the last 5-year period (both disease-agnostic and disease-specific). It ranges from -1 to 1, with 1 corresponding to an infinite amount of publications during the last year. Negative values correspond to the decline in publication activity. The exact trend value of a gene is available on the gene page.
Percentage score is defined as the target-indication publication percentage (relative to the total number for all target-disease cross-mentions). The recurrent neural network was trained on retrospective publication trends, to forecast the attention spike probability in the next two years. The score ranges between 0 and 1. 0 corresponds to the lowest probability of the spike, while 1 corresponds to the higher probability of the spike.
This score combines the evidence from Attention score and Trend scores representing the weighted average of trend and log-Attention score. Higher values indicating both exponential attention growth and a high volume of research.
Financial scores
Grant Funding
Funding per Publication
Grant Size
Grant funding at all times. Skewed distribution, average entry with non-zero funding has $ 8 900 000 funding, but median funding is $ 1 700 000. 5% of entries have funding above $ 30 000 000.
The sum of grant funding in relation to the publication count. Entries with 0 publications and >0 funding are replaced with 0. This score corresponds to the opinion of the scientific community on the fruitfulness of the research around the particular gene or gene-disease association.
This score also corresponds to the trust from the scientific community to the particular target or target-disease association.
Key Opinion Leaders (KOL)
Impact Factor
Average Hirsch index is calculated over publications, mentioning given gene-disease association.
Average Impact Factor is calculated over the journals publishing the number of mentions of a given gene-disease association. The Impact Factor of each journal is weighted to the number of publications mentioning gene-disease association published in this journal.
This score represents the total number of publications in 10+ impact factor journals (top 3% of journals).