Heterogeneous Graph Walk (HeroWalk) is a guided random walk-based approach that is applied to a heterogeneous graph. The model learns node representations and then finds gene nodes close to the reference disease node. First, the "walks" are sampled with a predefined meta-path, i.e. fixed sequence of node types in a walk, e.g. 'gene'-'disease'-'gene.' The node degree controls the probability of transition between the nodes while sampling. Following that, the SkipGram model learns the representation of each node based on the resulting corpus of walks. The cosine similarity between the specific disease and all genes produces a ranked list of genes. The top genes from this list are predicted to be promising target hypotheses.
Matrix Factorization is a collaborative filtering algorithm widely used in recommender systems. The algorithm decomposes a sparse matrix derived from a gene-disease interaction graph into two lower dimensionality matrices that consist of latent factors for genes and diseases. The algorithm uses graph regularization based on a fast kNN search to account for intraclass similarity between the nodes of a similar type. Recomputing the original interaction matrix from latent factors provides the scores for unobserved interactions; thus, gene ranking is obtained.
Expression relies on the differential gene expression analysis defined by a collection of datasets of interest. Machine learning-based models are used to normalize gene expression data available across multiple samples from unrelated datasets.
The score utilizes the collection of the protein-protein interactome exploration. AI methods aim to assess interactome communities, in order to detect a given gene. The higher the score, the richer the dense community with known targets for a given disease, the genes related to a given disease by GWAS, TWAS, OMIM and ClinVar evidence, and differentially expressed genes defined by a collection of datasets of interest.
This score is based on the causal inference of transcription factors. It estimates the number of genes associated with the disease progression/treatment, controlled by a similar set of transcription factors to a given gene. It uses a manually curated regulatory network and known drug targets to predict potential disease-modifying transcription factors.
These scores are based on the AI model for comparison of differential expression profiles of interest. The LINCs cell line expression profiles are compared with gene expression profiles in cell lines overexpressing particular genes (knockin) or carrying gene knockout (knockout) based on LINCS database. The score characterizes a gene which knockin/knockout affects cancer cell line expression profile, similar to the disease related transcriptomics dataset submitted by the user.
This score estimates target relevance for the process/disease phenotype. The score relies on a manually curated database and the data coming from external sources (OMIM, ClinVar, Open Targets).
This score is based on the AI weighting of the evidence inferred from a manually curated list of gene variants and TWAS/GWAS studies. High score indicates mutation burden, correlating with a given disease.
The score is based on the combination of pathway analysis approaches. First, iPanda is used to examine a given gene implication in pathway activation patterns, in order to get a collection of gene expression datasets of interest (activation/inhibition of each pathway is examined separately). Next, all the pathways from the library are merged into a single network, which is examined from the perspective of signal propagation by a number of methods. The final score indicates the implication of a given gene in individual pathways activation/inhibition, and its ability to affect multiple pathways via cross-talks.
The score explores direct network neighbors interacting with a given gene. The higher the score, the bigger the number of neighbor differentially expressed genes defined by a collection of datasets of interest.
Combines evidence from the external databases (OMIM, Open Targets, ClinVar).