Comparison Creation | PandaOmics

Comparison Creation

PandaOmics enables you to easily compare experimental sample groups defined earlier

PandaOmics uses the Comparison concept to store
the contrasting analysis between the two experimental groups which includes differential gene expression analysis, differential pathway activity predictions and more.

Once defined, it requires several minutes for a comparison to be calculated.

Clicking on the Comparison of interest will lead you to the corresponding details page

Cross-Dataset Comparisons

Сross-dataset analysis allows combining several Datasets within a single experiment. This can be beneficial for increasing statistical power and detecting more subtle changes in expression, especially when a single dataset contains a few samples.

To create a cross-dataset comparison open the "Datasets" tab on the Data manager and select three or more Sample groups. As with single dataset comparison you should specify cases (samples for condition under study) and controls (reference samples, usually healthy tissue). Optionally you can specify auxiliary cases/controls – additional samples used for batch correction (see below), but not for the actual comparison.

Batch effect and its correction

Simultaneous analysis of samples from different datasets may cause manifestation of so-called batch effect – detection of changes in data caused by non-biological reasons, e.g. reagent lots, personnel differences or equipment model. Batch effect leads to false findings in the downstream analysis such as differential gene expression and pathway perturbation.

PandaOmics utilizes a ComBat method to minimize the batch effect, but this procedure cannot guarantee that the technical variability is totally removed. There are some limitations for batch correction: it will work only for transcriptomics data (microarray, RNAseq) and only if there is at least one dataset selected for the analysis containing both Case and Control Groups simultaneously.

Batch correction
result interpretation

One can decide whether the Batch correction has succeeded and either proceed with comparison setup or select other Datasets or Case Groups. To help you with the decision PandaOmics provides visualization (PCA plot) and quantitative metrics.

Here are examples of successful and unsuccessful batch correction.
Marker color encodes the sample Dataset, while the shape encodes the sample type (Case or Control).

Initially (left plot) samples are grouped according to datasets rather than classes – this is the batch effect that we want to remove. After batch correction (right plot) samples are grouped according to their class, the batch effect is mostly gone. It is safe to proceed with comparison creation.

The batch effect can be clearly seen – samples are clustered according to datasets rather than classes (left plot). After batch correction (right plot) samples from datasets are mixed together, but Cases are not separated from Controls. In this case creation of comparison and subsequent analysis likely will produce meaningless results.

On top of the visualization several quantitative metrics are calculated to estimate the overall performance of the batch correction procedure. PandaOmics uses these metrics to inform you whether the batch effect is likely to be removed.

Auxiliary samples

Batch removal is more efficient if many samples are available for the procedure

Imagine that in addition to other data you have a dataset with several groups (e.g. different diseases) and want to use only a single group in the comparison. By simply discarding other data you will lose valuable information about technical variability.
The solution is to specify other groups as auxiliary cases. These samples will be used by batch removal procedure and ignored for actual comparison calculation.

Here are some use cases for cross-dataset comparisons.
In this example Dataset 1 includes Case1 and Control 2, Dataset 2 includes Case 1 and Control 2 groups.

After you have clicked the COMPARE button, the comparison will start being calculated in the background either with or without batch correction. Once it finishes, you will see a popup notification. Resulting comparisons can be found on the Summary page of every Dataset included in the Cross-comparison. You can explore the gene expression or pathway perturbation results and include comparisons into Meta-analysis for subsequent drug target or compound identification.

Training Video

PHARMA.AI

PandaOmics
Chemistry42
InClinico

Contact us

Back to top