DGE_Table.pl - Creates a table containing the gene (miRNA) expression levels of different samples
# Minimal argument call specifying all required parameters. At least one sample has to be specified DGE_Table.pl --output table.txt Mapping_day0_1_i_Eland_against_mature_unambiguous.txt
# Maximum argument call specifying all possible parameters; Many different samples may be specified DGE_Table.pl --output table.txt --min_length 20 --max_length 25 --max_mm 2 --strand RF --normalisation quantile --format diflog2 Mapping_day0_1_i_Eland_against_mature_unambiguous.txt Mapping_day6_1_i_Eland_against_mature_unambiguous.txt Mapping_day12_1_i_Eland_against_mature_unambiguous.txt Mapping_day24_1_i_Eland_against_mature_unambiguous.txt
The output file; Mandatory parameter
Only reads mapping to the specified strand will be used for creating the expression tables. Possible values: R (reverse strand), F (forward strand), RF (both strands); default=RF
The minimum length of reads. Shorter reads will not enter the expression tables. default=15
The maximum length of reads. Longer reads will not enter the expression tables. default=100
The maximum number of mismatches. Reads having more mismatches will not enter the expression tables. default=2
The formatting of the expression table. default=count; At the moment the following formats are supported:
The expression level will be displayed in counts.
If the normalisation off
is used, this values represent the actually observed number of reads mapping to a certain gene (miRNA).
The expression level will be displayed as log2 of the counts (count
).
For each sample the difference to the first sample will be displayed using the following equation:
x = log2( count / count_sample1)
This may be useful to identify differences to a sample which represents a background level of expression or the first timepoint in a time-course experiment (t0);
MIRO allows to use several normalisation methods to create the expression tables. default=scalelinear; At the moment the following normalisation methods are supported:
samples are not normalised, the actual observed read counts will be displayed
The total expression levels will be linearly scaled to constant level. The individual read counts will be adjusted accordingly. This is the most straight-forward normalisation method
The quantile-normalisation method
Examples: housekeep5
, housekeep10
, housekeep20
;
This normalisation methods is a derivate of the scalelinear method. Instead of using all genes (miRNAs) for calculating the normalisation factor, only the genes having a medium expression levels will be used. Therefore the genes having the highest and the lowest expression levels will be ignored (of course only for calculating the normalisation factor, not for normalisation itself). The housekeep normalisation has to be called with the exact percentage of genes to be skipped. E.g.: housekeep20 ignores the 20% highest and the 20% lowest expressed genes. The genes (miRNAs) are weighted by the log2 of the expression level.
Display the help pages.
This script creates a table (matrix) containing the gene (miRNA) expression levels of samples.
Several different normalisation methods may be used and several formatting methods are available.
The expression tables are computed from run_Mapping.pl
output files.
One or more unambiguously mapped read files.
Each file is assumed to represent one sample e.g.:
one tissue or one time point.
The file have to be unambiguously mapped reads using the script run_Mapping
or run_Multimapper
:
For example:
24688||Count=3 TACCCTGTAGATCCGAATTTGT hsa-miR-10a MIMAT0000253 Homo sapiens miR-10a 1 0 F 1 128318||Count=2 TACCCTGTAGATCCGAATTTGTG hsa-miR-10a MIMAT0000253 Homo sapiens miR-10a 1 0 F 1 150952||Count=1 TACCCTGTAGATCCTAATTTGTGT hsa-miR-10a MIMAT0000253 Homo sapiens miR-10a 1 2 R 1 212857||Count=1 TACCCTGTAGATCCAAATTTGT hsa-miR-10a MIMAT0000253 Homo sapiens miR-10a 1 1 F 1 317801||Count=1 TACCTTGTAGATCCGAATTTGTG hsa-miR-10a MIMAT0000253 Homo sapiens miR-10a 1 1 F 1 389805||Count=1 TACCCTGTATATCCGAATTTGTGG hsa-miR-10a MIMAT0000253 Homo sapiens miR-10a 1 2 F 1
The output is a table containing the expression levels for each gene (miRNA) and sample.
The genes (miRNAs) are given as rows and the samples as columns.
The actual values vary with the normalisation method and the formatting.
When choosing normalisation --normalisation off
and format --format count
the values in the table will represent the actual observed number of reads mapping to
a certain gene (miRNA).
The expression level will be displayed in counts. For example:
day0 day3 day6 day12 day18 day24 hsa-mir-191 MI0000465 Homo sapiens miR-191 stem-loop 16666.9 6826.7 26944.6 8692.7 8851.5 16534.4 hsa-mir-29a MI0000087 Homo sapiens miR-29a stem-loop 1359.4 2704.2 24531.5 4004.6 3931.7 44904.4 hsa-mir-142 MI0000458 Homo sapiens miR-142 stem-loop 6518.8 3720.0 14015.7 3319.0 3428.3 23975.9 hsa-mir-10a MI0000266 Homo sapiens miR-10a stem-loop 2177.2 2980.4 18518.0 2138.8 1880.9 15114.3 hsa-mir-146b MI0003129 Homo sapiens miR-146b stem-loop 4.4 6.2 3607.6 459.6 8479.3 20975.4
The expression level will be displayed as log2 of the counts (count
). For example:
day0 day3 day6 day12 day18 day24 hsa-mir-191 MI0000465 Homo sapiens miR-191 stem-loop 14.0 12.7 14.7 13.1 13.1 14.0 hsa-mir-29a MI0000087 Homo sapiens miR-29a stem-loop 10.4 11.4 14.6 12.0 11.9 15.5 hsa-mir-142 MI0000458 Homo sapiens miR-142 stem-loop 12.7 11.9 13.8 11.7 11.7 14.5 hsa-mir-10a MI0000266 Homo sapiens miR-10a stem-loop 11.1 11.5 14.2 11.1 10.9 13.9 hsa-mir-146b MI0003129 Homo sapiens miR-146b stem-loop 2.1 2.6 11.8 8.8 13.0 14.4
For each sample the difference to the first sample will be displayed using the following equation
x = log2( count / count_sample1)
This may be useful to identify differences to a sample which represents a background level of expression or the first timepoint in a time-course experiment (t0); For example:
day0 day3 day6 day12 day18 day24 hsa-mir-191 MI0000465 Homo sapiens miR-191 stem-loop 0.0 -1.3 0.7 -0.9 -0.9 -0.0 hsa-mir-29a MI0000087 Homo sapiens miR-29a stem-loop 0.0 1.0 4.2 1.6 1.5 5.0 hsa-mir-142 MI0000458 Homo sapiens miR-142 stem-loop 0.0 -0.8 1.1 -1.0 -0.9 1.9 hsa-mir-10a MI0000266 Homo sapiens miR-10a stem-loop 0.0 0.5 3.1 -0.0 -0.2 2.8 hsa-mir-146b MI0003129 Homo sapiens miR-146b stem-loop 0.0 0.5 9.7 6.7 10.9 12.2
Perl 5.8 or higher
R 2.7.0 or higher
Robert Kofler
Manuela Hummel
Lauro Sumoy
Heinz Himmelbauer
robert.kofler at crg.es