NAME

DGE_Lineplots.pl - Creates a lineplot illustrating differences (e.g.: the time-course) in gene (miRNA) expression between samples


SYNOPSIS

 # Minimal argument call specifying all required parameters. At least two different samples have to be specified
 # A lineplot will be drawn for the top 25 expressed miRNAs
 DGE_Lineplots.pl --output lineplot.pdf
                   Mapping_day0_1_i_Eland_against_mature_unambiguous.txt
                   Mapping_day6_1_i_Eland_against_mature_unambiguous.txt
 # Minimal argument call to create a lineplot for a specified list of miRNAs.
 # miRNAs have to be given by their IDs and separated by a comma (,)
 DGE_Lineplots.pl --output lineplot.pdf
                   --ids "hsa-mir-133a-2 MI0000451 Homo sapiens miR-133a-2 stem-loop, hsa-mir-142 MI0000458 Homo sapiens miR-142 stem-loop"
                   Mapping_day0_1_i_Eland_against_mature_unambiguous.txt
                   Mapping_day6_1_i_Eland_against_mature_unambiguous.txt
 
 
 # Maximum argument call specifying all possible parameters; Many different samples may be specified
 DGE_Lineplots.pl --output lineplot.pdf --mode pdf --tempdir "/tmp"
                   --min_length 20 --max_length 25 --max_mm 2 --strand RF
                   --normalisation quantile --top_n 50
                   Mapping_day0_1_i_Eland_against_mature_unambiguous.txt
                   Mapping_day6_1_i_Eland_against_mature_unambiguous.txt
                   Mapping_day12_1_i_Eland_against_mature_unambiguous.txt
                   Mapping_day24_1_i_Eland_against_mature_unambiguous.txt


OPTIONS

--output

The output file name; Mandatory parameter

--strand

Only reads mapping to the specified strand will be used for creating the lineplot. Possible values: R (reverse strand), F (forward strand), RF (both strands); default=RF

--min_length

The minimum length of reads. Shorter reads will not enter the lineplot. default=15

--max_length

The maximum length of reads. Longer reads will not enter the lineplot. default=100

--max_mm

The maximum number of mismatches. Reads having more mismatches will not enter the lineplot. default=2

--top_n

The number of top expressed genes (miRNAs) for which the lineplot should be drawn. default=25

--ids

A comma separated list of gene (miRNA) IDs for which the lineplot should be created. This option always takes precedence about --top_n, i.e.: when both options are specified the lineplot will be drawn for the given list of gene (miRNA) IDs. Example: --ids "hsa-mir-142 MI0000458 Homo sapiens miR-142 stem-loop, hsa-mir-320a MI0000542 Homo sapiens miR-320a stem-loop" default=undefined

--mode

The format of the output either postscript or a pdf; [ps or pdf]; default=pdf

--tempdir

The temporary directory; default=/tmp

--normalisation

MIRO allows to use several normalisation methods to create the lineplot. default=scalelinear; At the moment the following normalisation methods are supported:

off

samples are not normalised, the actual observed read counts will be displayed

scalelinear

The total expression levels will be linearly scaled to constant level. The individual read counts will be adjusted accordingly. This is the most straight-forward normalisation method

quantile

The quantile-normalisation method

housekeep..

Examples: housekeep5, housekeep10, housekeep20;

This normalisation methods is a derivate of the scalelinear method. Instead of using all genes (miRNAs) for calculating the normalisation factor, only the genes having a medium expression levels will be used. Therefore the genes having the highest and the lowest expression levels will be ignored (of course only for calculating the normalisation factor, not for normalisation itself). The housekeep normalisation has to be called with the exact percentage of genes to be skipped. E.g.: housekeep20 ignores the 20% highest and the 20% lowest expressed genes. The genes (miRNAs) are weighted by the log2 of the expression level.

--help

Display the help pages.


DESCRIPTION

General

This script creates a lineplot illustrating the expression level of the specified genes (miRNAs). A lineplot is created either for the top expressed genes (miRNAs) or for a given list of gene IDs. The linplot is computed from run_Mapping.pl output files.

Input

A list of unambiguously mapped read files. At least two different files have to be provided, whereas each file is assumed to represent one sample e.g.: one tissue or one time point. The file have to be unambiguously mapped reads using the script run_Mapping or run_Multimapper:

For example:


 24688||Count=3         TACCCTGTAGATCCGAATTTGT          hsa-miR-10a MIMAT0000253 Homo sapiens miR-10a   1       0       F       1
 128318||Count=2        TACCCTGTAGATCCGAATTTGTG         hsa-miR-10a MIMAT0000253 Homo sapiens miR-10a   1       0       F       1
 150952||Count=1        TACCCTGTAGATCCTAATTTGTGT        hsa-miR-10a MIMAT0000253 Homo sapiens miR-10a   1       2       R       1
 212857||Count=1        TACCCTGTAGATCCAAATTTGT          hsa-miR-10a MIMAT0000253 Homo sapiens miR-10a   1       1       F       1
 317801||Count=1        TACCTTGTAGATCCGAATTTGTG         hsa-miR-10a MIMAT0000253 Homo sapiens miR-10a   1       1       F       1
 389805||Count=1        TACCCTGTATATCCGAATTTGTGG        hsa-miR-10a MIMAT0000253 Homo sapiens miR-10a   1       2       F       1

Output

The output will be a lineplot (.pdf or .ps)


REQUIREMENTS

Perl 5.8 or higher

R 2.7.0 or higher


AUTHORS

Robert Kofler

Manuela Hummel

Lauro Sumoy

Heinz Himmelbauer


CONTACT

robert.kofler at crg.es