NAME

TrimSequencesTo.pl - Trims all nucleotide sequences at the 3'-end to a given length


SYNOPSIS

 # Minimal argument call, specifying all required parameters.
 TrimSequencesTo.pl --input fastafile.fa --output trimmedsequences.fa
 
 # Maximal argument call, specifying all possible parameters.
 TrimSequencesTo.pl --input fastafile.fa --output trimmedsequences.fa
                    --trimto 32


OPTIONS

--input

The input file. Has to be a multiple fasta file. Mandatory parameter

--output

The output file. Mandatory parameter

--trimto

The length to which the nucleotide sequences should be trimmed. default=32

--help

Display the help pages


DESCRIPTION

General

The script trims each nucleotide sequence of a multiple fasta file at the 3'-end to a given length. Shorter sequences will be left unaltered. Solexa reads usually accumulate mismatches at their 3'-ends which may impede mapping of the reads. The script may thus be useful, for example, to trim all no-matches of a mapping step to 22 bp and repeat the mapping with this trimmed no-matches.

Input

Multiple fasta files. For example:

 >43||Count=1
 GAAATTTAAGAAACAATTATAATCCAC
 >44||Count=1
 ATTCGCGTTCAGCTGAGGCAGAGTGATGGT
 >45||Count=2
 TCCCTGTGGTCTATTGTTTATGATTCGGCT
 >46||Count=1
 TCCCGGGGCGTCTAGTGGTTAGGGTTTGGCG
 >47||Count=3
 TTCCTGTTGTCTAGTGGTTAGG

Output

A multiple fasta file, containing the trimmed sequences. For example, the sequences shown above trimmed to 22 nt:

 >43||Count=1
 GAAATTTAAGAAACAATTATAA
 >44||Count=1
 ATTCGCGTTCAGCTGAGGCAGA
 >45||Count=2
 TCCCTGTGGTCTATTGTTTATG
 >46||Count=1
 TCCCGGGGCGTCTAGTGGTTAG
 >47||Count=3
 TTCCTGTTGTCTAGTGGTTAGG


REQUIREMENTS

Perl 5.8 or higher


AUTHORS

Robert Kofler

Heinz Himmelbauer


CONTACT

robert.kofler at crg.es