Genome Analyzer and HiSeq2000 Sample Preparation

The following procedures for Solexa library preparation are presently in use at the CRG Ultrasequencing Unit and are offered as service. The sample preparation protocols are identical for sequencing on the Genome Analyzer IIx and on the HiSeq2000.

Please contact Rebecca Curley or Heinz Himmelbauer for further inquiries.

Genomic and Paired-end Genomic

This sample preparation protocol can be used to process genomic DNA, cDNA, or PCR amplicons, both for single reads and paired-end sequencing applications.

Briefly, the protocol consists of the following steps:

  1. Fragmentation of the DNA (Covaris or nebulization)
  2. End-repair
  3. Generation of dA overhangs
  4. Adapter ligation
  5. Size selection and removal of non-ligated adapters by agarose gel electrophoresis
  6. Amplification (10 or more commonly 12 PCR cycles)

Input Material
1-5 g DNA should be provided. Care should be taken to avoid contamination with DNA from EF cuvettes, sample loading buffer, common stocks of gel staining buffer, etc. Clean material should be used for all the steps of DNA purification. RNA is not a problem as it cannot be ligated to the adapters.

DNA from genomic clones (e.g. BACs, PACs, fosmids, cosmids) is usually contaminated with DNA from E. coli. By our experience, up to 50% of reads generated from Qiagen-purified BAC-DNA map to E. coli. Care should be taken to avoid DNA shearing. We have been successful with conventional alkaline lysis preps (no kits), and E.coli background was < 20%.

Size selection
During the size selection step, the insert size range should be kept as narrow as possible (50 nt difference is fine). Common insert size ranges are 300-350 nt or 400-450 nt (insert fragments excluding adapters).
Library amplification
We have successfully performed sequencing runs from templates that have not undergone amplification. The introduction of biases such as underrepresentation of AT rich sequences as a result of PCR amplification will be eliminated. However, such libraries are tricky to prepare, yields are low, and at least 5 ug of starting material needs to be provided.

Genomic mate pair libraries

This sample preparation protocol can be used to process genomic DNA and DNA from genomic clones (BACs, PACs, fosmids, cosmids) for paired-end sequencing applications. The libraries will generate PE reads with a gap of 2-5 kb relative to the original sample.

Briefly, the protocol consists of the following steps:

  1. Fragmentation of the DNA with Hydroshear or Covaris
  2. End-repair
  3. Biotin labeling
  4. Size selection of fragments with length of interest (2-5 kb)
  5. DNA circularization and digestion of linear, non-circularized DNA
  6. Fragmentation of circularized DNA with covaris
  7. Purification of biotinylated DNA
  8. End-repair
  9. Generation of dA overhangs
  10. Adapter ligation
  11. Amplification (18 cycles)
  12. Size selection of the PCR amplified library (350-650 bp)

Input Material
10 g of double stranded DNA confirmed by picogreen should be provided. We recommend you get in touch to discuss if you can not provide 10 g DNA from your sample. Also, we require a gel picture of the DNA. Partial degradation of the source DNA will lead to high background in the data, because nicks in DNA strands will be labelled with biotin, and such fragments will be recovered and sequenced as well.

Library insert sizes
The protocol allows paired end sequencing from fragments up to 5 kb.


This sample preparation protocol can be used to process sheared genomic DNA (as obtained from sheared, immunoprecipitated chromatin), but also fragmented cDNA, PCR amplicons, or MNase digests.

Briefly, the protocol consists of the following steps:

  1. End-repair
  2. Generation of dA overhangs
  3. Adapter ligation
  4. Size selection and removal of non-ligated adapters by agarose gel electrophoresis
  5. Amplification (18 cycles)

Input Material
The recommended starting amount of material is 10 ng, although we have been successful with 6 ng in a maximum volume of 40 ul. To obtain this amount of material, one may have to pool several ChIP experiments. The quantification, as for genomic samples, has to be performed using a dsDNA dye (e.g. Picogreen). Nanodrop quantification is of no use (below detection limit).

It is very important that you indicate to us the average size of DNA obtained during sonication, as there is a size selection step and we don't want to miss the immunoprecipitated DNA. In an ideal case, the sonicated DNA should contain most molecules in the 100-200 nt size range (we can process samples with longer inserts). Please provide a gel picture of the sonicated input DNA.

We advise that no blocking DNA (e.g. from herring sperm) is used during the ChIP process, as it will contaminate the sample. We recommend using Dynabeads or Diagenode agarose beads to avoid this problem. Immunoprecipitated material supplied to us should be RNA free and purified using QIAgen columns, eluting in 50 ul of EB buffer.

Size selection
The size range of fragments that we cut from the gel during step 4 of the protocol may depend on the protein whose binding you study. Open chromatin is easier to sonicate and will yield smaller fragments than compacted chromatin. Feedback from users in advance of the experiment may be helpful.
ChIP controls
Two different controls can be sequenced, i.e. input DNA, or DNA immunoprecipitated with pre-immune IgG. At present we can not advise which control is better. Alignment of reads obtained from sequencing input DNA shows that read distribution is not random (i.e. peaks are observed), reflecting the different accessibility of chromatin for sonication (and perhaps also PCR amplification). IgG lots may differ with regard to stickyness, resulting in varying levels of background.

Small RNA

This sample preparation protocol can be used to process endogenous small RNA molecules (i.e. miRNA), as well as fragmented RNA, and information on strandedness is retained.

The protocol involves the ligation of RNA adapters.

Briefly, the protocol consists of the following steps:

  1. Size selection of small RNA by PAGE
  2. Ligation of 5' RNA adapter
  3. Size selection of ligation products by PAGE, removal of non-ligated adapter
  4. Ligation of 3' RNA adapter
  5. Size selection of ligation products by PAGE, removal of non-ligated adapter
  6. First strand synthesis by reverse transcription
  7. Amplification (15 cycles)

Input material
For miRNA processing, the initial amount required is 10 ug of total RNA per sample. Please provide this amount of material in a maximum volume of 15 microliters. The amount can be decreased to 5 ug when using RNA extraction kits that allow the enrichment of small-sized RNA molecules (QIAGEN and Ambion offer such products), also in a maximum volume of 15 ul. We have no preference as to which protocol is used for RNA extraction. The sample preparation protocol works equally well with or without enrichment of small RNA as input. Total RNA must have been run on the Bioanalyser, to monitor its integrity. RIN should be as high as possible (>7), we cannot guarantee good results with lower RIN values.

RNA size range
The small RNA protocol can also be used to prepare libraries from fragmented RNA up to a size of 50 nt, derived from longer RNA molecules. We advise to perform rRNA depletion using commercially available kits such as RiboMinus from Invitrogen (depletion is presently not offered as a service). 100 ng of fragmented RNA per sample is required.

Modifications to the protocol
We have developed modifications to the small RNA protocol, to allow processing of RNA of a size up to 200 nt (Vivancos et al. 2010).


With 5-10 g of total RNA as starting material, this sample preparation protocol can be used to purify mRNA and to prepare cDNA ready to sequence on the Illumina platform. Information on strandedness is lost during sample preparation. The adapters are suitable for single read and paired-end sequencing.

Briefly, the protocol consists of the following steps:

  1. Purification of mRNA with oligo(dT) beads
  2. mRNA fragmentation
  3. First strand synthesis by random priming, followed by 2nd strand synthesis
  4. End repair
  5. Generation of dA overhangs
  6. Adapter ligation
  7. Size selection and removal of non-ligated adapters by agarose gel electrophoresis
  8. Amplification (15 cycles)
  9. Removal of PCR primers by Purelink

Input material
5-10 g of total RNA is required. RIN should be checked and should be > 7 (see small RNA sample preparation). Alternatively, 100 ng of mRNA can be used (start at step 2 of protocol). For size selection, we generally retrieve cDNA 300 nt in size, and keep the gel slice containing 200 nt fragments as backup.
Reads generated from the input material will be distributed over the entire length of transcripts, allowing reconstruction of transcript sequences, transcript isoforms, as well as gene expression levels.

Indexing of Solexa samples

Indexing of DNA or cDNA samples
Adapters are suitable for single read and paired-end sequencing. Samples can be pooled after sample preparation and sequenced in a single lane. Samples can be indexed using two different approaches:

  1. 4-base index which is part of the sequence read, at the 5' end of reads (12 indices prepared by us, designed so that a single sequencing error does not convert one index into another one). Reads will be four bases shorter, due to the index.
  2. 6-base index placed within the 3' adapter which is read in a separate sequencing reaction (12 indices available from Illumina). We are currently setting up this approach.

Indexing of RNA samples
In case of indexing RNA, we ligate 5' adapters with a 6-base index to the RNA. 12 types of RNA adapters with different indices are available. Read length will be six bases shorter, due to the index.

 Ultrasequencing Unit
Centre for Genomic Regulation (CRG)
Barcelona, Spain