Scientific

Deep Sequencing Notes


These pages document notes I collected on various aspects of deep sequencing, aka sequencing by synthesis or short fragment sequencing. The machine on which my experience is based is a Illumina/Solexa technologies Genome Analyzer II. The work I performed was related to the deep sequencing facility in Basel (at ETH Zurich) and these are some extra-curricular notes that remain otherwise unused.

We collected some notes on compiling the pipeline v0.3 and v1.0

One of the initial aspects of Deep Sequencing is the question whether there is a bias present and what types of biases are known.

  • One of the surprising biases we learned in Lausanne is the fact that higher cycles will favor specific bases. This is shown in detail here.
  • Another important bias is temperature dependence.
  • In one of our experiments we essentially sequenced the adapter which was nicely visible in the IVC (Intensity versus Cycle) plot.
  • One of the biggest problems with the software associated with the genome analyzer is that it relies on the known genome to decide on the quality of the reads. This process will even modify the quality values of the different reads. This then leads to a number of quality reports based on the alignments. We present two examples here. The first quality report is based on the eland aligner (quality report). The second quality report is based on the phage-align aligner. The output of the GA Pipeline is discussed in detail at the file formats page.

    Illumina software

  • What with more than 16M tags ? An important problem we had was that the eland aligner was not able to handle more than 16'000'000 tags. This document explains how to resolve this.
  • A race condition in the goat pipeline.
  • How to split lanes ? - explains how to split the output of the analysis pipeline such that we can give each customer their own quality report.
  • Our own software

  • Eland2Wig - An eland2 to wig file converter
  • Eland2Exp - An eland2 to expression file converter. This converts your genome analyzer into a microarray scanner.
  • Sig22Seq - A tool to convert the *.sig2.txt files to filtered sequences.
  • ProbPerSf - A tool to shorten the measured fragments up to a length that can be trusted.

  • More Deep Sequencing notes
    - http://analysis.yellowcouch.org/