Genome Technology Center

Development on High-throughput Sequencing Technology: emPCR Titration and Barcode Design


Farbod Babrzadeh Chunlin Wang
Elijah Wang
Roxana Jalili
Andrea Chan
Ronald W. Davis
Baback Gharizadeh

454 sequencing technology is a high throughput DNA sequencing platform based on Pyrosequencing technology. The current platform has an average read-length of 250 bases and it generates over 100 million bases in one single run. At Stanford Genome Technology Center, we use this platform from whole genome sequencing to amplicon sequencing for a wide range of applications.

The DNA amplification process is limited by the quality of the DNA fragments that have been annealed onto them. If we amplified the beads with too high of a concentration of DNA fragments, we yield beads annealed with different fragments of DNA which yield high errors when sequenced. Conversely, at too low of a concentration of DNA fragments, we waste many DNA beads that did not capture a DNA fragment to amplify. While an expensive and time-consuming titration sequencing run is recommended to find an optimal ratio of DNA fragments to beads, our group determined that the bead recovery percentage after amplification can reflect the quality of the DNA beads.


high troughput sequencingBarcode Design

With the advent of high-throughput DNA sequencing technologies, it is becoming a more common practice to pool barcode-tagged multiple nucleotide samples simultaneously to increase the efficiency and decrease cost and labor.
There are two concerns about barcode designing:

Due to barcode synthesis error, PCR and sequencing errors, barcode sequences can be altered. The read length of current ‘next-generation’ sequencing platforms is limited, shorter barcodes are always desirable to save sequencing capacity for targets of interest. So for all those reasons, we have developed an approach to design sets of barcode, which is of minimum length and tolerates sequencing errors.
In order to design the shorter barcodes for sample pooling, we try to find the maximal number of barcodes of a certain length and sharing at least a particular number of differences. With this representation, we transform the barcode-design problem to a clique-finding problem. However, in computational complexity theory, the maximum clique problem is NP-complete. Here we presented a heuristic approach by a genetic algorithm, which mimics the evolution by natural selection with the principles of selection, variation and inheritance to find exact or approximate solutions to problems.

Footer Links: