Genomic Tools
Aiptasia Genome
The multi-lab effort led by KAUST Reef Genomics lab to sequence the Aiptasia genome culminated in our open access PNAS article.
An excellent genome browser with dedicated BLAST search is available from the KAUST Aiptasia genome browser.
All sequencing projects performed in the Pringle lab are deposited in SRA as they are generated. They can be freely downloaded for analysis.
Transcriptome of aposymbiotic CC7 Aiptasia
We assembled the transcriptome of a clonal population of adult, aposymbiotic (dinoflagellate-free) Aiptasia pallida from ~208 million reads, yielding 58,018 contigs. We demonstrated that many of these contigs represent full-length or near-full-length transcripts that encode proteins similar to those from a diverse array of pathways in other organisms, including various metabolic enzymes, cytoskeletal proteins, and neuropeptide precursors. These contigs were uploaded to the NCBI Transcriptome Assembly Shotgun Sequence Database, accession numbers JV077153-JV134524.
Many alternatively spliced transcript were present in out transcriptome. For the purposes of RNA-Seq experiments, transcripts with highly redundant sequence should be removed. A fasta file of containing the longest transcript from each cluster is available as a version "FOR MAPPING".
Transcriptome of Symbiotic CC7 Aiptasia (with some algal sequences as well)
Any transcripts having 99% identity over an alignment that covers at least 20% of both transcripts were clustered and a single representative transcript retained (.GZ file).
Each representative transcript has been classified based on most probable species of origin based on TopSort (a machine-learning algorithm) and genomic read alignment (.txt file).
Fulcrum Read Collapser
"Leveraging the computer power you have into the results you need"
Our Lab has developed a pipeline to collapse near-duplicate reads resulting from Ultra Hight Throughput Sequencing (UHTS), specifically Illumina and 454. The pipeline uses the Parallel Python module to spread work across processors or networked computers, and uses MapReduce to allow processing of even larger datasets. Included in this archive are detailed manuals and several helper scripts. http://www.ncbi.nlm.nih.gov/pubmed/22419786
Updated, Oct. 2012: We have updated fulcrum to be compatible with the new Illumina fastq output.
We have added a new option to linenumfast05.py. -n [int] Length of longest read. Default is 101-bp. If you are using shorter, just change it AND change the variable MAX_LENGTH at the top of parallel05.py. Finally, if you are using something besides FASTQ 1.8+, change the OFFSET value in parallel05.py from 33 to 64.