CeBiTec Colloquium

(unscheduled)

 date 

Thursday, February 10th 2011, 17 c.t.

 location 

G2-104, CeBiTec Building

 speaker 

Dr. Mario Caccamo

The Genome Analysis Centre, Norwich, UK

title 

Short Reads Assembly Strategies for Large Genomes

  The next generation sequencing (NGS) technologies are characterised by the capacity to generate data at very high rates. The sequence reads, however, are short. The availability of high-quality reference genomes for model organisms such as human, mouse and Arabidopsis thaliana have been central in establishing these technologies as the tool of choice to implement population genetics studies based on re-sequencing. The ability to generate de novo assemblies from short reads for large eukaryote genomes, however, remains a challenge. Most of the current assembly tools struggle to deal with the massive datasets generated by these technologies.
Some of the recent assembly algorithms such as SOAPdenovo and Cortex have been designed to offer efficient alternatives to represent these datasets in main memory, but in general the results are assemblies with large numbers of contigs. The use of resources such as mate-paired reads are key to extend and link contigs in scaffolds but the representation of this vast amount of information in a computer’s main memory is in general prohibitive with alternatives resulting in poor performance. In this presentation we will explore the available assembly algorithms with a review of the different technologies and strategies to approach the de novo assembly of large eukaryote genomes.
 host 

Prof. Dr. Alfred Pühler