BACCardI -
A tool for the validation of genomic assemblies, assisting genome finishing and intergenome comparison.

Genomic assemblies generated from sequence information need to be validated by independent methods such as physical maps. The time-consuming task of building physical maps can be circumvented by virtual clone maps derived from read pair information of large insert libraries.
The graphical tool BACCardI is used for the construction of virtual clone maps from standard assembler output files or BLAST based sequence comparisons. This tool has been applied to numerous genome projects to solve various problems including (a) validation of whole genome shotgun assemblies, (b) support for contig ordering in the finishing phase of a genome project, and (c) intergenome comparison between related strains when only one of the strains has been sequenced and a large insert library is available for the other. The BACCardI software can seamlessly interact with various sequence assembly packages.

Download

Download Solaris Version
Download Linux Version
Download README for installing BACCardI

Documentation

A short user's manual can be found here

People

Daniela Bartels, CeBiTec, Uni Bielefeld
Sebastian Kespohl, CeBiTec, Uni Bielefeld
Stefan Albaum, CeBiTec, Uni Bielefeld
Tanja Drüke, CeBiTec, Uni Bielefeld
Alexander Goesmann, CeBiTec, Uni Bielefeld
Olaf Kaiser, Fakultät für Biologie, Uni Bielefeld
Alfred Pühler, Fakultät für Biologie, Uni Bielefeld
Friedhelm Pfeiffer, MPI für Biochemie, Martinsried
Günter Raddatz MPI für Entwicklungsbiologie, Tübingen
Jens Stoye, Technische Fakultät, Uni Bielefeld
Folker Meyer, CeBiTec, Uni Bielefeld (corresponding author)
Stephan C. Schuster MPI für Entwicklungsbiologie, Tübingen

Applications

We have used the BACCardI tool in our genome assembly pipeline for several finished and ongoing genome projects:

Automatic generation of large insert size clone maps

Fosmid map Wolinella succinogenes

A virtual large insert clone map built with BACCardI. The circle displays a fosmid map of the 2.1 MB Wolinella succinogenes (MPI für Entwicklungsbiologie, Tübingen) genome. The inner circle of a large insert size clone map represents the contigs of a given genome assembly. In this case, as the genome is already finished, there is only one contig. For the complete validation of the genome sequence, a number of PCR products were done at parts with no fosmid coverage. The second circle shows the coverage of the contigs with fosmids (or PCR products). The green parts are covered with more than one, the yellow ones with exactly one fosmid. Regions which are not covered by any fosmid would be shown in red. The third layer represents each of the large insert clones classified to be "ok" as a green arc. The outer layers representing clones classified as "problematic" are not shown in this figure.

Finding misassemblies in genome projects

ESMs in Sorangium cellulosum


A contig in the early finishing phase of the ongoing Sorangium cellulosum (Kompetenznetzwerk Bielefeld) genome project. There are two misassembled regions indicated by lack of covering large insert clones (red in lane 3) and fosmids which are too long. In the BACCardI linear contig view, fosmids which are too long are displayed as black triangles that point towards where the other end is located. As a consequence, these triangles point from both sides to the misassembled region. Displaying them in the top line facilitates their detection and highlights potential misassemblies.

Cluster in Sorangium cellulosum


Misassembly in the finishing phase of the Sorangium cellulosum genome project. The red sector shows the region found by the automated misassembly detection of BACCardI. Above the region, a cluster of too long fosmids appear that indicate the false assembly of the data.