GenDB
GenDB is a genome annotation system for prokaryotic genomes. The system has been developed as an extensible and user friendly framework for both bioinformatics researchers and biologists to use in their genome projects. The GenDB annotation engine will automatically identify, classify and annotate genes using a large collection of software tools. Many groups view this automatic annotation as the first step that needs to be followed by expert annotation of the genome.
GenDB offers user interfaces that allow expert annotation with large, geo-graphically dispersed teams of experts. Genes to be annotated can be categorized by functional class or gene location. A number of naming schemes (aka ontologies or functional classification schemes) are supported: GO, TIGR roles, COG, Monica Riley, MIPS. In addition to its use as a production genome annotation system, it can be employed as a flexible framework for the large-scale evaluation of different annotation strategies.
The system is available as open source under the GNU public license (GPL). The modular system was developed using an object-oriented approach, and it relies on a relational database backend (e.g. MySQL). Using a well defined application programmers interface (API), the system can be linked easily to other systems.
The software currently is in use in more than a dozen microbial genome annotation projects. Several partners/collaborators have installed GenDB (e.g. the Max-Plank-Computing center in Garching or the group of Rick Stevens at Argonne National Lab, Futures Lab).
SAMS
Every genome project generates thousands of ESTs or shotgun reads. Users have high interest in a first look at the DNA sequence content of the individual reads, before they are assembled (in case of shotgun reads) or clustered (in case of ESTs). Several steps are necessary to provide the researcher with high quality sequences, as well as an overview of their content. For all these purposes we have implemented some additional extensions to GenDB within the SAMS system.
SAMS is a simple, easy to install and maintain open source system that provides the mechanisms to run a variety of tools on each read/EST, presenting the results in a web form. The pipeline includes the processing of the raw sequence data (e.g. base calling, quality and vector clipping), the processing of ESTs using different tools (e.g. BLAST), and also the clustering and assembly of the sequences. Finally, the system provides a web based visualization of the results.
Further information about SAMS can be found on the SAMS homepage.
|