The huge amounts of data acquired from PolyOmics technologies can only be handled with intensive bioinformatics support that has to provide an adequate data management, efficient data analysis algorithms, and user-friendly software applications. Over the last years, we developed a large scale software suite to substantially support researchers in the field of genome and metagenome, transcriptome, proteome, and metabolome data analysis. As all software applications follow a similar design pattern, experimental data as well as analysis results can be easily integrated to achieve higher-level evaluation and combined visualizations.
Forward-looking projects within the computational genomics group focus on the development of bioinformatics analysis workflows for rapid and parallel data interpretation, which also includes hardware accelerated software implementations. Conveyor is our novel workflow-processing engine to rapidly deploy new bioinformatics data analysis pipelines utilizing either the CeBiTec compute cluster for distributed computing or multi-core servers for local execution. Using a workflow based approach, analyses can be designed using an intuitive graphical designer application. A number of ready-to-use processing steps for bioinformatics analyses already exist with a focus on sequence analysis and sequence annotation. Using the Conveyor2Go tool, an existing workflow can easily be converted into a standalone application.
The program SARUMAN is our first approach based on GPU programming that allows us to boost the performance of complete and exact short read mapping against reference genomes. It needs no server class hardware and can be run on every desktop PC with an installed NVIDIA graphic adapter. As result, various scientific applications can benefit from the parallel computing power provided by current graphics adapters found in many PCs.
The whole collection of established software tools at hand provides the basis for efficient, parallel, and data driven processing that will be further improved and extended in the future to build a comprehensive bioinformatics technology platform for genome based systems biology.
Comparative Analysis and in silico Reconstruction of organism-specific MEtabolic Networks
New sequencing technologies provide ultra-fast access to novel microbial genome data. For their interpretation, an efficient bioinformatics pipeline that facilitates in silico reconstruction of metabolic networks is highly desirable. The software tool CARMEN performs in silico reconstruction of metabolic networks to interpret genome data in a functional context.
EDGAR: A Framework for comparative genomics
The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons.