The huge amounts of data acquired from PolyOmics technologies can only be handled with intensive bioinformatics support that has to provide an adequate data management, efficient data analysis algorithms, and user-friendly software applications. Over the last years, we developed a large scale software suite to substantially support researchers in the field of genome and metagenome, transcriptome, proteome, and metabolome data analysis. As all software applications follow a similar design pattern, experimental data as well as analysis results can be easily integrated to achieve higher-level evaluation and combined visualizations.
Forward-looking projects within the computational genomics group focus on the development of bioinformatics analysis workflows for rapid and parallel data interpretation, which also includes hardware accelerated software implementations. Conveyor is our novel workflow-processing engine to rapidly deploy new bioinformatics data analysis pipelines utilizing either the CeBiTec compute cluster for distributed computing or multi-core servers for local execution. Using a workflow based approach, analyses can be designed using an intuitive graphical designer application. A number of ready-to-use processing steps for bioinformatics analyses already exist with a focus on sequence analysis and sequence annotation. Using the Conveyor2Go tool, an existing workflow can easily be converted into a standalone application.
The program SARUMAN is our first approach based on GPU programming that allows us to boost the performance of complete and exact short read mapping against reference genomes. It needs no server class hardware and can be run on every desktop PC with an installed NVIDIA graphic adapter. As result, various scientific applications can benefit from the parallel computing power provided by current graphics adapters found in many PCs.
The whole collection of established software tools at hand provides the basis for efficient, parallel, and data driven processing that will be further improved and extended in the future to build a comprehensive bioinformatics technology platform for genome based systems biology.
ReadXplorer - Visualization and Analysis of Mapped Sequences
Next generation sequencing (NGS) technologies offer an ever-growing amount of sequence data. At the same time, sequencing costs are rapidly decreasing, even for complete mammalian genomes. The application scenarios of NGS include sequencing of unknown genomes, closely related strains, re-sequencing of known genomes, and transcriptome sequencing (RNA-seq).
ReadXplorer is a freely available comprehensive exploration and evaluation tool for NGS data. It extracts and adds quantity and quality measures to each alignment in order to classify the mapped reads. This classification is then taken into account for the different data views and all supported analysis functions. ReadXplorer is implemented in Java as a Netbeans rich client application. Utilizing a modular programming structure, it enables developers to create their own highly specialized software modules and easily plug them into ReadXplorer.
Comparative Analysis and in silico Reconstruction of organism-specific MEtabolic Networks
New sequencing technologies provide ultra-fast access to novel microbial genome data. For their interpretation, an efficient bioinformatics pipeline that facilitates in silico reconstruction of metabolic networks is highly desirable. The software tool CARMEN performs in silico reconstruction of metabolic networks to interpret genome data in a functional context.