Mining complex biodata
Several technological developments increase the complexity of nowadays collections of biodata almost on a daily basis. The high dimension, the large volume and the non-linear dependencies in the data trigger the demand for new methodologies and tools to enable researchers to find the gold nuggets their mountains of biodata.
Previous and Current Research
The fields of life science and biotechnology are continuously shaped by the ongoing development of new technologies to analyze biological samples. Bioimaging is one example here, since due to its latest advances regarding automation, resolution and dimension, it plays a role of growing importance in life science research. In the last ten years, the Biodata Mining Group gained a lot of experience in the successful development and evaluation of algorithms for the automatic detection and quantification of objects in bioimages and medical images using concepts from the fields of machine learning and neural networks. This approach has shown to have considerable advantages like flexibility and segmentation accuracy. The applications range from automatic tumor detection and classification in MRI of the female breast, via the detection and segmentation of cells in fluorescence micrographs or histopathological slides to the segmentation and quantification of corals in underwater video data. One field of particular interest is the analysis of multivariate bioimages which have been proposed to study molecular networks and interactions. Among these are high-content imaging (HCI), Multi-Epitope-Ligand-Cartography (MELC), Toponome Imaging (TIS), MALDI imaging (MI) or vibrational spectroscopy. The result data of these techniques can be referred to as multivariate images since to each pixel a number of variables is associated. Recently the Biodata Mining Group developed a new technological online platform to enable researchers to analyze such high-dimensional data through the web: BioIMAX (BioImage Management, Analysis and eXploration). The approach combines principles from image processing, unsupervised machine learning, information visualization and new web technology.
In addition the group contributed significantly in other fields of bioinformatics, such as the analysis of new age sequencing data regarding the classification of short sequence reads and the visualization of metagenome data.
Future Projects and Aims
The Biodata Mining Group pursues the goal to build new bridges between complex high volume data and the users. One important future goal is the development of new methodologies to analyze data from different sources (like multimodal data or poly-omics data). As an example one may consider a biomedical scenario, where a group of cancer patients is analyzed using a multitude of techniques leading to a complex data set of clinical data, images, microarrays, histopathology and others. It is our goal to develop integrated approaches to analyze this data make it understandable to the users so they can develop new hypothesis.
In addition, we believe that one important building block of new bridges between data and users is the development of new ways and techniques to visualize biodata and interact with biodata. In the development of these techniques we keep our eyes on the development of the world wide web and the web technology since the ongoing revolutionary development of the web and its influences on our ways to interact with data continuously trigger new ideas and concepts for new biodata mining approaches.