Conveyor - A workflow engine for bioinformatics analyses (Dr. Burkhard Linke)

The various fields of the omics life sciences (genomics, transcriptomics, proteomics to name a few) have shown a very high pace of development in the last decade. New laboratory technologies like next-gen sequencing or high-throughput GC-MS are able to create a tremendous amount of data. On the other hand, the cost and availability of compute power has dropped to an all time low.
Combining life sciences and computer science is the field of bioinformatics, which also has seen a growth without comparison over the last decade, and many new algorithms and applications like short read mapping have been enabled by new developments in life sciences. Keeping pace with the development and data flood from the laboratories imposes a need for agile and flexible software development approaches.

Unfortunately, the conventional software development is too slow, even with scripting languages like Perl or Python. This is why workflow systems have been proposed and emerged. They break down a complex task into smaller problems, orchestrate solving the sub-problems and merging the solutions, and to some extend also handle data management.
Conveyor is a workflow engine which uses a novel approach for defining and deploying of analytical workflows. It is based on a full generic object-oriented data model, and allows a datadriven, interface based design of workflows. The engine is extensible by plugins, allowing almost any functionality to be included in a workflow. The applicability of Conveyor to bioinformatics applications is demonstrated in this thesis by several use cases.

The complete thesis is available online.
The most important details about Conveyor have been published as a full paper in the Oxford Bioinformatics Journal.