Workflows and Programatically Accessible Tools
WPAT09
Course description:
The quantity and size of bioinformatics data is continually growing,
providing rich resources to researchers, but also presenting problems
of interoperability and data management. Workflow technologies offer a
solution to this problem as they enable the automated and systematic
use of distributed bioinformatics data and applications from the
scientist's desktop. This provides a fast and efficient methodology for
conducting large-scale experiments without the overhead of installing
and maintaining local resources. Additionally, data and metadata
management capabilities facilitate the support of the whole in silico
experiment life cycle.
The underlying technology for many workflow systems is cutting-edge
computer science technologies. For example, the use of distributed Web
Services by workflow management systems allows the co-ordinated access
to supercomputing resources from standard desktop computers. Also, the
use of ontologies for describing workflow processes and for service
descriptions and discovery enables the possibility of automated
workflow composition and the collection of experimental provenance.
Workflows have already been used in many areas of biology, including;
transcriptomics, proteomics, systems biology, data integration,
comparative genomics, sequence analysis and structural biology,
however, their use is not restricted to these areas. Any in silico
experiment involving multiple steps in its analysis and multiple data
sources and resources can potentially benefit from using workflows.
Taverna (a component of the myGrid Project)
is a workflow management system that allows user to develop and run
workflows by combining distributed and local analysis tools and data
resources. It has over 62500 downloads and is used by over 350
institutions worldwide and has been used by scientists in different
domains, including amongst others, life sciences, medicine and
astronomy. Many of the workflows that have already been developed are
available for reuse or repurposing from myExperiment,
a repository for publishing workflows and a social networking resource
for sharing expertise and experience. myExperiment also enables
workflow enactment through a web-interface, providing an alternative
mechanism for running workflows and sharing results between groups of
collaborators. Currently, myExperiment has 1400 users and stores over
560 workflows. This pool of workflows and know-how provides scientists
with a wealth of components to integrate into the design of new
experiments.
Aims and objectives:
This course aims to provide
attendees with a "hands-on" introduction to designing and building
workflows in Taverna. We will provide background on workflow-based
systems available and examples of workflow projects. We will show
practical demonstrations of workflow construction and highlight
associated issues such as provenance, service discovery and workflow
reuse. The objectives of the course are to:
1)Understand and experience the steps involved in good workflow design and implementation
2)Design and Build workflows using distributed and local analysis tools and data resources
3)Understand the major issues faced when designing and building workflows
4)Gain experience of where using workflows would be advantageous to your research
Target Audience:
This "tutorial-style" course will be of benefit for anyone
(postgraduate students and researchers) wishing to explore new methods
of designing complex, and/or repetitive, in silico experiments in the
life sciences. It will also be of interest to those who are already
exploring workflow technology and have use cases in mind.
Course Fee : Euro 160.00
Details about the course and the application process at:
http://gtpb.igc.gulbenkian.pt/bicourses/WPAT09/


