Workshop Goals
From NSF Workflow Workshop
In the last two decades, workflows and process models have been used as a paradigm to organize and manage complex task structures. These workflows typically represent business processes. They are often executed by a variety of parties and involving elaborate partial products as well as requirements and constraints that require complex relationships among the tasks, often cutting across organizational boundaries. Workflow editors and project management tools abound, and several standards for workflow languages have been developed in recent years.
Recently, workflows have also emerged as a paradigm for conducting large-scale scientific analyses. The structure of a workflow specifies what analysis routines need to be executed, the data flow amongst them, and relevant execution details. These workflows often need to be executed in distributed environments, where data sources may be available in different physical locations and the steps may have execution requirements calling for high-end computing and memory resources at remote locations. Workflows help manage the coordinated execution of related tasks. They also provide a systematic way to capture scientific methodology and provide provenance information for their results. Scientists in many disciplines are approaching data volumes and resource sharing facilities that would enable a new stage in scientific discovery. Yet, robust and flexible workflow creation, mapping, and execution are largely open research problems. Scientific workflows present new challenges over business workflows and other kinds of process models. They typically use large data sets and computationally intensive tasks and require high-end and distributed computing technology. They are also often iteratively and interactively designed, since that is the nature of the scientific exploration and analysis process they reflect. But they also have simplified requirements in terms of their data flow structure and execution management.
The aim of this workshop is to bring together IT researchers and practitioners working on a variety of aspects of workflow management as well as domain scientists that use workflows for day-to-day data analysis and simulation. Application scientists will be asked to describe requirements and desired new analyses and computations that are not possible with today’s technologies. IT researchers will be asked to identify problems in their specific areas of expertise. The workshop discussions will focus on four main topics:
1. Applications and requirements: What are the requirements of future applications? What new capabilities are needed to support emerging applications?
2. Dynamic workflows and user steering: What are the challenges in supporting dynamic workflows that need to evolve over time as execution data become available? What kinds of techniques can support incremental and dynamic workflow evolution due to user steering?
3. System-level management: What are the challenges in supporting large-scale workflows in a scalable and robust way? What changes are needed in existing software infrastructure? What new research needs to be done to develop better workflow management systems?
4. Data and workflow descriptions: How can workflow descriptions be improved to support usability and scalability? How to describe data produced as part of the workflows? What provenance information needs to be tracked to support scalable data and workflow discovery?
