Main Page
From WorkflowSharing
Contents |
Towards Shared Repositories of Computational Workflows
Scientific computing has entered a new era of scale and sharing with the arrival of cyberinfrastructure for computational experimentation. A key emerging concept is scientific workflows, which provide a declarative representation of scientific applications as complex compositions of software components and the dataflow among them. Workflow systems manage their execution in distributed resources, track provenance of analysis products, and enable rapid reproducibility of results. In current cyberinfrastructure, there are well-understood mechanisms for sharing data, instruments, and computing resources. This is not the case for sharing workflows, though there is an emerging movement for sharing analysis processes in the scientific community.
We are investigating computational mechanisms for sharing workflows as a key missing element of cyberinfrastructure for scientific research. We are exploring three major research topics. First, we are eliciting new requirements that workflow sharing poses over current techniques to share software tools and libraries. Second, we want to understand how shared workflow catalogs should be designed. Existing data catalogs are a successful model, but software components require different representations and access functions. Finally, we are studying what sharing paradigms might be appropriate for scientific communities, exploring environments ranging from traditional server-based architectures to wikis to Web 2.0 social sites.
Recent Results
We are investigating several major research areas:
- Reuse through workflow provenance sharing
- Reproducibility through workflow sharing
- Design of shared workflow catalogs
- Paradigms for workflow sharing
- Incentives for workflow sharing
- Making Expertise Accessible through Workflows
Publications
- "Making Data Analysis Expertise Broadly Accessible through Workflows". Matheus Hauder, Yolanda Gil, Ricky Sethi, Yan Liu, and Hyunjoon Jo. Proceedings of the Seventh IEEE International Conference on e-Science, Stockholm, Sweden, December 5-8, 2011. Available as a preprint.
- "Linked Data for Network Science". Paul Groth and Yolanda Gil. Proceedings of Workshop on Linked Science Data (LISD) of the International Semantic Web Conference, Bonn, Germany, 2011.
- “Retrieval of Semantic Workflows with Knowledge Intensive Similarity Metrics”. Ralph Bergmann and Yolanda Gil. Proceedings of the Nineteenth International Conference on Case Based Reasoning (ICCBR), Greenwich, London, September 2011. Available as a preprint.
- “The Open Provenance Model Core Specification (v1.1)”. Luc Moreau, Ben Clifford, Juliana Freire, Joe Futrelle, Yolanda Gil, Paul Groth, Natalia Kwasnikowska, Simon Miles, Paolo Missier, Jim Myers, Beth Plale, Yogesh Simmhan, Eric Stephan, and Jan Van den Bussche. To appear in Future Generation Computer Systems, 2011. Available as a preprint.
- "A Social Collaboration Argumentation System for Generating Multi-Faceted Answers in Question and Answer Communities". Ricky Sethi and Yolanda Gil. 2011. To appear in Proceedings of the AAAI Workshop on Computational Models of Natural Argument, San Francisco, CA. Available as a preprint.
- "LinkedDataLens: Linked Data as a Network of Networks". Paul Groth and Yolanda Gil. Proceedings of the ACM International Conference on Knowledge Capture (K-CAP), Banff, Alberta, Canada, 2011. Available as a preprint.
- “Provenance Requirements for the Next Version of RDF”. Jun Zhao, Christian Bizer, Yolanda Gil, Paolo Missier, Satya Sahoo. W3C Workshop on RDF Next Steps, Stanford, CA, June 2010. Available as a preprint.
- “Social Task Networks: Personal and Collaborative Task Formulation and Management in Social Networking Sites”. Yolanda Gil, Paul Groth, and Varun Ratnakar. AAAI Fall Symposium Series on Proactive Assistant Agents, Arlington, VA, November 2010. Available as a preprint.
Points of Contact
Yolanda Gil (PI)
Students
- Christian Fritz (Post-doctoral student), University of Southern California
- Denny Vrandecic (Post-doctoral student), University of Southern California
- Daniel Garijo (PhD student), Polytechnic University of Madrid.
Collaborators
- Ralph Bergmann, University of Trier (Germany)
- Pedro Gonzalez, Universidad Complutense de Madrid (Spain)
- Christopher Mason, Cornell University
- Joel Saltz, Emory University
- Paul Groth, Free University of Amsterdam (Netherlands)
- Luc Moreau, University of Southampton (UK)
- Simon Miles, Kings College London (UK)
Funding
This work was done under the grant Towards Shared Repositories of Computational Workflows, funded by the National Science Foundation with grant number IIS-0948429 from September 2009 to August 2011.
