Research Projects

 


WRENCH: Workflow Management System Simulation Workbench
http://wrench-project.org

Capitalizing on recent advances in distributed application and platform simulation technology, WRENCH makes it possible to (1) quickly prototype workflow, WMS implementations, and decision-making algorithms; and (2) evaluate/compare alternative options scalably and accurately for arbitrary, and often hypothetical, experimental scenarios.  This project will define a generic and foundational software architecture, that is informed by current state-of-the-art WMS designs and planned future designs.  The implementation of the components in this architecture when taken together form a generic “scientific instrument” that can be used by workflow users, developers, and researchers.  This scientific instrument will be instantiated for several real-world WMSs and used for a range of real-world workflow applications.

Funding Agency: NSF
Role: co-PI

 


Model Integration through Knowledge-Rich Data and Process Composition
http://mint-project.info

Major societal and environmental challenges require forecasting how natural processes and human activities affect one another. There are many areas of the globe where climate affects water resources and therefore food availability, with major economic and social implications. Today, such analyses require significant effort to integrate highly heterogeneous models from separate disciplines, including geosciences, agriculture, economics, and social sciences. Model integration requires resolving semantic, spatio-temporal, and execution mismatches, which are largely done by hand today and may take more than two years. The Model INTegration (MINT) project will develop a modeling environment which will significantly reduce the time needed to develop new integrated models, while ensuring their utility and accuracy. Research topics to be addressed include: 1) New principle-based semiautomatic ontology generation tools for modeling variables, to ground analytic graphs to describe models and data; 2) A novel workflow compiler using abductive reasoning to hypothesize new models and data transformation steps; 3) A new data discovery and integration framework that finds new sources of data, learns to extract information from both online sources and remote sensing data, and transforms the data into the format required by the models; 4) A new methodology for spatio-temporal scale selection; 5) New knowledge-guided machine learning algorithms for model parameterization to improve accuracy; 6) A novel framework for multi-modal scalable workflow execution; and 7) Novel composable agroeconomic models.

Funding Agency: DARPA
Role: co-PI

 

In Situ Data Analytics for Next Generation Molecular Dynamics Workflows
http://analytics4md.org

Molecular dynamics simulations studying the classical time evolution of a molecular system at atomic resolution are widely recognized in the fields of chemistry, material sciences, molecular biology and drug design; these simulations are one of the most common simulations on supercomputers. Next-generation supercomputers will have dramatically higher performance than do current systems, generating more data that needs to be analyzed (i.e., in terms of number and length of molecular dynamics trajectories). The coordination of data generation and analysis cannot rely on manual, centralized approaches as it does now. This interdisciplinary project integrates research from various areas across programs such as computer science, structural molecular biosciences, and high performance computing to transform the centralized nature of the molecular dynamics analysis into a distributed approach that is predominantly performed in situ. Specifically, this effort combines machine learning and data analytics approaches, workflow management methods, and high performance computing techniques to analyze molecular dynamics data as it is generated, save to disk only what is really needed for future analysis, and annotate molecular dynamics trajectories to drive the next steps in increasingly complex simulations’ workflows.

Funding Agency: NSF
Role: co-PI

 


Pegasus Workflow Management System
http://pegasus.isi.edu

The Pegasus project encompasses a set of technologies that help workflow-based applications execute in a number of different environments including desktops, campus clusters, grids, and now clouds. Scientific workflows allow users to easily express multi-step computations, for example retrieve data from a database, reformat the data, and run an analysis. Once an application is formalized as a workflow the Pegasus Workflow Management Service can map it onto available compute resources and execute the steps in appropriate order. Pegasus can easily handle workflows with several million computational tasks.

Funding Agency: NSF
Role: Senior Personnel

 


RACE:
Repository and Workflows for Accelerating Circuit Realization
http://race.crc.nd.edu/

RACE will enable researchers and design experts to expand the state-of-the art in ASIC design through novel cyberinfrastructure and workflow tools that accelerate every phase of discovery, creation, adoption, and use by linking and computing around a repository of user-generated data, including new tools, new IP blocks/libraries, new design flows, training modules, and experience-base documenting best practices to adopt (and pitfalls to avoid).

Funding Agency: DARPA
Role: Senior Personnel

 


Panorama 360: Performance Data Capture and Analysis for End-to-end Scientific Workflows
https://panorama360.github.io

The vision of Panorama 360 is to provide a resource for the collection, analysis, and sharing of performance data about end-to-end scientific workflows executing on DOE facilities. In Panorama 360, we will develop a repository and associated capabilities for data collection, ingestion, and analysis for a broad class of DOE applications that span experimental and simulation science workflows. In particular, the proposed work will focus on workflows that include experimental data generation at DOE facilities.

Funding Agency: DOE
Role: Senior Personnel

 


Boutiques: A cross-platform application repository for science gateways
http://boutiques.github.io

Boutiques is an application repository that allows automatic import and exchange of applications in science gateways. Compared to previous initiatives, our repository relies on Linux containers to solve the problem of application installation in a lightweight manner. In addition, it adopts a flexible application description format which is versatile enough to be used in various science gateways.

 

2,258 views