Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid (CMS) Experiment at LHC


Presentation held at ICCS 2015 Conference, 2015
Reykjavik, Iceland

High throughput computing (HTC) has aided the scientific community in the analysis of vast amounts of data and computational jobs in distributed environments. To manage these large workloads, several systems have been developed to efficiently allocate and provide access to distributed resources. Many of these systems rely on job characteristics estimates (e.g., job runtime) to characterize the workload behavior, which in practice is hard to obtain. In this work, we perform an exploratory analysis of the CMS experiment workload using the statistical recursive partitioning method and conditional inference trees to identify patterns that characterize particular behaviors of the workload. We then propose an estimation process to predict job characteristics based on the collected data. Experimental results show that our process estimates job runtime with 75% of accuracy on average, and produces nearly optimal predictions for disk and memory consumption.

 

Related Publication

  • [PDF] [DOI] R. Ferreira da Silva, M. Rynge, G. Juve, I. Sfiligoi, E. Deelman, J. Letts, F. Würthwein, and M. Livny, “Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid (CMS) Experiment at LHC,” Procedia Computer Science, vol. 51, p. 39–48, 2015.
    [Bibtex]
    @article{ferreiradasilva-iccs-2015,
    title = {Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid ({CMS}) Experiment at {LHC}},
    author = {Ferreira da Silva, Rafael and Rynge, Mats and Juve, Gideon and Sfiligoi, Igor and Deelman, Ewa and Letts, James and W\"urthwein, Frank and Livny, Miron},
    journal = {Procedia Computer Science},
    year = {2015},
    volume = {51},
    pages = {39--48},
    note = {International Conference On Computational Science, \{ICCS\} 2015 Computational Science at the Gates of Nature},
    doi = {10.1016/j.procs.2015.05.190}
    }

 

972 views

Continue Reading

A science-gateway workload archive to study pilot jobs, user activity, bag of tasks, task sub-steps, and workflow executions


Presentation held at Workshop on Grids, Clouds, and P2P Computing (CGWS), 2012
Rhodes Island, Greece – Euro-Par 2012

Abstract – Archives of distributed workloads acquired at the infrastructure level reputably lack information about users and application-level middleware. Science gateways provide consistent access points to the infrastructure, and therefore are an interesting information source to cope with this issue. In this paper, we describe a workload archive acquired at the science-gateway level, and we show its added value on several case studies related to user accounting, pilot jobs, fine-grained task analysis, bag of tasks, and workflows. Results show that science-gateway workload archives can detect workload wrapped in pilot jobs, improve user identification, give information on distributions of data transfer times, make bag-of-task detection accurate, and retrieve characteristics of workflow executions. Some limits are also identified.

 

Related Publication

  • [PDF] [DOI] R. Ferreira da Silva and T. Glatard, “A Science-Gateway Workload Archive to Study Pilot Jobs, User Activity, Bag of Tasks, Task Sub-steps, and Workflow Executions,” in Euro-Par 2012: Parallel Processing Workshops, I. Caragiannis, M. Alexander, R. Badia, M. Cannataro, A. Costan, M. Danelutto, F. Desprez, B. Krammer, J. Sahuquillo, S. Scott, and J. Weidendorfer, Eds., , 2013, vol. 7640, p. 79–88.
    [Bibtex]
    @incollection{ferreiradasilva-cgws-2013,
    year = {2013},
    booktitle = {Euro-Par 2012: Parallel Processing Workshops},
    volume = {7640},
    series = {Lecture Notes in Computer Science},
    editor = {Caragiannis, Ioannis and Alexander, Michael and Badia, RosaMaria and Cannataro, Mario and Costan, Alexandru and Danelutto, Marco and Desprez, Fr\'ed\'eric and Krammer, Bettina and Sahuquillo, Julio and Scott, StephenL. and Weidendorfer, Josef},
    doi = {10.1007/978-3-642-36949-0_10},
    title = {A Science-Gateway Workload Archive to Study Pilot Jobs, User Activity, Bag of Tasks, Task Sub-steps, and Workflow Executions},
    author = {Ferreira da Silva, Rafael and Glatard, Tristan},
    pages = {79--88}
    }

 

917 views

Continue Reading