Mohammadi, Somayeh and PourKarimi, Latif and Zschäbitz, Manuel and Aretz, Tristan and De Mecquenem, Ninon and Leser, Ulf and Reinert, Knut (2024) Optimizing Job/Task Granularity for Metagenomic Workflows in Heterogeneous Cluster Infrastructures. In: EDBT/ICDT 2024 Joint Conference: 8th International workshop on Data Analytics solutions for Real-LIfe APplications (DARLI-AP).
Full text not available from this repository.
Official URL: https://ceur-ws.org/Vol-3651/DARLI-AP-15.pdf
Abstract
Data analysis workflows are popular for sequencing activities in large-scale and complex scientific processes. Scheduling approaches attempt to find an appropriate assignment of workflow tasks to the computing nodes for minimizing the makespan in heterogeneous cluster infrastructures. A common feature of these approaches is that they already know the structure of the workflow. However, for many workflows, a high degree of parallelization can be achieved by splitting the large input data of a single task into chunks and processing them independently. We call this problem task granularity, which involves finding an assignment of tasks to computing nodes and simultaneously optimizing the structure of a bag of tasks. Accordingly, this paper addresses the problem of task granularity for metagenomic workflows. To this end, we first formulated the problem as a mathematical model. We then solved the proposed model using the genetic algorithm. To overcome the challenge of not knowing the number of tasks, we adjusted the number of tasks as a factor of the number of computing nodes. The procedure of increasing the number of tasks is performed interactively and evolutionarily. Experimental results showed that a desirable makespan value can be achieved after a few steps of the increase.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Subjects: | Mathematical and Computer Sciences > Computer Science |
Divisions: | Department of Mathematics and Computer Science > Institute of Computer Science > Algorithmic Bioinformatics Group |
ID Code: | 3144 |
Deposited By: | Anja Kasseckert |
Deposited On: | 18 Apr 2024 11:34 |
Last Modified: | 18 Apr 2024 11:34 |
Repository Staff Only: item control page