Repository: Freie Universität Berlin, Math Department

Optimizing Job/Task Granularity for Metagenomic Workflows in Heterogeneous Cluster Infrastructures

Mohammadi, Somayeh and PourKarimi, Latif and Zschäbitz, Manuel and Aretz, Tristan and De Mecquenem, Ninon and Leser, Ulf and Reinert, Knut (2024) Optimizing Job/Task Granularity for Metagenomic Workflows in Heterogeneous Cluster Infrastructures. In: EDBT/ICDT 2024 Joint Conference: 8th International workshop on Data Analytics solutions for Real-LIfe APplications (DARLI-AP).

Full text not available from this repository.

Official URL: https://ceur-ws.org/Vol-3651/DARLI-AP-15.pdf

Abstract

Data analysis workflows are popular for sequencing activities in large-scale and complex scientific processes. Scheduling approaches attempt to find an appropriate assignment of workflow tasks to the computing nodes for minimizing the makespan in heterogeneous cluster infrastructures. A common feature of these approaches is that they already know the structure of the workflow. However, for many workflows, a high degree of parallelization can be achieved by splitting the large input data of a single task into chunks and processing them independently. We call this problem task granularity, which involves finding an assignment of tasks to computing nodes and simultaneously optimizing the structure of a bag of tasks. Accordingly, this paper addresses the problem of task granularity for metagenomic workflows. To this end, we first formulated the problem as a mathematical model. We then solved the proposed model using the genetic algorithm. To overcome the challenge of not knowing the number of tasks, we adjusted the number of tasks as a factor of the number of computing nodes. The procedure of increasing the number of tasks is performed interactively and evolutionarily. Experimental results showed that a desirable makespan value can be achieved after a few steps of the increase.

Item Type:Conference or Workshop Item (Paper)
Subjects:Mathematical and Computer Sciences > Computer Science
Divisions:Department of Mathematics and Computer Science > Institute of Computer Science > Algorithmic Bioinformatics Group
ID Code:3144
Deposited By: Anja Kasseckert
Deposited On:18 Apr 2024 11:34
Last Modified:18 Apr 2024 11:34

Repository Staff Only: item control page