Description:A help tool for executing embarrassingly parallel computing tasks in ARC based grids.
Abstract:Grid computing can be very effective in cases where the analysis task can be split into numerous independent sub-tasks. They are generally referred to as embarrassingly parallel computing tasks, and typical examples are cases where the same simulation task is executed several times with different parameter settings. Another common embarrassingly parallel job type are cases where the same analysis is performed to a large set of inputfiles.
Running embarrassingly parallel computing tasks in the grid environment is in principle straight forward: the user just creates the grid job files and submits all the jobs to grid and, once the jobs are ready, the user then collects the results and merges them together. However, this kind of straight forward seeming approach is not always the most efficient way.
A grid job manager tool, called arcrunner, can be used to run large embarrassingly parallel computing tasks easily and effectively in the ARC environment. You can use arcrunner in the CSC Hippu (hippu.csc.fi) servers or you can download it to your local Linux or MacOSX computer.
Arcrunner automatizes several steps, that are needed when embarrassingly parallel computing tasks are executed in the ARC based grid environments: It submits the grid jobs gradually, keeping the load balance of the grid environment in a proper level. When a grid job has finished, the results are automatically retrieved or, if the jobs has failed, the job is automatically resubmitted to another cluster.