GNU Parallel

Shell tool for executing jobs in parallel using one or more computers

Introduction

Parallel은 병렬 실행을 위한 cli 도구입니다. 이를 활용하면 다수의 반복 작업등을 병렬 처리할 수 있어서 cli 기반의 자동화에 큰 도움을 줄 수 있습니다.

Installation

MacOS

$ brew install parallel

Linux

$ sudo apt install parallel

Usage

Usage:

parallel [options] [command [arguments]] < list_of_arguments
parallel [options] [command [arguments]] (::: arguments|:::: argfile(s))...
cat ... | parallel --pipe [options] [command [arguments]]

-j n            Run n jobs in parallel
-k              Keep same order
-X              Multiple arguments with context replace
--colsep regexp Split input on regexp for positional replacements
{} {.} {/} {/.} {#} {%} {= perl code =} Replacement strings
{3} {3.} {3/} {3/.} {=3 perl code =}    Positional replacement strings
With --plus:    {} = {+/}/{/} = {.}.{+.} = {+/}/{/.}.{+.} = {..}.{+..} =
                {+/}/{/..}.{+..} = {...}.{+...} = {+/}/{/...}.{+...}

-S sshlogin     Example: foo@server.example.com
--slf ..        Use ~/.parallel/sshloginfile as the list of sshlogins
--trc {}.bar    Shorthand for --transfer --return {}.bar --cleanup
--onall         Run the given command with argument on all sshlogins
--nonall        Run the given command with no arguments on all sshlogins

--pipe          Split stdin (standard input) to multiple jobs.
--recend str    Record end separator for --pipe.
--recstart str  Record start separator for --pipe.

GNU Parallel can do much more. See 'man parallel' for details

Academic tradition requires you to cite works you base your article on.
If you use programs that use GNU Parallel to process data for an article in a
scientific publication, please cite:

  Tange, O. (2020, November 22). GNU Parallel 20201122 ('Biden').
  Zenodo. https://doi.org/10.5281/zenodo.4284075

This helps funding further development; AND IT WON'T COST YOU A CENT.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

옵션 정리

자주 사용되는 옵션 정리

  • -k : 순서를 유지하여 진행
  • -j : 동시에 실행할 작업 수 (e.g -j 5 시 5개 단위로 병렬 실행)
  • -X : 여러개의 인자값 처리
  • -S : ssh 기반 (e.g -S root@192.168.0.1 시 해당 ssh 서버에서 실행)

Parallel scanning

Paramspider parallel scanning

$ cat domains | parallel -k -q python3 ~/tools/ParamSpider/paramspider.py -d {} -l high -e gif,jpg,jpeg,png,woff,txt,avi,mov,mpeg,webp,gif

Smuggler parallel scanning with 5 jobs

$ cat urls | parallel -j 5 -k -q python3 ~/tools/smuggler/smuggler.py -u {}

References