Easy Parallel Processing with xargs
Hello, I'm Munou.
Sorry for the long time since my last post, but I'm alive.
I bought a used Ryzen 5, and it surprisingly had bent pins, causing issues like only one memory stick being recognized, but I'm somehow managing. I got another M-ATX AM4 motherboard cheaply, so I'm thinking of using that Ryzen chip to build a mini PC.
Convenient xargs
xargs - Wikipedia
xargs, which exists as a UNIX command, is very convenient.
At first, I wasn't used to it, but the more I use it, the more versatility I find. It's a very good command that even eliminates the need to write simple loop statements in a one-liner.
Use Case
For example, let's say you have videos you want to save with yt-dlp, and they exist one per line.
url1
url2
url3
If you pass them all at once for parallel processing, the CPU load will be severe, so if you want to run them two processes at a time, you can write it like this:
cat url-txt | xargs -I {} -P 2 yt-dlp {}
Surprisingly, with just this, you can process them two processes at a time.
Let's break it down
So, how does this work?
The above is equivalent to the following:
yt-dlp url1 ; yt-dlp url2
yt-dlp url3
Mechanically, it means that the command passed to xargs is executed each time.
Is it like reading a line and storing it as an array?
What about without options?
Without options, it passes all arguments to the command given to xargs.
Since the compatible commands are limited, I feel it's better to use the -I option.
In cases like the following:
cat url-txt | xargs yt-dlp
In this case, the following command would have been executed:
yt-dlp url1 url2 url3
Therefore, it seems better to process with -I whenever possible, but without options might be fine depending on the situation.
Use Case2
It's also useful when you want to grep or sed only specific filenames with the find command.
find . -name "testfile" | xargs -I {} grep "return 0" {}
In this way, you can grep only specific files with such a short description.
That's all for today.
Until next time.