It only took me 20 years
… to understand how xargs
works.
Well, it’s pretty presumptuous of me to say that I understand how xargs
works. More accurately, I have a good enough understanding of xargs
to be able to use it without immediately running to Google or ChatGPT. Hopefully, dear reader, this will help.
Let’s say I wanna upload a bunch of files that match a regular expression to S3. These were the files I wanted to upload:
❯ ls -al successes-*
-rw-r--r-- 1 mando staff 289743 Jan 16 17:53 successes-00.csv
-rw-r--r-- 1 mando staff 290386 Jan 16 17:52 successes-01.csv
-rw-r--r-- 1 mando staff 287589 Jan 16 17:53 successes-02.csv
-rw-r--r-- 1 mando staff 291275 Jan 16 17:53 successes-03.csv
-rw-r--r-- 1 mando staff 290009 Jan 16 17:53 successes-04.csv
-rw-r--r-- 1 mando staff 149875 Jan 16 17:53 successes-05.csv
Now you’re probably gonna tell me about the --include
and --exclude
flags to aws s3 cp
to which I’ll say “phooey!”.
translation: I tried it and couldn't make it work
Enter: xargs
Here’s the thing I never got that’s helped me better understand what’s going on:
all xargs
does is take whatever you send it and append them to the end of the command you tell it to call. That’s a terrible explanation but maybe an example will help:
❯ ls success* | xargs echo
successes-00.csv successes-01.csv successes-02.csv successes-03.csv successes-04.csv successes-05.csv successes.csv
All it’s doing is taking the output from ls success*
and sending it to echo
, turning the command into:
❯ echo successes-00.csv successes-01.csv successes-02.csv successes-03.csv successes-04.csv successes-05.csv successes.csv
successes-00.csv successes-01.csv successes-02.csv successes-03.csv successes-04.csv successes-05.csv successes.csv
Useful I suppose, but doesn’t do a lot for me at the moment. But in all the examples for xargs
people be saying -I
a lot - what does that do?
❯ ls success* | xargs -I echo
successes-00.csv
successes-01.csv
successes-02.csv
successes-03.csv
successes-04.csv
successes-05.csv
successes.csv
AHA! Instead of taking the inputs and turning them into a list of arguments to the xargs
command (echo
in this case), it iterates over them one at a time and calls the command with each input in turn. If you toss in -p
(for prompt) you’ll see it even clearer:
❯ ls success* | xargs -p -I echo
/bin/echo successes-00.csv?...y
successes-00.csv
/bin/echo successes-01.csv?...y
successes-01.csv
/bin/echo successes-02.csv?...
So let’s try it with aws s3 cp
:
❯ ls success* | xargs -p -I aws s3 cp ????????? s3://brainstorm-data/dev/mando/
Astute readers will see the problem: we gotta get the filename where the ?????????
is, not at the end of the command like we did with echo
.
But fear not! Our new best friend -I
is here to help! We can pass some placeholder text to -I
and then use that placeholder text in the command to substitute in the argument.
Again, that’s some terrible words but hopefully an example will help:
❯ ls success* | xargs -p -I {} aws s3 cp {} s3://brainstorm-data/dev/mando/
aws s3 cp successes-00.csv s3://brainstorm-data/dev/mando/?...y
upload: ./successes-00.csv to s3://brainstorm-data/dev/mando/successes-00.csv
aws s3 cp successes-01.csv s3://brainstorm-data/dev/mando/?...
When we call xargs
like this:
xargs -p -I {} aws s3 cp {} s3://brainstorm-data/dev/mando/
It takes the input (in this case our filename) and replaces the {} with the input!
Add in -p
as a sanity check and review before your computer friend actually runs the command, and now you’re xargsing.