… to understand how xargs works.

Well, it’s pretty presumptuous of me to say that I understand how xargs works. More accurately, I have a good enough understanding of xargs to be able to use it without immediately running to Google or ChatGPT. Hopefully, dear reader, this will help.

Let’s say I wanna upload a bunch of files that match a regular expression to S3. These were the files I wanted to upload:

❯ ls -al successes-*
-rw-r--r-- 1 mando staff 289743 Jan 16 17:53 successes-00.csv
-rw-r--r-- 1 mando staff 290386 Jan 16 17:52 successes-01.csv
-rw-r--r-- 1 mando staff 287589 Jan 16 17:53 successes-02.csv
-rw-r--r-- 1 mando staff 291275 Jan 16 17:53 successes-03.csv
-rw-r--r-- 1 mando staff 290009 Jan 16 17:53 successes-04.csv
-rw-r--r-- 1 mando staff 149875 Jan 16 17:53 successes-05.csv

Now you’re probably gonna tell me about the --include and --exclude flags to aws s3 cp to which I’ll say “phooey!”.

translation: I tried it and couldn't make it work

Enter: xargs

Here’s the thing I never got that’s helped me better understand what’s going on: all xargs does is take whatever you send it and append them to the end of the command you tell it to call. That’s a terrible explanation but maybe an example will help:

❯ ls success* | xargs echo
successes-00.csv successes-01.csv successes-02.csv successes-03.csv successes-04.csv successes-05.csv successes.csv

All it’s doing is taking the output from ls success* and sending it to echo, turning the command into:

❯ echo successes-00.csv  successes-01.csv  successes-02.csv  successes-03.csv  successes-04.csv  successes-05.csv  successes.csv
successes-00.csv successes-01.csv successes-02.csv successes-03.csv successes-04.csv successes-05.csv successes.csv

Useful I suppose, but doesn’t do a lot for me at the moment. But in all the examples for xargs people be saying -I a lot - what does that do?

❯ ls success* | xargs -I echo
successes-00.csv
successes-01.csv
successes-02.csv
successes-03.csv
successes-04.csv
successes-05.csv
successes.csv

AHA! Instead of taking the inputs and turning them into a list of arguments to the xargs command (echo in this case), it iterates over them one at a time and calls the command with each input in turn. If you toss in -p (for prompt) you’ll see it even clearer:

❯ ls success* | xargs -p -I echo
/bin/echo successes-00.csv?...y
successes-00.csv
/bin/echo successes-01.csv?...y
successes-01.csv
/bin/echo successes-02.csv?...

So let’s try it with aws s3 cp:

❯ ls success* | xargs -p -I aws s3 cp ????????? s3://brainstorm-data/dev/mando/

Astute readers will see the problem: we gotta get the filename where the ????????? is, not at the end of the command like we did with echo.

But fear not! Our new best friend -I is here to help! We can pass some placeholder text to -I and then use that placeholder text in the command to substitute in the argument.

Again, that’s some terrible words but hopefully an example will help:

❯ ls success* | xargs -p -I {} aws s3 cp {} s3://brainstorm-data/dev/mando/
aws s3 cp successes-00.csv s3://brainstorm-data/dev/mando/?...y
upload: ./successes-00.csv to s3://brainstorm-data/dev/mando/successes-00.csv
aws s3 cp successes-01.csv s3://brainstorm-data/dev/mando/?...

When we call xargs like this:

xargs -p -I {} aws s3 cp {} s3://brainstorm-data/dev/mando/

It takes the input (in this case our filename) and replaces the {} with the input! Add in -p as a sanity check and review before your computer friend actually runs the command, and now you’re xargsing.