If you work with the AWS CLI, chances are you are going to want to use grep
on the various list and get operations to make the results more readable. And if you want a way to mimic bulk operations, you can combine it with sed
for
First, let me get a rant out of the way: why, AWS CLI? why do you not standardize on get-
or list-
?? Truly aggravating, but I guess it just means you need to keep the link to the CLI reference bookmarked for convenience.
grep
Moving on. When you run a list, get, or describe, you'll get probably more information than you need, and grep
is an easy way to filter down to just the lines you want. Take as an example the below output from s3api list-objects --bucket "my-aws-library"
:
All I wanted was a list of filenames. Grep to the rescue! While there are numerous flags to modify the behavior of grep
, there are only a few that I use regularly. These are what I think are the most useful grep
flags:
-o
this says 'only return the lines that match the regex pattern'-c
this says 'just count the lines that match the regex pattern'-v
this inverts the match - iow, only return/count lines that don't match the regex pattern-E
interpret the pattern as an extended regular expression; this is helpful if you're accustomed to the VSCode flavor of regex
So, armed with these flags, let's adjust our list objects command like so:
aws s3api list-objects --bucket "my-aws-library" | grep -o -E '"Key": .+"'
Like using regex in VSCode, using .+"
will do a greedy match of everything up until the final "
on the line. This will produce a list with just the filenames:
SED
Well, limiting the result to just a list of filenames is useful, but what I really want is to do stuff to those files. While the AWS console makes it fairly easy to move, delete, and copy files within an account, moving between accounts must be done through the CLI. There are some policies that need to be created first to ensure that the files can be copied, which are described here. But the final step from the instructions on that page is to copy using the CLI:
aws s3 cp s3://source-DOC-EXAMPLE-BUCKET/object.txt s3://destination-DOC-EXAMPLE-BUCKET/object.txt --acl bucket-owner-full-control
Well, I have a lot of objects in this bucket, and I don't want to write that many CLI commands. Now is the time to enlist sed
SED enables regex pattern-based substitution. The format is sed 's/pattern/replacement/'
. The s/
tells sed
that this is a substitution, and the final /
tells it when the replacement is done. So, let's add on to our grepped object list like this:
aws s3api list-objects --bucket "my-aws-library" | grep -o -E '"Key": .+"' | sed 's/Key/Filename/'
Now our results look like this:
Okay, that was a pretty simple substitution, but it doesn't get us to the list of CLI commands we need. This is going to get more complicated, so I like to work in an editor and put the expressions on separate lines, just so I can see what I'm doing (I join the lines back up before pasting in bash).
aws s3api list-objects --bucket "my-aws-library" | grep -o -E '"Key": .+"' | sed 's/
[regex for the pattern to be replaced will go here]/
[regex for the replacement will go here]/'
I'm also going to take advantage of a few other sed
features:
- In order to take advantage of extended regular expressions, we'll use the
-E
flag here, as well. - Since I will be working on paths that contain
/
, I'm going to use a different substitution delimiter. Alternate delimiters are allowed with sed, including@%|;:
. I'm going to use@
, since it won't collide with any of my text. - You can also use grouping and back-references in
sed
. To create a grouping, enclose a portion of the pattern in parentheses. To use a back-reference to a grouping in the replacement text, use the ordinal of the grouping escaped with a slash, e.g\1
.
First, we'll build the pattern to be replaced. We want everything between the second set of double-quotes, and we'll be reusing that so we put it in a group:
aws s3api list-objects --bucket "my-aws-library" | grep -o -E '"Key": .+"' | sed -E 's@
"Key": "(.+)"@
[regex for the replacement will go here]@'
Next, we'll copy in the AWS CLI command, replacing the key name with the back-reference to the grouping:
aws s3api list-objects --bucket "my-aws-library" | grep -o -E '"Key": .+"' | sed -E 's@
"Key": "(.+)"@
aws s3 cp s3://source-DOC-EXAMPLE-BUCKET/\1 s3://destination-DOC-EXAMPLE-BUCKET/\1 --acl bucket-owner-full-control@'
Now, when we run it we get this:
Slick! I love writing code to write code. Since I'm too lazy to even copy and paste, I'm going to dump it directly into a file:
aws s3api list-objects --bucket "my-aws-library" | grep -o -E '"Key": .+"' | sed -E 's@
"Key": "(.+)"@
aws s3 cp s3://source-DOC-EXAMPLE-BUCKET/\1 s3://destination-DOC-EXAMPLE-BUCKET/\1 --acl bucket-owner-full-control@'
> s3commands
Now, I can run all those copy commands in bash by running source s3commands
. And scene.