Create an IAM User with CloudFormation
ACM.15 Creating Batch job administrators to kick of AWS Batch jobs
This is a continuation of my series of posts on Automating Cybersecurity Metrics.
The past few posts covered the reasoning behind some of the architectural choices up to this point for the batch job architecture we’re building to automate some security metrics. The last post took at look at the big picture and why security architecture is not a checklist.
We’ve decided to use an IAM user with virtual MFA to kick off batch jobs and an IAM role for the actions taken by a specific batch job. Let’s review our requirements for the user I’m calling our batch job administrator.
Batch Job User Requirements
- I’ll use a separate user for this purpose from my administrative user account which has too many permissions.
- I cannot use an IAM role to kick off the jobs because I want to use MFA and you cannot use MFA with an AWS IAM Role alone. An MFA serial number is associated with a user.
- I don’t want to use AWS SSO (now AWS Identity Center) because I don’t like the configuration that includes the URL to my login portal or the browser interaction. I’m not even sure that will work with what I am hoping to implement.
- I’m going to use STS and pass in credentials and an MFA token.
- I don’t want to use a Yubikey because I’d have to install the Yubico CLI on my laptop, so I’ll use virtual MFA for this user.
- This user does not need access to the AWS Console. This user is needed solely to obtain an MFA code to kick off a job and to assume a role.
- I’ll be leveraging an AWS Access Key and Secret Key associated with the IAM user but I don’t have anywhere secure to put those just yet so I’ll create them later.
- This user will not be allowed to modify it’s own password or MFA. An administrator will have to do that.
- I’m going to run my batch jobs in a separate, locked down account in my AWS organization. I’ll create the user in that account.
- The only required permissions from my batch job administrator seems to be STS to assume the roles I create for my batch jobs in that account.
- I may want to assign the permissions to kick off batch jobs to other users in the future, so I’ll assign the permissions to a group, not a user. That is an AWS security best practice.
IAM Role Requirements
- I will create a separate role for each batch jobs with permissions limited to exactly what that batch job needs to do.
- We will limit access to resources to this role only unless MFA is present for the reasons discussed in the last post.
- We’ll add permissions to this role later when we set up the resources it needs to use and access.
Dedicated AWS Account
I’ve already written about automating AWS account creation with a secure baseline using AWS Control Tower. Though I could add some additional information about securing this Lambda function, this will get you started if you want to automate creation of new accounts. Account creation could be moved to a batch job once we have our infrastructure set up.
Creating an AWS IAM User with CloudFormation
I’m going to create my batch job administrator user with AWS CloudFormation. If you’re not familiar with or don’t like CloudFormation, this post may help:
We’ll need to create the following IAM resources with our CloudFormation templates and script:
An IAM User (a batch job administrator)
This is the identity that will kick off batch jobs and has the associated MFA device used to generate MFA tokens.
An IAM Policy (for batch job administrators)
Note that this is not a “User Policy” which is a policy written and associated with a single user. We want to create a stand alone policy in the account that we can associate with other users and groups in the future. This policy is only used to start batch jobs, and does not include the permissions for the actions taken by batch jobs.
An IAM User Policy Document (permission to start a batch job)
Policy documents define the permissions that a policy grants to an identity and to which resources the policy.
An IAM Group (batch job administrators)
We’ll assign the permissions to a group and then add our IAM user to that group.
Running the CloudFormation Script
Another question to consider besides just writing the CloudFormation script is where and how we will run this script. For the purposes of these proof-of-concept (POC) I am operating as if I am in a development environment, testing things out. I’m going to be testing my batch job locally and using IAM to deploy it with a manually executed script.
Specifically we will use the deploy command because it will create a new stack if it doesn’t exist or update an existing stack. That’s much better than the days when you had to figure out yourself which action needed to occur and call two separate commands.
I generally recommend an automated CI/CD pipeline or clients I speak to on consulting calls through IANS Research when moving code from Dev to QA to Production. To create this automated pipeline, we could use batch jobs in the future to run these scripts or Lambda functions such as the one I wrote to create a new AWS account. Your team might use a tool like Jenkins to run jobs that execute deployment scripts or leverage AWS Service Catalog. Creating a secure deployment pipeline is a topic for another day (or another book!)
GitHub Repo and Directory Structure
I created a new GitHub repository for this POC. Here is the structure for the files associated with this blog post:
- I expect that I’ll need to deploy and re-deploy batch job administrators at different times than the batch jobs themselves so I’ll put that code in a separate directory called batch_job_admins in my GitHub repo.
- I’ll add a deploy.sh in the directory containing the scripts to deploy my batch job admins.
- I generally put each resource in it’s own file so it’s easy to find the code when I want to edit it. I can also redeploy individual resources separately.
To start the directories will look like this:
/batch_job_admins
/cfn
policy_batch_job_admins.yaml
group_batch_job_admins.yaml
user_batch_job_admins.yaml
deploy.sh
CloudFormation Parameters
Note that I’m making use of CloudFormation Parameters in these templates. That way, they will be reusable because people can pass in whatever parameters work in their own environment. For example, perhaps you already have am IAM group named batch_job_admins which I use in my script. You can override the group name in deploy.sh with whatever name you want.
CloudFormation Outputs
The templates also use CloudFormation outputs. After a template executes it can store references to resources it created in outputs. We’ll need the outputs from one template to use in the next. For example, we’ll create an IAM Policy with the permission for our batch job admins. Then we’ll use the output from that template in our IAM group template to assign that policy to our new group.
ImportValue
We’re going to use the ImportValue function to reference the outputs from one stack in another. For example, the cloud formation template that creates and AWS IAM group outputs the group reference. The policy CloudFormation uses that output value to apply the new policy it’s creating to the specified group.
Pseudo Parameter — AWS::AccountId
We’ll be using a pseudo parameter (AWS:AccountId) that insert the account ID for the account in which the script is getting executed into the code. That way we don’t have to hard code account numbers, put them in source control, and the script will work in any account where it gets executed.
Sub
We will use the Sub function to substitute the AWS::AccountID with the actual account ID of the account in which the script is executing. You’ll see the sub function on this line of our policy CloudFormation template:
Resource: !Sub 'arn:aws:iam::${AWS::AccountId}:role/batch_*'
CLI profile
The bash script allows you to pass in a CLI profile if you have one configured, otherwise it uses the default profile. You can configure multiple CLI profiles so you can run scripts with different credentials and in different accounts.
The code
You can find the code in this GitHub repo:
No warranties on any code and please don’t run it in a production environment without thoroughly testing it. As I mentioned before I already had someone insert a “special character” into some of my class code. Things happen. Do your due diligence.
Just in case you are unfamiliar with GitHub, first install git.
Then you can execute this command to obtain the code in this repo:
git clone https://github.com/tradichel/SecurityMetricsAutomation.git
These are the files related to this blog post in the GitHub repository:
NOTE: All the following code moved into the iam directory in a subsequent post and refactoring of code.
Group CloudFormation (group_batch_job_admins.yaml)
#group_batch_job_admins.yaml
https://github.com/tradichel/SecurityMetricsAutomation/blob/main/iam/batch_job_admins/cfn/group_batch_job_admins.yaml
Policy CloudFormation (policy_batch_job_admins.yaml)
#policy_batch_job_admins.yaml
https://github.com/tradichel/SecurityMetricsAutomation/blob/main/iam/batch_job_admins/cfn/policy_batch_job_admins.yaml
You will notice in the policy above that I have an asterisk (*) in the resource name for the roles that my user can assume. That’s because we are going to name all our batch job roles BatchRole[name of batch job]. The value of ${AWS::AccountId} will be replaced with the account ID of the account where the script is executed. That means the Resource will be an ARN for any role in that account that has a name starting with BatchRole. So any user in the group to which we are assigning this policy will be able to assume any role in the account that starts with BatchRole.
Threat Modeling our Policy
We still do have a couple of security concerns with this design. We need to make sure that no one can add a new role starting with BatchRole to assign permissions higher than they have themselves and execute a batch job to elevate their own privileges. It would be a good idea to separate the permissions between the people who can create and modify jobs from those who can start jobs. You’ll also want a secure deployment process as mentioned earlier.
We also need to make sure that someone who is not supposed to be allowed to start batch jobs gets added to the Batch Job administrator group. I usually recommend that organizations have a separate IAM team so the people who give and use the permissions are not one and the same. This gets tricky when it comes to applications, and you can use permission boundaries to limit the permissions users can give to applications or use an AWS account.
Another concern would be that someone could start an EC2 instance or other compute resource, add an existing role to it, and use those permissions to do something they are not supposed to do. This again comes down your IAM team, permissions boundaries, and possibly adding some service control policies to limit permissions. You can also raise alerts in response to sensitive or unexpected actions in an account.
User CloudFormation (user_batch_job_admin.yaml)
#user_batch_job_admin.yaml
https://github.com/tradichel/SecurityMetricsAutomation/blob/main/iam/batch_job_admins/cfn/user_batch_job_admin.yaml
Bash file to deploy our CloudFormation Templates
#deploy.sh
https://github.com/tradichel/SecurityMetricsAutomation/blob/main/iam/batch_job_admins/deploy.sh
Permissions to run the code
It’s usually not a good idea to put credentials on an EC2 instance. I will be using credentials later in this series, stored in secrets manager. To run the code in this post, I used an EC2 instance with the necessary instance profile (IAM role) assigned to it.
If you’re adding credentials to EC2 instances please stop. Go create a role that your EC2 instance can use that has the required permissions. Add the role to your EC2 instance. Disable or delete your existing credentials and create new ones.
By the way, if you don’t have long-lived credentials on your laptop, malware that gains access via a phishing attack (one of the top sources of cloud security breaches) cannot steal them. Though I did explain how credentials on EC2 instances can be abused in this blog post, it’s harder, and I provided some recommendations to mitigate that risk in this post as well.
Execute the deployment script:
./deploy.sh
Watch the output of the script for any errors:
You can see I got an error that the end.
Navigate to CloudFormation in the AWS Console both to verify the stacks completed successfully or check for any errors. You can see here I have an error and the script rolled back. Click on the name of the script.
Click on Events:
Here I can see that the of the export I referenced is not correct. It doesn’t match the name in the original template where I created the export, so I’ll fix that. I happen to know that export is in my group template. As you can see below the export name is groupbatchjobadmins not batchjobadminsgroup. I don’t want to change this file since this resource is already deployed so I’ll fix the user template instead.
I’ll change this:
To this:
Before I redeploy I will need to delete the existing failed stack since it’s never fully deployed successfully. Return to your stack in CloudFormation and click Delete.
Verify the stack got deleted from your list and run the deploy.sh script again.
Once all the stacks complete successfully you should see them in CloudFormation:
You can navigate to AWS IAM and confirm the resources exist there as well.
Here’s my User Group:
When I click on it I can see my user in the group:
When I click on Permissions I can see the policy:
When I click on the policy and then JSON I can see the contents of the policy, and that I have one error:
The problem above is that resource ARNs should have an account Id in them and this sample ARN had a placeholder value where the account ID should go. I corrected that problem by using a pseudo parameter as explained above:
If I return to users and click on the BatchJobAdmin user and then security credentials, I can see that the user correctly has the console password disabled.
No MFA device is assigned here. You can go ahead and add that now. Click Managed and follow the steps to add your virtual MFA device.
Scroll down and you can see that this user has not AWS Access Key or Secret Key to perform programmatic actions, so essentially this user has no ability to do anything in our AWS account in the moment, which is what we want.
I’ll publish the code for the job role and job in the next post. Follow me for updates.
Teri Radichel
If you liked this story please clap and follow:
Medium: Teri Radichel or Email List: Teri Radichel
Twitter: @teriradichel or @2ndSightLab
Requests services via LinkedIn: Teri Radichel or IANS Research
© 2nd Sight Lab 2022
All the posts in this series:
Github repo:
____________________________________________
Author:
Cybersecurity for Executives in the Age of Cloud on Amazon
Need Cloud Security Training? 2nd Sight Lab Cloud Security Training
Is your cloud secure? Hire 2nd Sight Lab for a penetration test or security assessment.
Have a Cybersecurity or Cloud Security Question? Ask Teri Radichel by scheduling a call with IANS Research.
Cybersecurity & Cloud Security Resources by Teri Radichel: Cybersecurity and Cloud security classes, articles, white papers, presentations, and podcasts