After getting Vault’s SSH backend setup in our environment, my attention was diverted to building a base AMI that all our teams can use as a starting point for deploying their applications. The purpose of this AMI was to provide something that already had the Vault SSH compoenets installed as well as other tools we use such as monitoring and log aggrigation. Monitoring and log aggrication were fairly easy as we have Ansible roles for deploying them. The challenge came with how to configure the Vault SSH.

The main challenge is that our pattern is to capture the IP address of the host as a principal, so I can’t really do that with on the AMI, since I need to do it dynamically whenever something launches from the AMI. However, how do I do this across all of the OS types that could be chosen (Ubuntu, Amazon Linux, Red Hat, etc) and their various startup mechanisms?

I had started with a really simple shell script that configures the host to use Vault’s SSH Backend:

#!/bin/bash
#*******************************************************************************
# Name: sshprincipals.sh
# Description: Sets up SSH Authorized Principals for AWS Hosts
#*******************************************************************************
VAULT_ADDR=https://vault.example.com

# Gets the Public Key from Vault
curl -s -k -o /etc/ssh/trusted-user-ca-keys.pem ${VAULT_ADDR}/v1/ssh-client-signer/public_key

# Create the auth_principals directory if it doesn't exists
if [ ! -d "/etc/ssh/auth_principals" ]; then
  mkdir -p /etc/ssh/auth_principals
fi

# Create the file for the local login user (ec2-user, ubuntu, etc)
localip=`curl -s http://169.254.169.254/latest/meta-data/local-ipv4`

egrep -i "^ec2-user" /etc/passwd 1>/dev/null
if [ $? = 0 ]; then
  echo "$localip" > /etc/ssh/auth_principals/ec2-user
  echo "sre-team" >> /etc/ssh/auth_principals/ec2-user
fi

egrep -i "^ubuntu" /etc/passwd 1>/dev/null
if [ $? = 0 ]; then
  echo "$localip" > /etc/ssh/auth_principals/ubuntu
  echo "sre-team" >> /etc/ssh/auth_principals/ubuntu
fi

# Make sure the sshd_config file is configured correctly
egrep -i "^TrustedUserCAKeys" /etc/ssh/sshd_config 1>/dev/null
if [ $? = 1 ]; then
  printf "\nTrustedUserCAKeys /etc/ssh/trusted-user-ca-keys.pem" >> /etc/ssh/sshd_config
fi

egrep -i "^AuthorizedPrincipalsFile" /etc/ssh/sshd_config 1>/dev/null
if [ $? = 1 ]; then
  printf "\nAuthorizedPrincipalsFile /etc/ssh/auth_principals/%%u" >> /etc/ssh/sshd_config
fi

service sshd restart

The script gets the SSL Public Key from the vault server and then configures the auth_principals file for the primary user (ec2-user or ubuntu, you could add more). and then configure the necessary bits in the sshd_config file to support it. So instead of trying to figure out what type of system it was running on and which startup mechanism to use, I decided to dig a little more into cloud-init.

With cloud-init I am able to run the above script on (and only on) initial launch. This is done by copying the above script to /usr/local/bin and then creating a new file in the /etc/cloud/cloud.cfg.d directory called 20_ssh_principals.cfg with the following content:

runcmd:
  - /usr/local/bin/sshprincipals.sh

Now, every time I launch a new instance that uses my base AMI, it will run the sshprincipals.sh script to configure my system to use the Vault SSH backend.