Working with multiple AWS EKS instances

I’ve recently been working on a project that uses AWS EKS managed Kubernetes Service.

For various reasons too complicated to go into here we’ve ended up with multiple clusters owned by different AWS Accounts so flipping back and forth between them has been a little trickier than normal.

Here are my notes on how to manage the AWS credentials and the kubectl config to access each cluster.

AWS CLI

First task is to authorise the AWS CLI to act as the user in question. We do this by creating a user with the right permissions in the IAM console and then export the Access key ID and Secret access key values usually as a CSV file. We then take these values and add them to the ~/.aws/credentials file.

[dev]
aws_access_key_id = AKXXXXXXXXXXXXXXXXXX
aws_secret_access_key = xyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxy

[test]
aws_access_key_id = AKYYYYYYYYYYYYYYYYYY
aws_secret_access_key = abababababababababababababababababababab

[prod]
aws_access_key_id = AKZZZZZZZZZZZZZZZZZZ
aws_secret_access_key = nmnmnmnmnmnmnmnmnmnmnmnmnmnmnmnmnmnmnmnm

We can pick which set of credential the AWS CLI uses by adding the --profile option to the command line.

$ aws --profile dev sts get-caller-identity
{
    "UserId": "AIXXXXXXXXXXXXXXXXXXX",
    "Account": "111111111111",
    "Arn": "arn:aws:iam::111111111111:user/dev"
}

Instead of using the --profile option you can also set the AWS_PROFILE environment variable. Details of all the ways to switch profiles are in the docs here.

$ export AWS_PROFILE=test
$ aws sts get-caller-identity
{
    "UserId": "AIYYYYYYYYYYYYYYYYYYY",
    "Account": "222222222222",
    "Arn": "arn:aws:iam::222222222222:user/test"
}

Now we can flip easily between different AWS accounts we can export the EKS credential with

$ export AWS_PROFILE=prod
$ aws eks update-kubeconfig --name foo-bar --region us-east-1
Updated context arn:aws:eks:us-east-1:333333333333:cluster/foo-bar in /home/user/.kube/config

The user that created the cluster should also follow these instructions to make sure the new account is added to the cluster’s internal ACL.

Kubectl

If we run the previous command with each profile it will add the connection information for all 3 clusters to the ~/.kube/config file. We can list them with the following command

$ kubectl config get-contexts
CURRENT   NAME                                                  CLUSTER                                               AUTHINFO                                              NAMESPACE
*         arn:aws:eks:us-east-1:111111111111:cluster/foo-bar   arn:aws:eks:us-east-1:111111111111:cluster/foo-bar   arn:aws:eks:us-east-1:111111111111:cluster/foo-bar   
          arn:aws:eks:us-east-1:222222222222:cluster/foo-bar   arn:aws:eks:us-east-1:222222222222:cluster/foo-bar   arn:aws:eks:us-east-1:222222222222:cluster/foo-bar   
          arn:aws:eks:us-east-1:333333333333:cluster/foo-bar   arn:aws:eks:us-east-1:333333333333:cluster/foo-bar   arn:aws:eks:us-east-1:333333333333:cluster/foo-bar 

The star is next to the currently active context, we can change the active context with this command

$ kubectl config set-context arn:aws:eks:us-east-1:222222222222:cluster/foo-bar
Switched to context "arn:aws:eks:us-east-1:222222222222:cluster/foo-bar".

Putting it all together

To automate all this I’ve put together a collection of script that look like this

export AWS_PROFILE=prod
aws eks update-kubeconfig --name foo-bar --region us-east-1
kubectl config set-context arn:aws:eks:us-east-1:222222222222:cluster/foo-bar

I then use the shell source ./setup-prod command (or it’s shortcut . ./setup-prod) , this is instead of adding the shebang to the top and running it as a normal script. This is because when environment variables are set in scripts they go out of scope. Leaving the AWS_PROFILE variable in scope means that the AWS CLI will continue to use the correct account settings when it’s used later while working on this cluster.

Working with multiple EFS file system in EKS

I’ve been building a system recently on AWS EKS and using EFS filesystems as volumes for persistent storage.

I initially only had one container that required any storage, but as I added a second I ran into the issue that there didn’t look to be a way to bind a EFS volume to a specific PersistentVolumeClaim so no way to make sure the same volume was mounted into the same container each time.

A Pod requests a volume by referencing a PersistentVolumeClaim as follows:

apiVersion: v1
kind: Pod
metadata:
  name: efs-app
spec:
  containers:
  - name: app
    image: centos
    command: ["/bin/sh"]
    volumeMounts:
    - name: efs-volume
      mountPath: /data
  volumes:
  - name: efs-volume
    persistentVolumeClaim:
      claimName: efs-claim

The PersistentVolumeClaim would look:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: efs-claim
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: efs-sc
  resources:
    requests:
      storage: 5Gi

You can bind the EFS volume to a PersistentVolume as follows

apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-persistent-volume
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-6eb2fc16

The volumeHandle points to the EFS volume you want to back it.

If there is only one PersistentVolume then there is not a problem as the PersistentVolumeClaim will grab the only one available. But if there are more than one then you can include the volumeName in the PersistentVolumeClaim description to bind the two together.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: efs-claim
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: efs-sc
  resources:
    requests:
      storage: 5Gi
  volumeName: efs-persistent-volume

After a bit of poking around I found this Stack Overflow question which pointed me in the right direction.

Setting up mDNS CNAME entries for K8S Ingress Hostnames

As I hinted at in the end of my last post, I’ve been looking for a way to take the hostnames setup for Kubernetes Ingress endpoints and turn them into mDNS CNAME entries.

When I’m building things I like to spin up a local copy where possible (e.g. microk8s on a Pi 4 for the Node-RED on Kubernetes and the Docker Compose environment on another Pi 4 for the previous version). These setups run on my local network at home and while I have my own DNS server set up and running I also make extensive use of mDNS to be able to access the different services.

I’ve previously built little utilities to generate mDNS CNAME entries for both Nginx and Traefik reverse proxies using Environment Variables or Labels in a Docker environment, so I was keen to see if I can build the same for Kubernetes’ Ingress proxy.

Watching for new Ingress endpoints

The kubernetes-client node module supports for watching certain endpoints, so can be used to get notifications when an Ingress endpoint is created or destroyed.

const stream = client.apis.extensions.v1beta1.namespaces("default").ingresses.getStream({qs:{ watch: true}})
const jsonStream = new JSONStream()
stream.pipe(jsonStream)
jsonStream.on('data', async obj => {
  if (obj.type == "ADDED") {
    for (x in obj.object.spec.rules) {
      let hostname = obj.object.spec.rules[x].host
      ...
    }
  } else if (obj.type == "DELETED") {
    for (x in obj.object.spec.rules) {
      let hostname = obj.object.spec.rules[x].host
      ...
    }
  }
}

Creating the CNAME

For the previous versions I used a python library called mdns-publish to set up the CNAME entries. It works by sending DBUS messages to the Avahi daemon which actually answers the mDNS requests on the network. For this version I decided to try and send those DBUS messages directly from the app watching for changes in K8s.

The dbus-next node module allows working directly with the DBUS interfaces that Avahi exposes.

const dbus = require('dbus-next');
const bus = dbus.systemBus()
bus.getProxyObject('org.freedesktop.Avahi', '/')
.then( async obj => {
	const server = obj.getInterface('org.freedesktop.Avahi.Server')
	const entryGroupPath = await server.EntryGroupNew()
	const entryGroup = await bus.getProxyObject('org.freedesktop.Avahi',  entryGroupPath)
	const entryGroupInt = entryGroup.getInterface('org.freedesktop.Avahi.EntryGroup')
	var interface = -1
	var protocol = -1
	var flags = 0
	var name = host
	var clazz = 0x01
	var type = 0x05
	var ttl = 60
	var rdata = encodeFQDN(hostname)
	entryGroupInt.AddRecord(interface, protocol, flags, name, clazz, type, ttl, rdata)
	entryGroupInt.Commit()
})

Adding a signal handler to clean up when the app gets killed and we are pretty much good to go.

process.on('SIGTERM', cleanup)
process.on('SIGINT', cleanup)
function cleanup() {
	const keys = Object.keys(cnames)
	for (k in keys) {
		//console.log(keys[k])
		cnames[keys[k]].Reset()
    	cnames[keys[k]].Free()
	}
	bus.disconnect()
	process.exit(0)
}

Running

Once it’s all put together it runs as follows:

$ node index.js /home/ubuntu/.kube/config ubuntu.local

The first argument is the path to the kubectl config fileand the second is the hostname the CNAME should point to.

If the Ingress controller is running on ubuntu.local then Ingress YAML would look like this:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: manager-ingress
spec:
  rules:
  - host: "manager.ubuntu.local"
    http:
      paths:
      - pathType: Prefix
        path: "/"
        backend:
          service:
            name: manager
            port:
              number: 3000 

I’ve tested this with my local microk8s install and it is working pretty well (even on my folks really sketchy wifi). The code is all up here.

Multi Tenant Node-RED with Kubernetes

Having built a working example of Multi Tenant Node-RED using Docker I thought I’d have a look at how to do the same with Kubernetes as a Christmas project.

I started with installing the 64bit build of Ubuntu Server on a fresh Pi4 with 8gb RAM and then using snapd to install microk8s. I had initially wanted to use the 64bit version of Raspberry Pi OS, but despite microk8s claiming to work on any OS that support snapd, I found that containerd just kept crashing on Raspberry Pi OS.

Once installed I enabled the dns and ingress plugins, this got me a minimal viable single node Kubernetes setup working.

I also had to stand up a private docker registry to hold the containers I’ll be using. That was just a case of running docker run -d -p 5000:5000 --name registry registry on a local machine e.g private.example.com . This also means adding the URL for this to microk8s as described here.

Since Kubernetes is another container environment I can reuse most of the parts I previously created. The only bit that really needs to change is the Manager application as this has to interact with the environment to stand up and tear down containers.

Architecture

As before the central components are a MongoDB database and a management web app that stands up and tears down instances. The MongoDB instance holds all the flows and authentication details for each instance. I’ve deployed the database and web app as a single pod and exposed them both as services

apiVersion: apps/v1
kind: Deployment
metadata:
  name: node-red-multi-tenant
  labels:
    app: nr-mt
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nr-mt
  template:
    metadata:
      labels:
        app: nr-mt
    spec:
      containers:
      - name: node-red-manager
        image: private.example.com/k8s-manager
        ports:
        - containerPort: 3000
        volumeMounts:
        - name: secret
          mountPath: /usr/src/app/config
        env:
        - name: MONGO_URL
          value: mongodb://mongo/nodered
        - name: ROOT_DOMAIN
          value: example.com
      - name: mongodb
        image: mongo
        ports:
        - containerPort: 27017
        volumeMounts:
        - name: mongo-data
          mountPath: /data/db
      - name: registry
        image: verdaccio/verdaccio
        ports:
        - containerPort: 4873
        volumeMounts:
        - name: registry-data
          mountPath: /verdaccio/storage
        - name: registry-conf
          mountPath: /verdaccio/conf
      volumes:
      - name: secret
        secret:
          secretName: kube-config
      - name: mongo-data
        hostPath:
          path: /opt/mongo-data
          type: Directory
      - name: registry-data
        hostPath:
          path: /opt/registry-data
          type: Directory
      - name: registry-conf
        secret:
          secretName: registry-conf

This Deployment descriptor basically does all the heavy lifting. It sets up the mangment app, MongoDB and the private NPM registry.

It also binds 2 sets of secrets, the first holds holds the authentication details to interact with the Kubernetes API (the ~/.kube/config file) and the settings.js for the management app. The second is the config for the Veraccio NPM registry.

I’m using the HostPath volume provider to store the MongoDB and the Veraccio registry on the filesystem of the Pi, but for a production deployment I’d probably use the NFS provider or a Cloud Storage option like AWS S3.

Manager

This is mainly the same as the docker version, but I had to swap out dockerode for kubernetes-client.

This library exposes the full kubernetes API allowing the creation/modification/destructions of all entities.

Standing up a new instance is a little more complicated as it’s now a multi step process.

  1. Create a Pod with the custom-node-red container
  2. Create a Service based on that pod
  3. Expose that service via the Ingress addon

I also removed the Start/Stop buttons since stopping pods is not really a thing in Kubernetes.

All the code for this version of the app is on github here.

Catalogue

In the Docker-Compose version the custom node `catalogue.json` file is hosted by the management application and had to be manually updated each time a new or updated node was push to the repository. For this version I’ve stood up a separate container.

This container runs a small web app that has 2 endpoints.

  • /catalogue.json – which returns the current version of the catalogue
  • /update – which is triggered by the the notify function of the Verdaccio private npm registry

The registry has this snippet added to the end of the config.yml

notify:
  method: POST
  headers: [{'Content-Type': 'application/json'}]
  endpoint: http://catalogue/update
  content: '{"name": "{{name}}", "versions": "{{versions}}", "dist-tags": "{{dist-tags}}"}'

The code for this container can be found here.

Deploying

First clone the project from github

$ github clone https://github.com/hardillb/multi-tenant-node-red-k8s.git

Then run the setup.sh script, passing in the base domain for instances and the host:port combination for the local container registry.

$ ./setup.sh example.com private.example.com:5000

This will update some of the container locations in the deployment and build the secrets needed to access the Kubernetes API (reads the content of ~/.kube/config)

With all the configuration files updated the containers need building and pushing to the local container registry.

$ docker build ./manager -t private.example.com:5000/k8s-manager
$ docker push private.example.com:5000/k8s-manager
$ docker build ./catalogue -t private.example.com:5000/catalogue
$ docker push private.example.com:5000/catalogue
$ docker build ./custom-node-red -t private.example.com:5000/custom-node-red
$ docker push private.example.com:5000/custom-node-red

Finally trigger the actual deployment with kubectl

$ kubectl apply -f ./deployment

Once up and running the management app should be available on http://manager.example.com, the private npm registry on http://registry.example.com and an instance called “r1” would be on http://r1.example.com.

A wildcard DNS entry needs to be setup to point all *.example.com hosts to the Kubernetes clusters Ingress IP addresses.

As usual the whole solution can be found on github here.

What’s Next

I need to work out how to set up Avahi CNAME entries for each deployment as I had working with both nginx and traefik so I can run it all nicely on my LAN without having to mess with /etc/hosts or the local DNS. This should be possible by using a watch call one the Kubernetes Ingress endpoint.

I also need to back port the new catalogue handling to the docker-compose version.

And finally I want to have a look at generating a Helm chart for all this to help get rid of needing the setup.sh script to modify the deployment YAML files.

p.s. If anybody is looking for somebody to do this sort of thing for them drop me a line.