Creating Digital Ocean Droplets

Digital Ocean are a Cloud Hosting provider that supports running VMs and Kubernetes clusters. Along with these they host a market place where users can pick pre-built images (known as Droplets) for the VMs and Application stacks for deploying to the Kubernetes clusters.

Over the last week I’ve spent an hour or two on a few days building Digital Ocean Droplet image to stand up an instance of FlowForge.

Luckily Digital Ocean make the recipes to build their standard images available on GitHub here which gives great examples to build on.

Packer

The recipes all use a tool called packer from Hashicorp to assemble the images.

Packer can be used with a number of different environments, but it has native support for building Digital Ocean Droplet snaphots. To use this you need to generate a Digital Ocean API Token, details of how to generate a token can be found here.

Once you have a token you can start to build the template.json file that will be passed to packer.

{
  "variables": {
    "do_api_token": "{{env `DIGITALOCEAN_API_TOKEN`}}",
    "image_name": "flowforge-1-3-0-snapshot-{{timestamp}}",
    "apt_packages": "apt-transport-https ca-certificates curl jq linux-image-extra-virtual software-properties-common ",
    "application_name": "FlowForge",
    "application_version": "v1.3.0",
    "docker_compose_version": "v2.12.0"
  },
  "sensitive-variables": [
    "do_api_token"
  ],
  "builders": [
    {
      "type": "digitalocean",
      "api_token": "{{user `do_api_token`}}",
      "image": "ubuntu-22-04-x64",
      "region": "lon1",
      "size": "s-1vcpu-1gb",
      "ssh_username": "root",
      "snapshot_name": "{{user `image_name`}}"
    }
  ],
...

Here we have a section at the top that declares a bunch of variables we will use later, and then a builder definition which says to use Digital Ocean as the environment and sets the base image and which region to run the build in.

Next we have the bits that do the real work of building the image, the provisioners section.

For this task they fall into 2 categories

  1. Scripts to run
  2. Files to copy

I used the Digital Ocean Docker droplet as a starting point as I’m installing the FlowForge Docker driver. This includes scripts to ensure that the Ubuntu 22.04 images is up to date and then installs both Docker and Docker Compose. This meant the only tasks needed where to install the FlowForge package and include a script to do the initial configuration (setting the domain to host the FlowForge Projects).

The setup script gets added to the end of the root users .bashrc file so it gets run when the user logs into the Droplet with SSH for the first time. The script removes it’s self from this file so it only runs once.

It also pulls all the needed containers and finally calls docker compose up to start the stack.

Once the template.json was complete it was a case of having packer run it

$ DIGITAL_OCEAN_API_TOKEN=######## packer template.json

Once built the snapshot shows up under Snapshots section in the Images section of the Digital Ocean dashboard. From here it can be published to the Digital Ocean Marketplace once you have signed up as a vendor.

Digital Ocean Snapshots

Trying it out

You can find the FlowForge entry in the Digital Ocean Marketplace here. You will need a domain to host the FlowForge app and any projects you end up running.

It will run in the $6 a month Droplet size, but you may want to bump it to to one of the large offerings depending on how many projects you want to end up running.

The full packer project can be found here

Raspberry Pi USB Gadget Creator Update

A few years ago I wrote a script that would take a standard Raspberry Pi OS image and boot it up in a VM to make all the required modifications so a Pi Zero or Pi 4 would work as a USB Ethernet device connected to a host computer.

One of my uses for these images is an offline Certificate Authority that runs on a Raspberry Pi Zero (note not a Zero W so it’s totally offline), so that it is only powered on and accessible when plugged into one of the USB ports on my laptop. This has worked well, but recently I upgraded my laptop to Ubuntu 22.04 and this upgraded OpenSSL to 3.0.x which also include a bunch of security related changes to do with default cypher sets enabled and minimum key sizes it considers secure.

When I came to issue some new certificates for my VPN setup this cause some problems as the PKCS12 files generated by my offline CA could not be opened on the new laptop without adding a bunch of extra options to the commands to re-enable the obsolete configuration. To remedy this it was time to upgrade the CA image to the latest version of Raspberry Pi OS.

The script worked pretty well until Raspberry Pi OS stopped enabling a default user called pi with the password raspberry. This was a sensible change as many Raspberry Pi machines had ended up connected directly to the Internet without the password being changed.

Since there is no default user anymore you have supply a userconf.txt file in the /boot partition on the SDCard that is used to create the user on first boot. The problem is that the /boot partition is bundled into the image so not easily accessible until the image is written to a card.

We can fix this using a loopback mount on Linux, we just need to know the offset into the image file that the partition starts at. We can find that by running fdisk -l against the image:

$ fdisk -l 2022-09-22-raspios-bullseye-armhf-lite.img 
Disk 2022-09-22-raspios-bullseye-armhf-lite.img: 1.75 GiB, 1874853888 bytes, 3661824 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xb1214a26

Device                                      Boot  Start     End Sectors  Size Id Type
2022-09-22-raspios-bullseye-armhf-lite.img1        8192  532479  524288  256M  c W95 FAT32 (LBA)
2022-09-22-raspios-bullseye-armhf-lite.img2      532480 3661823 3129344  1.5G 83 Linux

We can see that the first partition is the one we want and the off set it at 8192 sectors (each sector is 512 bytes as shown in the header). Using awk we can then extract and calculate the right value to pass to mount

$ OFFSET=$(fdisk -l $1 | awk '/^[^ ]*1/{ print $2*512 }')
$ sudo mount -o loop,offset=$OFFSET 2022-09-22-raspios-bullseye-armhf-lite.img boot

Now we have the partition mounted we can go about generating the userconf.txt file. This is done using openssl as described in the Raspberry Pi Org docs

$ openssl passwd -6
Password:
Verifying - Password:
$6$k6CgL8YerZDjNAeD$N0HnHZGUPufxjewErapVEalmZll/1KgPlD0ybBXubaAvp7CZEOZBw8FDFIQ2jIyUevanKvGHmmc33YyXaZnf./

The output is appended to username: and placed in the /boot/userconf.txt file, then we can unmount the partition and boot the system to install the required parts for the USB gadget.

I’ve put the whole thing in a shell script which also exports the password as an environment variable so it can be picked up by the expect script that actually does the install.

#!/bin/bash

OFFSET=$(fdisk -l $1 | awk '/^[^ ]*1/{ print $2*512 }')
mkdir boot
sudo mount -o loop,offset=$OFFSET $1 boot

read -rsp "Please enter password for pi user: " PASSWORD
echo
PASS=$(echo $PASSWORD |  openssl passwd -6 -stdin)
echo "pi:$PASS" > userconf.txt

sudo cp userconf.txt boot/userconf.txt
sudo sync

sudo umount boot
rmdir boot
export PASSWORD
./create-image $1

I’ve updated the git repo with the new script.

Kubernetes Mutating Web Hooks to Configure Ingress

I’m working on a project that dynamically creates Pods/Services/Ingress objects using the Kubernetes API.

This all works pretty well, until we have to support a bunch of different Ingress Controllers. Currently the code supports 2 different options.

  1. Using the nginx ingress controller with it set as the default IngressClass
  2. Running on AWS EKS with a ALB Ingress Controller.

If does this by having an if block and a settings flag that says it’s running on AWS where it then injects a bunch of annotations into the Ingress object

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: alb
    alb.ingress.kubernetes.io/group.name: flowforge
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}, {"HTTP":80}]'
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
  name: graceful-whiskered-tern-1370
  namespace: flowforge
spec:
...

While this works it doesn’t scale well as we add support for more types of Ingress Controller that require different annotations to configure them e.g. to use cert-manager to request LetsEncrypt certificates to HTTPS.

Luckily Kubernetes provides a mechanism for modifying objects as they are being created via something called a MutatingAdmissionWebhook. This is a HTTPS endpoint hosted inside the cluster that is passed the object at specific lifetime events and is allowed to modify that object before it is instantiated by the control plane.

There are a few projects that implement this pattern and allow you to declare rules to be applied to objects. Such as KubeMod and Patch Operator from RedHat. I may end up using one of these for the production solution, but as this didn’t sound too complex so I thought I would have a go at implementing a Webhook first myself just to help understand how they work.

Here are the Kubernetes docs on creating Webhooks.

So the first task was to write a simple web app to host the Webhook. I decided to use express.js as that is what I’m most familiar with to get started.

By default the Webhook is a POST to the /mutate route, the body is a JSON object with the AdmissionReview which has the object being created in the request.object field.

Modifications to the original object need to be sent as a base64 JSONPatch.

{
  apiVersion: admissionReview.apiVersion,
  kind: admissionReview.kind,
  response: {
    uid: admissionReview.request.uid,
    allowed: true,
    patchType: 'JSONPatch',
    patch: Buffer.from(patchString).toString('base64')
  }
}

The JSONPatch to add the AWS ALB annotations mentioned earlier look like:

[
  {
    "op":"add",
    "path":"/metadata/annotations/alb.ingress.kubernetes.io~1scheme",
    "value":"internet-facing"
  },
  {
    "op":"add",
    "path":"/metadata/annotations/alb.ingress.kubernetes.io~1target-type",
    "value":"ip"
  },
  {
    "op":"add",
    "path":"/metadata/annotations/alb.ingress.kubernetes.io~1group.name",
    "value":"flowforge"
  },
  {
    "op":"add",
    "path":"/metadata/annotations/alb.ingress.kubernetes.io~1listen-ports",
    "value":"[{\"HTTPS\":443}, {\"HTTP\":80}]"
  }
]

A basic hook that adds the AWS ALB annotations fits in 50 lines of code (and some of that is for HTTPS, which we will get to in a moment).

Webhooks need to be called via HTTPS, this means we need to create a server certificate for the HTTP server. Normally we could use something like LetsEncrypt to generate a certificate, but that will only issue certificates that have host names that are publicly resolvable. And since we will be accessing this as a Kubernetes Service that means it’s hostname will be something like service-name.namespace. Luckily we can create our own Certificate Authority and issue certificates that match any name we need because as partof the configuration we can upload our own CA root certificate.

The following script creates a new CA, then uses it to sign a certificate for a service called ingress-mutartor and adds all the relevant `SAN entries that are needed.

#!/bin/bash
cd ca
rm newcerts/* ca.key ca.crt index index.* key.pem req.pem
touch index

openssl genrsa -out ca.key 4096
openssl req -new -x509 -key ca.key -out ca.crt -subj "/C=GB/ST=Gloucestershire/O=Hardill Technologies Ltd./OU=K8s CA/CN=CA" 

openssl req -new -subj "/C=GB/CN=ingress-mutator" \
    -addext "subjectAltName = DNS.1:ingress-mutator, DNS.2:ingress-mutator.default, DNS.3:ingress-mustator.default.svc, DNS.4:ingress-mutator.default.svc.cluster.local"  \
    -addext "basicConstraints = CA:FALSE" \
    -addext "keyUsage = nonRepudiation, digitalSignature, keyEncipherment" \
    -addext "extendedKeyUsage = serverAuth" \
    -newkey rsa:4096 -keyout key.pem -out req.pem \
    -nodes

openssl ca -config ./sign.conf -in req.pem -out ingress.pem -batch

If I was building more than one Webhook I could break out the last 2 lines to generate and sign multiple different certificates.

Now we have the code and the key/certificate pair we can bundle them up in the Docker container that can be pushed to a suitable container registry and we then create the deployment YAML files needed to make all this work.

The Pod and Service definitions are all pretty basic, but we also need a MutatingWebhookConfiguration. As well as identifying which service hosts the Webhook it also includes the filter that decides which new objects should be passed to the Webhook to be modified.

apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  name: ingress-annotation.hardill.me.uk
webhooks:
- name: ingress-annotation.hardill.me.uk
  namespaceSelector:
    matchExpressions:
    - key: kubernetes.io/metadata.name
      operator: In
      values: 
        - flowforge
  rules:
  - apiGroups:   [ "*" ]
    apiVersions: [ "networking.k8s.io/v1" ]
    resources:   [ "ingresses" ]
    operations:  [ "CREATE" ]
    scope:       Namespaced
  clientConfig:
    service:
      namespace: default
      name: ingress-mustator
      path: /mutate
    caBundle: < BASE64 encoded CA bundle>
  admissionReviewVersions: ["v1"]
  sideEffects: None
  timeoutSeconds: 5
  reinvocationPolicy: "Never"

The rules secition says to match all new Ingress objects and the namespaceSelector says to only apply it to objects from the flowforge namespace to stop us stamping on anything else that might be creating new objects on the cluster.

The caBundle value is the output of cat ca.crt | base64 -w0

This all worked as expected when deployed. So the next step is to remove the hard coded JSONPatch and make it apply a configurable set of different options based on the target environment.

The code for this is all on github here.

Working with Wild Card Domain Names while Developing

Old Internet Proverb:

It’s not DNS
It can’t be DNS
It was DNS

A recent project uses hostname based routing and generates hostnames on a supplied domain.

In production this is not too tricky (assuming you know how to configure your chosen DNS server), you just set up a wild card DNS entry to point to the host running the ingress reverse proxy.

For development things are a little trickier, especially if you are a bit of a road warrior/digital nomad wandering round place to place connecting your laptop to lots of different WiFi networks.

For a single hostname you can usually just stick an entry in your OS’s /etc/hosts file that will map the name to a given IP address, but you can’t do that with a whole domain. Also the project I’m working on is container based so we also have to be careful not to use 127.0.0.1 as the address for the ingress host, because that will resolve to the container’s local IP stack, rather than the loop back interface of the host machine.

The solution is to setup a DNS server that will respond with a single IP address for any hosts in the given domain. Luckily there is a package that can do this for use called dnsmasq.

You can use the address=/domain.name/ip.add.re.s configuration option to map as many domains as you would like to single IP addresses (as well as adding individual hostnames as normal).

For the example setups listed I’m going to use the domain example.com but it will work for anything.

On Linux I’m going to use the IP address 172.17.0.1 because this is the default address bound to the docker0 bridge interface. This is one of the addresses that ports mapped to containers get forwarded from, so will work for accessing the ingress proxy.

Ubuntu

Modern versions of Ubuntu use something called systemd-resolved which manages which DNS server to use (depending on how many interfaces you have connected to how many different networks), it also runs a caching system to try and reduce the number of lookups get done.

While this is very useful it does make doing what we want to do here a little trickier, but after a lot of messing around and testing the following works on Ubuntu 22.04 and I’m pretty sure will work on 20.04 as well.

sudo apt-get install dnsmasq
sudo echo "bind-interfaces" >> /etc/dnsmasq.conf
sudo echo "no-resolv" >> /etc/dnsmasq.conf
sudo echo "conf-dir=/etc/dnsmasq.d" >> /etc/dnsmasq.conf
sudo echo "address=/example.com/172.17.0.1" > /etc/dnsmasq.d/02-flowforge.conf
sudo service dnsmasq restart
sudo echo "DNS=127.0.0.1" >> /etc/systemd/resolved.conf
sudo echo "DOMAINS=~example.com" >> /etc/systemd/resolved.conf
sudo service systemd-resolved restart

This does the following steps:

  • Installs dnsmasq
  • Configures dnsmasq to not try and use the system DNS information to forward requests
  • To look in /etc/dnsmasq.d for extra config files
  • Adds a config file with the mapping of example.com to 172.17.0.1
  • Tells systemd-resolved to send all requests for example.com to dnsmasq listening in 127.0.0.1

MacOS

MacOS can work basically the same way as I’ve just described for Ubuntu. The only differences are that dnsmasq gets installed via HomeBrew, and we need to assign a phantom IP address to the loopback adapter because Docker on MacOS doesn’t have the same docker0 bridge interface we can use.

sudo ifconfig lo0 alias 10.128.0.1
brew install dnsmasq
echo "conf-dir=/opt/homebrew/etc/dnsmasq.d" >> /opt/homebrew/etc/dnsmasq.conf
echo "address=/example.com/10.128.0.1" > /opt/homebrew/etc/dnsmasq.d/ff.conf
sudo brew services start dnsmasq
dscacheutil -flushcache
sudo mkdir -p /etc/resolver
sudo tee /etc/resolver/example.com > /dev/null <<EOF
nameserver 127.0.0.1
domain example.com
search_order 1
EOF

These commands do the following:

  • Set 10.128.0.1 as an address bound to the lo0 loopback interface
  • Install dnsmasq
  • Tell dnsmasq to look in the /opt/homebrew/etc/dnsmasq.d directory for extra config files
  • Add a mapping from example.com to IP address 10.128.0.1
  • Set dnsmasq to run as a service
  • Tell the MacOS name system to send all queries for example.com to dnsmasq running on 127.0.0.1

Other options

There is another option which is good in a pinch for testing these sort of set ups.

The domain sslip.io has been setup to reply with the IP address found in the hostname. e.g.

  • 127.0.0.1.sslip.io resolves to 127.0.0.1
  • www.127-0-0-1.sslip.io resolves to 127.0.0.1
  • www.--1.sslip.io resolves to ::1
  • fe80--9a6e-9aca-57cc-eea3.sslip.io resolves to fe80::9a6e:9aca:57cc:eea3

You can find more details about sslip.io here and it includes a link to the source code you you can build and run your own if needed.

Joining the Herd

With the news about what’s happening on the BirdSite… I thought it might be time to explore what the state of the art alternative is these days.

As an aside, this isn’t the first time I’ve looked at Twitter alternatives, back in the very early days of Twitter I build and ran a project called BlueTwit inside IBM. This was ground up clone that was there to see how a micro blogging platform (how quaint a term…) would work in a large organisation. It had a Firefox plugin client, supported 250+ characters long before Twitter. The whole was written in Java ran on a modest AMD64 box under my desk for many years and was even played with by the IBM Research team before similar functionality ended up in IBM/Lotus Connections. (Someday I should do a proper writup about it…)

Anyway back to the hear and now… I have a propensity for wanting to know how things work under the covers (which is why I run my own web, DNS, mail, SIP, finger and, gopher server). So I thought I’d have a go at running my own Mastodon server for a little while to see how it all fits together.

A little digging shows that a Mastodon instance isn’t just one thing, it needs the following:

  • Mastodon Web interface
  • Mastodon Streaming interface
  • Mastodon Job server (sidekiq)
  • A PostgreSQL instance (for persistence)
  • A Redis Instance
  • A Elastic Search Instance (optional)

Given this list of parts trying to run it in a containerised environment made sense. I have both a Docker Compose setup and a Kubernetes setup at home for testing FlowForge on while I’m working, so that wouldn’t be a major problem (though I understand I’m the outlier here). I decided to go with Kubernetes, because that cluster is a bit bigger and I like a challenge.

Deploying to Kubernetes

A bit of Googling turned up that while there isn’t a published Helm chart, there is one included in the project. So I cloned the project to my laptop.

git clone https://github.com/mastodon/mastodon.git

Configuration

I then started to create a local-values.yml file to contain my instances specific configuration details. To get a feel for what values I’d need I started by looking in the chart/values.yml file to see what the defaults are and what I could override.

I also started to read the Mastodon server install guide as it had explanations to what each option means.

The first choice was what to call the instance. I went with a suggestion from @andypiper for the server name, but I’ll have the users hosted at my root domain

This means that the server is called bluetoot.hardill.me.uk and the instance is called hardill.me.uk so users will be identified for example as @ben@hardill.me.uk. These are configured as local_domain and web_domain

Next up was setting up the details of my mail server so that thinks like alerts and password resets can be sent. That was all pretty self explanatory.

The first tricky bit was setting up some secrets for the instance. There are secret keys for authentication tokens and a seed for the OTP setup. The documentation says to use rake secret but that implies you have Ruby environment already setup. I don’t work with Ruby so this wasn’t available. A bit more searching suggested that OpenSSL could be used:

openssl rand -hex 64

The next set of secrets are the vapid public and private keys. Here the documentation again has a rake command, but this time it appears to be a Mastodon specific Gem. To get round this I decided to pull the pre-built docker image and see if I could run the command in the container.

docker pull tootsuite/mastodon
docker run -it --rm --entrypoint /bin/bash tootsuite/mastodon:latest
mastodon@ab417e0a893a:~$ RAILS_ENV=production bundle exec rake mastodon:webpush:generate_vapid_key
VAPID_PRIVATE_KEY=MdgBNkR98ctXtk3xSTbs7-KJBCcykvvw_q1aFGNfMgY=
VAPID_PUBLIC_KEY=BB-g5Lgund3gYi3UhGGn7Z1Yj06gy4DqdozXQXYxeCDJjpEUW9TXYau7Ifv9xK_676MgUE4JSOSh4XSsroBoHmo=

(These keys are purely for demonstration and have not been used)

I made the decision to skip the elastic search instance to start with just to try and limit just how many resources I need to provide.

There are a couple of other bits I tweaked to make things work with my environment (forcing the bitnami charts to run on the AMD64 nodes rather than the ARM64) but the local-values.yml ended up looking like

mastodon:
  createAdmin:
    enabled: true
    username: fat_controller
    email: mastodon@example.com
  local_domain: example.com
  web_domain: bluetoot.example.com
  persistence:
    assets:
      accessMode: ReadWriteMany
      resources:
        requests:
          storage: 10Gi
    system:
      accessMode: ReadWriteMany
      resources:
        requests:
          storage: 10Gi
  smtp:
    auth_method: plain
    delivery_method: smtp
    enable_starttls_auto: true
    port: 587
    server: mail.example.com
    login: user
    password: password
    from_address: mastodon@example.com
    domain: example.com

  secrets:
    secret_key_base: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    otp_secret: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    vapid:
      private_key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
      public_key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
elasticsearch:
  enabled: false

ingress:
  annotations:
  tls:
  hosts:
  - host: bluetoot.example.com
    paths:
    - path: '/'

postgresql:
  auth:
    password: xxxxxxx
  primary:
    nodeSelector:
      beta.kubernetes.io/arch: amd64

redis:
  password: xxxxxxx
  master:
    nodeSelector:
      beta.kubernetes.io/arch: amd64
  replica:
    nodeSelector:
      beta.kubernetes.io/arch: amd64

Deploying

The first part is to run helm

helm upgrade --install  --namespace mastodon --create-namespace mastodon ../mastodon/chart -f ./local-values.yml

The first time I ran this it failed to start most of the Mastodon pods with a bunch of errors around the PostgreSQL user password. I tracked it down to a recent Pull Request so I raised a new issue and a matching PR to fix things. This got merged really quickly which was very nice to see.

Once I’d modified the helm chart templates a little it deployed cleanly.

Proxying

Next up was setting up my Internet facing web proxy. The Ingress controller on my Kubernetes cluster is not directly exposed to the internet so I needed to add an extra layer of proxying.

First up was to setup an new host entry and renew my LetsEncrypt certificate with the new SAN entry.

location / {
    proxy_pass https://kube-three.local;
    proxy_ssl_verify off;
    proxy_ssl_server_name on;
    proxy_redirect off;
    proxy_read_timeout 900;
    proxy_set_header Host $http_host; # host:$server_port;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-NginX-Proxy true;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "Upgrade";
    proxy_http_version 1.1;
}

It was important to make sure that the proxy target was over https otherwise the mastodon web app would trigger a redirect loop. the proxy_ssl_verify off; is because I’m using the default https certificate on the kubernetes ingress controller so it will fail the hostname check.

The other bit that needed doing was to add a proxy on the hardill.me.uk server for the .well-known/webfinger path to make sure user discovery works properly.

location /.well-known/webfinger {
    return 301 https://bluetoot.example.com$request_uri;
}

First Login

Now all the proxying and https setup is complete I can point my browser at https://bluetoot.hardill.me.uk and I get the intial sigin/signup screen.

As part of the configuration I created an admin user (fat_controller) but didn’t have a way to login as I don’t know what the generated password is. I tried follwing the lost password flow but couldn’t get it to work so followed the documentation about using the admin cli to do a password reset. I did this by using kubectl to get a shell in the mastodon web app pod in the cluster.

kubectl -n mastodon exec -it mastodon-web-5dd6764d4-mvwnl /bin/bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
mastodon@mastodon-web-5dd6764d4-mvwnl:~$ RAILS_ENV=production bin/tootctl accounts modify fat_controller --reset-password
OK
New password: fc20b54eddb312bf9462abfec77a27d8

I could then log in as the administrator, update things like the contact email address, change the default user creation to require approval for new users.

What next?

I’m not sure if I’m going to keep this instance yet. At the moment it’s a testing environment and I’m not sure yet how much I’m going to actually use Mastodon. A lot will depend on what Mr Musk decides to do with his new play thing and if/when the communities I follow move.

But I will probably play with writing a few bots for some of the home automation things I have on the go, e.g. as I write this I’m waiting for a WiFi CO2 monitor to be delivered. Having a bot that Toots the level in my office sounds like a good first project.

Taking a RISC

As a result of the Chip Crisis getting hold of Raspberry Pi SBCs has been tricky for the last year or so. If you manage to find a retailer that has some in stock before they get cleaned out by the scalpers they come bundled with PSU, pre-installed SD card, case and HDMI adapters (I understand why, it makes them more attractive to the original target audience).

At the same time there has been a growing number of new SBCs based on the RISC-V architecture. RISC-V is an open CPU architecture and looking to try and compete with the close ARM architecture which is found just about everywhere (Phones, new Macs, Watches, Set Top boxes, …)

So it seamed like a good time to see what boards were available and have a play. I had a bit of a search, and found a Sipeed Nezha RV64 Development Board on Amazon. It has the same basic layout as a Pi, all the same ports (plus some extra), a Debian Linux variant to run on it, the only thing lacking is RAM at 1GB, but that should be enough to get started.

Single Board Computer

Having ordered one, the delivery date was for mid October, early November but it turned up the 3rd weekend in September.

Unboxing

We get a reasonably nice box which contains the following

  • Sipeed Nezha 64bit 1GB Development Board
  • 32gb SD card preloaded with Debain Linux
  • 2 USB-A to USB-C cables
  • USB to Serial break out cable
  • American style USB Power supply
  • 4 Stand off feet

I don’t think I’ll be making use of the USB PSU, unbranded Chinese PSUs like this have a bad reputation and I don’t fancy digging out my UK to US step down transformer. Luckily when the Postman delivered this he also delivered one of the new Google Chromecast HDs which came with a suitable PSU that I don’t need as the TV has a USB socket I can use with the Chromcast.

The USB-A to USB-C cables also look a little strange, the USB-A end has a purple insert, but doesn’t look to have the USB-3 extra tracks and the data tracks look really thin. I’ll give them a go to power the board, but will probably use one of my existing cables with the USB-C OTG port on the board when I get round to seeing if I can drive it as a USB Gadget like I’ve been doing with Raspberry Pi 4 and Pi Zero boards.

I can’t find a matching case for this board online, while it’s mainly laid out like a Pi, the ports are not quite the same, so most Pi cases would need modifying. I may see if I can find somebody with a 3D printer to run one up for me. In the mean time I’ve screwed in the stand off feet to hold it up off the desk.

Powering up

As the board only has one USB-A socket I stuck a wireless keyboard/mouse dongle in to get started, along with a HDMI cable, ethernet and power.

When I plugged the power in I got absolutely nothing out to the display, but the power LED lit up and the link and data lights on the ethernet socket started to flash. I had a quick look in the DHCP table on my router and found a new device with a hostname of sipeed. I fired up SSH to connect as the sipeed user and entered the password printed on the inside of the box.

This got me logged into a bash shell. I found a shell script in the home directory called test_lcd.sh which when I examined hinted that i might be able to switch between a built in LCD driver and the HDMI output. Running the script at root with the argument of hdmi kicked the hooked up display into life.

X Windows running on half a screen

I appears (after a little searching) that the default X config is for a tablet sized screen in portrait mode, which is why it’s only using the left 3rd of the screen.

I’ll try and find some time to set up the display properly later, but for now we’ve proved nothing is broken.

Next I tried checking for any OS updates with sudo apt-get update but it threw errors to do with missing GPG keys.

root@sipeed:~# apt-get update
Get:1 http://ftp.ports.debian.org/debian-ports sid InRelease [69.6 kB]
Err:1 http://ftp.ports.debian.org/debian-ports sid InRelease
  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY E852514F5DF312F6
Fetched 69.6 kB in 2s (34.0 kB/s)
Reading package lists... Done
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: http://ftp.ports.debian.org/debian-ports sid InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY E852514F5DF312F6
W: Failed to fetch http://ftp.ports.debian.org/debian-ports/dists/sid/InRelease  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY E852514F5DF312F6
W: Some index files failed to download. They have been ignored, or old ones used instead.

I’m pretty sure I can import them, just need to find the right key server to pull them from.

(From Redit)

$ busybox wget -O- https://www.ports.debian.org/archive_2022.key | apt-key add

Once I get the updates installed I’ll see if I can get NodeJS and give Node-RED and FlowForge a go.

That’s a basic sniff done. Will post some more when I’ve had time to poke round some more.

I’ve found some Fedora images for this board, here that I’ll probably boot up next, then give Ubuntu a go from here.

Full NGINX Logging Example

In a previous post I talked about how to setup NGINX to log both the request and response headers and bodies to help with debugging. It’s a lot easier to set up a HTTP reverse proxy than sort out all the private keys when trying to capture HTTPS with wireshark.

I needed to use that again recently and it took me a little while to remember exactly how to put it all together again, so this is just a really short follow up post with a full minimal example of the nginx.conf file needed.

worker_processes 1;
load_module modules/ndk_http_module.so;
load_module modules/ngx_http_lua_module.so;
pcre_jit on;

events {
  worker_connections 1024;
}

error_log /dev/stderr;

http {
  log_format log_req_resp '$remote_addr - $remote_user [$time_local] '
  '"$request" $status $body_bytes_sent '
  '"$http_referer" "$http_user_agent" $request_time req_body:"$request_body" resp_body:"$resp_body" '
  'req_headers:"$req_header" resp_headers:"$resp_header"';

  server {
    listen 80;
    access_log /dev/stdout log_req_resp;

    root /var/www/html;

    lua_need_request_body on;

    set $resp_body "";
    body_filter_by_lua '
      local resp_body = string.sub(ngx.arg[1], 1, 1000)
      ngx.ctx.buffered = (ngx.ctx.buffered or "") .. resp_body
      if ngx.arg[2] then
        ngx.var.resp_body = ngx.ctx.buffered
      end
    ';

    set $req_header "";
    set $resp_header "";
    header_filter_by_lua '
      local h = ngx.req.get_headers()
      for k, v in pairs(h) do
        if (type(v) == "table") then
          ngx.var.req_header = ngx.var.req_header .. k.."="..table.concat(v,",").." "
        else
          ngx.var.req_header = ngx.var.req_header .. k.."="..v.." "
        end
      end
      local rh = ngx.resp.get_headers()
      for k, v in pairs(rh) do
        ngx.var.resp_header = ngx.var.resp_header .. k.."="..v.." "
      end
      ';

  }
}

It also has a small improvement to allow for duplicate HTTP headers in the request (which is in spec). It will now concatenate the values in to a comma separated list.

This is intended to be used with the following Docker container (the default nginx container doe not have the lua module installed)

FROM debian:latest

RUN apt-get update && apt-get install -y libnginx-mod-http-lua libnginx-mod-http-ndk

CMD ["nginx", "-g", "daemon off;"]

Run as follows

docker run --rm -v `pwd`/nginx.conf:/etc/nginx/nginx.conf:ro -p 80:80 custom-nginx

Contributing to Upstream

Over the last couple of weeks I’ve run into a few little bugs/niggles in Open Software components we are using at FlowForge. As a result I have been opening Pull Requests and Feature Requests against those projects to get them fixed.

mosquitto

First up was a really small change for the Mosquitto MQTT broker when using MQTT over WebSockets. In FlowForge v0.8.0 we added support to use MQTT to communicate between different projects in the same team and also as a way for devices to send state back to the core Forge App. To support this we have bundled the Mosquitto MQTT broker as part of the deployment.

When running on Docker or Kubernetes we use MQTT over WebSockets and the MQTT broker is exposed via the same HTTP reverse proxy as the Node-RED projects. I noticed that in the logs all the connections were coming from the IP address of the HTTP reverse proxy, not the actual clients. This makes sense because as far as mosquitto is concerned this is the source of the connection. To work around this the proxy usually adds a HTTP Header with the original clients IP address as follows:

X-Forwarded-For: 192.168.1.100

Web applications normally have a flag you can set to tell them to trust the proxy to add the correct value and substitute this IP address in any logging. Mosquitto uses the libwebsocket library to handle the set up of the WebSocket connection and this library supports exposing this HTTP Header when a new connection is created.

I submitted a Pull Request (#2616) to allow mosquitto to make use of this feature. Which adds the following code in src/websockets.c here.

...
    if (lws_hdr_copy(wsi, ip_addr_buff, sizeof(ip_addr_buff), WSI_TOKEN_X_FORWARDED_FOR) > 0) {
        mosq->address = mosquitto__strdup(ip_addr_buff);
    } else {
        easy_address(lws_get_socket_fd(wsi), mosq);
    }
...

This will use the HTTP header value if it exists, or fallback to the remote socket address if not.

Roger Light reviewed and merged this for me pretty much straight away and then released mosquitto v2.0.15, which was amazing.

mosquitto-go-auth

To secure our mosquitto broker we make use of the mosquitto-go-auth plugin which allows us to dynamically create users and ACL entries as we add/remove projects or users from the system. To make life easy this project publishes a Docker container with mosquitto, libwebsocket and the plugin all pre-built and setup to work together.

I had earlier run into a problem with the container not always working well when using MQTT over WebSockets connections. This turned out to be down to the libwebsockets instance in the container not being compiled with a required flag.

To fix this I submitted a Pull Request (#241) that

  • Updated the version of libwebsockets built into the container
  • Added the missing compile time flag when building libwebsockets
  • Bumped the mosquitto version to the just released v2.0.15

Again the project maintainer was really responsive and got the Pull Request turned round and release in a couple of days.

This means the latest version of the plugin container works properly with MQTT over WebSocket connections and will log the correct IP address of the connecting clients.

grafana

As I mentioned earlier I spotted the problem with the client IP addresses while looking at the logs from the MQTT broker of our Kubernetes deployment. To gather the logs we make use of Grafana Loki. Loki gathers the logs from all the different components and these can then be interrogated from the Gafana front end.

To deploy Grafana and Loki I made use of the Helm charts provided by the project. This means that all the required parts can be installed and configured with a single command and a file containing the specific customisations required.

One of the customisations is the setting up a Kubernetes Ingress entry for the Grafana frontend to make it accessible outside the cluster. The documentation says that you can pass a list of hostnames using the grafana.ingress.hosts key. If you just set this then most things work properly until you start to build dashboards based on the logs at which point you get some strange behaviour where you get redirected to http://localhost which doesn’t work. This is because to get the redirects to work correctly you also need to pass the hostname in the grafana."grafana.ini".server.domain setting.

In order to get this to work cleanly you have to know that you need to pass the same hostname in two different places. To try and make this a little simpler for the default basic case I have submitted a PR (#1689) that will take the first enty in the grafana.ingress.hosts list and use it for the value of grafana."grafana.ini".server.domain if there isn’t one provided.

Many thanks to Marcus Noble for helping me with the Go Templating needed for this.

This PR is currently undergoing review and hopefully will be merged soon.

eksctl

And finally this isn’t a code contribution, but I opened a feature request against WeaveWorks’ eksctl command. This is a tool create and managed Kubernetes clusters on AWS systems.

A couple of weeks ago we received a notice from AWS that one of the EC2 machines that makes up the FlowForge Cloud instance was due to be restarted as part of some EC2 planned maintenance.

We had a couple of options as to how to approach this

  1. Just leave it alone, the nodegroup would automatically replace the node when it was shutdown, but this would lead to downtime as the new EC2 instance was spun up.
  2. Add an extra node to the impacted nodegroup and migrate the running workload to the new node once it was fully up and running.

We obviously went with option 2, but this left us with an extra node in the group after maintenance. eksctl has options to allow the group size to be reduced but it doesn’t come with a way to say which node should be removed, so there was no way to be sure it would be the (new) node with no pods scheduled that is removed.

I asked a question on AWS’s re:Post forum as to how to remove a specific node and got details of how to do this with the awscli tool. While this works it would be really nice to be able to do it all from eksctl as a single point of contact.

I’ve raised a feature request (#5629) asking if this can be added and it’s currently waiting review and hopefully design and planning.

Home MicroK8s Cluster

I started to write about my home test environment for FlowForge a while ago, having just had to rebuild my K8s cluster due to a node failure I thought I should come back to this and document how I set it up (as much for next time as to share).

Cluster Nodes

I’m using the following nodes

Base OS

I’m working with Ubuntu 20.04 as this is the default OS of choice for MicroK8s and it’s available for both x86_64 and Arm8 for the Raspberry Pi 4.

Installing MicroK8s

$ sudo snap install microk8s --classic --channel=1.24

Once deployed on all 3 nodes, then we need to pick one of the nodes as the manager. In this case I’m using the Intel Celeron machine as the master and will run the following:

$ microk8s add-node
From the node you wish to join to this cluster, run the following:
microk8s join 192.168.1.59:25000/52bfa563603b3018770f88cadf606920/0e6fa3fb9ed3

Use the '--worker' flag to join a node as a worker not running the control plane, eg:
microk8s join 192.168.1.59:25000/52bfa563603b3018770f88cadf606920/0e6fa3fb9ed3 --worker

If the node you are adding is not reachable through the default interface you can use one of the following:
microk8s join 192.168.1.59:25000/52bfa563603b3018770f88cadf606920/0e6fa3fb9ed3

And then on the other 2 nodes run the following

$ microk8s join 192.168.1.59:25000/52bfa563603b3018770f88cadf606920/0e6fa3fb9ed3 --worker

You can verify the nodes are joined to the cluster with:

$ microk8s.kubectl get nodes
NAME         STATUS   ROLES    AGE    VERSION
kube-two     Ready    <none>   137m   v1.24.0-2+f76e51e86eadea
kube-one     Ready    <none>   138m   v1.24.0-2+f76e51e86eadea
kube-three   Ready    <none>   140m   v1.24.0-2+59bbb3530b6769

Once the nodes are added to the cluster we need to enable a bunch of plugins, on the master node run:

$ microk8s enable dns:192.168.1.xx ingress helm helm3

dns:192.168.1.xx overrides the default of using Google’s 8.8.8.8 DNS server to resolve names outside the cluster. This is important because I want it to point to my local DNS as I have set *.flowforge.loc and *.k8s.loc to point to the cluster IP addresses for Ingress.

Install kubectl and helm

By default Microk8s ships with a bunch of tools baked in, these include kubectl and helm that can be accessed as microk8s.kubectl and microk8s.helm respectively.

kubectl

Instructions for installing standalone kubectl can be found here. Once installed you can generate the config by running the following on the master node:

$ microk8s config > ~/.kube/config

This can be copied to other machines that you want to be able to administrate the cluster.

helm

Instructions for installing helm can be found standalone here.

This will make use of the same ~/.kube/config credentials file as kubectl.

NFS Persistent Storage

In order to have a consistent persistence storage pool across all 3 nodes I’m using a NFS share from my NAS. This is controlled using the nfs-subdir-external-provisioner. This creates a new directory on the NFS share for each volume created.

All the nodes need to have all the NFS client tools installed, this can be achieved with:

$ sudo apt-get install nfs-common

This is deployed using helm

$ helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
$ helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
    --set nfs.server=192.168.1.7 \
    --set nfs.path=/volume1/kube

To set this as the default StorageClass run the following:

kubectl patch storageclass standard -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Conclusion

That is enough for the basic Kubernetes cluster setup, there are some FlowForge specific bits that are needed (e.g. tagging nodes) but I’ll leave that for the FlowForge on Kubernetes install docs (which I have to finish writing before the next release).

Building a Kubernetes Test Environment

Over the last couple of weekends I’ve been noodling around with my home lab set up to build a full local environment to test out FlowForge with both the Kubernetes Container and Docker Drivers.

The other reason to put all this together is to help to work the right way to put together a proper CI pipeline to build, automatically test and deploy to our staging environment.

Components

NPM Registry

This is somewhere to push the various FlowForge NodeJS modules so they can then be installed while building the container images for the FlowForge App and the Project Container Stacks.

This is a private registry so that I can push pre-release builds without them slipping out in to the public domain, but also so I can delete releases and reuse version numbers which is not allowed on the public NPM registry.

I’m using the Verdaccio registry as I’ve used this in the past to host custom Node-RED nodes (which it will probably end up doing again in this set up as things move forwards). This runs as Docker container and I use my Nginx instance to reverse proxy for it.

As well as hosting my private builds it can proxy for the public npmjs.org regisry which speeds up local builds.

Docker Container Registry

This is somewhere to push the Docker containers that represent both the FlowForge app it’s self and the containers that represent the Project Stacks.

Docker ship a container image available that will run a registry.

As well as the registry I’m also running second container with this web UI project to help keep track of what I’ve pushed to the registry and also allows me to delete tags which is useful when testing

Again my internet facing Nginx instance is proxying for both of these (on the same virtual host since their routes do not clash and it makes CORS easier since the UI is all browser side JavaScript)

Helm Chart Repository

This isn’t really needed, as you can generate all the required files with the helm command and host the results on any Web server, but this lets me test the whole stack end to end.

I’m using a package called ChartMuseum which will automatically generate index.yaml manifest file when charts are uploaded via it’s simple UI.

Nginx Proxy

All of the previous components have been stood up as virtual hosts on my public Nginx instance so that they can get HTTPS certificates from LetsEncrypt. This is makes things a lot easier because both Docker and Kubernetes basically require the container registry be secure by default.

While it is possible to add exceptions for specific registries, these days it’s just easier to do it “properly” up front.

MicroK8s Cluster

And finally I need a Kubernetes cluster to run all this on. In this case I have a 3 node cluster made up of

  • 2 Raspberry Pi 4s with 8gb of RAM each
  • 1 Intel Celeron based mini PC with 8gb of RAM

All 3 of these are running 64bit Ubuntu 20.04 and MicroK8s. The Intel machine is needed at the moment because the de facto standard PostrgresSQL Helm Chat only have amd64 based containers at the moment so won’t run on the Raspberry Pi based nodes.

The cluster uses the NFS Persistent Volume provisioner to store volumes on my local NAS so they are available to all the nodes.

Usage

I’ll write some more detailed posts about how I’ve configured each of these components and then how I’m using them.

As well as testing the full Helm chart builds, I can also use this to run the FlowForge app locally and have the Kubernetes Container driver running locally on my development machine and have it create Projects in the Kubernetes cluster.