Contributing to Upstream

Over the last couple of weeks I’ve run into a few little bugs/niggles in Open Software components we are using at FlowForge. As a result I have been opening Pull Requests and Feature Requests against those projects to get them fixed.

mosquitto

First up was a really small change for the Mosquitto MQTT broker when using MQTT over WebSockets. In FlowForge v0.8.0 we added support to use MQTT to communicate between different projects in the same team and also as a way for devices to send state back to the core Forge App. To support this we have bundled the Mosquitto MQTT broker as part of the deployment.

When running on Docker or Kubernetes we use MQTT over WebSockets and the MQTT broker is exposed via the same HTTP reverse proxy as the Node-RED projects. I noticed that in the logs all the connections were coming from the IP address of the HTTP reverse proxy, not the actual clients. This makes sense because as far as mosquitto is concerned this is the source of the connection. To work around this the proxy usually adds a HTTP Header with the original clients IP address as follows:

X-Forwarded-For: 192.168.1.100

Web applications normally have a flag you can set to tell them to trust the proxy to add the correct value and substitute this IP address in any logging. Mosquitto uses the libwebsocket library to handle the set up of the WebSocket connection and this library supports exposing this HTTP Header when a new connection is created.

I submitted a Pull Request (#2616) to allow mosquitto to make use of this feature. Which adds the following code in src/websockets.c here.

...
    if (lws_hdr_copy(wsi, ip_addr_buff, sizeof(ip_addr_buff), WSI_TOKEN_X_FORWARDED_FOR) > 0) {
        mosq->address = mosquitto__strdup(ip_addr_buff);
    } else {
        easy_address(lws_get_socket_fd(wsi), mosq);
    }
...

This will use the HTTP header value if it exists, or fallback to the remote socket address if not.

Roger Light reviewed and merged this for me pretty much straight away and then released mosquitto v2.0.15, which was amazing.

mosquitto-go-auth

To secure our mosquitto broker we make use of the mosquitto-go-auth plugin which allows us to dynamically create users and ACL entries as we add/remove projects or users from the system. To make life easy this project publishes a Docker container with mosquitto, libwebsocket and the plugin all pre-built and setup to work together.

I had earlier run into a problem with the container not always working well when using MQTT over WebSockets connections. This turned out to be down to the libwebsockets instance in the container not being compiled with a required flag.

To fix this I submitted a Pull Request (#241) that

  • Updated the version of libwebsockets built into the container
  • Added the missing compile time flag when building libwebsockets
  • Bumped the mosquitto version to the just released v2.0.15

Again the project maintainer was really responsive and got the Pull Request turned round and release in a couple of days.

This means the latest version of the plugin container works properly with MQTT over WebSocket connections and will log the correct IP address of the connecting clients.

grafana

As I mentioned earlier I spotted the problem with the client IP addresses while looking at the logs from the MQTT broker of our Kubernetes deployment. To gather the logs we make use of Grafana Loki. Loki gathers the logs from all the different components and these can then be interrogated from the Gafana front end.

To deploy Grafana and Loki I made use of the Helm charts provided by the project. This means that all the required parts can be installed and configured with a single command and a file containing the specific customisations required.

One of the customisations is the setting up a Kubernetes Ingress entry for the Grafana frontend to make it accessible outside the cluster. The documentation says that you can pass a list of hostnames using the grafana.ingress.hosts key. If you just set this then most things work properly until you start to build dashboards based on the logs at which point you get some strange behaviour where you get redirected to http://localhost which doesn’t work. This is because to get the redirects to work correctly you also need to pass the hostname in the grafana."grafana.ini".server.domain setting.

In order to get this to work cleanly you have to know that you need to pass the same hostname in two different places. To try and make this a little simpler for the default basic case I have submitted a PR (#1689) that will take the first enty in the grafana.ingress.hosts list and use it for the value of grafana."grafana.ini".server.domain if there isn’t one provided.

Many thanks to Marcus Noble for helping me with the Go Templating needed for this.

This PR is currently undergoing review and hopefully will be merged soon.

eksctl

And finally this isn’t a code contribution, but I opened a feature request against WeaveWorks’ eksctl command. This is a tool create and managed Kubernetes clusters on AWS systems.

A couple of weeks ago we received a notice from AWS that one of the EC2 machines that makes up the FlowForge Cloud instance was due to be restarted as part of some EC2 planned maintenance.

We had a couple of options as to how to approach this

  1. Just leave it alone, the nodegroup would automatically replace the node when it was shutdown, but this would lead to downtime as the new EC2 instance was spun up.
  2. Add an extra node to the impacted nodegroup and migrate the running workload to the new node once it was fully up and running.

We obviously went with option 2, but this left us with an extra node in the group after maintenance. eksctl has options to allow the group size to be reduced but it doesn’t come with a way to say which node should be removed, so there was no way to be sure it would be the (new) node with no pods scheduled that is removed.

I asked a question on AWS’s re:Post forum as to how to remove a specific node and got details of how to do this with the awscli tool. While this works it would be really nice to be able to do it all from eksctl as a single point of contact.

I’ve raised a feature request (#5629) asking if this can be added and it’s currently waiting review and hopefully design and planning.

Home MicroK8s Cluster

I started to write about my home test environment for FlowForge a while ago, having just had to rebuild my K8s cluster due to a node failure I thought I should come back to this and document how I set it up (as much for next time as to share).

Cluster Nodes

I’m using the following nodes

Base OS

I’m working with Ubuntu 20.04 as this is the default OS of choice for MicroK8s and it’s available for both x86_64 and Arm8 for the Raspberry Pi 4.

Installing MicroK8s

$ sudo snap install microk8s --classic --channel=1.24

Once deployed on all 3 nodes, then we need to pick one of the nodes as the manager. In this case I’m using the Intel Celeron machine as the master and will run the following:

$ microk8s add-node
From the node you wish to join to this cluster, run the following:
microk8s join 192.168.1.59:25000/52bfa563603b3018770f88cadf606920/0e6fa3fb9ed3

Use the '--worker' flag to join a node as a worker not running the control plane, eg:
microk8s join 192.168.1.59:25000/52bfa563603b3018770f88cadf606920/0e6fa3fb9ed3 --worker

If the node you are adding is not reachable through the default interface you can use one of the following:
microk8s join 192.168.1.59:25000/52bfa563603b3018770f88cadf606920/0e6fa3fb9ed3

And then on the other 2 nodes run the following

$ microk8s join 192.168.1.59:25000/52bfa563603b3018770f88cadf606920/0e6fa3fb9ed3 --worker

You can verify the nodes are joined to the cluster with:

$ microk8s.kubectl get nodes
NAME         STATUS   ROLES    AGE    VERSION
kube-two     Ready    <none>   137m   v1.24.0-2+f76e51e86eadea
kube-one     Ready    <none>   138m   v1.24.0-2+f76e51e86eadea
kube-three   Ready    <none>   140m   v1.24.0-2+59bbb3530b6769

Once the nodes are added to the cluster we need to enable a bunch of plugins, on the master node run:

$ microk8s enable dns:192.168.1.xx ingress helm helm3

dns:192.168.1.xx overrides the default of using Google’s 8.8.8.8 DNS server to resolve names outside the cluster. This is important because I want it to point to my local DNS as I have set *.flowforge.loc and *.k8s.loc to point to the cluster IP addresses for Ingress.

Install kubectl and helm

By default Microk8s ships with a bunch of tools baked in, these include kubectl and helm that can be accessed as microk8s.kubectl and microk8s.helm respectively.

kubectl

Instructions for installing standalone kubectl can be found here. Once installed you can generate the config by running the following on the master node:

$ microk8s config > ~/.kube/config

This can be copied to other machines that you want to be able to administrate the cluster.

helm

Instructions for installing helm can be found standalone here.

This will make use of the same ~/.kube/config credentials file as kubectl.

NFS Persistent Storage

In order to have a consistent persistence storage pool across all 3 nodes I’m using a NFS share from my NAS. This is controlled using the nfs-subdir-external-provisioner. This creates a new directory on the NFS share for each volume created.

All the nodes need to have all the NFS client tools installed, this can be achieved with:

$ sudo apt-get install nfs-common

This is deployed using helm

$ helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
$ helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
    --set nfs.server=192.168.1.7 \
    --set nfs.path=/volume1/kube

To set this as the default StorageClass run the following:

kubectl patch storageclass standard -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Conclusion

That is enough for the basic Kubernetes cluster setup, there are some FlowForge specific bits that are needed (e.g. tagging nodes) but I’ll leave that for the FlowForge on Kubernetes install docs (which I have to finish writing before the next release).

Building a Kubernetes Test Environment

Over the last couple of weekends I’ve been noodling around with my home lab set up to build a full local environment to test out FlowForge with both the Kubernetes Container and Docker Drivers.

The other reason to put all this together is to help to work the right way to put together a proper CI pipeline to build, automatically test and deploy to our staging environment.

Components

NPM Registry

This is somewhere to push the various FlowForge NodeJS modules so they can then be installed while building the container images for the FlowForge App and the Project Container Stacks.

This is a private registry so that I can push pre-release builds without them slipping out in to the public domain, but also so I can delete releases and reuse version numbers which is not allowed on the public NPM registry.

I’m using the Verdaccio registry as I’ve used this in the past to host custom Node-RED nodes (which it will probably end up doing again in this set up as things move forwards). This runs as Docker container and I use my Nginx instance to reverse proxy for it.

As well as hosting my private builds it can proxy for the public npmjs.org regisry which speeds up local builds.

Docker Container Registry

This is somewhere to push the Docker containers that represent both the FlowForge app it’s self and the containers that represent the Project Stacks.

Docker ship a container image available that will run a registry.

As well as the registry I’m also running second container with this web UI project to help keep track of what I’ve pushed to the registry and also allows me to delete tags which is useful when testing

Again my internet facing Nginx instance is proxying for both of these (on the same virtual host since their routes do not clash and it makes CORS easier since the UI is all browser side JavaScript)

Helm Chart Repository

This isn’t really needed, as you can generate all the required files with the helm command and host the results on any Web server, but this lets me test the whole stack end to end.

I’m using a package called ChartMuseum which will automatically generate index.yaml manifest file when charts are uploaded via it’s simple UI.

Nginx Proxy

All of the previous components have been stood up as virtual hosts on my public Nginx instance so that they can get HTTPS certificates from LetsEncrypt. This is makes things a lot easier because both Docker and Kubernetes basically require the container registry be secure by default.

While it is possible to add exceptions for specific registries, these days it’s just easier to do it “properly” up front.

MicroK8s Cluster

And finally I need a Kubernetes cluster to run all this on. In this case I have a 3 node cluster made up of

  • 2 Raspberry Pi 4s with 8gb of RAM each
  • 1 Intel Celeron based mini PC with 8gb of RAM

All 3 of these are running 64bit Ubuntu 20.04 and MicroK8s. The Intel machine is needed at the moment because the de facto standard PostrgresSQL Helm Chat only have amd64 based containers at the moment so won’t run on the Raspberry Pi based nodes.

The cluster uses the NFS Persistent Volume provisioner to store volumes on my local NAS so they are available to all the nodes.

Usage

I’ll write some more detailed posts about how I’ve configured each of these components and then how I’m using them.

As well as testing the full Helm chart builds, I can also use this to run the FlowForge app locally and have the Kubernetes Container driver running locally on my development machine and have it create Projects in the Kubernetes cluster.

FlowForge v0.1.0

So it’s finally time to talk a bit more about what I’ve been up to for the last few months since joining FlowForge Inc.

The FlowForge platform is a way to manage multiple instances of Node-RED at scale and to control user access to those instances.

The platform comes with 3 different backend drivers

  • LocalFS
  • Docker Compose
  • Kubernetes

LocalFS

This is the driver to use for evaluating the platform or as a home user that doesn’t want to install all the overhead that is required for the other 2 drivers. I starts Projects (Node-RED instances) as separate processes on the same machine and runs each one on a separate port. It keeps state in a local SQLite database.

Docker Compose

This version is a little more complicated, it uses the Docker runtime to start containers for the FlowForge runtime, a PostgreSQL database and Nginx reverse proxy. Each Project lives in it’s own container and is accessed by a unique hostname prepended to a supplied hostname. This can still run on a single machine (or multiple if Docker Swarm mode is used)

Kubernetes

This is the whole shebang, similar to Docker Compose the FlowForge platform all runs in containers and the Projects end up in their own containers. But the Kubernetes platform provides more ways to manage the resources behind the containers and to scale to even bigger deployments.

Release

Today we have released version 0.1.0 and made all the GitHub projects public.

The initial release is primarily focused on getting the core FlowForge platform out there for feedback and we’ve tried to make the LocalFS install experience as smooth as possible. There are example installers for the Docker and Kubernetes drivers but the documentation around these will improve very soon.

You can read the official release announcement here which has a link to the installer and also includes a walk through video.