Contributing to Upstream

Over the last couple of weeks I’ve run into a few little bugs/niggles in Open Software components we are using at FlowForge. As a result I have been opening Pull Requests and Feature Requests against those projects to get them fixed.

mosquitto

First up was a really small change for the Mosquitto MQTT broker when using MQTT over WebSockets. In FlowForge v0.8.0 we added support to use MQTT to communicate between different projects in the same team and also as a way for devices to send state back to the core Forge App. To support this we have bundled the Mosquitto MQTT broker as part of the deployment.

When running on Docker or Kubernetes we use MQTT over WebSockets and the MQTT broker is exposed via the same HTTP reverse proxy as the Node-RED projects. I noticed that in the logs all the connections were coming from the IP address of the HTTP reverse proxy, not the actual clients. This makes sense because as far as mosquitto is concerned this is the source of the connection. To work around this the proxy usually adds a HTTP Header with the original clients IP address as follows:

X-Forwarded-For: 192.168.1.100

Web applications normally have a flag you can set to tell them to trust the proxy to add the correct value and substitute this IP address in any logging. Mosquitto uses the libwebsocket library to handle the set up of the WebSocket connection and this library supports exposing this HTTP Header when a new connection is created.

I submitted a Pull Request (#2616) to allow mosquitto to make use of this feature. Which adds the following code in src/websockets.c here.

...
    if (lws_hdr_copy(wsi, ip_addr_buff, sizeof(ip_addr_buff), WSI_TOKEN_X_FORWARDED_FOR) > 0) {
        mosq->address = mosquitto__strdup(ip_addr_buff);
    } else {
        easy_address(lws_get_socket_fd(wsi), mosq);
    }
...

This will use the HTTP header value if it exists, or fallback to the remote socket address if not.

Roger Light reviewed and merged this for me pretty much straight away and then released mosquitto v2.0.15, which was amazing.

mosquitto-go-auth

To secure our mosquitto broker we make use of the mosquitto-go-auth plugin which allows us to dynamically create users and ACL entries as we add/remove projects or users from the system. To make life easy this project publishes a Docker container with mosquitto, libwebsocket and the plugin all pre-built and setup to work together.

I had earlier run into a problem with the container not always working well when using MQTT over WebSockets connections. This turned out to be down to the libwebsockets instance in the container not being compiled with a required flag.

To fix this I submitted a Pull Request (#241) that

  • Updated the version of libwebsockets built into the container
  • Added the missing compile time flag when building libwebsockets
  • Bumped the mosquitto version to the just released v2.0.15

Again the project maintainer was really responsive and got the Pull Request turned round and release in a couple of days.

This means the latest version of the plugin container works properly with MQTT over WebSocket connections and will log the correct IP address of the connecting clients.

grafana

As I mentioned earlier I spotted the problem with the client IP addresses while looking at the logs from the MQTT broker of our Kubernetes deployment. To gather the logs we make use of Grafana Loki. Loki gathers the logs from all the different components and these can then be interrogated from the Gafana front end.

To deploy Grafana and Loki I made use of the Helm charts provided by the project. This means that all the required parts can be installed and configured with a single command and a file containing the specific customisations required.

One of the customisations is the setting up a Kubernetes Ingress entry for the Grafana frontend to make it accessible outside the cluster. The documentation says that you can pass a list of hostnames using the grafana.ingress.hosts key. If you just set this then most things work properly until you start to build dashboards based on the logs at which point you get some strange behaviour where you get redirected to http://localhost which doesn’t work. This is because to get the redirects to work correctly you also need to pass the hostname in the grafana."grafana.ini".server.domain setting.

In order to get this to work cleanly you have to know that you need to pass the same hostname in two different places. To try and make this a little simpler for the default basic case I have submitted a PR (#1689) that will take the first enty in the grafana.ingress.hosts list and use it for the value of grafana."grafana.ini".server.domain if there isn’t one provided.

Many thanks to Marcus Noble for helping me with the Go Templating needed for this.

This PR is currently undergoing review and hopefully will be merged soon.

eksctl

And finally this isn’t a code contribution, but I opened a feature request against WeaveWorks’ eksctl command. This is a tool create and managed Kubernetes clusters on AWS systems.

A couple of weeks ago we received a notice from AWS that one of the EC2 machines that makes up the FlowForge Cloud instance was due to be restarted as part of some EC2 planned maintenance.

We had a couple of options as to how to approach this

  1. Just leave it alone, the nodegroup would automatically replace the node when it was shutdown, but this would lead to downtime as the new EC2 instance was spun up.
  2. Add an extra node to the impacted nodegroup and migrate the running workload to the new node once it was fully up and running.

We obviously went with option 2, but this left us with an extra node in the group after maintenance. eksctl has options to allow the group size to be reduced but it doesn’t come with a way to say which node should be removed, so there was no way to be sure it would be the (new) node with no pods scheduled that is removed.

I asked a question on AWS’s re:Post forum as to how to remove a specific node and got details of how to do this with the awscli tool. While this works it would be really nice to be able to do it all from eksctl as a single point of contact.

I’ve raised a feature request (#5629) asking if this can be added and it’s currently waiting review and hopefully design and planning.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.