Contributing to Upstream

Over the last couple of weeks I’ve run into a few little bugs/niggles in Open Software components we are using at FlowForge. As a result I have been opening Pull Requests and Feature Requests against those projects to get them fixed.

mosquitto

First up was a really small change for the Mosquitto MQTT broker when using MQTT over WebSockets. In FlowForge v0.8.0 we added support to use MQTT to communicate between different projects in the same team and also as a way for devices to send state back to the core Forge App. To support this we have bundled the Mosquitto MQTT broker as part of the deployment.

When running on Docker or Kubernetes we use MQTT over WebSockets and the MQTT broker is exposed via the same HTTP reverse proxy as the Node-RED projects. I noticed that in the logs all the connections were coming from the IP address of the HTTP reverse proxy, not the actual clients. This makes sense because as far as mosquitto is concerned this is the source of the connection. To work around this the proxy usually adds a HTTP Header with the original clients IP address as follows:

X-Forwarded-For: 192.168.1.100

Web applications normally have a flag you can set to tell them to trust the proxy to add the correct value and substitute this IP address in any logging. Mosquitto uses the libwebsocket library to handle the set up of the WebSocket connection and this library supports exposing this HTTP Header when a new connection is created.

I submitted a Pull Request (#2616) to allow mosquitto to make use of this feature. Which adds the following code in src/websockets.c here.

...
    if (lws_hdr_copy(wsi, ip_addr_buff, sizeof(ip_addr_buff), WSI_TOKEN_X_FORWARDED_FOR) > 0) {
        mosq->address = mosquitto__strdup(ip_addr_buff);
    } else {
        easy_address(lws_get_socket_fd(wsi), mosq);
    }
...

This will use the HTTP header value if it exists, or fallback to the remote socket address if not.

Roger Light reviewed and merged this for me pretty much straight away and then released mosquitto v2.0.15, which was amazing.

mosquitto-go-auth

To secure our mosquitto broker we make use of the mosquitto-go-auth plugin which allows us to dynamically create users and ACL entries as we add/remove projects or users from the system. To make life easy this project publishes a Docker container with mosquitto, libwebsocket and the plugin all pre-built and setup to work together.

I had earlier run into a problem with the container not always working well when using MQTT over WebSockets connections. This turned out to be down to the libwebsockets instance in the container not being compiled with a required flag.

To fix this I submitted a Pull Request (#241) that

  • Updated the version of libwebsockets built into the container
  • Added the missing compile time flag when building libwebsockets
  • Bumped the mosquitto version to the just released v2.0.15

Again the project maintainer was really responsive and got the Pull Request turned round and release in a couple of days.

This means the latest version of the plugin container works properly with MQTT over WebSocket connections and will log the correct IP address of the connecting clients.

grafana

As I mentioned earlier I spotted the problem with the client IP addresses while looking at the logs from the MQTT broker of our Kubernetes deployment. To gather the logs we make use of Grafana Loki. Loki gathers the logs from all the different components and these can then be interrogated from the Gafana front end.

To deploy Grafana and Loki I made use of the Helm charts provided by the project. This means that all the required parts can be installed and configured with a single command and a file containing the specific customisations required.

One of the customisations is the setting up a Kubernetes Ingress entry for the Grafana frontend to make it accessible outside the cluster. The documentation says that you can pass a list of hostnames using the grafana.ingress.hosts key. If you just set this then most things work properly until you start to build dashboards based on the logs at which point you get some strange behaviour where you get redirected to http://localhost which doesn’t work. This is because to get the redirects to work correctly you also need to pass the hostname in the grafana."grafana.ini".server.domain setting.

In order to get this to work cleanly you have to know that you need to pass the same hostname in two different places. To try and make this a little simpler for the default basic case I have submitted a PR (#1689) that will take the first enty in the grafana.ingress.hosts list and use it for the value of grafana."grafana.ini".server.domain if there isn’t one provided.

Many thanks to Marcus Noble for helping me with the Go Templating needed for this.

This PR is currently undergoing review and hopefully will be merged soon.

eksctl

And finally this isn’t a code contribution, but I opened a feature request against WeaveWorks’ eksctl command. This is a tool create and managed Kubernetes clusters on AWS systems.

A couple of weeks ago we received a notice from AWS that one of the EC2 machines that makes up the FlowForge Cloud instance was due to be restarted as part of some EC2 planned maintenance.

We had a couple of options as to how to approach this

  1. Just leave it alone, the nodegroup would automatically replace the node when it was shutdown, but this would lead to downtime as the new EC2 instance was spun up.
  2. Add an extra node to the impacted nodegroup and migrate the running workload to the new node once it was fully up and running.

We obviously went with option 2, but this left us with an extra node in the group after maintenance. eksctl has options to allow the group size to be reduced but it doesn’t come with a way to say which node should be removed, so there was no way to be sure it would be the (new) node with no pods scheduled that is removed.

I asked a question on AWS’s re:Post forum as to how to remove a specific node and got details of how to do this with the awscli tool. While this works it would be really nice to be able to do it all from eksctl as a single point of contact.

I’ve raised a feature request (#5629) asking if this can be added and it’s currently waiting review and hopefully design and planning.

Open Source Rewards

This is a bit of a rambling piece around some things that have been rattling round in my brain for a while. I’ve written it out mainly to just get it out of my head.

There has been a of noise around Open Source projects and cloud companies making huge profits when running these as services. To the extent of some projects even changing the license to either prevent this, to force the cloud providers to publish all the supporting code that allows the projects to be run at scale or include new features under none Open Source licenses.

Most of the cases that have been making the news have been around projects that have an organisation that supports them e.g. Elastic search that also sell support and hosted versions of the project. While I’m sympathetic to the arguments of the projects I’m not sure the license changes work, and the cloud companies do tend to commit developers to the projects (OK, usually with an aim of getting the features they want implemented, but they do fix bugs as well). Steve O’Grady from Redmonk has a good piece about this here.

I’m less interested in the big firms fighting over this sort of thing, I’m more interested in the other end of the scale, the little “guys/gals”.

There have also been cases about single developers that have built some core components that under pin huge amounts of Open Source software. This is especially clear in the NodeJS world where hundreds of tiny npm modules come together to build thousands of more complex modules that then make up every bodies applications.

While a lot of OS developers do it for the love, or to practice their skills on things they enjoy. But when a project becomes popular the expectations start to stack up. There are literally too many stories of entitled users expecting the same levels of support/service that they would get from a large enterprise when paying for a support contract.

The problem here is when the developer at the bottom of the stack gets fed up with everybody that depends on their module raising bugs and not contributing fixes or just gets bored and walks away. We end up with what happened to the event-stream module. In this case the dev got bored and handy ownership over to somebody else (the first person who asked), that somebody later injected a bunch of code that would steal cryptocurrency private keys.

So what are the options to allow a loan developer to work on their projects and get the support they deserve.

Get employed by a somebody that depends on your code

If your project is of value to a medium/large organisation it can be in their interests to bring support for that project in house. I’ve seen this happen with projects I’ve been involved with (if only on the periphery) and it can work really well.

The thing that can possibly be tricky is balancing the needs of the community that has grown up around a project and the developers new “master” who may well have their own idea’s about what direction the project should take.

I’ve also seen work projects be open sourced and their developers continuing to drive them and get paid to do so.

Set up own business around the project

This is sort of the ultimate version of the previous one.

It can be done by selling support/services around a project, but it also can make some of the problems I mentioned earlier worse as now some will expect even more now they are paying for it.

Paypal donations or Github sponsorship/Patreon/Ko-Fi

Adding a link on the projects About page to a paypal account or a Patreon/Github sponsorship page can let people show their appreciation for a developers work.

Paypal links work well for one off payments, where as the Patreon/Github/Ko-Fi sponsorship model is a little bit more of a commitment but can be a good way to cover on going costs without needing to charge for a service directly. With a little work the developer can make use of the APIs these platforms provide bespoke content/services for users who choose to donate.

I have included a Paypal link the about page of some of my projects, I set have set the default amount to £5 with the suggestion that I will probably use it to buy myself a beer from time to time.

I have also recently signed up to the Githib sponsorship project to see how it works. Github lets you set different monthly amounts between $1 and $20000, at this time I only have 1 level set to $1.

Adverts/Affiliate links in projects

If you are building a mobile app or run a website then there is always the option of including adverts in the page/app. With this approach the user of the project doesn’t have to do anything apart from put up with some hopefully relevant adverts.

There is a balance that has to be struck with this as too many adverts or for irrelevant things can annoy users. I do occasionally post Amazon affiliate links in blog posts and I keep track of how much I’ve earned on the about page.

This is not just a valid model for open source projects, many (most) mobile games have adopted this sort of model, even if it is just as a starting tire before either allowing users to pay a fee to remove the adds or to buy in game content.

Amazon Wishlists

This is a slightly different approach is to publish a link to something like an Amazon wishlist. This allows users to buy developers a gift as a token of appreciation. The list allow the developer to get things they actually want and to select a range of items at different price points.

Back when Amazon was closer to it’s roots as a online book store (and people still read books to learn new things) it was a great way to get a book about a new subject to start a new project.

Other random thoughts

For another very interesting take on some of this please watch this video from Tom Scott for the Royal Institution about Science Communicating in the world of Youtube and social media. It has a section in middle about Parasocial Relationships which is really interesting (as it the rest of the video) in this context.

Conclusion

I don’t really have one at the moment, as I said at the start this is a bit of a stream of conciousness post.

I do think that there isn’t a one size fits all model, nor are the options I’ve listed above all of them, they were just the ones that came to mind as it typed.

If I come up with anything meaningful, I’ll do a follow up post, also if somebody want to sponsor me $20,000 a month on Github to come up with something, drop me a line ;-).