Proxying for Mulitple Node-RED instances

So far we have worked out how to set up Node-RED to store flows in a database, use authentication to prevent unauthorised access and how to start multiple containerised instances under Docker.

In this post I will cover how to expose those multiple instances so their users can access them.

The easiest way to do this is to stick something like Nginx or Traefik in front of the docker containers and have it act as a reverse proxy. There are two ways we can set this up

  • Virtual Host based proxying – where each instance has it’s own hostname e.g. http://r1.example.com
  • Path based proxying – where each instance has a root path on the same hostname e.g. http://example.com/r1

In this case I’m going to use the first option of virtual hosts as currently Node-RED uses browser local storage to hold the Admin API security token and this is scoped to the host the editor is loaded from which means that you can only access one instance at a time if you use path based proxying. This is on the Node-RED backlog to get fixed soon.

To do that I’m going to use the nginx-proxy container. This container runs in the same Docker environment as the Node-RED instances and monitors all Docker to watch for containers starting/stopping. When it sees a new container start up it automatically create the right entry in the nginx configuration files and triggers it to reload the config.

To make this work I needed to add an extra environment variable to the command used to start the Node-RED containers

$ docker run -d --rm -e VIRTUAL_HOST=r1.example.com -e MONGO_URL=mongodb://mongodb/nodered -e APP_NAME=r1 --name r1 --link mongodb custom-node-red

I added the VIRTUAL_HOST environment variable which contains the hostname to use for this container. Th is means I can access this specific instance of Node-RED on http://r1.docker.local.

Traefik can be run in a similar way using labels instead of environment variables.

To make is all work smoothly I’ve added a wildcard domain entry to my local DNS that maps anything that matches *.example.com to the docker-pi.local machine that the container are running on.

Security

If I was going to run this exposed to the internet I’d probably want to enable HTTPS. To do this there are 2 options again

  • Use a separate certificate for each Virtual Host
  • Use a wildcard certificate that matches the wildcard DNS entry

I would probably go with the second option here as it is just one certificate you have to manage and even LetsEncrypt will issue a wildcard domain these days if you have access to the DNS.

For the first option there is a companion docker container for nginx-proxy that will use LetsEncrypt to issue certificates for each Virtual Host as they are started. It’s called letsencrypt-nginx-proxy-companion you can read how to use it in the nginx-proxy README.md

Limitations

Exposing Node-RED via a HTTP proxy does have one drawback. This approach means that only HTTP requests can directly reach the instances.

While this helps to offer a little more security, it also means that you will not be able to use the TCP-in or UDP-in nodes in server mode, that would allow arbitrary network connections into the instance. You will still be able to connect out from the instances to hosts as Docker provides NAT routing from containers to the outside world.

Sidebar

I’m testing all this on a Raspberry Pi4 running the beta of 64bit Raspberry Pi OS. I need this to get the official MongoDB container to work as they only formally support 64bit. As a result of this I had to modify and rebuild the nginx-proxy container because it only ships with support for AMD64 architectures. I had to build a ARM64 version of the forego and docker-gen packages and manually copy these into the container.

There is an outstanding pull-request open against the project to use a multistage-build which will build target specific binaries of forego and docker-gen which will fix this.

Hostname Based Proxying with MQTT

An interesting question came up on Stack Overflow recently that I suggested a hypothetical answer for how to do hostname based proxying for MQTT.

In this post I’ll explore how to actually implement that hypothetical solution.

History

HTTP added the ability to do hostname based proxying when it instroduced the Host header in HTTP v1.1. This meant that a single IP address could be used for many sites and the server would decide which content to serve based on the his header. Front end reverse proxies (e.g. nginx) can use the same header to decide which backend server to forward the traffic to.

This works well until we need encrypt the traffic to the HTTP server using SSL/TLS where the headers are encrypted. The solution to this is to use the SNI header in the TLS handshake, this tells the server which hostname the client is trying to connect to. The front end proxy can then either use this information to find the right local copy of the certificate/key for that site if it’s terminating the encryption at the frontend or it can forward the whole connection directly to the correct backend server.

MQTT

Since the SNI header is in the initial TLS handshake and is nothing to do with the underlying protocol it can be used for ay protocol, in this case MQTT. This measn if we set up a frontend proxy that uses SNI to pick the correct backend server to connect to.

Here is a nginx configuration file that proxies for 2 different MQTT brokers based on the hostname the client uses to connect. It is doing the TLS termination at the proxy before forwarding the clear version to the backend.

user  nginx;
worker_processes  auto;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
}

stream {  
  map $ssl_server_name $targetBackend {
    test1.example.com  192.168.1.1:1883;
    test2.example.com  192.168.1.2:1883;
  }

  map $ssl_server_name $targetCert {
    test1.example.com /certs/test1-chain.crt;
    test2.example.com /certs/test2-chain.crt;
  }

  map $ssl_server_name $targetCertKey {
    test1.example.com /certs/test1.key;
    test2.example.com /certs/test2.key;
  }
  
  server {
    listen 1883         ssl; 
    ssl_protocols       TLSv1.2;
    ssl_certificate     $targetCert;
    ssl_certificate_key $targetCertKey;
        
    proxy_connect_timeout 1s;
    proxy_pass $targetBackend;
  } 
}

Assuming the the DNS entries for test1.example.com and test2.example.com both point to the host running nginx then we can test this with the mosquitto_sub command as follows

$ mosquitto_sub -v -h test1.example.com -t test --cafile ./ca-certs.pem

This will be proxied to the broker running on 192.168.1.1, where as

$ mosquitto_sub -v -h test2.example.com -t test --cafile ./ca-certs.pem

will be proxied to the broker on 192.168.1.2.

Cavets

The main drawback with this approach is that it requires that all the clients connect using TLS, but this is not a huge problem as nearly all devices are capable of this now and for any internet facing service it should be the default anyway.

Acknowledgment

How to do this was mainly informed by the following Gist

nginx-proxy-avahi-helper

I’ve been playing with the jwilder/nginx-proxy docker container recently. It automatically generates reverse proxy entries for other containers.

It does this by monitoring when new containers are started and then inspecting the environment variables attached to the new container. If the container has the VIRTUAL_HOST variable is uses it’s value as the virtual host for nginx to proxy access to the container.

Normally for this type of setup you would set up a wildcard DNS that points to the docker host so all DNS lookups for a machine in the root domain will return the same IP address.

If all the virtual hosts are in the example.com domain e.g. foo.example.com and bar.example.com you would setup the *.example.com DNS to point to the docker hosts IP address.

When working on my home LAN I normally use mDNS to access local machines so there is no where to set up the wildcard DNS entry. Instead I have build a container to add CNAME mDNS entries for the docker host for each of the virtual hosts.

In my case Docker is running on a machine called docker-pi.local and I’m using that as the root domain. e.g. manager.docker-pi.local or registry.docker-pi.local.

The container is using the docker-gen application that use templates to generate configuration files based on the running containers. In this case it generates a list of virtual hosts and writes them to a file.

The file is read by a small python application that connects to d-bus to talk to the avahi daemon on the docker host and configures the CNAMEs.

Both Docker and d-bus use unix domain sockets as their transport protocol so you have to mount the sockets from the host into the container.

 docker run -d -v /run/dbus/system_bus_socket:/run/dbus/system_bus_socket -v /var/run/docker.sock:/tmp/docker.sock --name avahi-cname hardillb/nginx-proxy-avahi-helper

I’ve put the code on github here and the container is on docker hub here.

Creating Fail2Ban rules

I recently installed some log charting software to get a view NGINX logs for the instance that servers up this site. While looking at some of the output I found some strange results so went to have a closer look at the actual logs

Chart showing hits/visitor to site

There were a whole bunch of entries with with a similar pattern, something similar to the following:

94.130.58.174 - - [15/Jul/2020:11:43:39 +0100] "CONNECT rate-market.ru:443 HTTP/1.1" 400 173 "-" "-"

CONNECT is a HTTP verb that is used by HTTP Proxy clients to tell the proxy server to set up a tunnel to the requested machine. This is needed because is the client is trying to connect to a HTTP server that uses SSL/TLS then it needs to be able to do the SSL/TLS handshake directly with that server to ensure it has the correct certificates.

Since my instance of NGINX is not configured to be a HTTP Proxy server it is rightly rejecting these requests as can been seen by the 400 (Bad Request) after the request string. Even though the clients making these request are all getting the 400 error they are continuing to send the same request over and over again.

I wanted to filter these bogus request out for 2 reasons

  • They clog the logs up, making it harder to see real problems
  • They take up resources to process

The solution is to use a tool called fail2ban, which watches log files for signs of abuse and then sets up the correct firewall rules to prevent the remote clients from being able to connect.

I already run fail2ban monitoring the SSH logs for all my machines that face the internet as there are plenty of scripts out there running trying standard username/password combinations attempting to log into any machine listening on port 22.

Fail2ban comes with a bunch of log filters to work will a applications, but the ones bundled on raspbian (RaspberryOS) at this time didn’t include one that would match this particular pattern. Writing filters is not that hard but it does require writing regular expressions which is a bit of dark art. The trick is to get a regex that only matches what we want.

The one I came up with is this:

^<HOST> - - \[.*\] \"CONNECT .* HTTP/1\.[0-1]\" 400

Let’s break that down into it’s specific parts

  • ^ this matches the start of the log line
  • <HOST> fail2ban has some special tags that match an IP address or Host name
  • - - this matches the 2 dashes in the log
  • \[.*\] this matches the date/timestamp in the square brackets, I could have matched the date more precisely, but we don’t need to in this case. We have to escape the [ as it has meaning in regex as we will see later.
  • \"CONNECT this matches the start of the request string and limits it to just CONNECT requests
  • .* this matches the host:port combination that is being requested
  • HTTP/1\.[0-1]\" here we match both HTTP/1.0 and HTTP/1.1. We need to escape the . and the square brackets let us specify a range of values
  • 400 and finally the 400 error code

Now we have the regex we can test it with the fail2ban-regex command to make sure it finds what we expect.

# fail2ban-regex /var/log/nginx/access.log '^<HOST> - - \[.*\] \"CONNECT .* HTTP/1\.[0-1]\" 400' 

Running tests
=============

Use   failregex line : ^<HOST> - - \[.*\] \"CONNECT .* HTTP/1\.[0-1]\" 400
Use         log file : /var/log/nginx/access.log
Use         encoding : UTF-8


Results
=======

Failregex: 12819 total
|-  #) [# of hits] regular expression
|   1) [12819] ^<HOST> - - \[.*\] \"CONNECT .* HTTP/1\.[0-1]\" 400
`-

Ignoreregex: 0 total

Date template hits:
|- [# of hits] date format
|  [526165] Day(?P<_sep>[-/])MON(?P=_sep)ExYear[ :]?24hour:Minute:Second(?:\.Microseconds)?(?: Zone offset)?
`-

Lines: 526165 lines, 0 ignored, 12819 matched, 513346 missed
[processed in 97.73 sec]

Missed line(s): too many to print.  Use --print-all-missed to print all 513346 lines

We can see that the regex matched 12819 lines out of 526165 which is in the right ball park. Now we have the right regex we can build the fail2ban config files.

First up is the filter file that will live in /etc/fail2ban/filter.d/nginx-connect.conf

# Fail2Ban filter to match HTTP Proxy attempts
#

[INCLUDES]

[Definition]

failregex = ^<HOST> - - \[.*\] \"CONNECT .* HTTP/1\.[0-1]\" 400

ignoreregex = 

next is the jail file in /etc/fail2ban/jail.d/nginx-connect.conf, this tells fail2ban what to do if it finds log lines that match the filter and where to find the logs.

[nginx-connect]
port	= http,https
enabled	= true
logpath = %(nginx_access_log)s
action	= %(action_mwl)s

The port entry is which ports to block in the firewall, enabled & logpath should be obvious (it’s using a pre-configured macro for the path). And finally the action, there are a number of pre-configured actions that range from just logging the fact that there were hits in the log all the way through to this on action_mwl, which adds a firewall rule to block the <HOST>, then emails me that with information about the IP address and a selection of the matching log lines.

I’ll probably reduce the action to one that just adds the firewall rule after a day or two when I have a feel for how much of this sort of traffic I’m getting. But the info about each IP address normally include a lookup of the owner and that usually has an email address to report problems. It’s not often that reporting things actually stops it happening, but it’s worth giving it ago at the start to see what happens.

Over the course of the afternoon since it was deployed the filter has blocked 148 IP addresses.

My default rules are a 12 hour ban if the same IP address triggers more than 3 times in 10 mins with a follow up of getting a 2 week ban if you trip that more than 3 times in 3 days.

Next up is to create a filter to match the GET https://www.instagram.com requests that are also polluting the logs.

Logging request & response body and headers with nginx

I’ve been working a problem to do with oAuth token refresh with the Amazon Alexa team recently and one of the things they have asked for is a log of the entire token exchange stage.

Normally I’d do this with something like Wireshark but as the server is running on a Amazon EC2 instance I didn’t have easy access to somewhere to tap the network so I decided to look for another way.

The actual oAuth code is all in NodeJS + Express but the whole thing is fronted by nginx. You can get nginx to log the incoming request body relatively simply, there is a $request_body variable that can be included in the logs, but there is no equivalent $resp_body.

To solve this I turned to Google and it turned up this answer on Server Fault which introduced me to the embedded lua engine in nginx. I’ve been playing with lua for some things at work recently so I’ve managed to get my head round the basics.

The important bit of the answer is:

lua_need_request_body on;

set $resp_body "";
body_filter_by_lua '
  local resp_body = string.sub(ngx.arg[1], 1, 1000)
  ngx.ctx.buffered = (ngx.ctx.buffered or "") .. resp_body
  if ngx.arg[2] then
     ngx.var.resp_body = ngx.ctx.buffered
  end
';

I also wanted the request and response headers logging so a little bit more lua got me those as well:

set $req_header "";
set $resp_header "";
header_filter_by_lua ' 
  local h = ngx.req.get_headers()
  for k, v in pairs(h) do
      ngx.var.req_header = ngx.var.req_header .. k.."="..v.." "
  end
  local rh = ngx.resp.get_headers()
  for k, v in pairs(rh) do
      ngx.var.resp_header = ngx.var.resp_header .. k.."="..v.." "
  end
';

This combined with a custom log format string gets me everything I need.

log_format log_req_resp '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" '
'$request_time req_header:"$req_header" req_body:"$request_body" '
'resp_header:"$resp_header" resp_body:"$resp_body"';