After deploying the TTN Gateway in my loft last year, I’ve finally got round to deploying a sensor to make use of it.
As the great lock down of 2020 kicked off I stuck in a quick order with Pimoroni for a Adafruit Feather 32u4 with an onboard LoRa radio (In hindsight I probably should have got the M0 based board).
I already had a DHT22 temperature & humidity and the required resistor so it was just a case of soldering on an antenna of the right length and adding the headers so I could hook up the 3 wires needed to talk to the DHT22.
There is an example app included in with the TinyLoRa library that reads from the sensor and sends a message to TTN every 30 seconds.
(There are full instructions on how to set everything up on the Adafruit site here)
Now the messages are being published to TTN Application I needed a way to consume them. TTN have a set of Node-RED nodes which make this pretty simple. Though installing them did lead to a weekend adventure of having to upgrade my entire home automation system. The nodes wouldn’t build/install on Node-RED v0.16.0, NodeJS v6 and Raspbian Jessie so it was time to upgrade to Node-RED v1.0.4, NodeJS v12 and Raspbian Buster.
This flow receives messages via MQTT direct from TTN, it then separates out the Temperature and Humidity values, runs them through smooth nodes to generate a rolling average of the last 20 values. This is needed because the DHT22 sensor is pretty noisy.
It outputs these smoothed values to 2 charts on a Node-RED Dashboard. which show the trend over the last 12 hours.
It also outputs to 2 MQTT topics that mapped to my public facing broker, which means they can be used to resurrect the temperature chart on my home page, and when I get an outdoor version of the sensor (hopefully solar/battery powered) up and running it will have both lines that my old weather station used to produce.
The last thing the flow does is to update the temperature reading for my virtual thermostat in the Google Assistant HomeGraph. This uses the RBE node to make sure that it only sends updates when the value actually changes.
The next step is to build a couple more and find a suitable battery and enclosure so I can stick one in the back garden.
I’ve been listening to a podcast called “Brad & Will Made a Tech Pod.” . It’s a fun geeky hour that comes out every Sunday.
In their last episode (44-alt-barney-dinosaur-die-die-die) they went back to their first contact with the Internet and this lead to a discussion of “The Gopher Space”, which is basically what the Internet was before “The Web” took over.
In my case my first trip online was a dial up educational BBS on a BBC Micro while at junior school, I dread to think how big the school phone bill must have been and how often I cut off the school Secretary off mid call by flicking the switch that hooked the line up to the acoustic coupler modem.
By the time I got to secondary school we had a 486 PC at home and I remember convincing my parents to let me buy a modem and posting 12 forward dated £12.00 cheques to Deamon Internet for a years worth of dial up access. At this point the Mosaic graphical Web browser had just turned up so I missed out on digging too deep into the Gopher Space.
But all the talk got me wondering how hard it would be to set up a gopher server to have a play with. I already run my own webserver to host this blog so why not see if I can host my posts on gopher as well.
A little bit of digging and I found pygopherd which comes ready packaged on Rasbian/RaspberryOS and just needs the servername setting in /etc/pygopherd/pygopherd.conf and a entry in the port forwarding rules on the router to get up and running.
I can now use the command gopher gopher.hardill.me.uk to start exploring. This is a terminal based tool which gives an authentic experience, but there are plugins for Firefox and Chrome if you want to play on the desktop.
A gopher site is sort of like a very stripped back webpage, it’s text only with no real formatting but it can contain links to other pages and to binary files that can be downloaded but not viewed inline. For example here is my “homepage” is in a file called gophermap (a bit like index.html) in /var/gopher
Ben's Place - Gopher
Just a place to make notes about things I've
been playing with
1Brad & Will Made a Tech Pod /podcast
The lines starting with a 1 point to another gophermap with titles Blog and Brad & Will Made a Tech Pod in /var/gopher/blog and /var/gopher/podcast respectively. If the line had started with 0 it would have pointed to a text file. A full list of the prefixes can be found here.
The plan is to generate text versions of all the posts on the blog. I’ll probably write a script that takes the ATOM feed and do the conversion, but in the mean time, there was a small challenge thrown down in podcast to host all the podcast over gopher.
I was also looking for an excuse to start to doing some playing in Go which just seemed apt.
A quick search found a library to do the RSS download and parsing so the whole thing only took about 50 lines. You can find the code on github here.
I recently installed some log charting software to get a view NGINX logs for the instance that servers up this site. While looking at some of the output I found some strange results so went to have a closer look at the actual logs
There were a whole bunch of entries with with a similar pattern, something similar to the following:
CONNECT is a HTTP verb that is used by HTTP Proxy clients to tell the proxy server to set up a tunnel to the requested machine. This is needed because is the client is trying to connect to a HTTP server that uses SSL/TLS then it needs to be able to do the SSL/TLS handshake directly with that server to ensure it has the correct certificates.
Since my instance of NGINX is not configured to be a HTTP Proxy server it is rightly rejecting these requests as can been seen by the 400 (Bad Request) after the request string. Even though the clients making these request are all getting the 400 error they are continuing to send the same request over and over again.
I wanted to filter these bogus request out for 2 reasons
They clog the logs up, making it harder to see real problems
They take up resources to process
The solution is to use a tool called fail2ban, which watches log files for signs of abuse and then sets up the correct firewall rules to prevent the remote clients from being able to connect.
I already run fail2ban monitoring the SSH logs for all my machines that face the internet as there are plenty of scripts out there running trying standard username/password combinations attempting to log into any machine listening on port 22.
Fail2ban comes with a bunch of log filters to work will a applications, but the ones bundled on raspbian (RaspberryOS) at this time didn’t include one that would match this particular pattern. Writing filters is not that hard but it does require writing regular expressions which is a bit of dark art. The trick is to get a regex that only matches what we want.
<HOST> fail2ban has some special tags that match an IP address or Host name
- - this matches the 2 dashes in the log
\[.*\] this matches the date/timestamp in the square brackets, I could have matched the date more precisely, but we don’t need to in this case. We have to escape the [ as it has meaning in regex as we will see later.
\"CONNECT this matches the start of the request string and limits it to just CONNECT requests
.* this matches the host:port combination that is being requested
HTTP/1\.[0-1]\" here we match both HTTP/1.0 and HTTP/1.1. We need to escape the . and the square brackets let us specify a range of values
400 and finally the 400 error code
Now we have the regex we can test it with the fail2ban-regex command to make sure it finds what we expect.
# fail2ban-regex /var/log/nginx/access.log '^<HOST> - - \[.*\] \"CONNECT .* HTTP/1\.[0-1]\" 400'
Use failregex line : ^<HOST> - - \[.*\] \"CONNECT .* HTTP/1\.[0-1]\" 400
Use log file : /var/log/nginx/access.log
Use encoding : UTF-8
Failregex: 12819 total
|- #) [# of hits] regular expression
| 1)  ^<HOST> - - \[.*\] \"CONNECT .* HTTP/1\.[0-1]\" 400
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
|  Day(?P<_sep>[-/])MON(?P=_sep)ExYear[ :]?24hour:Minute:Second(?:\.Microseconds)?(?: Zone offset)?
Lines: 526165 lines, 0 ignored, 12819 matched, 513346 missed
[processed in 97.73 sec]
Missed line(s): too many to print. Use --print-all-missed to print all 513346 lines
We can see that the regex matched 12819 lines out of 526165 which is in the right ball park. Now we have the right regex we can build the fail2ban config files.
First up is the filter file that will live in /etc/fail2ban/filter.d/nginx-connect.conf
The port entry is which ports to block in the firewall, enabled & logpath should be obvious (it’s using a pre-configured macro for the path). And finally the action, there are a number of pre-configured actions that range from just logging the fact that there were hits in the log all the way through to this on action_mwl, which adds a firewall rule to block the <HOST>, then emails me that with information about the IP address and a selection of the matching log lines.
I’ll probably reduce the action to one that just adds the firewall rule after a day or two when I have a feel for how much of this sort of traffic I’m getting. But the info about each IP address normally include a lookup of the owner and that usually has an email address to report problems. It’s not often that reporting things actually stops it happening, but it’s worth giving it ago at the start to see what happens.
Over the course of the afternoon since it was deployed the filter has blocked 148 IP addresses.
My default rules are a 12 hour ban if the same IP address triggers more than 3 times in 10 mins with a follow up of getting a 2 week ban if you trip that more than 3 times in 3 days.
Next up is to create a filter to match the GET https://www.instagram.com requests that are also polluting the logs.