Introduction

After putting it off for a long time, I finally decided to set up a personal website for myself (the one you’re on now). I tried to create a Linode account, but it was flagged for malicious activity. Luckily, I had a Raspberry Pi lying around, so I decided to self-host on my home network. I thought I would share my process for others looking to do the same.

Why Host Locally

The most common approach when setting up a personal website is to use a website generator and then a paid platform to host your server. For those wanting more control, setting up a website and hosting it on a VPS (Virtual Private Server) is common. However, if you want even more control and a fun project, self-hosting a web server on your home network is a great way to get familiar with some networking concepts in practice. For this tutorial, I will be using a Raspberry Pi to host my server, but any spare computer you have lying around (ideally running Linux) will work.

Network Address Translation (NAT)

As you may know, the internet runs on the IP protocol. This means that each internet connected device must have an IP address which it can be accessed by such that all data that is sent can be routed to the correct recipient. Ideally, each device would have its own unique IP address. However, as the IP protocol was designed in 1974, when the Internet was somewhat smaller, IPv4 was only designed to have a 32-bit address, which provides 4,294,967,296 unique addresses (No, I will not be addressing IPv6 here).

The world has a significantly larger number of internet-connected devices, so some workarounds have been created. The most important workaround for this article is Network Address Translation (NAT). What NAT does is that your home router will have a unique IP address; it then assigns each device that connects to it an IP address which is only accessible on your local network. Then, when your device sends a packet using its assigned (local) address as a return address, the router will switch the address on your packet to its external address and forward your packet to the target. Then, when it receives that packet, it changes the destination IP on the packet and sends it to your device. For more info, check out the wiki.

While NAT is very cool and extends the life of IPV4 massively, it poses a problem for our purpose as your device doesn’t have an externally facing IP address from which a random client can request your website. And if a client tries to connect to the router directly, the router doesn’t know where to forward your message. Instead, we can go into the router settings and set up port forwarding. This is an option in your router’s settings page, accessible by accessing the router’s local address in your browser (in my case, http://192.168.0.1).

You’ll want to set both ports 80 and 443 for http and https, respectively, to be forwarded to your Raspberry PI IP. You can either get the RPIs IP by looking at the connected devices in your router settings or running the following on you RPi:

ifconfig

When you are setting up port forwarding you will want to find your device in your router’s connected devices list and give it a static IP in the DHCP settings. Otherwise each time your RPi reboots it will have a new IP and you will have to re-do the port-forwarding.

Getting a Domain

Now that you have the network settings on your local router set up, you’ll want to get a domain for your new site. These are generally cheap, especially if you’re willing to go with a less common domain extension. I chose to use Hostinger for this, but any domain provider should work.

Once you have your domain, you’ll want to set up an A record on your domain provider’s website with the external IP address of your router.

Nginx

Nginx is a suitable platform for hosting simple websites. Although it is technically intended to be used as a reverse proxy, it works well for static websites and is easy to set up. To set up Nginx on your Raspberry Pi, you’ll want to first install Nginx.

sudo apt update
sudo apt upgrade
sudo apt install nginx

Once you have it installed, you’ll want to create a directory under var/www/example.com, replacing example.com here and everywhere following with your domain name.

You’ll then want to create a configuration file for your site with.

sudo nano /etc/nginx/sites-available/example.com

And copy the following into it.

server {
    listen 80;
    server_name example.com www.example.com;

    root /var/www/example.com;
    index index.html;

    location / {
        try_files $uri $uri/ =404;
    }
}

Now it’s getting close to having everything up and running, and we can create our first content. Create your website files in the directory you made earlier.

sudo nano /var/www/example.com/index.html

And add some content to test that it works

<!DOCTYPE html>
<html>
<head>
    <title>Welcome to my blog!</title>
</head>
<body>
    <h1>Hello, World!</h1>
    <p>This is a test page for your new website hosted on Raspberry Pi.</p>
</body>
</html>

The final few steps are to create a symbolic link to enable your site:

sudo ln -s /etc/nginx/sites-available/example.com /etc/nginx/sites-enabled/

Check the Nginx configuration for errors with:

sudo nginx -t

And finally, if the test is successful, you can restart Nginx to apply the changes.

sudo systemctl restart nginx

Now go ahead and test your site, you can either do so by entering http://[External IP address] or by visiting your domain name if you’ve set it up.

NAT Loopback

One side effect of self-hosting through port forwarding through your NAT is that not all routers allow for so-called NAT Hairpinning or NAT Loopback. This means that you won’t be able to access your router’s external IP from within your local network. Therefore, you might need to access your website via a VPN or on your phone using mobile data.

SSL Certificates

To obtain valid SSL certificates, you can use Certbot.

sudo certbot --nginx -d example.com -d www.example.com

You can then modify your configuration file as such.

server {
    listen 80;
    server_name example.com www.example.com;

    # Redirect HTTP to HTTPS
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl;
    server_name example.com www.example.com;

    ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;

    root /var/www/example.com;
    index index.html;

    location / {
        try_files $uri $uri/ =404;
    }
}

Website content

Now, your website should work. Congrats! Now comes the important part: make it your own. To create the HTML and CSS files for your website, there are a few approaches you can take. The simplest is to just do it from scratch by yourself, a.k.a. have ChatGPT create a base that you can customize. This is a good way to make a simple website, depending on how familiar you are with HTML. You could also find a template online to modify for your purposes.

I went another route and used Hugo to generate my site. This static website generator creates the HTML and CSS files based on the markdown files you define. I chose this as I’m used to writing my notes in Markdown anyway in Obsidian. I found the learning curve for this a little steep, as the documentation isn’t the best if you are not too familiar with frontend design, but a good quickstart guide can be found here.

I won’t go into too much detail on Hugo as I am far from an expert. However, if you are curious about my setup, feel free to send me an email.

Monitoring

Once I had my website set up, I decided to do a few experiments to ensure it would run smoothly.

Temperature

As I knew overheating is an issue with certain Raspberry Pi models I wanted to ensure this wouldn’t be an issue for me. Therefore I created a simple script to log the temperature values of the Raspberry Pi.

temperature=$(sensors | grep -Po '\+([0-9]+\.[0-9])')
datetime=$(date '+%Y/%m/%d %H:%M:%S')
echo "${datetime} Temp: ${temperature}" >> temps.log

Then I set it up with a cron job to run every 5 minutes. Here’s a good tutorial.

This showed me that the temperature remains stable at around 50°C so I do not feel the need to setup housing for my Raspberry Pi with active cooling.

Traffic

Next, I wanted to monitor the traffic coming to my site and visualize it, which was a more involved process.

The access logs for nginx servers are under /var/log/nginx/access.log. The logs show every request sent to the server including requests for the favcon and robots.txt. I was primarily curious to see how often I got requests for the main page, articles and CV page, so I created a regex expression in Python to filter only these lines out and then extract both the IP of the client who accessed it and the access time. This was a rushed script so the code isn’t the nicest but I’ll include it in case it can be of use to anyone.

def get_access_times(log):
    # pattern to get all accesses to pages, no accesses to favcon etc.
    page_access_pattern = r"GET (/ |/cv|/posts)"
    time_pattern = r"\d{2}/[a-zA-Z]{3}/\d{4}:\d{2}:\d{2}:\d{2}"
    accesses = [line for line in log.split('\n') if re.search(page_access_pattern,line)]
    times = [re.findall(time_pattern, access)[0] for access in accesses]
    return times

def get_IPs(log):
    # pattern to get all accesses to pages, no accesses to favcon etc.
    page_access_pattern = r"GET (/ |/cv|/posts)"
    IP_pattern = r"^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"
    accesses = [line for line in log.split('\n') if re.search(page_access_pattern,line)]
    IPs = [re.findall(IP_pattern, access)[0] for access in accesses]
    return IPs

Next once I had a nice list of access times I plot it by bucketing how many accesses I get in every 5 minute interval and plot these intervals for the last day as follows.

#!/usr/bin/env python3
import re
import matplotlib.pyplot as plt
from datetime import datetime, timedelta
from collections import defaultdict

with open('/var/log/nginx/access.log') as file:
    log = file.read()

date_times = get_access_times(log)
IPs = get_IPs(log)

# Convert the date string into datetime objects
datetime_data = [datetime.strptime(dt, '%d/%b/%Y:%H:%M:%S') for dt in date_times]

bucketed_access = defaultdict(int)

interval = timedelta(minutes=5)

start_time = min(datetime_data)
end_time = max(datetime_data)

for access_time in datetime_data:
    bucket_start = start_time + ((access_time-start_time) // interval)*interval
    bucketed_access[bucket_start] += 1

bucket_times = sorted(bucketed_access.keys())
access_counts = [bucketed_access[bucket] for bucket in bucket_times]

plt.plot(bucket_times, access_counts)
plt.xlabel('Time')
plt.ylabel('Number of Accesses')
plt.title('Accesses per 5-Minute interval')
plt.xticks(rotation=45)
plt.savefig('access_plot.png')

Now, I could run the script whenever I want to see the traffic for the last day and then view the figure by transferring the file over with scp. Instead, I decided to set up a second web server on a port different from 80 or 443 (which are used for HTTP and https, respectively) to display the data in an easy-to-access manner. I do this with a new configuration file. I have changed the port I used in the example below to 1234.

server {
    listen 1234;
    server_name 192.168.0.240;

    root /var/www/monitor_site;
    index index.html;

    location / {
            try_files $uri $uri/ =404;
    }
}

I then set up a cron job to run the Python script every 5 minutes and a simple html site to serve the figure it creates in the root directory in the configuration above.

It’s not pretty, but it works.