Self-hosted website-analytics using Plausible and Traefik

Introduction

As a website operator, you have a legitimate interest in collecting statistics about your visitors, and even more about recurring ones. When using cookies, a unique identifier is set to track all user actions. This is a really effective method for collecting extensive data, especially combined with solutions like Google Analytics. Google is an expert in tracking down users, their software is so popular that currently more than 27 million websites use and share their data with Google [1].

People deserve privacy, and we have seen many hacks exposing valuable personal data to the world-wide-web [2] [3] [4].
As an administrator, it is my responsibility to keep that data safe. However, the rate of common vulnerabilities and exposures is increasing by such a rate that keeping your software secure and up to date becomes a tedious task. This challenge raises whether it is worth it to even collect any data. Think about it: if you don't have any personal data, there is nothing to protect.

This is where Plausible steps into the game. Plausible is an open-source software for website statistics, but it is so GDPR-compliant that you don’t even need a popup asking your visitors for consent. But how is this possible?
Easy! It just does not collect any personal data to create its statistics [5]. Of course, the tool is not as detailed as Google Analytics and you will still require cookies for other use cases like online shops. But Plausible is perfect for small projects that want usable website analytics while simultaneously not compromising its users’ privacy. I am delighted with the software and if you are interested in installing it yourself, go read ahead.

Installing Plausible

Requirements

To my knowledge, self-hosting Plausible is “only” possible via docker-container. Additionally, if you want to avoid issues, Plausible should only be accessed via HTTPS. Therefore, you will need a working machine with docker-compose and a reverse proxy with SSL/TLS functionality. This tutorial will not cover how to set up docker and neither Traefik, my choice for a docker reverse proxy.

Files

version: '3.7'
services:
#Main Container
  plausible:
    image: plausible/analytics:latest
    container_name: plausible
    hostname: plausible
    restart: unless-stopped
    command: sh -c "sleep 10 && /entrypoint.sh db createdb && /entrypoint.sh db migrate && /entrypoint.sh db init-admin && /entrypoint.sh run"
    env_file:
      - plausible.env
    networks:
      - plausible
    volumes:
      - ./plausible/geoip/GeoLite2-Country.mmdb:/plausible/geoip/GeoLite2-Country.mmdb
    labels:
      - traefik.enable=true
      - traefik.http.routers.plausible.entrypoints=https
      - traefik.http.routers.plausible.rule=Host(`data.contoso.com`)
      - traefik.http.routers.plausible.tls.certresolver=contoso
      - traefik.http.routers.plausible.middlewares=secure_headers@file
      - traefik.http.routers.plausible.service=plausible
      - traefik.http.services.plausible.loadbalancer.server.port=8000
#Needed for better tracking
  plausible-events:
    image: yandex/clickhouse-server:latest
    container_name: plausible-events
    hostname: plausible-events
    restart: unless-stopped
    networks:
      - plausible
    volumes:
      - ./plausible/events/data:/var/lib/clickhouse
      - ./plausible/events/clickhouse-config.xml:/etc/clickhouse-server/config.d/logging.xml:ro
      - ./plausible/events/clickhouse-user-config.xml:/etc/clickhouse-server/users.d/logging.xml:ro
    ulimits:
      nofile:
        soft: 262144
        hard: 262144
#Database
  plausible-db:
    image: postgres:latest
    container_name: plausible-db
    hostname: plausible-db
    restart: unless-stopped
    networks:
      - plausible
    volumes:
      - ./plausible/db:/var/lib/postgresql/data
    env_file:
      - plausible.env
#Database to match IP Adresses to countries
  plausible-geoip:
    image: maxmindinc/geoipupdate:latest
    container_name: plausible-geoip
    hostname: plausible-geoip
    restart: unless-stopped
    networks:
      - plausible
    env_file:
      - ./plausible.env
    volumes:
      - ./plausible/geoip:/usr/share/GeoIP
      - ./plausible/GeoIP.conf:/etc/GeoIP.conf
networks:
  plausible:

docker-compose.yml

###Plausible
DATABASE_URL=postgres://postgres:SgnaMq2nRCn4rU3cMKRiNR669sVHC@plausible-db:5432/plausible
CLICKHOUSE_DATABASE_URL=http://plausible-events:8123/plausible
GEOLITE2_COUNTRY_DB=/plausible/geoip/GeoLite2-Country.mmdb
SECRET_KEY_BASE=H63TDy9JhFhA5fz7Fo85923M7K6ZgH63TDy9JhFhA5fz7Fo85923M7K6ZgH63TDy9JhFhA5fz7Fo85923M7K6Z==
#Admin Settings
ADMIN_USER_EMAIL=admin@contoso.com
ADMIN_USER_NAME=admin
ADMIN_USER_PWD=4SucCRKKip9YLeNW8nbKbbCwDq2td
BASE_URL=https://data.contoso.com
#Mail-Settings
MAILER_EMAIL=plausible@contoso.com
SMTP_HOST_ADDR=mail.contoso.com
SMTP_HOST_PORT=587
SMTP_USER_NAME=plausible@contoso.com
SMTP_USER_PWD=jhWb4pF5P2zVB96C2XsAFZRR2GH7p
SMTP_HOST_SSL_ENABLED=yes
SMTP_RETRIES=2
#For automatic Google Search Console import
GOOGLE_CLIENT_ID=703417029534-6fpn539ge1i8f5ugrd0hdupsk1870t9p.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=A4282gPbpv2q83VDEF7PMxSf
#Postgres
POSTGRES_PASSWORD=zaeMeVhKhaRnYao7BT9s4R258YEdg
#GeoIP
GEOIPUPDATE_ACCOUNT_ID=195780
GEOIPUPDATE_LICENSE_KEY=Z3TqfgH1AbjA0ZkCN
GEOIPUPDATE_EDITION_IDS=GeoLite2-Country

plausible.env

Instructions:

  1. Copy the docker-compose.yml and the plausible.env with the above contents in the directory of your choice.

  2. To avoid many different issues, create as shown a geoip folder, an empty GeoIP.conf and GeoLite.mmdb:
    mkdir plausible && mkdir plausible/geoip && touch plausible/GeoIP.conf && touch plausible/geoip/GeoLite2-Country.mmdb

  3. Run the following command to create your SECRET_KEY_BASE and replace it with the one in the .env file.
    openssl rand -base64 64

  4. Adjust your ADMIN_USER_ credentials and BASE_URL to your liking.

  5. Change your Postgres database password POSTGRES_PASSWORD and replace it with the one already existing in the variable DATABASE_URL.

  6. Optional: Insert your own SMTP-server details at the SMTP_* variables to enable password-reset emails.

  7. Optional: In order to get reliant IP-Address to country translation, sign up here. When your account has been successfully created visit page and check "no" to generate a new GeoLite2 license key. Replace the values of the GEOIPUPDATE_ACCOUNT_ID and GEOIPUPDATE_ACCOUNT_ID variables with your own ones. If you do not wish to use this service, remove all the GeoIP variables

    Traefik specific settings:

    In the docker-compose.yml the Traefik labels for the Plausible container need multiple adjustments:

  8. Replace in the Host rule data.contoso.com with the same URL as yours BASE_URL.

  9. Change the certresolver from contoso to the one you have set up with your Traefik instance.

  10. Adjust or remove the middlewares line secure_headers@file to the ones you prefer using.

  11. Don't forget to add the plausible network to the Traefik container (lines 12-13 from the docker-compose.yml)
    In case you are using a different reverse proxy, remove all labels from the Plausible container, make sure to forward all traffic coming for BASE_URL (Port 443) to plausible:8000. For non-docker reverse proxies, you need to add a port mapping to the Plausible container ports: - XXXX:8000 and then point your reverse proxy to the docker-host.

  12. Finally, create the containers running docker-compose up -d. If no errors occur, visit in your browser the value of your BASE_URL. You should now be seeing this:plausible_welcome

  13. To validate your account, press "Request activation code" and it will email to ADMIN_USER_EMAIL. The verification can, however, be bypassed with this command:
    docker exec plausible-db psql -U postgres -d plausible_db -c "UPDATE users SET email_verified = true;".

Congratulations, you have now successfully installed Plausible 👏🏼🥳.

Please note that it would be better to create three different .env files (plausible.env, postgres.env, geopip.env), but for readability I passed all containers the same environment file. In case you want to separate them, orientate at the "###" mark. Also, it is recommended to use Docker Secrets, but I have to be honest - as of right now I don't know how to use it.

Bypassing AdBlocks

uBlock Origins EasyPrivacy List will block URL's containing and script files named with the strings “plausible”, “statistics” and “analytics”. This results in users not being tracked if you have chosen a poor name. When following this tutorial, I used “data” as a subdomain and script name, which right now does not trigger any AdBlock. I recommend following my example, or putting some good thought in to the naming scheme.

Adding Plausible to your Website

Add the second or third line into your <header>-section, which for a Ghost-Blog can be found inside the /yourThemeFolder/default.hbs file:

<head>
  <script async defer src="{{asset "js/stats.js"}}"></script> <!-- for Ghost -->
   <script async defer src="/replacePath/stats.js"}}"></script> <!-- for generic Websites. ADJUST THE PATH! -->
 </head>

Next, create the stats.js script in the path you defined above. For Ghost-CMS, that would be /yourThemeFolder/assets/js. Please make sure to adapt in the script the placeholder “YOUR_URL” with the URL of your Website.

!function(n,i){"use strict";var e,o=n.location,s=n.document,t=s.querySelector('[src*="'+i+'"]'),l=t&&t.getAttribute("data-domain"),c=n.localStorage.plausible_ignore;function p(e){console.warn("Ignoring Event: "+e)}function a(e,t){if(/^localhost$|^127(?:\.[0-9]+){0,2}\.[0-9]+$|^(?:0*\:)*?:?0*1$/.test(o.hostname)||"file:"===o.protocol)return p("localhost");if(!(n.phantom||n._phantom||n.__nightmare||n.navigator.webdriver||n.Cypress)){if("true"==c)return p("localStorage flag");var a={};a.n=e,a.u=o.href,a.d=l,a.r=s.referrer||null,a.w=n.innerWidth,t&&t.meta&&(a.m=JSON.stringify(t.meta)),t&&t.props&&(a.p=JSON.stringify(t.props));var r=new XMLHttpRequest;r.open("POST",i+"/api/event",!0),r.setRequestHeader("Content-Type","text/plain"),r.send(JSON.stringify(a)),r.onreadystatechange=function(){4==r.readyState&&t&&t.callback&&t.callback()}}}function r(){e!==o.pathname&&(e=o.pathname,a("pageview"))}function u(e){for(var t=e.target,a="auxclick"==e.type&&2==e.which,r="click"==e.type;t&&(void 0===t.tagName||"a"!=t.tagName.toLowerCase()||!t.href);)t=t.parentNode;t&&t.href&&t.host&&t.host!==o.host&&((a||r)&&plausible("Outbound Link: Click",{props:{url:t.href}}),t.target&&!t.target.match(/^_(self|parent|top)$/i)||e.ctrlKey||e.metaKey||e.shiftKey||!r||(setTimeout(function(){o.href=t.href},150),e.preventDefault()))}try{var h,f=n.history;f.pushState&&(h=f.pushState,f.pushState=function(){h.apply(this,arguments),r()},n.addEventListener("popstate",r)),s.addEventListener("click",u),s.addEventListener("auxclick",u);var d=n.plausible&&n.plausible.q||[];n.plausible=a;for(var g=0;g<d.length;g++)a.apply(this,d[g]);"prerender"===s.visibilityState?s.addEventListener("visibilitychange",function(){e||"visible"!==s.visibilityState||r()}):r()}catch(e){console.error(e),(new Image).src=i+"/api/error?message="+encodeURIComponent(e.message)}}(window,"https://YOUR_URL");

stats.js

Further Resources

I tried my best to make this tutorial helpful, and I certainly had fun writing it! Regardless, if you encounter any issues, make sure to check out the official Plausible documentation.


That is all, see you maybe next time 🥰.

Fabio Sauna

Fabio Sauna

Just an everyday, normal Systems Engineer living for the European spirit and maintaining a semi-professional homelab.
Heidelberg, Germany