Setting up Posthog on AWS with Elasticache, Postgres, EKS and Clickhouse

Parikshit Singh
15 min readOct 13, 2022

--

Self hosting posthog on AWS EKS(Elastic Kubernetes Service) comes with the benefits of reduced cost as compared to that of deploying on Posthog Cloud and obviously more control and visibility over the infrastructure to debug or scale later.

It is often difficult to set it up without prior knowledge of availability zone placement, what services are required and what configurations should be set for initial cluster setup e.t.c.

So, here in this article we are going to set up Posthog version “1.40.0" (latest version at the time I am writing this) and cover/setup the following:

  1. Installing tools for setting up AWS EKS cluster
  2. Setting up new AWS EKS cluster
  3. Setting up Elasticache (Redis) instance
  4. Setting up RDS (Postgres) instance
  5. Setting up helm chart configuration
  6. Setting up clickhouse instance
  7. Installing the helm chart

Installing tools for setting up AWS EKS cluster

We’ll begin with installing necessary tools to speed up the process of configuration, setup and visualisation

  1. Open terminal and install AWS CLI tool. This will help us adding our AWS account details in our local system. Once it is installed execute aws configure and enter your AWS access key id, secret key and region name. Keep the default output format whatever it is by default (None).
  2. Install eksctl cli tool. It is a command line tool for working with EKS clusters that automate many individual tasks like creating a cluster, deleting, updating e.t.c. This is specific to AWS EKS.
  3. Install kubectl which is a command line tool for working with kubernetes clusters. It provides information on a running cluster for example, number of pods in the cluster, pod configurations e.t.c.
  4. Install K9S which is a terminal based UI to interact with your kubernetes cluster.
  5. Install Helm which is a package manager for kubernetes. It helps you manage different kubernetes applications. It has something called “Helm charts” which helps us define, install and upgrade any kubernetes application rather than relying upon kubectl to configure our cluster manually. We are anyway going to keep kubectl because it makes some other stuff easy.

Setting up new AWS EKS cluster

Lets now create a new cluster using eksctl cli tool which we have already installed in previous steps. Execute the following in your terminal.
eksctl create cluster --name my-cluster --region region-code
Don’t forget to change the name to “posthog” and region to whatever region you are going to create your cluster in. For this article, I am assuming that we have kept it ap-southeast-1. Which makes our command as
eksctl create cluster --name posthog --region ap-southeast-1
Grab a cup of coffee ☕. It might take some time 🙂

While creating a new cluster you might get an error saying “The maximum number of internet gateways has been reached” or “The maximum number of VPC(s) has been reached” This is because you are having a limit of number of VPCs that can exist parallely. You should request a VPC quota increase from AWS or delete soe unused existing VPCs.

Setting up Elasticache (Redis) instance

Basic Settings

  1. Go to AWS Elasticache.
  2. Choose “Configure and create a new cluster”.
  3. Keep the cluster mode off. Posthog is not known to work well with Redis Cluster. It is not very good with handling requests that do not hash keys to the same slot (See this).
  4. We’ll be naming our elasticache instance “posthog”. You can name it whatever you want. Add some relatable description. I generally keep the posthog version in the description with related EKS cluster name.
  5. Keep the location as AWS Cloud. We can disable Multi-AZ and AuoFailover for now. If required, we can set it up later.
  6. In cluster settings, Keep the redis engine version latest. I am keeping it at 6.2 which is latest at the time I am writing this.
  7. Keep the port untouched and select default.redis<version_you_selected>.x as the parameter group. Keep the number of replicas 0 for the initial setup. Redis is not anyway going to be used intensively for posthog. It is used to cache some events which are being read at a high frequency by clickhouse.
  8. We can keep the node type “cache.t3.small” for the time being and can scale it up as per the requirement.
  9. In subnet group settings, We’ll create a new subnet group for our posthog instance. Select “Create a new subnet group” and enter a relevant name, for example “posthog-redis-subnet”. Description can be “Redis subnet for posthog version 1.40.0”. Now we need to find out the VPC ID. For that go to VPCs. You will find “eksctl-posthog-cluster/VPC” there which has been created by AWS cloudformation. Copy its VPC ID and VPC ID dropdown in subnet settings.
  10. Once you choose that VPC in subnet group settings, all the subnets related to that VPC will be shown automatically. How? Because these subnets were created at the time when VPC was built by cloudformation.
  11. In availability zone we’ll select only one zone either a, b or c. Reason for this will be explained later in this article. I am supposing we have selected ap-southeast-1a. Now click on “Next”

Advanced Settings

Disable encryption at rest and encryption in transit.

Now for security groups, We’ll select mainly 3 security groups.

  1. Cluster shared node security group: For communication between all nodes in the cluster.
  2. Control plane security group: For communication between the control plane and worker node-groups.
  3. Cluster security group: EKS created security group applied to ENI that is attached to EKS Control Plane master nodes, as well as any managed workloads.

To select the security groups in advance configuration, click on manage in “Selected security groups” section and search your cluster’s name in the search box. The three security groups mentioned above will be shown. Select all three.

Optionally set up backup, maintenence and log settings and click on next. If you are not sure what should these settings be, keep them untouched and move forward.

Review your changes and create the redis queue / elasticache instance. And wait for AWS to get your instance up-and-runnin. 🚀

Once the instance gets available, go to yor posthog instance and note the “Primary endpoint” from cluster details. We’ll need this later.

Setting up RDS (Postgres) instance

Go to AWS RDS dashboard and in “Databases”, click on “Create database”.

Choose a database creation method

In “Choose a database creation method” choose “Standard create” to have the flexibility of providing all the necessary configuration manually.

Engine options

Choose PostgreSQL from Engine options.

For engine version, any version should work. At the time I am writing this, version 12.8-R1 works just fine.

Templates

In templates, Choose “Production”.

Availability and durability

In availability and durability we can keep it Single DB instance .

Settings

Provide an identifier for the db instance. This will be our instance name. We can keep it “posthog” if there are no other RDS instances with the same name.

Lets keep the username “postgres” and provide the password. Write it somewhere safe to access it later. We’ll need it in our helm chart configurations.

Instance configuration

In Instance configuration, choose Burstable classes. Burstable classes include t3 and t2 instances as well. This is a good choice when you want to go for smaller instances. We’ll keep it db.t3.large for now.

If you don’t see burstable class in instance configuration options, you might want to tweak the engine version we have set previously.

Storage settings

  1. For storage settings, We’ll choose General purpose SSD type storage (GP2) because this volume class provides a balance of price and performance. I would recommend this for posthog type of workload. Posthog has RDS instance only as a secondary storage system. Clickhouse is responsible for storing and fetching main events-data which becomes huge overtime.
  2. We can keep the allocated storage at 50GB initially. We’ll change it according to increasing scale, for which we should enable storage autoscaling option (There should be a checkbox for the same in storage settings).
  3. Keep the maximum storage threshold to 1000GB.

Connectivity Settings

  1. We don’t need to setup connection with EC2 compute resource. If we want to access this database, we’ll be doing it from inside the EKScluster itself.
  2. Choose IPV4 network type.
  3. Choose your posthog VPC from the VPC dropdown. In our case it will be something like eksctl-<your EKS cluster name>-cluster, i.e. eksctl-posthog-cluster.
  4. For DB subnet group, we can choose “Create a new DB subnet group”.
  5. Allow public access to the database which will ensure that it is accessible via its master username and password across all VPCs. We’ll limit this access later via security groups.
  6. For “VPC security group (firewall)”, select “Choose Existing” so that we can choose from existing security groups.
  7. Choose all the security groups shown there having posthog in its name. We can keep the default one selected as well.
  8. For availability zone we can go with the same zone we have selected while creating Elasticache instance. i.e. ap-southeast-1a. Why we are doing this will be explained later.
  9. Keep the additional configuration as it is.

Database authentication

Choose password authentication

Monitoring

We can keep this untouched. (By default performance insights are enabled, at the time I am writing this).

Additional Configuration

  1. For initial database name, specify a string of up to 64 alpha-numeric characters that define the name given to a database that Amazon RDS creates when it creates the DB instance. If you do not specify a database name, Amazon RDS does not create a database when it creates the DB instance. We’ll keep it as “posthog”.
  2. We’ll keep the DB parameter group as it is.
  3. For backup settings, we should ideally enable automated backups. Lets keep the retention period to 1 day and choose a window of any time in a day with a duration of 0.5 hours. We can keep encryption and backup replication settings as it is. Enable both type of logs in “Log Exports”. In maintenance, we can enable “Auto minor version upgrade”. We can chose a maintenance window of one hour. After these settings, in the last, we can enable deletion protection.
  4. Dont forget to check out estimated monthly costs in the end. 💰

Before we review our changes, make sure you keep the following noted somewhere. We’ll need them later while setting up helm configuration.

  1. DB instance name
  2. Master user name
  3. Master password
  4. Initial database name

Review all the settings and click on “Create database”. We have done a lot of labour already. While the database creation is in progress, go grab another cup of coffee.

Once the creation is completed please note the endpoint of your instance as well. It can be found in “Connectivity and security” > “Endpoint & port” on AWS RDS management console.

Now its time to connect resouces on AWS EKS with our local system for monitoring and configuration.

Since we have already configured AWS credetials in our local system, we need to connect our system with the remote EKS cluster. To achieve that, we’ll execute the following command in a new terminal window.

aws eks update-kubeconfig --name posthog --region ap-southeast-1

The above comand will update your kubeconfig in local system. Now you’ll need to save it, for which following command will help.

eksctl utils write-kubeconfig --cluster=posthog

Congrats! You have connected to your remote posthog cluster. execute k9s in the terminal to see the running pods (Currently there should be none). Explore k9s and various commands available in kubectl.

Now its time to set up our posthog helm configuration. Helm configuration will help us run posthog pods on the remote cluster.

Keep DEPLOYING ON AWS -BY POSTHOG and ALL_VALUES.md handy while working on the helm configuration.

Setting up helm chart configuration

Open a new terminal and create a file values.yml in your home path.

Paste the following scratch configuration in this yml file.

cloud: "aws"

ingress:
hostname: <your hostname here>
nginx:
enabled: true

cert-manager:
enabled: true

env:
- name: CLICKHOUSE_REPLICATION
value: "TRUE"

sentryDSN: <If you are using sentry for logging, add the DSN here>

email:
user: <user>
password: <password>
port: 587
host: smtp.sendgrid.net
from_email: <your contact email address>

redis:
enabled: false

externalRedis:
host: <redis host>
port: 6379

postgresql:
enabled: false

externalPostgresql:
postgresqlHost: <Your Psql host>
postgresqlPort: 5432
postgresqlUsername: <Psql master username>
postgresqlPassword: <Psql master password>
postgresqlDatabase: <master DB name>

zookeeper:
metrics:
enabled: true

clickhouse:
persistence:
size: 1000Gi
layout:
shardsCount: 1
nodeSelector:
clickhouse: "true"
tolerations:
- key: "dedicated"
value: "clickhouse"
operator: "Equal"
effect: "NoSchedule"

kafka:
enabled: true
persistence:
size: 50Gi
logRetentionBytes: _22_000_000_000
logRetentionHours: 4

# Enable horizontal autoscaling for serivces
pgbouncer:
hpa:
enabled: true
minpods: 2
maxpods: 4

web:
hpa:
enabled: true
minpods: 2
maxpods: 5
internalMetrics:
capture: true

worker:
hpa:
enabled: true
minpods: 2
maxpods: 5

plugins:
hpa:
enabled: true
minpods: 2
maxpods: 5

events:
hpa:
enabled: true
minpods: 2
maxpods: 5

installCustomStorageClass: true

Lets fill the placeholders together.

  1. ingress.hostname: This is the URL to address your PostHog installation. we’ll need to set up DNS after installation. For now, we can keep it analytics.<your webisite hostname>.com
  2. sentryDSN: Sentry is open-sourced error tracking tool with full stacktraces & asynchronous context. In case you are using this tool, you add add its DSN here.
  3. email.user: Add SMTP service username or email.
  4. email.password: Add SMTP password for the username or email entered in the previous step.
  5. email.from_email: This defines outbound email sender to use. Enter your company’s contact email here.
  6. externalRedis.host: Remember we noted the primary endpoint while setting up elasticache instance? Add it here. Don’t forget to remove the port 6379 from its end. As it is already added as another property i.e. externalRedis.port.
  7. externalPostgresql.host: Remember we noted some details while setting up postgres RDS instance? You need to add host from there.
  8. externalPostgresql.postgresqlUsername: Enter the username you have noted.
  9. externalPostgresql.postgresqlPassword: Enter the password you have noted.
  10. externalPostgresql.postgresqlDatabase: Enter the name of master database you have noted.

Now before installing this helm chart we need to understand that we have some tolerations an taints for clickhouse database (See the configuration above, specifically clickhouse.nodeSelector and clickhouse.tolerations). Let’s first understand these terms.

  1. Labels: Labels are key/value pairs that are attached to objects, such as pods. Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users, but do not directly imply semantics to the core system. Labels can be used to organize and to select subsets of objects. Labels can be attached to objects at creation time and subsequently added and modified at any time. Each object can have a set of key/value labels defined. Each Key must be unique for a given object.
  2. Selectors: Labels do not provide uniqueness. In general, we expect many objects to carry the same label( s). Label selectors are used to identify a set of objects. The label selector is the core grouping primitive in Kubernetes. You can constrain a Pod so that it can only run on particular set of node(s). There are several ways to do this and the recommended approaches all use label selectors to facilitate the selection. Generally such constraints are unnecessary, as the scheduler will automatically do a reasonable placement ( for example, spreading your Pods across nodes so as not place Pods on a node with insufficient free resources). However, there are some circumstances where you may want to control which node the Pod deploys to, for example, to ensure that a Pod ends up on a node with an SSD attached to it, or to co-locate Pods from two different services that communicate a lot into the same availability zone.
  3. Taints: Node affinity is a property of Pods that attracts them to a set of nodes (either as a preference or a hard requirement). Taints are the opposite — they allow a node to repel a set of pods.
  4. Tolerations: Tolerations are applied to pods. Tolerations allow the scheduler to schedule pods with matching taints. Tolerations allow scheduling but don’t guarantee scheduling: the scheduler also evaluates other parameters as part of its function. Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this marks that the node should not accept any pods that do not tolerate the taints.

What we want is to run the clickhouse pod(s) on a separate instance because we might need to upgrade that instance later and it will be the most resource intensive instance in our whole cluster. With toelrations, node selectors and taints we can allow running clickhouse pods on a separate instance and avoid scheduling other pods like kafka, events pods, pgbouncer, web, worker e.t.c. on this isolated clickhouse instance/node.

Let’s now create that separate instance for clickhouse.

Setting up clickhouse instance

Go to your EKS cluster on AWS. Inside compute section, click on “Add node group”.

Node group configuration

  1. Name: clickhouse
  2. IAM role: Choose the one having your cluster name in it.

Launch template

  1. Enable “Use launch template”.
  2. Select a launch template having your cluster’s name in it.
  3. Keep the version 1

Kubernetes Lables

Lets add a label with key “clickhouse” and value “true” as mentioned in the helm configuration in the previous section.

Kubernetes Taints

Lets add one taint which will repel other pods from scheduling on this instance.

Lets keep the “key” as “dedicated”, “value” as “clickhouse” and “Effect” as “NoSchedule”.

Once this is done, click on next.

Node group compute configuration

Keep the AMI type and capacity type as it is. Select the desired instance type. I am assuming we have selected “r5.2xlarge”. Yes such a large CPU is needed for clickhouse. As we have mentioned earlier that clickhouse is CPU intensive. Keep everything else untouched in this section.

Node group scaling configuration

Set desired, min and max size to 1.

Node group update configuration

Keep it as it is and click “Next”

Node group network configuration

We’ll select only those subnets which are in region “a”. to find out such subnets go to VPCs in new tab. Copy the VPC id correcponding to your cluster’s VPC. Now go to subnets and paste the VPC is in search bar and hit enter.

Note the Subnet ID for those subnets which are in availablility zone “ap-southeast-1a” Because we have set availability zone for both redis and postgres as 1a. We need to select these subnet ids in the dropdown. Now hit Next and review your changes.

Click on “Create” and wait for the instance to be created. You can check the status in “Compute” section in your cluster corresponding to the new cickhouse nodegroup that we have just created.

Reason behind keeping single availability zone everywhere

We have kept the availability zone as ap-southeast-1a everywhere because of the node affinity. When we install the helm chart configuration, Kubernetes sets the affinity of pods towards the region it has been hosted in. In future if you delete a nodegroup and try to schedule multiple pods on a differently configured nodegroup, some pods will fail to schedule because these pods have the affinity for some other region and you are trying to schedule them in a different region. (In case you have not kept single availability zone everywhere). This region mismatch creates a lot of issues, which is why it is better to keep them all in single availability zone.

Installing the helm chart

We can finally proceed installing the helm chart with all the overrides we have already created in values.yml in previous section in the home path.

Execute the following commands to install and update posthog repo default cluster configurations.

helm repo add posthog https://posthog.github.io/charts-clickhouse/
helm repo update

Once this is done, you can check the version of posthog we are using via following command.

helm search repo posthog

Finally, Time to istall the chart. Run the command given below to install the chart in “posthog” namespace and with release name as “posthog”

helm upgrade --install -f values.yml --timeout 30m --create-namespace --namespace posthog posthog posthog/posthog --wait --wait-for-jobs --debug

This will install the custom helm chart configuration in your cluster from values.yml.

Later if you change the configuration in values.yml you can execute the following command to update the changes in your cluster.

helm upgrade -f values.yml --timeout 7m --namespace posthog posthog posthog/posthog --atomic --wait --wait-for-jobs --debug

Execute the following shell script to get the location of posthog’s dashboard.

POSTHOG_IP=$(kubectl get --namespace posthog ingress posthog -o jsonpath="{.status.loadBalancer.ingress[0].ip}" 2> /dev/null)
POSTHOG_HOSTNAME=$(kubectl get --namespace posthog ingress posthog -o jsonpath="{.status.loadBalancer.ingress[0].hostname}" 2> /dev/null)
if [ -n "$POSTHOG_IP" ]; then
POSTHOG_INSTALLATION=$POSTHOG_IP
fi
if [ -n "$POSTHOG_HOSTNAME" ]; then
POSTHOG_INSTALLATION=$POSTHOG_HOSTNAME
fi
if [ ! -z "$POSTHOG_INSTALLATION" ]; then
echo -e "\n----\nYour PostHog installation is available at: http://${POSTHOG_INSTALLATION}\n----\n"
else
echo -e "\n----\nUnable to find the address of your PostHog installation\n----\n"
fi

It will give you a ELB URL where your application is hosted. Go to AWS Route53 and add a CNAME in one of the hosted zones. This ELB URL should be mapped to your ingress.hostname your have mentioned in your values.yml configuration.

Congrats! We have finally set up our first posthog instance on AWS with Elasticache, Postgres, EKS and clickhouse.

--

--