How to Install CockroachDB Cluster on Ubuntu 24.04 LTS - Stop Fighting Database Setup

Set up a production-ready CockroachDB cluster in 45 minutes. Avoid the 6 hours I wasted on network config mistakes. Copy-paste commands included.

I spent 6 hours setting up my first CockroachDB cluster because the official docs skip the real-world problems you'll hit.

What you'll build: A 3-node CockroachDB cluster that survives server failures
Time needed: 45 minutes (not 6 hours like my first attempt)
Difficulty: Intermediate - requires basic Linux and networking knowledge

Here's what makes this different: I'll show you the exact network configuration that actually works, plus the two critical security steps that aren't obvious from the documentation.

Why I Built This

I needed a database that could handle my SaaS app going from 1,000 to 50,000 users without downtime. PostgreSQL with read replicas was getting complex, and I kept hearing about CockroachDB's "just works" distributed approach.

My setup:

  • 3 Ubuntu 24.04 LTS servers on DigitalOcean (4GB RAM each)
  • Private networking between nodes
  • Load balancer requirement for client connections
  • SSL certificates for production security

What didn't work:

  • Following the basic single-node tutorial for production
  • Using default firewall settings (blocked cluster communication)
  • Skipping certificate setup (caused mysterious connection failures)
  • Not configuring proper DNS resolution between nodes

Step 1: Prepare Your Ubuntu Servers

The problem: CockroachDB needs specific system requirements that Ubuntu doesn't have by default.

My solution: Install dependencies and configure system limits before touching CockroachDB.

Time this saves: Prevents the "why won't it start" debugging loop that cost me 2 hours.

Update System and Install Dependencies

Run this on all three servers:

# Update package lists
sudo apt update && sudo apt upgrade -y

# Install required packages
sudo apt install -y wget curl software-properties-common apt-transport-https ca-certificates

# Install NTP for time synchronization (critical for CockroachDB)
sudo apt install -y ntp
sudo systemctl enable ntp
sudo systemctl start ntp

What this does: CockroachDB requires synchronized clocks across nodes. The NTP service prevents clock skew issues that cause data consistency problems.

Expected output: You should see "ntp.service - Network Time Protocol daemon" as active when you run sudo systemctl status ntp.

System preparation Terminal output My terminal after running the preparation commands - yours should look identical

Personal tip: "I learned the hard way that clock skew breaks CockroachDB's consensus algorithm. Always install NTP first."

Configure System Limits

CockroachDB needs higher file descriptor limits:

# Edit limits configuration
sudo nano /etc/security/limits.conf

# Add these lines at the end:
* soft nofile 35000
* hard nofile 35000
* soft nproc 35000
* hard nproc 35000

What this does: Prevents "too many open files" errors when your cluster handles high connection loads.

Reboot all servers to apply limits:

sudo reboot

Personal tip: "Skip this step and your cluster will crash under load. I found out during a client demo."

Step 2: Download and Install CockroachDB

The problem: Package managers don't have the latest CockroachDB version, and manual installation can be tricky.

My solution: Download directly from Cockroach Labs and install system-wide.

Time this saves: Avoids version compatibility issues with older packaged versions.

Download CockroachDB Binary

Run on all three servers:

# Download the latest binary (replace with current version)
wget https://binaries.cockroachdb.com/cockroach-v23.2.4.linux-amd64.tgz

# Extract and install
tar -xzf cockroach-v23.2.4.linux-amd64.tgz
sudo cp cockroach-v23.2.4.linux-amd64/cockroach /usr/local/bin/
sudo chmod +x /usr/local/bin/cockroach

# Verify installation
cockroach version

Expected output:

Build Tag:        v23.2.4
Build Time:       2024/05/01 18:47:17
Distribution:     CCL
Platform:         linux amd64 (x86_64-pc-linux-gnu)
Go Version:       go1.19.13

CockroachDB version check output Successful installation - the version command should return without errors

Personal tip: "Always verify the installation works before moving to cluster configuration. This catches file permission issues early."

Step 3: Configure Network and Firewall

The problem: CockroachDB uses multiple ports that need specific firewall rules, and default Ubuntu firewall blocks everything.

My solution: Configure UFW with exact port requirements and test connectivity.

Time this saves: Prevents the "nodes can't find each other" problem that wasted 3 hours of my setup time.

Set Up Firewall Rules

On each server, configure UFW to allow CockroachDB traffic:

# Enable UFW
sudo ufw enable

# Allow SSH (don't lock yourself out)
sudo ufw allow ssh

# Allow CockroachDB ports
sudo ufw allow 26257/tcp  # CockroachDB client connections
sudo ufw allow 8080/tcp   # Admin UI
sudo ufw allow 26256/tcp  # Inter-node communication

# Allow from specific IP ranges (replace with your server IPs)
sudo ufw allow from 10.0.0.0/8 to any port 26257
sudo ufw allow from 10.0.0.0/8 to any port 26256

# Check firewall status
sudo ufw status verbose

What this does: Opens the three essential ports CockroachDB needs while maintaining security by restricting access to your private network.

Personal tip: "Don't use sudo ufw allow 26257 without IP restrictions in production. I had bots trying to connect within hours."

Configure Hostnames and DNS

Edit /etc/hosts on each server to ensure nodes can find each other:

sudo nano /etc/hosts

# Add your server IPs (replace with actual IPs):
10.0.1.100  cockroach-1
10.0.1.101  cockroach-2  
10.0.1.102  cockroach-3

Expected result: You should be able to ping each hostname from every server:

ping -c 3 cockroach-1
ping -c 3 cockroach-2
ping -c 3 cockroach-3

Network connectivity test results All three hostnames should respond to ping with 0% packet loss

Step 4: Generate Security Certificates

The problem: CockroachDB requires SSL certificates for secure cluster communication, but generating them correctly is confusing.

My solution: Use CockroachDB's built-in certificate generation with proper node names.

Time this saves: Prevents SSL handshake failures that are nearly impossible to debug without proper logging.

Create Certificate Authority

On your first server (cockroach-1), generate the CA:

# Create certificates directory
mkdir ~/certs ~/my-safe-directory

# Generate CA certificate
cockroach cert create-ca \
    --certs-dir=~/certs \
    --ca-key=~/my-safe-directory/ca.key

# Generate node certificates for all three nodes
cockroach cert create-node \
    cockroach-1 \
    cockroach-2 \
    cockroach-3 \
    localhost \
    127.0.0.1 \
    $(hostname) \
    --certs-dir=~/certs \
    --ca-key=~/my-safe-directory/ca.key

# Generate client certificate for root user
cockroach cert create-client \
    root \
    --certs-dir=~/certs \
    --ca-key=~/my-safe-directory/ca.key

What this does: Creates a certificate authority and node certificates that include all possible hostnames your nodes might use.

Copy Certificates to Other Nodes

Copy the certificates to your other servers:

# From cockroach-1, copy to cockroach-2
scp ~/certs/* user@cockroach-2:~/certs/

# From cockroach-1, copy to cockroach-3  
scp ~/certs/* user@cockroach-3:~/certs/

Set proper permissions on all servers:

chmod 700 ~/certs
chmod 600 ~/certs/*

Personal tip: "The certificate subject names must match how nodes connect to each other. Include both IP addresses and hostnames to avoid SSL verification failures."

Step 5: Start the CockroachDB Cluster

The problem: The node startup order matters, and the join flags need to be exactly right or nodes won't form a cluster.

My solution: Start the first node without join, then add others with specific join syntax.

Time this saves: Avoids the "nodes start but never join" scenario that confused me for 90 minutes.

Start First Node (cockroach-1)

# Start the first node
cockroach start \
    --certs-dir=~/certs \
    --advertise-addr=cockroach-1 \
    --join=cockroach-1,cockroach-2,cockroach-3 \
    --cache=.25 \
    --max-sql-memory=.25 \
    --background

# Check if it started
cockroach node status --certs-dir=~/certs --host=cockroach-1

Expected output: Should show one node with status "live"

Start Second Node (cockroach-2)

cockroach start \
    --certs-dir=~/certs \
    --advertise-addr=cockroach-2 \
    --join=cockroach-1,cockroach-2,cockroach-3 \
    --cache=.25 \
    --max-sql-memory=.25 \
    --background

Start Third Node (cockroach-3)

cockroach start \
    --certs-dir=~/certs \
    --advertise-addr=cockroach-3 \
    --join=cockroach-1,cockroach-2,cockroach-3 \
    --cache=.25 \
    --max-sql-memory=.25 \
    --background

Personal tip: "All nodes use the same join list. This lets any node bootstrap the cluster if others are down during restart."

Initialize the Cluster

From any node, run the initialization:

cockroach init --certs-dir=~/certs --host=cockroach-1

Expected output: "Cluster successfully initialized"

Cluster initialization success message Success message confirming your cluster is ready for connections

Step 6: Verify Cluster Health

The problem: Just because nodes start doesn't mean the cluster is actually working correctly.

My solution: Run comprehensive health checks before declaring victory.

Time this saves: Catches configuration problems before you deploy applications.

Check Node Status

cockroach node status --certs-dir=~/certs --host=cockroach-1

Expected output: All three nodes should show "live" status with recent heartbeat times.

Test SQL Connectivity

# Connect to cluster
cockroach sql --certs-dir=~/certs --host=cockroach-1

# Inside the SQL shell, test basic operations:
CREATE DATABASE test_db;
USE test_db;
CREATE TABLE users (id INT PRIMARY KEY, name STRING);
INSERT INTO users VALUES (1, 'Test User');
SELECT * FROM users;
\q

Access Admin UI

Open your browser and navigate to: https://cockroach-1:8080

You'll see a security warning (expected with self-signed certificates). Click "Advanced" → "Proceed to cockroach-1".

CockroachDB Admin UI dashboard Healthy cluster dashboard showing 3 live nodes and green status indicators

Personal tip: "The admin UI is your best friend for monitoring. Bookmark it and check the Overview tab regularly in production."

Step 7: Configure Systemd Services (Production Ready)

The problem: Manual startup doesn't survive server reboots, and you need proper service management for production.

My solution: Create systemd service files with proper dependencies and restart policies.

Time this saves: Prevents manual cluster recovery after maintenance windows.

Create Systemd Service File

On each server, create the service file:

sudo nano /etc/systemd/system/cockroachdb.service

Add this configuration (adjust paths and hostnames for each server):

[Unit]
Description=CockroachDB database server
Requires=network.target
After=network.target

[Service]
Type=notify
User=ubuntu
Restart=always
RestartSec=10
ExecStart=/usr/local/bin/cockroach start \
    --certs-dir=/home/ubuntu/certs \
    --advertise-addr=cockroach-1 \
    --join=cockroach-1,cockroach-2,cockroach-3 \
    --cache=.25 \
    --max-sql-memory=.25
ExecReload=/bin/kill -HUP $MAINPID
KillMode=mixed
KillSignal=SIGTERM
TimeoutStopSec=60
SendSIGKILL=no

[Install]
WantedBy=multi-user.target

Personal tip: "Change the --advertise-addr value for each server. On cockroach-2 use cockroach-2, on cockroach-3 use cockroach-3."

Enable and Start Services

# Reload systemd configuration
sudo systemctl daemon-reload

# Enable auto-start on boot
sudo systemctl enable cockroachdb

# Stop manual processes first
pkill cockroach

# Start service
sudo systemctl start cockroachdb

# Check status
sudo systemctl status cockroachdb

Expected output: Service should show "active (running)" status.

Repeat this process on all three servers.

What You Just Built

You now have a production-ready CockroachDB cluster that automatically handles node failures, data replication, and load balancing across three Ubuntu 24.04 servers.

Key Takeaways (Save These)

  • Certificate Planning: Include all possible hostnames (IP, DNS, localhost) in certificates or you'll get SSL errors later
  • Firewall Configuration: CockroachDB needs ports 26257, 8080, and 26256 - don't forget inter-node communication
  • Time Synchronization: Install NTP first. Clock skew breaks distributed consensus and causes weird data inconsistencies
  • Join List Strategy: Use the same join list on all nodes so any node can bootstrap the cluster after maintenance

Your Next Steps

Pick one based on your experience level:

  • Beginner: Set up connection pooling with PgBouncer for your applications
  • Intermediate: Configure automated backups to cloud storage and test restore procedures
  • Advanced: Implement geo-distributed clusters across multiple regions with zone configs

Tools I Actually Use

  • Monitoring: CockroachDB's built-in Admin UI plus Prometheus + Grafana for production metrics
  • Connection Pooling: PgBouncer configured for CockroachDB's transaction retry logic
  • Backup Strategy: Built-in cockroach backup to S3 with automated scheduling
  • Load Balancing: HAProxy with health checks on port 26257 for application connections

Common Gotchas I Learned the Hard Way

Clock Skew Issues: If you see "timestamp in the future" errors, check NTP sync with timedatectl status. Clock differences over 500ms break everything.

Connection Pool Settings: Don't use traditional PostgreSQL connection pool settings. CockroachDB needs pools configured for automatic transaction retries.

Schema Changes: Unlike PostgreSQL, schema changes in CockroachDB are online by default, but large table alterations can impact performance. Schedule them during low-traffic periods.

Ready to connect your applications? The cluster accepts standard PostgreSQL connections on port 26257 with SSL required.