Streaming Replication
Following steps are described how to setup the streaming replication between a master and slave.
Master
Master information
Example db:
- host: somerset.internal.softwareheritage.org
- port: 5434
- db: softwareheritage-indexer
Replication user needed
Make sure a replication user exists on the master server
su - postgres createuser -p 5434 replicator --replication psql -p 5434 template1 alter user replicator with password '<precious-secret>';
Note: To retrieve the password.
swhpass ls infra/somerset/postgres/5434/replicator
Postgres configuration adaptation
pghba.conf
File: /etc/postgresql/10/indexer/pghba.conf
# TYPE DATABASE USER CIDR-ADDRESS METHOD host replication replicator 192.168.200.0/21 md5 # Azure
postgresql.conf
File: /etc/postgresql/10/indexer/postgresql.conf
wal_level = replica # or higher, logical is fine wal_keep_segments = 500 # in logfile segments, 16MB each; 0 disables wal_compression = on # enable compression of full-page writes
Note:
- walcompression is an optimization which reduces I/O and network
traffic at the expense of CPU time. It is not required.
- walkeepsegments needs to be set to a reasonable value in order to
keep enough wal data between the initial database backup and its subsequent regular startup on the slave. Slave startup can fail at first, so make sure to have at least a couple hours of slack in case things go wrong.
Restart db
Make the db aware of those changes:
# to determine the main pid of the concerned db sudo systemctl status postgres@10-indexer.service # Actually make the db aware # kill -HUP <pid-of-db> # or sudo systemctl restart postgres@10-indexer.service
Slave
Package Dependencies
Install client for connection part, and server for replication one.
# client part and server sudo apt install postgresql-client-10 postgresl-10
Postgres setup
Make sure the connection is ok.
As postgres user:
su - postgres touch .pgpass cat > .pgpass <<EOF somerset.internal.softwareheritage.org:5434:replication:replicator:<precious-secret> EOF chmod 0600 .pgpass # should permit connection
Create replica cluster
This is a necessary and temporary step to permit to initialize configuration files that would need editing down the line.
# drop installed main cluster as this won't be used sudo pg_dropcluster --stop 10 main # prepare directories in data volume sudo mkdir /srv/softwareheritage/postgres sudo chown postgres:postgres /srv/softwareheritage/postgres # create the cluster configuration sudo -u postgres env LC_ALL=C.UTF-8 \ pg_createcluster 10 replica \ -D /srv/softwareheritage/postgres/10/replica \ -p 5434
Make sure the slave can connect to the master
Edit the local configuration according to the master. The slave must have the at least the same boundaries for some configuration options:
file: /etc/postgresql/10/replica/postgresql.conf
max_connections = 128 # same as master somerset.internal.softwareheritage.org:5434:softwareheritage-indexer max_worker_processes = 10 # same as master somerset.internal.softwareheritage.org:5434:softwareheritage-indexer
Note: You will be notified in logs when trying to start the db that some setup are not correctly set. Some form of:
ardumont@dbreplica1 $ sudo journalctl -xef -u postgresql@10-replica.service ... Jun 15 08:13:24 dbreplica1 postgresql@10-replica[654]: 2018-06-15 08:13:24.425 UTC [664] FATAL: hot standby is not possible because max_worker_processes = 8 is a lower setting than on the ma ster server (its value was 10) ...
Retrieve backup
From the replica node, as postgres user:
postgres@dbreplica1 $ pg_basebackup --progress --write-recovery-conf \ -h somerset.internal.softwareheritage.org -p 5434 -U replicator \ -D /srv/softwareheritage/postgres/10/somerset-swh-indexer.backup
Note that this step could fail if your .pgpass is not setup correctly.
Install backup as db
Move the backup files to the regular pgdata location and start the slave server:
postgres@dbreplica1 $ cd /srv/softwareheritage/postgres/10 # Remove 'temporary' cluster postgres@dbreplica1 $ rm -rf replica # Rename the backup as replica postgres@dbreplica1 $ mv somerset-swh-indexer.backup replica ardumont@dbreplica1 $ sudo systemctl status postgresql@10-replica.service
Start cluser
Now that everything is ready, start the cluster. This will start by redoing the necessary steps to synchronize the dbs.
sudo systemctl start postgresql@10-replica.service