|
|
Line 1: |
Line 1: |
− | Following steps are described how to setup the streaming replication
| + | Move away to [https://intranet.softwareheritage.org/index.php?title=Streaming_Replication intranet]. |
− | between a master and slave.
| |
− | | |
− | = Master =
| |
− | == Master information ==
| |
− | Example db:
| |
− | | |
− | * host: somerset.internal.softwareheritage.org
| |
− | | |
− | * port: 5434
| |
− | | |
− | * db: softwareheritage-indexer
| |
− | | |
− | == Replication user needed ==
| |
− | Make sure a replication user exists on the master server
| |
− | | |
− | su - postgres
| |
− | createuser -p 5434 replicator --replication
| |
− | psql -p 5434 template1
| |
− | alter user replicator with password '<precious-secret>';
| |
− | | |
− | Note: To retrieve the password.
| |
− | | |
− | swhpass ls infra/somerset/postgres/5434/replicator
| |
− | | |
− | == Postgres configuration adaptation ==
| |
− | === pg_hba.conf ===
| |
− | File: /etc/postgresql/10/indexer/pg_hba.conf
| |
− | | |
− | # TYPE DATABASE USER CIDR-ADDRESS METHOD
| |
− | host replication replicator 192.168.200.0/21 md5 # Azure
| |
− | | |
− | === postgresql.conf ===
| |
− | File: /etc/postgresql/10/indexer/postgresql.conf
| |
− | | |
− | wal_level = replica # or higher, logical is fine
| |
− | wal_keep_segments = 500 # in logfile segments, 16MB each; 0 disables
| |
− | wal_compression = on # enable compression of full-page writes
| |
− | | |
− | Note:
| |
− | | |
− | * wal_compression is an optimization which reduces I/O and network
| |
− | traffic at the expense of CPU time. It is not required.
| |
− | | |
− | * wal_keep_segments needs to be set to a reasonable value in order to
| |
− | keep enough wal data between the initial database backup and its
| |
− | subsequent regular startup on the slave. Slave startup can fail at
| |
− | first, so make sure to have at least a couple hours of slack in case
| |
− | things go wrong.
| |
− | | |
− | == Restart db ==
| |
− | Make the db aware of those changes:
| |
− | | |
− | # to determine the main pid of the concerned db
| |
− | sudo systemctl status postgres@10-indexer.service
| |
− | # Actually make the db aware
| |
− | # kill -HUP <pid-of-db>
| |
− | # or
| |
− | sudo systemctl restart postgres@10-indexer.service
| |
− | | |
− | = Slave =
| |
− | == Package Dependencies ==
| |
− | Install client for connection part, and server for replication one.
| |
− | | |
− | # client part and server
| |
− | sudo apt install postgresql-client-10 postgresl-10
| |
− | | |
− | == Postgres setup ==
| |
− | Make sure the connection is ok.
| |
− | | |
− | As postgres user:
| |
− | | |
− | su - postgres
| |
− | touch .pgpass
| |
− | cat > .pgpass <<EOF
| |
− | somerset.internal.softwareheritage.org:5434:replication:replicator:<precious-secret>
| |
− | EOF
| |
− | chmod 0600 .pgpass
| |
− | # should permit connection
| |
− | | |
− | == Create replica cluster ==
| |
− | This is a necessary and temporary step to permit to initialize
| |
− | configuration files that would need editing down the line.
| |
− | | |
− | # drop installed main cluster as this won't be used
| |
− | sudo pg_dropcluster --stop 10 main
| |
− | # prepare directories in data volume
| |
− | sudo mkdir /srv/softwareheritage/postgres
| |
− | sudo chown postgres:postgres /srv/softwareheritage/postgres
| |
− | # create the cluster configuration
| |
− | sudo -u postgres env LC_ALL=C.UTF-8 \
| |
− | pg_createcluster 10 replica \
| |
− | -D /srv/softwareheritage/postgres/10/replica \
| |
− | -p 5434
| |
− | | |
− | == Make sure the slave can connect to the master ==
| |
− | Edit the local configuration according to the master. The slave must
| |
− | have the at least the same boundaries for some configuration options:
| |
− | | |
− | file: /etc/postgresql/10/replica/postgresql.conf
| |
− | | |
− | max_connections = 128 # same as master somerset.internal.softwareheritage.org:5434:softwareheritage-indexer
| |
− | max_worker_processes = 10 # same as master somerset.internal.softwareheritage.org:5434:softwareheritage-indexer
| |
− | | |
− | Note: You will be notified in logs when trying to start the db that
| |
− | some setup are not correctly set. Some form of:
| |
− | | |
− | ardumont@dbreplica1 $ sudo journalctl -xef -u postgresql@10-replica.service
| |
− | ...
| |
− | Jun 15 08:13:24 dbreplica1 postgresql@10-replica[654]: 2018-06-15 08:13:24.425 UTC [664] FATAL: hot standby is not possible because max_worker_processes = 8 is a lower setting than on the ma
| |
− | ster server (its value was 10)
| |
− | ...
| |
− | | |
− | == Retrieve backup ==
| |
− | From the replica node, as postgres user:
| |
− | | |
− | postgres@dbreplica1 $ pg_basebackup --progress --write-recovery-conf \
| |
− | -h somerset.internal.softwareheritage.org -p 5434 -U replicator \
| |
− | -D /srv/softwareheritage/postgres/10/somerset-swh-indexer.backup
| |
− | | |
− | Note that this step could fail if your .pgpass is not setup correctly.
| |
− | | |
− | == Install backup as db ==
| |
− | Move the backup files to the regular pgdata location and start the
| |
− | slave server:
| |
− | | |
− | postgres@dbreplica1 $ cd /srv/softwareheritage/postgres/10
| |
− | # Remove 'temporary' cluster
| |
− | postgres@dbreplica1 $ rm -rf replica
| |
− | # Rename the backup as replica
| |
− | postgres@dbreplica1 $ mv somerset-swh-indexer.backup replica
| |
− | ardumont@dbreplica1 $ sudo systemctl status postgresql@10-replica.service
| |
− | | |
− | == Start cluser ==
| |
− | Now that everything is ready, start the cluster. This will start by
| |
− | redoing the necessary steps to synchronize the dbs.
| |
− | | |
− | sudo systemctl start postgresql@10-replica.service
| |