Difference between revisions of "Streaming Replication"

From Software Heritage Wiki
Jump to: navigation, search
m (Fix _, sub issues)
(Move page to the intranet (was wrongly created here, no credentials leaked or anything anyway))
Line 1: Line 1:
Following steps are described how to setup the streaming replication
Move away to [https://intranet.softwareheritage.org/index.php?title=Streaming_Replication intranet].
between a master and slave.
= Master =
== Master information ==
Example db:
* host: somerset.internal.softwareheritage.org
* port: 5434
* db: softwareheritage-indexer
== Replication user needed ==
Make sure a replication user exists on the master server
    su - postgres
    createuser -p 5434 replicator --replication
    psql -p 5434 template1
    alter user replicator with password '<precious-secret>';
Note: To retrieve the password.
    swhpass ls infra/somerset/postgres/5434/replicator
== Postgres configuration adaptation ==
=== pg_hba.conf ===
File: /etc/postgresql/10/indexer/pg_hba.conf
    # TYPE  DATABASE        USER            CIDR-ADDRESS            METHOD
    host    replication    replicator        md5 # Azure
=== postgresql.conf ===
File: /etc/postgresql/10/indexer/postgresql.conf
    wal_level = replica      # or higher, logical is fine
    wal_keep_segments = 500  # in logfile segments, 16MB each; 0 disables
    wal_compression = on    # enable compression of full-page writes
* wal_compression is an optimization which reduces I/O and network
  traffic at the expense of CPU time. It is not required.
* wal_keep_segments needs to be set to a reasonable value in order to
  keep enough wal data between the initial database backup and its
  subsequent regular startup on the slave. Slave startup can fail at
  first, so make sure to have at least a couple hours of slack in case
  things go wrong.
== Restart db ==
Make the db aware of those changes:
    # to determine the main pid of the concerned db
    sudo systemctl status postgres@10-indexer.service
    # Actually make the db aware
    # kill -HUP <pid-of-db>
    # or
    sudo systemctl restart postgres@10-indexer.service
= Slave =
== Package Dependencies ==
Install client for connection part, and server for replication one.
    # client part and server
    sudo apt install postgresql-client-10 postgresl-10
== Postgres setup ==
Make sure the connection is ok.
As postgres user:
    su - postgres
    touch .pgpass
    cat > .pgpass <<EOF
    chmod 0600 .pgpass
    # should permit connection
== Create replica cluster ==
This is a necessary and temporary step to permit to initialize
configuration files that would need editing down the line.
    # drop installed main cluster as this won't be used
    sudo pg_dropcluster --stop 10 main
    # prepare directories in data volume
    sudo mkdir /srv/softwareheritage/postgres
    sudo chown postgres:postgres /srv/softwareheritage/postgres
      # create the cluster configuration
    sudo -u postgres env LC_ALL=C.UTF-8 \
        pg_createcluster 10 replica \
            -D /srv/softwareheritage/postgres/10/replica \
            -p 5434
== Make sure the slave can connect to the master ==
Edit the local configuration according to the master.  The slave must
have the at least the same boundaries for some configuration options:
file: /etc/postgresql/10/replica/postgresql.conf
    max_connections = 128      # same as master somerset.internal.softwareheritage.org:5434:softwareheritage-indexer
    max_worker_processes = 10  # same as master somerset.internal.softwareheritage.org:5434:softwareheritage-indexer
Note: You will be notified in logs when trying to start the db that
some setup are not correctly set.  Some form of:
    ardumont@dbreplica1 $ sudo journalctl -xef -u postgresql@10-replica.service
    Jun 15 08:13:24 dbreplica1 postgresql@10-replica[654]: 2018-06-15 08:13:24.425 UTC [664] FATAL:  hot standby is not possible because max_worker_processes = 8 is a lower setting than on the ma
    ster server (its value was 10)
== Retrieve backup ==
From the replica node, as postgres user:
    postgres@dbreplica1 $ pg_basebackup --progress --write-recovery-conf \
            -h somerset.internal.softwareheritage.org -p 5434 -U replicator \
            -D /srv/softwareheritage/postgres/10/somerset-swh-indexer.backup
Note that this step could fail if your .pgpass is not setup correctly.
== Install backup as db ==
Move the backup files to the regular pgdata location and start the
slave server:
    postgres@dbreplica1 $ cd /srv/softwareheritage/postgres/10
    # Remove 'temporary' cluster
    postgres@dbreplica1 $ rm -rf replica
    # Rename the backup as replica
    postgres@dbreplica1 $ mv somerset-swh-indexer.backup replica
    ardumont@dbreplica1 $ sudo systemctl status postgresql@10-replica.service
== Start cluser ==
Now that everything is ready, start the cluster.  This will start by
redoing the necessary steps to synchronize the dbs.
    sudo systemctl start postgresql@10-replica.service

Revision as of 09:38, 15 June 2018

Move away to intranet.