Difference between revisions of "New machine setup"
Jump to navigation
Jump to search
Line 47: | Line 47: | ||
./create-vm.sh worker01 | ./create-vm.sh worker01 | ||
− | * Add the machine to the internal dns | + | * after the provisioning, the vm should have been given a private ip, keep it in mind |
+ | |||
+ | * Add the machine to the internal dns: swh-site commit and push + deploy the latest puppet recipes in puppet master (pergamon) + deploy the recipe to internal dns (pergamon) | ||
ssh pergamon.internal.softwareheritage.org | ssh pergamon.internal.softwareheritage.org | ||
sudo /usr/local/bin/deploy.sh | sudo /usr/local/bin/deploy.sh | ||
+ | sudo puppet agent --test | ||
* Connect to the machine with the temp admin user | * Connect to the machine with the temp admin user | ||
Line 62: | Line 65: | ||
apt-get autoremove --purge | apt-get autoremove --purge | ||
− | * Set a root password (xckdpass | + | * Set a root password (on your machine `xckdpass` + add to password store) |
+ | |||
+ | on your machine: | ||
# generate password (for example) | # generate password (for example) | ||
Line 72: | Line 77: | ||
pass insert infra/<machine-name>/root | pass insert infra/<machine-name>/root | ||
pass git push | pass git push | ||
− | + | ||
+ | on the vm: | ||
+ | |||
+ | mkpasswd | ||
+ | |||
* Allow root ssh password login (edit /etc/ssh/sshd_config and flip to yes the following options) | * Allow root ssh password login (edit /etc/ssh/sshd_config and flip to yes the following options) | ||
Revision as of 09:02, 19 January 2018
Setting up a new Software Heritage desktop machine
Debian install
- Stable
- root w/temporary password; no regular user (after setting up root password, cancel twice and jump forward to clock settings)
- full disk with LVM; reduce home LV to leave half of the disk free
- Standard system utilities, ssh server, no desktop environment (puppet will install that)
Base system setup (from console)
- Login as root
- Enable password root access in ssh (/etc/ssh/sshd_config, PermitRootLogin yes)
- Write down IP configuration and add the machine to the Gandi DNS
- Test SSH login as root from your workstation
- Stay at your desk :)
Full system setup (from your desk)
- SSH login as root
- Edit sources.list to add testing
- apt-get update, dist-upgrade, autoremove --purge
- While you wait, create Vpn certificates for the new machine
- add the machine to the puppet configuration, in the swh_desktop role
- apt-get install puppet openvpn
- configure openvpn per Vpn
- add pergamon IP address to /etc/resolv.conf
- add louvre.softwareheritage.org to /etc/hosts
- configure puppet
- systemctl disable puppet
- server=pergamon.internal.softwareheritage.org in /etc/puppet/puppet.conf
- puppet agent --enable
- puppet agent -t
- run puppet on pergamon to update munin server config
- set proper root password, add it to password store
- reboot
Setting up a new Virtual Machine (manual process)
Naming scheme: machine_name.<zone>.<hoster>.internal.softwareheritage.org.
- Provision the virtual machine from a Debian image (Provisioning script example for azure)
- Sets a temporary admin user with an ssh key (the real setup will be installed through puppet later)
- Avoid public IPs if you don't need them
- Example:
./create-vm.sh worker01
- after the provisioning, the vm should have been given a private ip, keep it in mind
- Add the machine to the internal dns: swh-site commit and push + deploy the latest puppet recipes in puppet master (pergamon) + deploy the recipe to internal dns (pergamon)
ssh pergamon.internal.softwareheritage.org sudo /usr/local/bin/deploy.sh sudo puppet agent --test
- Connect to the machine with the temp admin user
ssh -i <public-key-used-during-provisioning> <user>@<new-vm>
- Update machine to the latest
apt-get update apt-get dist-upgrade apt-get autoremove --purge
- Set a root password (on your machine `xckdpass` + add to password store)
on your machine:
# generate password (for example) xkcdpass --numwords=5 --delimiter=' ' --min=5 --max=6 --valid-chars='[a-z]' # insert into swh's password store cd /path/to/swh/credentials; pass git pull --rebase pass insert infra/<machine-name>/root pass git push
on the vm:
mkpasswd
- Allow root ssh password login (edit /etc/ssh/sshd_config and flip to yes the following options)
PermitRootLogin yes PasswordAuthentication yes
- Restart sshd service
systemctl restart sshd.service
- In another shell, check the ssh connection with the root login works.
ssh root@<new-vm>
- If connection ok, close the first connection with the temporary user.
- As root, remove temporary user (foo for the example)
deluser foo rm -rf /home/foo
- Set the hostname to the appropriate one:
- Edit /etc/hostname: machine.zone.hoster (e.g. worker01.euwest.azure)
- Edit /etc/hosts: add {{<ip> machine.zone.hoster.internal.softwareheritage.org machine.zone.hoster}} line
- reboot to get new hostname
reboot
- connect as root again to the machine
ssh root@<new-vm>
- install and setup puppet
apt-get install puppet systemctl disable puppet
- Edit /etc/puppet/puppet.conf and add the following line in the [main] section
server=pergamon.internal.softwareheritage.org
- run puppet agent:
puppet agent --enable # Add fact about its location (for example, with <vm_location> as "azure_euwest" in the following example) mkdir -p /etc/facter/facts.d/ echo "location=<vm_location>" > /etc/facter/facts.d/location.txt # to check everything is ok (if we reuse an existing hostname vm, puppet may complain about certificate errors, and ask further actions, do as entertain) puppet agent --test --noop # when everything is fine then, actually apply the manifest puppet agent --test
- On the puppet master host (pergamon):
- run puppet to update munin server config
- reboot to check new services
- update clustershell configuration on louvre
Troubleshoot
Recreating machine with the same exact configuration
It so happens that we could scratch and recreate the same machine. We then need to clean up on the puppet-master the old certificate (based on the machine's fqdn).
puppet cert clean <fqdn>
Duplicate resource found error
For information, after a wrong manipulation (wrong hostname setup for example), you could end up having stale data in the puppet master (in puppetdb).
You would end up with the puppet agent complaining about duplicate resources found, for example:
A duplicate resource was found while collecting exported resources
That means there exists some stale data in the master (puppetdb). Here is the command to clean those up.
puppet node deactivate <wrong-fqdn>