Hetzner Root-Server with dual Hardware RAID-1 and encrypted LVM on Debian 9

Hetzner Root-Server with dual Hardware RAID-1 and encrypted LVM on Debian 9

Introduction

This guide will explain the steps it takes to install a secure headless Hetzner Root-Server with some special pitfalls like LUKS on Hardware-RAIDs and SSH-based remote decryption at boot time for more than one encrypted volume that we want to use for our new PostgreSQL Backup-Server.

But some back story first.

Recently we started to rethink our Backup strategy for our PostgreSQL Database-Cluster.
We used to have a 2-node Cluster with DRBD and Corosync and a simple twice-a-day pg_dump-based logical backup until last year. That was ok for some time but the cluster setup was somewhat aged and we decided to switch to a new Patroni-based approach with 3 physical servers and a fourth virtual node for a 4-node cluster. We also switched from synchronous DRBD to PostgreSQL's synchronous streaming replication method with one synchronous standby and an asynchronous one.

We still had the twice-a-day logical backup and that was not sufficient anymore so I started to look for a better alternative. I wanted to have something that saved our data much more frequently and also does regular and automated restores to a Backup-PostgreSQL instance, so we can be sure not to have "Schrödingers Backups".
There is one fundamental thing with backups. You can only be sure your backups are working when you regularly restore them.

After a lot of research, I decided to use pgBarman and their first Scenario for our backup strategy because it will work out of the box with our current cluster without the need to further modify it's configuration and some cluster restarts eventually.

The pgBarman config is not the topic of this post but the underlying server and the pitfalls I've found setting it up and I wanted to share the information on that in case others getting trapped as well.

Disclaimer

I don't take responsibilities for any damage that my tutorial may cause to your system. I've tested it twice on my system and it worked well all the time but the commands may be different on your system. So please keep that in mind while following my remarks but the commands should be somewhat identical on Debian derivates or older and newer versions of them.

Considerations for the server

I wrote some background story for this post, so you can understand why I decided the way I did. Here are the raw server specs first:

  • Hetzner Serverbörse Root-Server "SB48"
  • Intel Xeon 4C/8T E3-1275V2
  • 32GB DDR3 ECC Reg. RAM
  • LSI MegaRAID SAS 9260-4i Hardware-RAID-Controller
  • 2x 300GB SAS 15k Disks
  • 2x 3TB SATA 7.2k Disks
  • Redundant PSU's

Since we need to have our Backups saved in a geo-redundant way (german publication), I decided to go with a Hetzner Root-Server that are located in ISO27001-certified datacenters in Nuremberg or Falkenstein.

How do we secure the server?

I wanted to have everything encrypted what's possible, so I decided to go with the full-disk-encryption using LUKS for everything except the boot partition that we need in order to at least be able to boot to dropbear with busybox for remote decryption of the server at boot time.

There are some tutorials available to encrypt those servers but most of them use only a software-raid and/or a single encrypted device with or without an LVM on top of it. We will have two encrypted devices on two hardware-raid's using Debian 9 "Stretch" and two LVM Volume Groups. And we want to be able to decrypt both LUKS devices at boot time which is normally not possible due to a bug in the cryptroot-unlock script included in the Debian 9 cryptsetup package. But more on that later in this post.

Let's start!

Configuring the Hardware-RAID

Boot into the Rescue System

If you buy a server from Hetzner's Serverbörse, your server will automatically boot up to the Rescue-System, so you can continue straight on. If you have another server, first boot to the Rescue System in order to follow the steps.

Removing the existing RAID's

In case there is already a RAID-Configuration on your Hardware-RAID controller, you can easily remove it with the commandline tool called megacli. I recommend doing this step because this way you can control exactly how your RAID's are being built and configured.

With the following commands you can first check what physical disks are attached and how (important for the RAID config later) and which RAID setups are currently present.

~$ megacli -PDList -a0 # Lists all physical disks on the first adapter (controller)
Adapter #0

Enclosure Device ID: 252
Slot Number: 0
...
Enclosure Device ID: 252
Slot Number: 1
...
Enclosure Device ID: 252
Slot Number: 2
...
Enclosure Device ID: 252
Slot Number: 3
...

~$ megacli -LDInfo -Lall -a0 # Lists all logical devices configured (e.g. RAID configurations)

Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 278.875 GB
...
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 1 (Target Id: 1)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 2.728 TB
...

If there are any configurations you can easily wipe them away with the following command and with another you can save them to a config file in case you did a mistake and want to restore the config afterwards.

megacli -CfgSave -f raidcfg.txt -a0 # Saves the current config to raidcfg.txt
megacli -CfgClr -a0 # Removes all available RAID configurations

Build up the new RAID config

Since I have 2 similar disks of each type my preferred config will be two RAID-1 (mirroring) of either of them. In fact you can choose whatever you want here and also use a RAID-0 and a RAID-1 or mix them otherwise. I want to have the System installed on the 300GB SAS RAID-1 including the later PostgreSQL Restore-Server instance and have the PostgreSQL-Backup itself stored on the slower but larger 3TB RAID-1.

Check that all of our physicals disks are in a "good state"

For an LSI controller you can only use disks that are in a "good state", so we might need to change that on our current disks if they aren't already.

Check the status first with

~$ megacli -PDList -a0 | grep "Firmware state"
Firmware state: Unconfigured(good), Spun Up
Firmware state: Unconfigured(good), Spun Up
Firmware state: Unconfigured(good), Spun Up
Firmware state: Unconfigured(good), Spun Up

If the output of a disk is not like shown above, you probably have to change it's state. For example if disk number 3 above has not the right state, you can change it with

megacli -PDMakeGood -PhysDrv[252:2] -a0

The number before the colon is the Enclosure ID and the second is the slot number. You can find both with the -PDList command above. The slot number count starts at zero, so disk 3 is actually slot number 2.

Configuring the new RAID's

Now we're ready to actually create our new RAID's. For my project I decided to create 2 RAID-1s like explained, so here are the commands. The explanation is below them.

megacli -CfgLdAdd -r1 [252:0,252:1] WB RA Direct CachedBadBBU -a0
megacli -CfgLdAdd -r1 [252:2,252:3] WB RA Direct CachedBadBBU -a0

At first we tell the controller that we want to add a new logical device configuration with -CfgLdAdd and that it should be a RAID-1 config with -r1. Possible other options are -r0, -r5 or -r10 depending on your controller and available disks of course.

The next part within the square brackets tells megacli which physical disks you want to use in your raid config. Since we checked our physical disk setup above with the -PDList command, i know that both 300GB SAS disks are on slots 0 and 1 and the two 3TB disks are on 2 and 3, so we define them with <enclosure id>:<slot number> separated by a comma.

After that we configure some more options like WB for Write Back so that the controller can send a transfer completion signal as soon as the data has been written to the cache to speed up writes.
RA for Read-Ahead which means that the controller can read data ahead of the current requested data in case this data will be requested afterwards to speed up reads.
Direct means that reads are not buffered in cache memory. Most file systems and many applications have their own cache and do not require caching data at the RAID controller level.
CachedBadBBU means that it's ok to use the controller cache even though no BBU (Batter Backup Unit) is available on the controller. This is ok for us since our server is attached with two PSU's and the datacenter utilizes a UPS as well.

After executing the commands for creating the RAID configs we can see the progress of the RAID rebuild with:

~$ megacli -PDRbld -ShowProg -PhysDrv [252:0,252:1,252:2,252:3] -a0

Device(Encl-252 Slot-0) is not in rebuild process
Device(Encl-252 Slot-1) is not in rebuild process
Device(Encl-252 Slot-2) is not in rebuild process
Device(Encl-252 Slot-3) is not in rebuild process

When you get this response above, the rebuild is already done.

Mark one RAID config as bootable

Since we want to boot from the first RAID-1 we have to mark it as bootable so the system knows where to check for a bootloader.

You can see if there is a bootable RAID-config with the first command and set this flag for our RAID with the second.

megacli -AdpBootDrive -get -a0 # Get bootable RAID's
megacli -AdpBootDrive -set -L0 -a0 # Set our first RAID config as bootable

Let's install our system now

Now that we have the hardware part configured, let's continue with the Debian installation. For that we use Hetzner's installimage command in an automated way, so we don't have to manually edit the config which can be error-prone if you forget something.

For our system, we will separate the /boot partition and install /root, /tmp, /var/log, /var/lib/postgresql and swap on separate partitions on the LVM. We also won't configure any software-raid (-r no). We will also use your ssh keys you probably have saved to the Hetzner robot. If not, remove the -K /root/.ssh/robot_user_keys part.
We also tell installimage to format all disks automatically with -f yes and install the grub bootloader with -b grub. The -a part tells installimage to do everything without user interaction. The only things you need to change are -n myhostname and change that to the FQDN of your new server and -s de to change the server language to something else than german.

installimage -a -n myhostname.com \
-b grub \
-r no \
-i root/.oldroot/nfs/images/Debian-94-stretch-64-minimal.tar.gz \
-p /boot:ext4:512M,lvm:vg0:all \
-v vg0:swap:swap:swap:8G,vg0:root:/:ext4:16G,vg0:var-log:/var/log:ext4:8G,vg0:tmp:/tmp:ext4:8G,vg0:var-lib-postgresql:/var/lib/postgresql:ext4:150G \
-f yes -s de -K /root/.ssh/robot_user_keys

We will delete and recreate the LVM and all logical volumes later again but we have to create them now as well, so we don't have to also readd them to the fstab later because we will backup and restore the installed system in the next steps.
This is necessary because installimage is unable to encrypt anything before setting up the LVM.

When the installation is ready, reboot to your shiny new system.

Install needed tools for headless decryption at boot

After you have rebooted your system, update all of the installed packages and then also install all the tools needed for headless boot-time decryption.

apt update && apt-get -y upgrade
apt -y install busybox dropbear dropbear-initramfs

You will see the following error, which is ok because we will fix that in a second.

dropbear: WARNING: Invalid authorized_keys file, remote unlocking of cryptroot via SSH won't work!

Now reenable the Rescue-System and boot into it again!

Adding the LUKS devices for both RAID's

Now we will add two LUKS devices for our full-disk-encryption (except boot) on both RAID's. For that, we will first make a full backup of our new installed Debian.

mkdir /rootbackup
mount /dev/vg0/root /mnt
mount /dev/vg0/var-log /mnt/var/log
mount /dev/vg0/tmp /mnt/tmp
mount /dev/vg0/var-lib-postgresql /mnt/var/lib/postgresql
rsync -a /mnt/ /rootbackup/

The rsync will take about 20-30 seconds. Now unmount your system and disable the LVM for removal.

umount /mnt/var/lib/postgresql && umount /mnt/tmp && umount /mnt/var/log && umount /mnt
vgchange -a n vg0 # Otherwise recreation of LUKS on sda2 will fail later

In the next steps, we will identify the current LVM partition (should be sda2), delete it and recreate it so we can add the LUKS device on top if it.

parted /dev/sda
print free # check which partition contains the LVM. Mind the "Start" value of that partition which should be 538MB
rm 2 # Remove the partition
mkpart primary 538MB -1s # Change the "Start" value of sda2 if it differs
quit
Adding the first LUKS device for our main system

At first, generate a very strong password for your LUKS device either external or on the server using pwgen 64 1.

Now create the first LUKS device with:

cryptsetup --cipher aes-xts-plain64 --key-size 512 --hash sha256 --iter-time 6000 luksFormat /dev/sda2

Type YES in uppercase letters and then type in the password. We use the newer and safer AES-XTS cipher and a key size of 512 bits which is effectively AES-256 because AES-XTS splits the key in half. With the default of 256 bits we would effectively have AES-128. We also increased the default iter-time from 2 to 6 seconds to make it harder for an attacker to find the password. It makes the initial decryption at boot 4 seconds slower but that's ok for us.

You can check your first LUKS device now with cryptsetup luksDump /dev/sda2.

Adding the second LUKS device

For our storage of PostgreSQL Backups we will use the 3TB RAID-1 as our storage, so let us create a LUKS device and an LVM on it as well.

parted /dev/sdb
mklabel gpt # For disks larger than 2TB! Otherwise mklabel msdos.
mkpart primary 2048s 100%

You may need to type Yes again to overwrite a potentially available disk label when running mklabel. That's ok.
You may wonder why i started the partition at sector 2048 in the mkpart command. That's because the gpt partition table eats up about 17.4KB at the beginning of the disk and when you try to create the partition at 17.5KB till the end - which would work by the way - parted will complain about the partition not properly aligned with the disk boundaries and this may affect performance. Since we want performance we created the partition perfectly aligned. If you're interested in the details about disk boundaries, read more about that here.

Now lets run cryptsetup again to create the second LUKS device.

cryptsetup --cipher aes-xts-plain64 --key-size 512 --hash sha256 --iter-time 6000 luksFormat /dev/sdb1

You can use another password here or the same. That's up to you.

Restore our saved Debian installation

Now we will restore our backupped Debian installation to the new LUKS device on sda2 but beforehand, we have to recreate the LVM, so it looks like the one we created at installation time.

cryptsetup luksOpen /dev/sda2 cryptroot # Open LUKS device
pvcreate /dev/mapper/cryptroot # Create physical LVM volume on LUKS
vgcreate vg0 /dev/mapper/cryptroot # Create our volume group again
# Create our logical volumes now
lvcreate -L 8G -n swap vg0
lvcreate -L 16G -n root vg0
lvcreate -L 8G -n var-log vg0
lvcreate -L 8G -n tmp vg0
lvcreate -L 150G -n var-lib-postgresql vg0
# Format the volumes and create swap again
mkfs.ext4 /dev/vg0/root
mkfs.ext4 /dev/vg0/var-log
mkfs.ext4 /dev/vg0/tmp
mkfs.ext4 /dev/vg0/var-lib-postgresql
mkswap /dev/vg0/swap
Adding the second LUKS to the party

Now let's add the second LUKS here as well, create the LVM for it and format it.

cryptsetup luksOpen /dev/sdb1 cryptmount
pvcreate /dev/mapper/cryptmount
vgcreate vg1 /dev/mapper/cryptmount
lvcreate -l 100%FREE -n var-lib-barman vg1
mkfs.ext4 /dev/vg1/var-lib-barman
Restore system and chroot into it for configuration

Now we can remount our newly created partitions and copy back our system which now resides on the LUKS devices. After that we will configure also the second LUKS device in the system and the headless boot-time decryption part.

mount /dev/vg0/root /mnt
mkdir -p /mnt/var/log
mount /dev/vg0/var-log /mnt/var/log
mkdir /mnt/tmp
mount /dev/vg0/tmp /mnt/tmp
mkdir -p /mnt/var/lib/postgresql
mount /dev/vg0/var-lib-postgresql /mnt/var/lib/postgresql
rsync -a /rootbackup/ /mnt/

Now we will also add our boot partition and chroot into the system to configure it.

mount /dev/sda1 /mnt/boot
mount --bind /dev /mnt/dev
mount --bind /sys /mnt/sys
mount --bind /proc /mnt/proc
chroot /mnt

Now we're chrooted into our installed system and we will continue with adding the second RAID and the configuration of the boot-time decryption. Let's start by adding the second RAID.

mkdir /var/lib/barman # Create new mountpoint
mount /dev/vg1/var-lib-barman /var/lib/barman
# Open /etc/fstab and add the new mount point at the end
vim /etc/fstab
/dev/vg1/var-lib-barman  /var/lib/barman  ext4  defaults 0 0
Configuring LUKS for boot-time decryption

Open up /etc/crypttab and add both LUKS devices there for boot-time decryption. The file should look like this:

# <target name>	<source device>		<key file>	<options>
cryptroot /dev/sda2 none luks,initramfs
cryptmount /dev/sdb1 none luks,initramfs

Keep in mind that every entry should have a distinct name followed by the LUKS device location. At the end we say that these are LUKS devices and not plain dm-crypt ones and the initramfs tells the system that it should only continue booting after all those devices are unlocked.

Now let's create /etc/rc.local and add some content. This will ensure proper network configuration after unlocking and booting the main system.

# Content of rc.local
/sbin/ifdown --force eth0
/sbin/ifup --force eth0
Fixing the cryptroot-unlock pitfall at boot-time

For headless boot-time decryption you have basically two methods available for unlocking your system.
This is either directly piping the password to a file like this echo passwd > /lib/cryptsetup/passfifo but this only works for single LUKS devices.
Then there is cryptroot-unlock. This is a shell script which normally reads our crypttab and should unlock all LUKS devices listed there that contain the initramfs option but due to a bug this doesn't work properly in Debian Stretch.

But we can simply use a newer fixed version of this script and replace the default one with that. Simply run this command below to download the fixed script and directly replace the old one.

curl https://salsa.debian.org/mjg59-guest/cryptsetup/raw/e0ad47dc25281372c01798dce41a8786f052057c/debian/initramfs/cryptroot-unlock -o /usr/share/cryptsetup/initramfs/bin/cryptroot-unlock
Adding your RSA public key to dropbear for remote ssh login

Now we will add your RSA public key for SSH to dropbear-initramfs so you're able to login with your private key part via SSH to unlock your server. Since dropbear in Debian 9 is a bit older it only supports RSA keys and no ED25519 keys yet.

# Copy the content of your public key
cat ~/.ssh/id_rsa.pub # For macOS you can add "|pbcopy" to directly copy it to your clipboard
On the server
vim /etc/dropbear-initramfs/authorized_keys
# Paste the key here and save the file
# You can add as much keys for multiple persons/machines as you like
Updating the initramfs for the next boot

Now we're almost done and we can update the initramfs for the next boot and update grub as well as install it on the first RAID again.

update-initramfs -u -k all
update-grub
grub-install /dev/sda

Cleanup and reboot to our encrypted system for the first time

Let us clean up our environment and reboot to our shiny new system right away.

exit # Exit Chroot environment
umount /mnt/var/lib/barman/
umount /mnt/var/lib/postgresql
umount /mnt/var/log
umount /mnt/tmp
umount /mnt/boot
umount /mnt/dev
umount /mnt/sys
umount /mnt/proc
umount /mnt
sync
shutdown -r now
Log in to dropbear and unlock your system

After the reboot, log in to your system again. Now you can simply run cryptroot-unlock and it will ask your for the password for earch entry from /etc/crypttab and will continue to boot to the main system afterwards. It looks like this.

~ # cryptroot-unlock
Please unlock disk cryptroot (/dev/sda2):
cryptsetup: cryptroot set up successfully
Please unlock disk cryptmount (/dev/sdb1):
cryptsetup: cryptmount set up successfully
# ...continues with booting here!

Bonus - Disable USB devices

Because we want to strengthen our server against physical attacks and we will probably never need USB on this machine - unless you need KVM support from Hetzner! - we can disable USB altogether by adding the USB drivers to a blacklist file.

vim /etc/modprobe.d/blacklist-usb.conf
# Add the following line there
install usb-storage /bin/false

Reboot the machine afterwards so that this change takes effect. When you now try to enable this module the following error should occur. Now all USB devices are so far disabled. Of course you can always revert that change as root or from the rescue system if needed.

~ # modprobe usb-storage
modprobe: ERROR: ../libkmod/libkmod-module.c:977 command_do() Error running install command for usb_storage
modprobe: ERROR: could not insert 'usb_storage': Operation not permitted

Bonus #2 - Get megacli for your system

Since you only have megacli available in the Rescue system, you can also get it quickly for your main system afterwards. In the E-Mail you got from Hetzner there is also a login included for their Download-Area. There you can download the MegaCLI package to your temp folder, unpack it and install it for your daily use. Simply follow the steps below. Replace the <user>:<password> part in the URL from the E-Mail!

mkdir /tmp/megacli
cd /tmp/megacli
wget https://<user>:<password>@download.hetzner.de/tools/LSI/tools/MegaCLI/8.07.14_MegaCLI.zip
apt-get install unzip rpm2cpio cpio # needed to unpack linux rpm package in debian
unzip 8.07.14_MegaCLI.zip
rpm2cpio Linux/MegaCli-8.07.14-1.noarch.rpm | cpio -idmv # Unpacks the rpm package
mv opt/MegaRAID/MegaCli/MegaCli64 /usr/local/bin/megacli

Now you can use the megacli command from all over the system.

Conclusion

I hope you find that tutorial useful because i also had some trouble finding a tutorial for my special use case and also stumbled over some bugs thoughout the way.

Acknowledgements

I also want to acknowledge the work of others where i found some ideas or help for this tutorial. I'll simply link their work here for that purpose.