gokeepasslib - Reading a Keepass 2 file with Go

One can certainly argue about the security of Keepass. I for one am currently using it to store my passwords, mainly because I do not necessarily trust any web service to handle my passwords. At least with Keepass I can make sure that my file only goes to places where I think it should go.

(Un)Fortunately I have a couple of different devices running diferent operating systems. I have a Macbook at work, a Windows PC at home and a Linux notebook for my private coding projects.

If you know Keepass 2, you know that it is written in C# using dotNet. Running it on an OS not called Windows is somewhat annoying. You can get it to work using Mono on OS X but I had lots of issues with that approach (it kept crashing...).
In order to still be able to access my Keepass file, I am using keepass-node. It works well for reading files. Saving files is another issue as it broke my file making it unreadable for the Windows version of Keepass 2. It is far from perfect, but it is sufficient.

The thing that annoys me about keepass-node though is, that I always have to open it up in a browser and to get it running initially a lot of stuff has to be installed. It would be great if it only required a single binary, CLI would be fine for me.
With that in mind I figured: why not build it in Go?

I started to look around trying to find some proper specification of the file format but either I did not look hard enough or there just isn't any. Thus: use the source! - so I did.
I looked at keepass.io, the library used by keepas-node, as I knew that it works and I was not eager to look at the C# source on sourceforge.

File format

An encrypted keepass file consists of 3 parts:

  • A signature part with general information about the file
  • A header part containing all kinds of information necessary for decrypting the file
  • The actual (encrypted) content

Signature

The signature consists of 12 bytes, 4 bytes each for a BaseSignature, a VersionSignature and a FileVersion (in that order). The BaseSignature should match 0x03, 0xd9, 0xa2, 0x9a and the VersionSignature should match 0x67, 0xfb, 0x4b, 0xb5.

There are 10 (possible) headers in a Keepass file:

  1. Comment
  • CipherID
  • CompressionFlags
  • MasterSeed
  • TransformSeed
  • TransformRounds
  • EncryptionIV
  • ProtectedStreamKey
  • StreamStartBytes
  • InnerRandomStreamID

Each header consists of 1 byte containing the ID (same value as in the list above), 2 bytes little endian uint16 containing the length of the header content and then the number of bytes as just given for the actual content.
Mostly this content can just be taken directly as binary.
The CompressionFlags was parsed as little endian uint32 in keepass.io so I figured I would do the same - interestingly I did not have to use that value though.
The TransformRounds header consists of 8 bytes. You can parse it as a little endian uint64.

Encrypted Data

The data itself is AES-256-CBC encrypted. For decryption a key and an IV are required. The IV can be found amongst the headers labeled EncryptionIV. The key itself has to be derived first.
The windows version of Keepass 2 provides 3 ways to secure your file: a password, a key file and the Windows user account. For portability the Windows user account is not all that useful, thus it will be ignored.
The key file is either directly regarded as binary key or as an XML file containing <data></data> which encapsulates the key.
This content or the password provided will be hashed twice with SHA-256.
Then the two 16 byte blocks of the key have to be transformed.
The number of TransformRounds times AES-256-ECB on each block.
Since Go does not directly support ECB (meaning there is no function simply providing it) a bit of background knowledge can help. ECB uses the same block cipher that CBC uses with the slight difference, that CBC requires an initialization vector and passes the result of a block as vector into the next block. You can use an empty initialization vector and only encrypting a single block though to emulate the behaviour of ECB. Note that ECB is NOT suited for encrypting larger amounts of data as any block equivalent to another block will also have the same result after encryption making it easy to attack. Trying to read a given format though there is no choice but to use it.
The key derivation works by using the TransformSeed as the encryption key, an empty (all zeros) IV and the individual 16 byte blocks of the hash as plaintext.
The result of these transformation rounds will be appended to the MasterSeed and hashed again using SHA-256. The result is the key which is required to decrypt the data.

As already mentioned the content is AES encrypted using 256 byte CBC thus simply decrypting it using our derived key and the EncryptionIV works.
The resulting content contains bytes for IDs and hashes to allow integrity checks (MAC-then-encrypt).
StreamStartBytes can be used to compare the same amount of bytes from the decrypted data to ensure proper decrypt.

The first 4 bytes of the decrypted data are an index for the block number (so theoretically the order of the content could be random and should be ordered now - fortunately it worked ignoring this). The next 32 bytes are a hash of the content. The length of the content in byte is defined by the next 4 bytes as little endian uint32, followed by the content.
Until the end of the file there can be multiple such blocks.
The SHA-256 of a content unit should match the 32 bytes hash value. If all hashes match the data can be appended to each other.

The result of this is a gzip-ed block of bytes. Simply unzipping it is sufficient.
The unzipped data now is simple XML and can be parsed as such. Fortunately the XML encoding package in go is working just as well as the JSON encoding package to unmarshal data into a struct.
An example of the decrypted XML can be found here.

Unfortunately the passwords are still protected using Salsa20 as a stream cipher - with a custom implementation. It works by unlocking ALL protected password entries in the correct order. Basically this order is: Top to bottom in the file (depth first search) on anything that has an entry with a password. This includes entries for the password history.

Conclusion

In hindsight I have to say that go was a great choice to do this. I am doing a lot of Ruby at work which probably would not have been as nice to do this. Also I would be back at having dependencies (Ruby) to be instaled before being able to use it.

Furthermore it was great that everything can be (easily) done using the standard library. It took me a while to figure out how to do ECB though as I am not that proficient in anything related to crypto.

I released the code for it on Github. Check it out and open an issue if you notice any major mistakes I made in handling crypto or send me a pull request.
For now it is a rather simple port of keepass.io and does not take clearing memory into account.
I reached my goal (being able to read a keepass file) with this and thus I am not sure how much work I will put in in the near future. Especially since I am exploring alternatives for Keepass.

In an earlier post I wrote about how the golang challenge is good for learning new things around go in a small task. My participation in the first challenge did help me with this.

Addition

In case you are wondering why not all different headers were mentioned in usage in this post: Not all of them are actually necessary as there is only 1 option. CipherID for example does not do anything at the moment as there is only one option.