The Go Challenge - a fun way to learn and improve

There are many ways to learn a language and to improve your skills in it.
Some gophers came up with the Go Challenge, a coding contest in which you have to solve an exercise using Go and send in your solution. All solutions will be reviewed and the best can win some nice prices. Although the prices are interesting, they are not what motivated me to participate.

I am working as a Ruby developer thus practising Go is more of a hobby. Finding suitable projects that do not die half way through due to the lack of time and/or motivation is not always that easy (when spending most of the day on writing code, just sitting back in the evening and weekends is actually pretty nice). Finding a project that fits a specific language is even harder.

The first Go Challenge provided a perfect little project for me. It was easily solvable within a couple of hours (depending on the individual skill set, of course) and I got the satisfaction of having a finished project. Something I do not have to spend any more time on once it is done. Any software developer will probably agree: That rarely happens, you are never done, there is always something else that can be added, improved or updated.

In hindsight it took me actually pretty long to solve it - as always when you are done and understood how it works.
From an engineering point of view the task was actually pretty simple: load binary data from a file, interpret it and load it into an object of type Pattern. In order to check that the data was loaded correctly the object also required a way to represent it as a string for human interpretation (in Go generally done by implementing the Stringer interface).
Sounds easy? Well, the coding part actually was not really hard - at least not in getting a working solution - optimization is another story. I for one spent most of my time trying to figure out the binary format.
In Ruby there is a Gem for almost everything. You never have to think all that often about what the data looks like in memory and at least for me, mostly working with microservices and JSON, a binary format would not be the first thing that comes to mind when thinking about saving data to a file.
Fortunately the challenge has a nicely written scenario which explains most of the background required to get a feeling about how the data is structured. It actually lists all the elements that have to be found in the data.
Furthermore some hints are given as to where to get started if you have not worked with defining your own decoding and encoding methods. Aside from the encoding package in general with its BinaryUnmarshaler interface what helped me most was probably looking at an actual implementation of the interface in the time package.

A few test cases and example files were provided so you could check whether your code works on a limited set of data. I really liked this because it showed me when I was done. Done as in: passing the test cases while requiring some refactoring.

How to interpret the data

Still: how do you turn

53 50 4c 49 43 45 00 00 00 00 00 00 00 c5 30 2e 38 30 38 2d 61 6c
70 68 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 f0 42 00 00 00 00 04 6b 69 63 6b 01 00 00 00 01 00 00
00 01 00 00 00 01 00 00 00 01 00 00 00 05 73 6e 61 72 65 00 00 00
00 01 00 00 00 00 00 00 00 01 00 00 00 02 00 00 00 04 63 6c 61 70
00 00 00 00 01 00 01 00 00 00 00 00 00 00 00 00 03 00 00 00 07 68
68 2d 6f 70 65 6e 00 00 01 00 00 00 01 00 01 00 01 00 00 00 01 00
04 00 00 00 08 68 68 2d 63 6c 6f 73 65 01 00 00 00 01 00 00 00 00
00 00 00 01 00 00 01 05 00 00 00 07 63 6f 77 62 65 6c 6c 00 00 00
00 00 00 00 00 00 00 01 00 00 00 00 00

into

Saved with HW Version: 0.808-alpha
Tempo: 120
(0) kick     |x---|x---|x---|x---|
(1) snare    |----|x---|----|x---|
(2) clap     |----|x-x-|----|----|
(3) hh-open  |--x-|--x-|x-x-|--x-|
(4) hh-close |x---|x---|----|x--x|
(5) cowbell  |----|----|--x-|----|

(replaced tabs after track names with spaces for better readability)

The key is obviously understanding the data. It is actually suggested in the challenge to use a hex editor which will at least visualize the strings hidden in the data. I am going to use hex values for the individual bytes as a representation in this post.

The first 6 bytes (53 50 4c 49 43 45) are actually just the string SPLICE in binary. The author of the challenge, Matt Aimonetti, is CTO and co-founder of Splice - I probably would have placed something like that as well.
These 6 bytes do not contain any data that has to be unmarshaled but it does allow for a basic check on whether just random byte data was provided.

The next 8 bytes (00 00 00 00 00 00 00 c5) are a BigEndian.Uint64 which limits the length of the total data set (necessary for the fifth example file which has some additional data in the end). In this case: 197 (bytes).
Using all 8 bytes is something I only realized after I sent in my solution (about 5 minutes later...). In the examples using the last of those bytes is actually sufficient - it does limit the amount of tracks greatly though. It did not come to mind at first since I rarely work with binary (as already mentioned) and I already used LittleEndian.Uint32 in other cases and neither that nor the 64bit version worked. I also figured: why use little and big endian values? A friend later commented that it was probably related to efficiency since depending on the specific usage such a combination might be better than sticking to just one. (He even mentioned a deflate implementation he knew as an example.)

The 32 bytes following (30 2e 38 30 38 2d 61 6c 70 68 61 00 00 ...) are the version string including some padding for possible longer versions. In this case: 0.808-alpha.

The 4 bytes (00 00 f0 42) after that are a Float32 which you can read using LittleEndian.Uint32 and math.Float32frombits.

Everything afterwards is a track. Tracks can have different lengths in data requirement since the length depends on the name of the track.

Track interpretation

I am going to use the first track as an example.

00 00 00 00 04 6b 69 63 6b 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00

Each track starts with 4 bytes (00 00 00 00) representing the track number. In this case obviously this is 0 no matter how you read it. For the second track (01 00 00 00) it would be 1 read as a LittleEndian.Uint32.

The byte (04) after that is the length for the track name. A simple Uint8 - with the value 4.

With the length indicator we now know how many bytes have to be read for the name: 4 (6b 69 63 6b) which represent kick in binary.

And the final 16 bytes are just 0 and 1 values which stand for whether an instrument should be played at that point or not. A 1 is an x and a 0 a - in the representation.

The tracks are directly appended and you should keep reading tracks until you reach the end or the value found in bytes 7 to 14 - whichever comes first. If the file is shorter than said value though, data might be missing.

Some implementation details

I implemented the UnmarshalBinary function to load the data into the object. So that the DecodeFile function only had to open the file, read the data into a byte slice and pass it to the function.

func DecodeFile(path string) (*Pattern, error) {
	p := &Pattern{}
	var file *os.File

	file, err := os.Open(path)
	if err != nil {
		return nil, err
	}

	r := bufio.NewReader(file)

	var data []byte
	if data, err = ioutil.ReadAll(r); err != nil {
		return nil, err
	}
	if err := p.UnmarshalBinary(data); err != nil {
		return nil, err
	}

	return p, nil
}

A more efficient solution in terms of memory and cpu usage might have been to not read everything into a slice at once but only read as needed.

The UnmarshalBinary function then split up the data as described above and unmarshaled the data into the fields of the Pattern:

type Pattern struct {
	Version string
	Tempo   float32
	Tracks  []Track
}

type Track struct {
	ID    int32
	Name  string
	Steps []byte
}

Once the data is properly unmarshaled all that is left todo is to call the String function and compare it to the expected value.

All solutions have been released on Github and can be found here. My solution in particular can be found here.

I am fairly certain that my solution is not even close to being the most efficient and the most well-structured in regards to code placement.
One of the evaluators, Dominik Honnef, wrote a blog post about his experience reviewing the solutions and some mistakes that were made regularly. I will not go into those here other than saying: while reading it I certainly came across a few things that could have been better.

Conclusion

All in all I can say: I enjoyed the challenge.

  • I learned a few new things about Go
  • It gave me some more motiviation to read the standard library extensively to at least know where to look for specific implementation examples
  • It showed me that I probably should play around more with binary data
  • Having the other solutions to compare to is a nice way to get some ideas for specific improvements

I am certainly looking forward to the results of this one and the start of the next Go Challenge which is supposed to start in April.