Reproducible builds are (not) sufficient

Reproducible builds is a set of practices, in accordance to which FOSS software distributor encouraged to provide build flags, environment details, state of hardware, etc. needed to consistently recreate, with binary precision, exactly identical copy of distributed software from publically available source.

At the time of writing, 94% out of 24821 packages in Debian main repository claimed to be reproducible. However, there is a question: "Is mere fact that shipped build is reproducible could signify at least some extra reliability by itself, or could it be simply a clever trick to substitute trust in a distribution maintainer with an exact same amount of trust in an assumed depersonalized third-party auditor supposed to be responsible for verifying maintainer's work".

Let's simulate some thought process on this topic.

Why does it matter

Infrastructure built around free software distribution naturally caters to high standards of transparency and reliability; ideally, every user has to be sure to some extent that software user runs will never do anything unexpected without explicitly being allowed to. To achieve this amount of trustworthiness, projects like Debian developed some requirements for the software they include in main repositories and ship by default.

First of all, the entire source code of said software should be publically available to make independent behavior audit as easy as possible. This requirement guarantee that theoretically it is possible to reveal any unwanted hidden instruction before it will be actually executed by a real physical CPU.

Moreover, it must be legal to distribute both original and modified version of the program, so distribution author can legally remove said unwanted behavior from the code without waiting for original author's explicit permission (or perform some other modifications, fix security vulnerablities, add features etc. to enhance end user experience).

These rules are good per se, but none of them could guarantee that distribution maintainer is not able to include some unwanted instruction into the final binary right before producing the shipped package. In this scenario, whether we like it or not, user is obligated to trust either original author, or distribution maintainer without being able to fully examine entire chain of delivery. In order fix the issue, concept of reproducible builds has been brought up on the stage.

A little remarqué

Similarly to assertion that "no one reads the source code, therefore open source is pointless", one could think of saying like "since bitwise reproduction of entire distiribution on a consumer-grade machine is far-from-trivial task and nobody besides qualified specialists is capable to get their head around such idea, hence package reproducibility doesn't provide any additional layer of trustworthyness because one way or another, the consumer still ought to trust some kind of qualified professional".

However: to ensure with almost 100% certainty everything is done right, it isn't necessary to always verify every single corner.

They did the math

At the time of writing Debian stable branch comes with 24821 packages available in default repositories. According to statistics gathered using popularity-contest package during this week debian stable was launched at least once on at least (anonymous survey is an opt-in option that must be explicitly enabled during installation process, so actual number of users is likely way higher) 62909 machines.

For the sake of simplicity, let's assume the user doesn't run any of 5.2% packages marked unreproducible, and restricted themself to remaining 23347. Let's also assume exactly one of them contains a backdoor, or some other kind of unexpected code, so the final hash-sum of a shipped binary doesn't match the one end user will get while reproducing the package on a personal machine.

According to a simplified formula of probability for Bernoulli trial case, probability that one human will stumble upon package incorrectly built while checking 3 randomly chosen packages will be equal to:

Not that promising. Let's assume then, packages were verified this way (3 packages per person) by 100 independent people instead. The new probability can be calculated with classical probability multiplication formula:

Where number in scientific notation is more precise value earlier rounded to 0.013%. Here's a relation between number of people who independently verified 3 packages and probability to stumble upon some error, visualized:

res <- 0.9998714928250161
df <- as.data.frame(seq(1,62909))
df$power <- 1 - res^df$`seq(1, 62909)`
plot(df, type="l")
grid(10, 10, lwd = 1)

As it can be concluded from above, approximately at the point around 30,000 involved users we can be fairly convinced that binary signature mismatch no doubt will be revealed by someone if there are any.

But what if...

But what if every person bothered to check more than 3 packages? It is possible to simulate complete sheet of probabilities for every possible combination of variable factors. Here's how to make a table containing probabilities to find broken package relative to number of packages verified by a single person:

import math

x = 23347
for i in range (3,x):
    a = math.factorial(23347)/math.factorial(i)*math.factorial(23347-i)
    b = math.factorial(23346)/math.factorial(i-1)*math.factorial(23346-(i-1))
    print(b/a)

Output graph will expectedly look like straight line from zero to 100% probability.

Wouldn't it be cool then to combine graphs mentioned above into one beatiful and complete visualization? Since probability to find broken package uxing X people independently equals to power with probability to find it by oneself as a base and number of involved people as an exponent, it is easy to create combined table.

library(readr)
txt <- read_csv("~/path/to/file.txt", col_names = FALSE)
for (i in seq(2,62909)){txt[[i]] <- 1-(1-txt$X1)^i}

And draw beautiful 3D visualization.

persp(as.matrix(txt))

This, however, will require enormous amount of RAM and mostly pointless, because every value above some point, say, [1000,1000], will be in practice indistinguishable from just 1. Let's just throw part of a table away. Following code:

for (i in seq(2,1000)){mx[[i]] <- 1-(1-mx$X1)^i}

Will calculate only values from point [1,1] to [1000,1000]. Now we are ready to visualize.

library(rgl)
persp3d(as.matrix(txt),col="skyblue")

As can be seen, approximate probability to find broken package out of 23000 heap, tends to be considerably high only when about 800 people are ready to verify about 800 packages each. Plot with a limit in a point around [100,100] looks the following way.

Which means some little distributed amount of independent people can not easily achieve bulletproof reliability of the system even if entire information needed to reproduce binary builds with bitwise precision is publically available.

So how to verify by hand if package is reproducible

I'll go through entire process. It's up to you to decide if it's hard or not. First of all, we need of course, clean Debian installation in a VM, or on a real hardware, package groups build-essential, fakeroot, devscripts and a bit of patience. Also, it's very handy to download full official structured list of packages, even though recommended way to check out package status is to use simple python script.

Let's start with something stupidly simple, so no package-specific bugs will interfere. Again, here we talk about VERY sensitive subject, where absolute bitwise precision matters. I choose my favorite piece of software ever – tmux terminal multiplexer, it's lightweight and has very high code quality. I've just made sure it marked as reproducible by debian project.

{
        "architecture": "amd64",
        "build_date": "2017-06-16 00:13",
        "package": "tmux",
        "status": "reproducible",
        "suite": "stretch",
        "version": "2.3-4"
}

Step 1. Obtain the source. Navigate to and empty directory and exec apt source tmux. Easy-peasy.

Step 2. Obtain .buildinfo file with dependencies, environment variables and everything needed to reproduce exactly the same build. Navigate to continous integration system page and search for needed package. Here it is. Notice checksums at the beginning. It is essential to check package integrity for anyone unfamiliar with that shit.

Checksums-Md5:
 70686e3b722bc94d48910d6a93914fa6 982704 tmux-dbgsym_2.3-4_amd64.deb
 bdc7e6789cd735bf0bc843f775a69b71 264862 tmux_2.3-4_amd64.deb
Checksums-Sha1:
 f0b0b76faae4862c71e1dc5e6b0ed6c799cde561 982704 tmux-dbgsym_2.3-4_amd64.deb
 bcadd30228602c50f49e84c255c7167b592251cf 264862 tmux_2.3-4_amd64.deb
Checksums-Sha256:
 53d956fa9fbf4a49fb1cfe1d3c49c405a4ed7cfd00a1412b32be0e8a4af37c0f 982704 tmux-dbgsym_2.3-4_amd64.deb
 9166d818afedc4e571d0cbe5bab66c940837cd9c238d1de6d5369c64a692e707 264862 tmux_2.3-4_amd64.deb

Step 3. Install missing build dependecies using sudo apt build-dep tmux. Without this packages build likely will fail bacause of missing header files.

he following NEW packages will be installed:
  libevent-core-2.0-5 libevent-dev libevent-extra-2.0-5 libevent-openssl-2.0-5
  libevent-pthreads-2.0-5 libncurses5-dev libtinfo-dev libutempter-dev
  libutempter0 pkg-config
0 upgraded, 10 newly installed, 0 to remove and 0 not upgraded.
Need to get 874 kB of archives.
After this operation, 3,756 kB of additional disk space will be used.
Do you want to continue? [Y/n]

Step 4. Carefully check out anything mentioned in .buildinfo file and make sure it set up exactly the same way on your installation. For example, in case of tmux we need to verify 4 environment variables. Let's print what we got on the host system:

$ printenv DEB_BUILD_OPTIONS LANG LC_ALL SOURCE_DATE_EPOCH
en_US.UTF-8

It doesn't match. We need to set up this variables. This can be done using export command (don't worry, every mentioned change is temporary, after reboot they'll all be gone if we'll mess up).

$ export DEB_BUILD_OPTIONS="buildinfo=+all parallel=15" LANG="C" LC_ALL="C" SOURCE_DATE_EPOCH="1477219712"

Check out again

$ printenv DEB_BUILD_OPTIONS LANG LC_ALL SOURCE_DATE_EPOCH
buildinfo=+all parallel=15
C
C
1477219712

Exactly! Seems like we're ready to build a package.

Step 5. Building. Navigate to directory with name formatted like "packagename-packageversion", in my case it's tmux-2.3 and run debuild -i -us -uc -b as mentioned here. Wait for the compilation, notice at the end of the process string which tells you where to fing compiled package. Like this:

dpkg-deb: building package 'tmux' in '../tmux_2.3-4_amd64.deb'.

Step 6. Verify. Invoke sha256sum ../tmux_2.3-4_amd64.deb and compare result to the one mentioned in buildinfo. If you've got the same numbers, congratulations, the package is reproducible! Here's what i've got:

Which means, package is reproducible. Or not, if hashes do not match.

index→blog→b19529