On Reproducibility

a Hacktoberfest Special

About Me

GitHub: purefunctor

Affiliations:

  • Core Team Member for PureScript
  • OSSPH Staff Member

Currently building:

  • language server for PureScript
  • AI-based audio plugins at work

Started contributing to OSS in 2021, and started learning programming in 2018

Goals

The goals of this talk are as follows:

  1. To introduce the concept of reproducibility
  2. To explain how it manifests in different contexts
  3. To present the challenges against reproducibility
  4. To advise on how to approach these challenges

Scope

This talk will focus mostly on general concepts, including technical ones, but not necessarily anything language-specific.

Feel free to ask questions in our Q&A about certain languages and I shall answer to the best of my knowledge.

Likewise, if it's your first time hearing about things in this talk, I also suggest checking them out.

Reproducibility

Reproducibility can be defined as the consistency of results across repetitions of some process.

In the context of software, reproducibility can be applied to the output and behaviour of a program.

Reproducible programs can be random, just that randomness itself should be the expected behaviour.

Expecting the Unexpected

Reproducibility also applies to unexpected behaviour such as bugs or runtime errors.

Being able to reproduce unexpected behaviour allows for analysis to be performed and fixes to be delivered.

Ever heard of this line?

  • "It works on my machine ¯\_(ツ)_/¯"

In Different Contexts

Up next, I'll be talking about reproducibility in different contexts, specifically, reproducibility in:

  1. a learning environment
  2. in software development
  3. in machine learning

In a Learning Environment

In a learning environment, reproducibility can often be overlooked when more focus is given on writing code.

During collaboration, differences between learners' devices can cause code to not run at all.

Likewise, this problem may also manifest during evaluation by the instructor, due to the difference between the instructor's device and the learner's.

In a Learning Environment

Ideally, deploying standardized machines would lessen friction in collaboration, but that is a challenge in itself, especially in remote contexts.

Realistically, taking the time to establish a setup that works for the most general case is the most effective strategy.

In Software Development

In (general) software development, reproducibility is often applied in terms of build artefacts or reproducible builds. 1

Open source software offers a greater degree of reproducibility as the source code is readily available and binary artifacts can be remade in a deterministic manner.

This notion of reproducibility also extends towards a program's runtime dependencies, in that they integrate with the final build.

In Software Development

Tools such as package managers allow software components to be installed from a trusted source. For instance, apt can pull trusted binaries from a package repository such as Ubuntu's. While other package managers like portage may prefer building from source, with binaries provided for convenience.

At the language level, build systems can make use of a "lockfile" to keep track and enforce dependency versions. For instance, there's Cargo.lock, package-lock.json, poetry.lock, among many others. Likewise, they may also contain other useful metadata such as the content hash of the dependency for added security against certain types of supply chain attacks.

In Software Development

Version control systems like Git tracks changes in the code base, allowing for bugs to be tracked down one commit at a time.

Novel approaches like Nix take reproducibility to the next level by building an entire ecosystem around it 2, that being Nix, Nixpkgs, and NixOS.

Containerization tools like Docker and Podman make it easy to isolate the runtime of programs from the machine they're being developed and run on.

In Software Development

CI/CD platforms can also provide hardened build environments that make sure that software being built on them works as expected and is built without modification.

Rich documentation also helps with taking note of any pre-requisites that need to be performed in order to reproduce a build.

There's also issue/ticket templates that can help guide developers into writing high-quality bug reports.

Modern tools are continuously being developed to address reproducibility as its in our best interest to do so.

In Machine Learning

In machine learning, reproducibility can be applied to the quality or output of a model being consistent across repeated training.

A model's reproducibility is not binary, whether or not code is available does not determine if it could be repeated.

For instance, the paper and code may be readily available, but not the dataset that was used for the results.

On the other hand, the paper, detailing the methodology, as well as the dataset, may be available, but not the code that implements it.

In Machine Learning

The availability of pre-trained models is also one other factor, but ML reproducibility is also a hardware problem, especially for LLMs

GPT-4 has a rumored 1.76 trillion parameters which would take a datacenter's worth of GPUs to train. 3

Even lightweight models like Mistral-7B recommend at least 24 GBs of VRAM for good throughput. 4

Cloud platforms like GCP, Azure, and AWS help alleviate the problem by providing access to enterprise GPUs.

In Machine Learning

As machine learning is a field of software development, it inherits the code and infrastructure problems from the latter.

Reproducibility is a multi-faceted problem in machine learning, ranging from the availability of research, code, and data; and hardware constraints due to model complexity.

Reproducibility in Practice

A few tips to start building with reproducibility in mind:

  1. When learning a new programming language, take the time to learn project management practices for that language.
  2. Make use of version control systems like Git, and platforms like GitHub to track changes and collaborate.
  3. Document the important bits when setting up a project, like any dependencies that need to be installed or config files that need to be written.

Reproducibility in Practice

  1. Set up automated builds and tests through CI/CD, to provide a baseline on how you expect software to run on an independent environment.
  2. For ML-adjacent projects, keep track of parameters used for experiments. Setting the seed for randomness is also applicable.
  3. For non-code assets, make sure that they're hosted in a secure, readily available source.

Wrapping Up

Reproducibility is a hard problem, but it's also one that we should familiarize ourselves with.

Slides

Audio Plugins

Q&A

Time for questions!