Resources

This page collects external resources that can help you learn more about kontextfrei.

Blog posts

Introducing kontextfrei

Conference talks

kontextfrei: A new approach to testable Spark applications, Scalar Conf Warsaw 2017, April 08, 2017

Slides are available on Speaker Deck

Abstract

Apache Spark has become the de-facto standard for writing big data processing pipelines. While the business logic of Spark applications is often at least as complex as what we have been dealing with in a pre-big data world, enabling developers to write comprehensive, fast unit test suites has not been a priority in the design of Spark. The main problem is that you cannot test your code without at least running a local SparkContext. These tests are not really unit tests, and they are too slow for pursuing a test-driven development approach.

In this talk, I will introduce thekontextfrei library, which aims to liberate you from the chains of the SparkContext. I will show how it helps restoring the fast feedback loop we are taking for granted. In addition, I will explain how kontextfrei is implemented and discuss some of the design decisions made and look at alternative approaches and current limitations.

The source code for this page can be found here.