Kubernetes 博客

Friday, March 22, 2019

Kubernetes End-to-end Testing for Everyone

Author: Patrick Ohly (Intel)

More and more components that used to be part of Kubernetes are now being developed outside of Kubernetes. For example, storage drivers used to be compiled into Kubernetes binaries, then were moved into stand-alone Flexvolume binaries on the host, and now are delivered as Container Storage Interface (CSI) drivers that get deployed in pods inside the Kubernetes cluster itself.

This poses a challenge for developers who work on such components: how can end-to-end (E2E) testing on a Kubernetes cluster be done for such external components? The E2E framework that is used for testing Kubernetes itself has all the necessary functionality. However, trying to use it outside of Kubernetes was difficult and only possible by carefully selecting the right versions of a large number of dependencies. E2E testing has become a lot simpler in Kubernetes 1.13.

This blog post summarizes the changes that went into Kubernetes 1.13. For CSI driver developers, it will cover the ongoing effort to also make the storage tests available for testing of third-party CSI drivers. How to use them will be shown based on two Intel CSI drivers:

Testing those drivers was the main motivation behind most of these enhancements.

E2E overview

E2E testing consists of several phases:

  • Implementing a test suite. This is the main focus of this blog post. The Kubernetes E2E framework is written in Go. It relies on Ginkgo for managing tests and Gomega for assertions. These tools support “behavior driven development”, which describes expected behavior in “specs”. In this blog post, “test” is used to reference an individual Ginkgo.It spec. Tests interact with the Kubernetes cluster using client-go.
  • Bringing up a test cluster. Tools like kubetest can help here.
  • Running an E2E test suite against that cluster. Ginkgo test suites can be run with the ginkgo tool or as a normal Go test with go test. Without any parameters, a Kubernetes E2E test suite will connect to the default cluster based on environment variables like KUBECONFIG, exactly like kubectl. Kubetest also knows how to run the Kubernetes E2E suite.

E2E framework enhancements in Kubernetes 1.13

All of the following enhancements follow the same basic pattern: they make the E2E framework more useful and easier to use outside of Kubernetes, without changing the behavior of the original Kubernetes e2e.test binary.

Splitting out provider support

The main reason why using the E2E framework from Kubernetes <= 1.12 was difficult were the dependencies on provider-specific SDKs, which pulled in a large number of packages. Just getting it compiled was non-trivial.

Many of these packages are only needed for certain tests. For example, testing the mounting of a pre-provisioned volume must first provision such a volume the same way as an administrator would, by talking directly to a specific storage backend via some non-Kubernetes API.

There is an effort to remove cloud provider-specific tests from core Kubernetes. The approach taken in PR #68483 can be seen as an incremental step towards that goal: instead of ripping out the code immediately and breaking all tests that depend on it, all cloud provider-specific code was moved into optional packages under test/e2e/framework/providers. The E2E framework then accesses it via an interface that gets implemented separately by each vendor package.

The author of a E2E test suite decides which of these packages get imported into the test suite. The vendor support is then activated via the --provider command line flag. The Kubernetes e2e.test binary in 1.13 and 1.14 still contains support for the same providers as in 1.12. It is also okay to include no packages, which means that only the generic providers will be available:

  • “skeleton”: cluster is accessed via the Kubernetes API and nothing else
  • “local”: like “skeleton”, but in addition the scripts in kubernetes/kubernetes/cluster can retrieve logs via ssh after a test suite is run

External files

Tests may have to read additional files at runtime, like .yaml manifests. But the Kubernetes e2e.test binary is supposed to be usable and entirely stand-alone because that simplifies shipping and running it. The solution in the Kubernetes build system is to link all files under test/e2e/testing-manifests into the binary with go-bindata. The E2E framework used to have a hard dependency on the output of go-bindata, now bindata support is optional. When accessing a file via the testfiles package, files will be retrieved from different sources:

  • relative to the directory specified with --repo-root parameter
  • zero or more bindata chunks

Test parameters

The e2e.test binary takes additional parameters which control test execution. In 2016, an effort was started to replace all E2E command line parameters with a Viper configuration file. But that effort stalled, which left developers without clear guidance how they should handle test-specific parameters.

The approach in v1.12 was to add all flags to the central test/e2e/framework/test_context.go, which does not work for tests developed independently from the framework. Since PR #69105 the recommendation has been to use the normal flag package to define its parameters, in its own source code. Flag names must be hierarchical with dots separating different levels, for example my.test.parameter, and must be unique. Uniqueness is enforced by the flag package which panics when registering a flag a second time. The new config package simplifies the definition of multiple options, which are stored in a single struct.

To summarize, this is how parameters are handled now:

  • The init code in test packages defines tests and parameters. The actual parameter values are not available yet, so test definitions cannot use them.
  • The init code of the test suite parses parameters and (optionally) the configuration file.
  • The tests run and now can use parameter values.

However, recently it was pointed out that it is desirable and was possible to not expose test settings as command line flags and only set them via a configuration file. There is an open bug and a pending PR about this.

Viper support has been enhanced. Like the provider support, it is completely optional. It gets pulled into a e2e.test binary by importing the viperconfig package and calling it after parsing the normal command line flags. This has been implemented so that all variables which can be set via command line flags are also set when the flag appears in a Viper config file. For example, the Kubernetes v1.13 e2e.test binary accepts --viper-config=/tmp/my-config.yaml and that file will set the my.test.parameter to value when it has this content: my: test: parameter: value

In older Kubernetes releases, that option could only load a file from the current directory, the suffix had to be left out, and only a few parameters actually could be set this way. Beware that one limitation of Viper still exists: it works by matching config file entries against known flags, without warning about unknown config file entries and thus leaving typos undetected. A better config file parser for Kubernetes is still work in progress.

Creating items from .yaml manifests

In Kubernetes 1.12, there was some support for loading individual items from a .yaml file, but then creating that item had to be done by hand-written code. Now the framework has new methods for loading a .yaml file that has multiple items, patching those items (for example, setting the namespace created for the current test), and creating them. This is currently used to deploy CSI drivers anew for each test from exactly the same .yaml files that are also used for deployment via kubectl. If the CSI driver supports running under different names, then tests are completely independent and can run in parallel.

However, redeploying a driver slows down test execution and it does not cover concurrent operations against the driver. A more realistic test scenario is to deploy a driver once when bringing up the test cluster, then run all tests against that deployment. Eventually the Kubernetes E2E testing will move to that model, once it is clearer how test cluster bringup can be extended such that it also includes installing additional entities like CSI drivers.

Upcoming enhancements in Kubernetes 1.14

Reusing storage tests

Being able to use the framework outside of Kubernetes enables building a custom test suite. But a test suite without tests is still useless. Several of the existing tests, in particular for storage, can also be applied to out-of-tree components. Thanks to the work done by Masaki Kimura, storage tests in Kubernetes 1.13 are defined such that they can be instantiated multiple times for different drivers.

But history has a habit of repeating itself. As with providers, the package defining these tests also pulled in driver definitions for all in-tree storage backends, which in turn pulled in more additional packages than were needed. This has been fixed for the upcoming Kubernetes 1.14.

Skipping unsupported tests

Some of the storage tests depend on features of the cluster (like running on a host that supports XFS) or of the driver (like supporting block volumes). These conditions are checked while the test runs, leading to skipped tests when they are not satisfied. The good thing is that this records an explanation why the test did not run.

Starting a test is slow, in particular when it must first deploy the CSI driver, but also in other scenarios. Creating the namespace for a test has been measured at 5 seconds on a fast cluster, and it produces a lot of noisy test output. It would have been possible to address that by skipping the definition of unsupported tests, but then reporting why a test isn’t even part of the test suite becomes tricky. This approach has been dropped in favor of reorganizing the storage test suite such that it first checks conditions before doing the more expensive test setup steps.

More readable test definitions

The same PR also rewrites the tests to operate like conventional Ginkgo tests, with test cases and their local variables in a single function.

Testing external drivers

Building a custom E2E test suite is still quite a bit of work. The e2e.test binary that will get distributed in the Kubernetes 1.14 test archive will have the ability to test already installed storage drivers without rebuilding the test suite. See this README for further instructions.

E2E test suite HOWTO

Test suite initialization

The first step is to set up the necessary boilerplate code that defines the test suite. In Kubernetes E2E, this is done in the e2e.go and e2e_test.go files. It could also be done in a single e2e_test.go file. Kubernetes imports all of the various providers, in-tree tests, Viper configuration support, and bindata file lookup in e2e_test.go. e2e.go controls the actual execution, including some cluster preparations and metrics collection.

A simpler starting point are the e2e_[test].go files from PMEM-CSI. It doesn’t use any providers, no Viper, no bindata, and imports just the storage tests.

Like PMEM-CSI, OIM drops all of the extra features, but is a bit more complex because it integrates a custom cluster startup directly into the test suite, which was useful in this case because some additional components have to run on the host side. By running them directly in the E2E binary, interactive debugging with dlv becomes easier.

Both CSI drivers follow the Kubernetes example and use the test/e2e directory for their test suites, but any other directory and other file names would also work.

Adding E2E storage tests

Tests are defined by packages that get imported into a test suite. The only thing specific to E2E tests is that they instantiate a framework.Framework pointer (usually called f) with framework.NewDefaultFramework. This variable gets initialized anew in a BeforeEach for each test and freed in an AfterEach. It has a f.ClientSet and f.Namespace at runtime (and only at runtime!) which can be used by a test.

The PMEM-CSI storage test imports the Kubernetes storage test suite and sets up one instance of the provisioning tests for a PMEM-CSI driver which must be already installed in the test cluster. The storage test suite changes the storage class to run tests with different filesystem types. Because of this requirement, the storage class is created from a .yaml file.

Explaining all the various utility methods available in the framework is out of scope for this blog post. Reading existing tests and the source code of the framework is a good way to get started.


Vendoring Kubernetes code is still not trivial, even after eliminating many of the unnecessary dependencies. k8s.io/kubernetes is not meant to be included in other projects and does not define its dependencies in a way that is understood by tools like dep. The other k8s.io packages are meant to be included, but don’t follow semantic versioning yet or don’t tag any releases (k8s.io/kube-openapi, k8s.io/utils).

PMEM-CSI uses dep. It’s Gopkg.toml file is a good starting point. It enables pruning (not enabled in dep by default) and locks certain projects onto versions that are compatible with the Kubernetes version that is used. When dep doesn’t pick a compatible version, then checking Kubernetes’ Godeps.json helps to determine which revision might be the right one.

Compiling and running the test suite

go test ./test/e2e -args -help is the fastest way to test that the test suite compiles.

Once it does compile and a cluster has been set up, the command go test -timeout=0 -v ./test/e2e -ginkgo.v runs all tests. In order to run tests in parallel, use the ginkgo -p ./test/e2e command instead.

Getting involved

The Kubernetes E2E framework is owned by the testing-commons sub-project in SIG-testing. See that page for contact information.

There are various tasks that could be worked on, including but not limited to:

  • Moving test/e2e/framework into a staging repo and restructuring it so that it is more modular (#74352).
  • Simplifying e2e.go by moving more of its code into test/e2e/framework (#74353).
  • Removing provider-specific code from the Kubernetes E2E test suite (#70194).

Special thanks to the reviewers of this article:



作者: Josh Berkus (红帽), Yang Li (The Plant), Puja Abbassi (Giant Swarm), XiangPeng Zhao (中兴通讯)

KubeCon 上海站新贡献者峰会与会者,摄影:Jerry Zhang

KubeCon 上海站新贡献者峰会与会者,摄影:Jerry Zhang

最近,在中国的首次 KubeCon 上,我们完成了在中国的首次新贡献者峰会。看到所有中国和亚洲的开发者(以及来自世界各地的一些人)有兴趣成为贡献者,这令人非常兴奋。在长达一天的课程中,他们了解了如何、为什么以及在何处为 Kubernetes 作出贡献,创建了 PR,参加了贡献者圆桌讨论,并签署了他们的 CLA。

这是我们的第二届新贡献者工作坊(NCW),它由前一次贡献者体验 SIG 成员创建和领导的哥本哈根研讨会延伸而来。根据受众情况,本次活动采用了中英文两种语言,充分利用了 CNCF 赞助的一流的同声传译服务。同样,NCW 团队由社区成员组成,既有说英语的,也有说汉语的:Yang Li、XiangPeng Zhao、Puja Abbassi、Noah Abrahams、Tim Pepper、Zach Corleissen、Sen Lu 和 Josh Berkus。除了演讲和帮助学员外,团队的双语成员还将所有幻灯片翻译成了中文。共有五十一名学员参加。

Noah Abrahams 讲解 Kubernetes 沟通渠道。摄影:Jerry Zhang

Noah Abrahams 讲解 Kubernetes 沟通渠道。摄影:Jerry Zhang

NCW 让参与者完成了为 Kubernetes 作出贡献的各个阶段,从决定在哪里作出贡献开始,接着介绍了 SIG 系统和我们的代码仓库结构。我们还有来自文档和测试基础设施领域的「客座讲者」,他们负责讲解有关的贡献。最后,我们在创建 issue、提交并批准 PR 的实践练习后,结束了工作坊。

这些实践练习使用一个名为贡献者游乐场的代码仓库,由贡献者体验 SIG 创建,让新贡献者尝试在一个 Kubernetes 仓库中执行各种操作。它修改了 Prow 和 Tide 自动化,使用与真实代码仓库类似的 Owners 文件。这可以让学员了解为我们的仓库做出贡献的有关机制,同时又不妨碍正常的开发流程。

Yang Li 讲到如何让你的 PR 通过评审。摄影:Josh Berkus

Yang Li 讲到如何让你的 PR 通过评审。摄影:Josh Berkus

「防火长城」和语言障碍都使得在中国为 Kubernetes 作出贡献变得困难。而且,中国的开源商业模式并不成熟,员工在开源项目上工作的时间有限。

中国工程师渴望参与 Kubernetes 的研发,但他们中的许多人不知道从何处开始,因为 Kubernetes 是一个如此庞大的项目。通过本次工作坊,我们希望帮助那些想要参与贡献的人,不论他们希望修复他们遇到的一些错误、改进或本地化文档,或者他们需要在工作中用到 Kubernetes。我们很高兴看到越来越多的中国贡献者在过去几年里加入社区,我们也希望将来可以看到更多。

「我已经参与了 Kubernetes 社区大约三年」,XiangPeng Zhao 说,「在社区,我注意到越来越多的中国开发者表现出对 Kubernetes 贡献的兴趣。但是,开始为这样一个项目做贡献并不容易。我尽力帮助那些我在社区遇到的人,但是,我认为可能仍有一些新的贡献者离开社区,因为他们在遇到麻烦时不知道从哪里获得帮助。幸运的是,社区在 KubeCon 哥本哈根站发起了 NCW,并在 KubeCon 上海站举办了第二届。我很高兴受到 Josh Berkus 的邀请,帮助组织这个工作坊。在工作坊期间,我当面见到了社区里的朋友,在练习中指导了与会者,等等。所有这些对我来说都是难忘的经历。作为有着多年贡献者经验的我,也学习到了很多。我希望几年前我开始为 Kubernetes 做贡献时参加过这样的工作坊」。

贡献者圆桌讨论。摄影:Jerry Zhang

贡献者圆桌讨论。摄影:Jerry Zhang

工作坊以现有贡献者圆桌讨论结束,嘉宾包括 Lucas Käldström、Janet Kuo、Da Ma、Pengfei Ni、Zefeng Wang 和 Chao Xu。这场圆桌讨论旨在让新的和现有的贡献者了解一些最活跃的贡献者和维护者的幕后日常工作,不论他们来自中国还是世界各地。嘉宾们讨论了从哪里开始贡献者的旅程,以及如何与评审者和维护者进行互动。他们进一步探讨了在中国参与贡献的主要问题,并向与会者预告了在 Kubernetes 的未来版本中可以期待的令人兴奋的功能。

工作坊结束后,XiangPeng Zhao 和一些与会者就他们的经历在微信和 Twitter 上进行了交谈。他们很高兴参加了 NCW,并就改进工作坊提出了一些建议。一位名叫 Mohammad 的与会者说:「我在工作坊上玩得很开心,学习了参与 k8s 贡献的整个过程。」另一位与会者 Jie Jia 说:「工作坊非常精彩。它系统地解释了如何为 Kubernetes 做出贡献。即使参与者之前对此一无所知,他(她)也可以理解这个过程。对于那些已经是贡献者的人,他们也可以学习到新东西。此外,我还可以在工作坊上结识来自国内外的新朋友。真是棒极了!」

贡献者体验 SIG 将继续在未来的 KubeCon 上举办新贡献者工作坊,包括西雅图站、巴塞罗那站,然后在 2019 年六月回到上海。如果你今年未能参加,请在未来的 KubeCon 上注册。并且,如果你遇到工作坊的与会者,请务必欢迎他们加入社区。