DatE
May 18, 2021
Reading Time
10 Min.

What ancient cultures have in common with software testing

Software Engineering
Testing

By

Andreas Siegel

Photo by Nour Wageh on Unsplash

We are all familiar with pyramids. The best known are probably the pyramids that Egyptian pharaohs had built as burial sites and a reflection of the hierarchy of the social structure. However, pyramids were built independently and completely autonomously by tribes living far apart from each other. Today we know pyramids from all over the world: Egypt, Latin America, China, Greece, Rome, ... What they have in common is that they are mainly buildings with a religious and/or ceremonial character, for example as part of a funerary cult.

But what exactly does this have to do with software and the testing of software? Is it also a construct with a ceremonial or religious character? The answer to this question probably depends very much on who you ask ... Is it even a death cult? Hopefully not – our software should live!

And a completely different pyramid makes an important contribution to this: the test pyramid.

In the mythological interpretation of step pyramids, it was a huge stone staircase to heaven, the "house of eternity" for a king, which allowed him to attain immortality. Immortality sounds good, perhaps that's what we want for our software too. However, perhaps we shouldn't approach the subject like Cheops, the evil pharaoh who forced all his subordinates to help build the pyramids.

Our motivation to help build the (test) pyramid can perhaps be found in the benefits that people in Latin America derived from their pyramids: The pyramids there were the substructures of busy plateaus on which houses, palaces and temples were built. These structures were thus protected from flooding during often heavy rainfall. This protection is potentially vital.

The same can be said for tests in software development – our test pyramids. Admittedly, despite their importance, in places they are more like burial mounds, the logical architectural and structural development of which were the Egyptian pyramids. So if the tests in your project still look more like a hill than a pyramid, don't worry – you're not the first (and probably not the last) to feel this way. And there is hope. The many pyramids in Latin America were sometimes even built over several times.

Pyramid level 1: Unit tests

Photo by Xavi Cabrera on Unsplash

So let's start our consideration of the software test pyramid with small hills, of which we can build many without much effort in order to give the rest of the structure a broad base. Let's start with the unit tests. As the smallest elements of our test pyramid, they may even be its building blocks. Unit tests check a single, individual piece of code. This could be a component or function, for example. The basic rule for this piece of code is that it is the smallest testable unit for which there are only a few input values and usually a single result.

Everything that is not in the scope of this single "unit" is mocked, stubbed or replaced by fake objects. The aim is to isolate the code to be tested so that clear and unambiguous input and output values are guaranteed. Ideally, such tests are written before the code to be tested even exists. Robert C. Martin ("Uncle Bob") has established three laws for such test-driven development:

  • You must not write production code until you have written a failing unit test for it.
  • The unit test must not contain more code than is necessary for the test to compile correctly and fail.
  • Only write as much production code as is necessary for the previously failing test to run successfully.

That works quickly. And you quickly have a lot of tests that are created in parallel with the development and can tell us at any time whether a change has broken something. It is therefore important to concentrate on a single aspect, a single concept or even just use one assertion in the unit tests. If a unit test fails, it should be immediately clear what exactly did not work and broke. Errors discovered quickly and easily in this way are much more difficult to uncover in later test phases.

These tests are therefore an essential part of the build process and should always be carried out frequently and regularly in order to obtain feedback quickly and regularly.

Logically, we as developers write the unit tests ourselves, for ourselves and other developers. If we had to wait for someone (a tester) to write unit tests for us first, the advantage of speed would be lost. This would be quite absurd, especially in view of the fact that significantly more test code may be created for one line of production code. At the same time, this makes it clear that it is advisable to concentrate on testing components, which also has an impact on the behavior of the software as a whole. For example, it may not make much sense to test every getter and every setter of an object. The added value of testing is much greater if we concentrate on covering all paths when making loops and decisions. We should use realistic data as input values to avoid surprises later on.

If we take all this into account, we also have another advantage that should not be neglected: our tests document the behavior we expect. We – and anyone who might join the project later and be confronted with our code and the associated tests – can thus better understand what works how (and why).

However, all of this only relates to the components tested individually in the unit tests. We will not find every bug and every error. And we probably won't be able to identify any problems that only arise during integration, i.e. when interacting with other components. That's what the next level of our test pyramid is for.

Pyramid level 2: Module tests & integration tests

Photo by Bonneval Sebastien on Unsplash

We move a little further up our pyramid. We consider the smallest testable units to be sufficiently tested and now turn our attention to their interaction. Up to this point, we have dealt with the unit tests, which ideally were created in parallel to the implementation, in particular with internal structures and functionalities. This means that we were working in the area of white box tests with the unit tests.

If we turn our attention to the modules (sometimes also referred to as components), knowledge of such internal aspects can be helpful, but is also a question of judgment and test strategy when it comes to the actual design of the tests. We are gradually moving into the area of black box testing. We have already dealt with these two approaches to software testing in more detail some time ago:

With module tests, we can consider several small units as belonging together and test them together as a group. For example, this could be a more complex functionality in a service of an application. Other parts of the application are replaced by mocks or stubs that specify clearly defined behavior for all aspects of the application that are not currently relevant, without being directly dependent on their actual implementation (and immediately affected by changes). For Spring Boot applications, for example, it is possible to test individual layers of the application separately. This means that only part of the application is actually initialized and executed for the execution of the tests, which has a positive effect on the runtime of the tests. This is important because we still want to receive feedback quickly and often and execute the module tests automatically as part of the build process.

On the other hand, it is also conceivable to view the entire application as a module (of a larger system) and thus test it as a whole, as a black box. In this case, we would only use mocks and stubs for external dependencies that are not directly part of our application. This could be, for example, interaction with an authentication server or identity provider. For test execution, the application can be started locally and called "from outside" with libraries such as REST-assured. This means that the tests also run through all the security mechanisms that are intended for the application, while other module test options only start later and therefore skip or bypass the first levels of request handling.

Whether you are focusing on more complex functionality within an application or an application as a whole, the focus is on detecting errors that arise during the integration and interaction of various small units of our software. We want to make sure that everything is properly connected and works together.

Of course, we cannot cover every conceivable and inconceivable scenario. Instead, we assume that the majority of potential error cases have already been taken into account in the unit tests. We are therefore now focusing on the "happy path" and obvious "corner cases". These could be bugs that have occurred in the past, for example, which are now explicitly included in the tests.

The two approaches to module tests (or integration tests) clearly show that the concept of integration tests is very broad, which is also reflected in various definitions of the International Software Testing Qualifications Board (ISTQB):

integration testing: Testing performed to expose defects in the interfaces and in the interactions between integrated components or systems. See also component integration testing, system integration testing.

component integration testing: Testing performed to expose defects in the interfaces and interaction between integrated components.

system integration testing: Testing the integration of systems and packages; testing interfaces to external organizations (e.g. Electronic Data Interchange, Internet).

This often causes confusion and it is not really possible to clearly differentiate between module tests, component tests, integration tests and system integration tests. The transition is actually fluid. Martin Fowler makes a simple distinction between "narrow integration tests" and "broad integration tests", which differ as follows:


Narrow Integration Tests Broad Integration Tests
Features
  • only cover the part of a service that interacts with another service

  • other services are replaced by "doubles" (mocks or stubs), either locally or remotely

  • relatively many tests with a comparatively narrow scope, which is often not very different from that of a unit test and is usually executed with the same framework

  • require live versions of all services in a real test environment with network internet access

  • cover functionality and paths across all services, not just the one responsible for the interaction

Analogy Component/module (integration) test System integration test

Without really realising it, we have already climbed to the next level of the test pyramid.

Pyramid level 3: API smoke tests & end-to-end tests

Photo by Bonneval Sebastien on Unsplash

At the transition from integration tests, we also find system integration tests with the characteristics just described. With these tests, we differentiate between API smoke tests and end-to-end tests. What both have in common is that they test an actual running and complete live version of the software as a black box—just in different ways and with different focuses.

If the system includes graphical user interfaces, these are also included in the tests. This is referred to as end-to-end testing. UI testing frameworks such as Protractor, Selenium, or Puppeteer are used to automate web browsers. We take on the role of a user and click through the application via the interface, enter data, switch pages, and check whether what is displayed meets our expectations. However, this is not based on the visible graphical UI itself but on the markup in the DOM. For instance, a UI end-to-end test expects to find a button with a specific label and properties, "click" it, and then verify the result of the interaction. Which page do I land on? Do I see the data I expect?

The focus of UI end-to-end tests is therefore on inputs, outputs, and (user) interactions via the UI components of our software. Since these tests are executed on a live version of the software, the API requests triggered by the UI are, of course, also involved—but they are not the primary focus.

Backend system integration tests, on the other hand, focus specifically on the API used by the UI components, which we now test directly. For this, we can again use tools like Postman or REST-assured. If we run system integration tests focused on APIs in development or test environments, they can also include requests that modify data to cover the full range of possible API interactions.

However, modifying data is a no-go when running API smoke tests against production systems. The goal here is strictly to verify a deployment through read-only access and ensure that our software system is available in a given environment and responds to requests as expected. In these tests, real external systems and dependencies are also part of the test.

End-to-end tests, backend system integration tests, and API smoke tests together cover maybe 10 percent of the overall system—because the specific business logic has already been tested elsewhere, in the lower levels of the test pyramid. We assume that the individual components function as intended or that issues will be identified earlier. Our two types of system integration tests are designed to ensure that requirements are met, the system is configured correctly, and everything works together as expected.

These tests can—and should—be automated, but not necessarily for every single build. Due to their broader scope, they are potentially slower and more resource-intensive. Running them too frequently may not be efficient. Ideally, they should be managed through separate build pipelines or jobs so their execution can be triggered manually when needed. It’s also good practice to run them automatically as part of a deployment. The tests at level 3 of our test pyramid represent the final stage of testing that we developers handle ourselves and automate. For the final level—the acceptance tests—we hand off responsibility to our customers.

Pyramid level 4: Acceptance test

Photo by Alphacolor on Unsplash

Developing software is not an end in itself. The goal is to meet the customer’s requirements and generate value for them. We cannot test whether our software achieves this on our own. For that, we need the customer and user to accept it. Acceptance tests are crucial for customer satisfaction. Ultimately, the customer decides whether the “rocky road” on our “stone stairway to heaven” was worth it—the customer is king here too.

Therefore, the acceptance test in software projects is typically one of the last phases before the software is put into production. The test begins once all known requirements have been implemented and the most serious issues have been resolved. The software is then deployed under realistic conditions and tested by technical experts.

Our customers or end users check the features from their perspective, focusing on the required business functions and the system’s proper operation, including its stability. Testing can sometimes even be a legal requirement for our customers.

We assume that if the software works under realistic testing conditions, it will perform just as well in the real world.

It is entirely possible that errors and problems may still arise during the acceptance test that we didn’t catch in the earlier testing phases. However, this does not include cosmetic issues, crashes, or simple spelling mistakes. It’s more about aspects of user-friendliness—particularly intuitive usability and expected behavior. Does the software require extensive training? Are there any features a user might expect that are missing but weren’t previously requested? Often, these are small details that simply weren’t considered, partly because, at some point, everyone involved in the project is so immersed in the subject matter.

To clearly document the results of acceptance tests and ensure they can be repeated if needed, these tests are based on scenarios that were initially considered when gathering the requirements in the form of use cases:

Once the customer requirements have been analyzed, test scenarios are compiled, a test plan is created based on these scenarios, and test cases are defined. The acceptance test can then be carried out, and the results recorded.

If the customer requirements are met, our software has successfully passed through the test pyramid and is one step closer to success. If not, we are all one insight richer: we’ve identified problems and challenges to address, and we can proceed to the next iteration.

Graphic by katemangostar on Freepik.com

To be continued...