Dealing with the “Refactoring or Test” dilemma

Blog
September 13, 2023

Abstract

This article covers the following topics:

  • Automated test code creates dependencies on the main application code.
  • In general, it is a good design design to eliminate unnecessary dependencies.
  • On the other hand, End-to-End (E2E) testing creates strong dependencies on the GUI.
  • It is important to avoid creating unnecessary dependencies on the GUI in test preparation, etc.
  • To reduce unnecessary dependencies, it’s better to be at a lower test level – the test levels from business process to user story.
  • The lower the test level, the greater the dependence on the testability implementation – testable design and API for testing.
  • If it is difficult to implement at a lower test level, start with a higher test level and implement the lower test levels in turn.

Introduction

Two developers in a bar

Two developers standing at a bar counter

I was out for a drink the other night with a friend of mine who is a software engineer.

When two engineers sit at a bar together, the conversation is usually about technical debt. The topic that day was end-to-end (E2E) testing, a type of testing that is conducted in an environment similar to a production environment and testing a business process from the user’s perspective from start to finish.

The friend shared the following concerns about E2E automated testing.

“Because of the side effect of a test run, another test case – including the test itself – can become broken. For example, running test A changes the data in the database, causing other tests that reference the same data, or even a re-run of test A, to fail.

In addition, the step of preparing test data before the test may be unstable, causing the test to fail before the actual point to be tested. For example, due to data generated from a previous test run, unexpected data may exist at the test preparation step, causing the creation of new data to fail and preventing the test from proceeding to completion. This situation can also occur when the GUI is delayed, or the data is not displayed as expected due to pagination.”

I thought the independence and stability of tests caused those problems and advised the following things:

  • Generating unique data each time is better to improve the tests’ independence so that the side effects of tests will not affect each other.
  • It is better to have a separate programmable interface, e.g., REST API, to prepare the test data. This will also help in generating unique test data as described above.

He went on to ask another question.

“If we prepare an API for test data, can we guarantee that that API will behave the same as the real UI?”

I thought the problem was not with the testing but with the application design. So, I gave the following answer.

“Can’t we guarantee that the API and UI internally behave in the same way by the system design? For example, why not call the same class when requested by the UI and when requested by the API and keep the logic in the UI and API controllers to a minimum? This would positively impact the readability and maintainability of the application.”

He responded as follows.

“That would mean that refactoring, including the internal structure of the UI, would be necessary, wouldn’t it? However, automated testing is necessary for rewriting internal structures without changing external behavior. That’s weird. Because it means that we can’t get good automated testing without rewriting the internal implementation, but it requires automated tests. This is a deadlock.

Besides, E2E testing is about testing business processes, and it seems odd that we get APIs that don’t appear in the actual business processes, even if they are for testing.”

Two problems

During the conversation at the bar, I was given two problems.

  1. If we have to modify the internals of the test target to create an automated test that works stably in the first place, we will never achieve the ideal test, a chicken or egg problem.
  2. If we use APIs instead of GUIs for the set of preconditions and postconditions for the sake of stability, we will not meet the requirements of E2E testing, which is to test business processes.

The first problem can be rephrased as “Why do we need a special implementation for automated testing in the first place?”. Because, to obtain an ideal – stable and independent – automated test, we have to modify the application which creates a chicken-or-egg problem. In this article, we will refer to this particular implementation as testability implementation and explain its necessity and importance.

The second problem can be straightforwardly rephrased as “What are the requirements for E2E testing?”. In the introduction, it was defined as “testing business processes,” but this article will dig deeper into this definition and explore what E2E testing should test and how.

The importance of testability implementation

First, let’s reiterate the nature and characteristics of automated testing in the first place. Then, let’s explore the answer to the first question.

Test code creates dependencies between the application and itself

The following example shows a very simple implementation of the FizzBuzz problem and its test code.

Note that the test code actually uses the fizzBuzz function.

function fizzBuzz(num) { if (num % 15 === 0) return "FizzBuzz" if (num % 5 === 0) return "Buzz" if (num % 3 === 0) return "Fizz") return num } assert(fizzBuzz(3) === "Fizz") assert(fizzBuzz(5) === "Buzz") assert(fizzBuzz(15) === "FizzBuzz") assert(fizzBuzz(1) === 1) assert(fizzBuzz(0) === 0)

For example, suppose we intentionally introduce a bug into the fizzBuzz function. If the return num at the end is not written, fizzBuzz(1) and fizzBuzz(0) will return undefined and the test will fail because they will not return values other than multiples of 3 and 5.

function fizzBuzz(num) { if (num % 15 === 0) return "FizzBuzz" if (num % 5 === 0) return "Buzz" if (num % 3 === 0) return "Fizz") - return num } assert(fizzBuzz(3) === "Fizz") assert(fizzBuzz(5) === "Buzz") assert(fizzBuzz(15) === "FizzBuzz") assert(fizzBuzz(1) === 1) // will fail assert(fizzBuzz(0) === 0) // will fail

Similarly, what will happen if we change the order of the if statements? In this case, fizzBuzz(15) will now result in “Buzz”, because if (num % 5 === 0) will be evaluated first since 15 is also a multiple of 5.

function fizzBuzz(num) { if (num % 5 === 0) return "Buzz" if (num % 3 === 0) return "Fizz") if (num % 15 === 0) return "FizzBuzz" return num } assert(fizzBuzz(3) === "Fizz") assert(fizzBuzz(5) === "Buzz") assert(fizzBuzz(15) === "FizzBuzz") // will fail assert(fizzBuzz(1) === 1) assert(fizzBuzz(0) === 0)

Now, as mentioned earlier, each assert actually calls the fizzBuzz function. What this means is that assert creates a dependency on fizzBuzz.

The five assert statements represent the specification of the fizzBuzz function. It describes which values it will return for certain inputs. In other words, the test code is the code that checks for compatibility with the specification, depending on the application interface.

The two are inseparable as far as dependencies are concerned. In other words, it is the test code that makes the dependency between the specification and the implementation inseparable. Since the test code depends on the interface of the implementation, the quality of the test code is largely dependent on the quality of the application. Quality here refers to the quality of the test code itself, such as the independence and stability of the test code.

This quality characteristic of an application that determines the quality of the test code is called testability, and we refer to its implementation as the testability implementation in this article. This refers to the need for internal changes or special implementations for testing that appeared in the introduction.

Why testability implementation is needed

The GUI is often used to create and clean up test data in E2E testing, including the examples presented in the introduction of this article. However, as mentioned earlier, the test code creates a dependency with the application interface, and in E2E testing the interface is the GUI.

In E2E testing, you will often use the GUI to perform all operations, including preparation of test data. However, doing so leads to unnecessary dependencies between testing and implementation.

To illustrate, consider a test on an e-commerce site that confirms the purchase of an item in the cart. This test requires the following operations each time we register a new product, inventory, user, etc., to ensure independence of the test case.

Test case: Confirm the purchase in the shopping cart

  • Create an item
  • Register an inventory for the item
  • Create a new user
  • Add item into the shopping cart
  • Register user’s shipment information
  • Register user’s payment information
  • Confirm purchase

Of these steps, what is it that you really want to test in this test case? The name of the test case is “Confirm purchase of an item in the cart”. In other words, the final “confirm purchase” operation is the main purpose of the test, and the rest of the steps are preparation of test data.

As explained in the previous section “Test code creates dependencies between the application and itself”, test code checks for gaps in the specification by creating dependencies with the implementation. However, using the GUI for preparation leads to the creation of dependencies on parts of the application that are unrelated to the part you originally want to test.

As a general application design best practice, unnecessary dependencies between components should be avoided as much as possible. The same is true for test code and implementation.

Requirement for E2E test

Testing business process

However, there is a major discrepancy here.

According to ISTQB [^1]’s glossary definition, E2E testing is defined as

📝 A test type in which business processes are tested from start to finish under production-like circumstances.

[^1]: International Software Testing Qualifications Board, a non-profit organization that certifies test engineer qualifications. In Japan, JSTQB offers equivalent qualification tests and mutual recognition.

In other words, the basic premise of E2E testing is that the test should execute a business process from start to finish. The definition of a business process is not listed in the ISTQB glossary, but Wikipedia describes it as follows:

A business process, business method or business function is a collection of related, structured activities or tasks performed by people or equipment in which a specific sequence produces a service or product (serves a particular business goal) for a particular customer or customers.

Taking the e-commerce site test mentioned earlier as an example, the following can be considered a business process. The key point is that multiple actors and systems are interacting with each other to realize the objectives. Here, the business process describes how the e-commerce site (or more precisely, the operator of that site) and the end user operate the system according to their respective objectives, and what value is created. It is also the test for the business process to ensure that it is achieved.

Business process: Through item registration to the shipment

The e-commerce site registers the products and prepares the inventory. The end-user purchases the product, and the e-commerce site ships it.

Testing user stories

On the other hand, is GUI testing really limited to business process testing? For example, user stories are documents that describes value from the user’s perspective, but they often describe only a single feature, compared to the business processes that describes multiple features. Here is the example user story of our e-commerce site.

As an end user, I can confirm the purchase of items in the shopping cart.

In other words, if we call traditional E2E testing as business process E2E testing, we can consider a new test level called user story testing.

Describes the relationship between each test level and each test bases. Business Process E2E tests Business Process. User Story E2E tests User stories. Integration tests test Components. Unit tests test Modules.

As mentioned earlier, a business process is a collection of related activities. Thus, almost all operations including data preparation will be operated with a GUI. On the other hand, a user story is a document that describes a small part of a business process from the user’s perspective. The prerequisites for testing that are not mentioned in the user stories should be prepared by programmable interface, such as an API, to ensure higher quality, more independent, and stable testing.

Refactoring or Test

Test levels and the testability implementation

The aforementioned test levels are summarized in the table above.

Each test level uses one or more interfaces as test targets. For example, business process E2E testing and user story E2E testing both use a UI. A coupled test or unit test would use interfaces such as components or methods in the system. This means that the test code depends on these interfaces to check if they work as expected.

The rightmost column Dependency on Testability Implementation means to what extent the test level is affected by the testability implementation. For example, business process E2E requires little or no testability implementation because, in principle, everything is executed using the UI. On the other hand, this also means that a very limited number of testability implementations are available, so there will be a great deal to consider on the test design side to maintain test stability and independence.

Conversely, unit tests, for example, are highly dependent on the testability implementation. This can be easily understood by imagining a function that is impossible to test due to a design failure. For example, if you write a unit test for a function that references a time and returns either AM or PM, the test code will have difficulty writing the expected value if the function internally references the OS time.

function meridian () { if (date.time.hour < 12) { return 'am' } else { return 'pm' } }

A simple way to solve this is just giving the current time as an argument:

function meridian (hour) { if (hour < 12) { return 'am' } else { return 'pm' } } assert(meridian(11) === 'am') assert(meridian(12) === 'pm')

In the above example, we took the approach of making the implementation itself testable as a testability implementation, but the lower the level of testing, the more likely this is to be the case.

What about User Story E2E Testing ? In theory, user story E2E testing can be operated entirely from the GUI, similar to business process E2E testing. However, as explained in the section “Why testability implementation is needed”, using a GUI for steps such as preparation has disadvantages in terms of the quality of the test code itself, such as stability and independence. Therefore, I recommend to use programmable interfaces for preparation.

Step down slightly from high test levels

Let’s reflect on the two questions posed at the beginning of this section.

  1. If we have to modify the internals of the test target to create an automated test that works stably in the first place, we will never get to the ideal test, a chicken or egg problem.
  2. If we use APIs instead of GUIs for the set of preconditions and postconditions for the sake of stability, we will not meet the requirements of E2E testing, which is to test business processes.

For (2), we tried to solve the problem by extending the definition of E2E testing a bit and considering a slightly smaller scope test, E2E testing at the user story level.

What about (1)? We are already close to an answer to this as well. The point is that implementation and testing are inseparable, and that there are different levels of testing.

In the table above, we have shown that there are several levels of testing and that the lower the testing level, the greater the dependence on the testability implementation. Dependence on testability implementation is a trade-off for test stability, in other words, the lower the test level, the more stable and faster the test will generally run.

Therefore, in fact, the “ideal test” spoken of in (1) actually meant only “testing at a lower test level”. Therefore, if the application has no or a few testability implementation, we can start from a higher test level and gradually move down to lower test levels, so that we can avoid saying “I can’t write tests because of low testability”.

Conclusion

As I previously described in the abstract, the article explained the following points:

  • Automated test code creates dependencies on the main application code.
  • In general, it is a good design design to eliminate unnecessary dependencies.
  • On the other hand, End-to-End (E2E) testing creates strong dependencies on the GUI.
  • It is important to avoid creating unnecessary dependencies on the GUI in test preparation, etc.
  • To reduce unnecessary dependencies, it’s better to be at a lower test level – the test levels from business process to user story.
  • The lower the test level, the greater the dependence on the testability implementation – testable design and API for testing.
  • If it is difficult to implement at a lower test level, start with a higher test level and implement the lower test levels in turn.

Autify provides a product that helps to create the types of E2E tests that this article introduced. If you are interested in trying out Autify, please contact us for a trial or to request a demo via the following links. Thank you for reading!

Run AI-Powered Test Automations Tailored to QA Teams with Autify!
Ditch your high cost manual work and start your no code FREE 14 day trial testing journey with Autify now!
Try Autify!
No credit card required. No auto-billing after trial.