This is the transcription for the session that I have in Spring OnlineTestConf 2020.
Hello everyone! In today’s presentation titled “How can we improve the testability of applications?”, I will be talking about the testability within E2E testing.
Before we begin, let me briefly introduce myself.
My name is Takuya Suemura. I’m working as a web application developer and also as a software tester for several years.
I have been an open source contributor of an E2E testing tool called CodeceptJS for a while. I will be talking about CodeceptJS later in this presentation.
So, Today I’m working at a startup called Autify. Autify is an AI-based E2E testing platform for web applications. Since its official launch in October 2019(twenty-nineteen), it’s been used by users in over 100 companies. I am in charge of technical support as well as developing browser automation.
Autify has many features to help your continuous E2E testing. For example, you can record your test scenario by just clicking elements or typing the value, similarly to Selenium IDE.
The key feature is called “Self Healing”. It’s the AI-powered automatic scenario maintenance. If you’re using test automation tools like Selenium, you need to update your test code when your application is changed. Autify can make it easier by the power of AI. If the target element isn’t there during the test execution, Autify tracks the change to automatically fix to the new element.
Autify solves many of the pain points of E2E testing. Which are “execution time” and “compatibility”. It supports both parallel executions by a lightweight Docker container and cross-browser compatibility test on a real machine.
As the tagline “Testing Automation for Agile and Remote Teams” indicates, we are aiming for a product that supports a development style where everyone is involved in test automation regardless of their skill set.
Today, I am going to talk about the testability. First, a brief explanation of the concept of testability. Next, I’d like to talk about how we can safely add tests to the low testability code. Finally, I’ll talk about how to increase testability at the E2E level.
I like both testing and software development. When I’m programming, I focus on details. I think about if the logic is complete, if it affects other components, if there’s a lot of complexity, if it unnecessarily locks the database.
When writing code, I’m focused on the code in front of me, so my perspective becomes narrow.
The big picture, such as if the user will like the implementation, is not on my mind. And when there’s a bug after a release, it’s usually because I failed to look at the big picture.
This is why I like testing; it gives me confidence in the codes that I’ve written. I write unit test code for minor concerns while coding.
For the great concern for the user, I test them with the actual UI. Or I can ask someone and get their opinion, or write an E2E automated test code.
Sometimes, I interact with the UI and test for something that is a ‘minor concern’.
For example, when a function or class isn’t properly broken down or when it can only be tested in a fully built state due to runtime environment issues.
In those cases, the experience is not good at all. It’s time-consuming, I have to constantly pay attention to the session and test data, and there is a huge number of combinations
This is when I always think about testability.
…it means the system has low testability
When you have to interact with the UI for a minor concern, it means that there’s no operating point to check it, or that there is a limited number of ways to check it. If you want to test the single IF statement, are there any reasons you need to register data from the real UI? By narrowing the focus, using a more compact API instead of UI, and using stubs instead of a real database, you can easily run the desired test.
One of the things that I like about testability is that it is a part of quality characteristics. Everyone thinks that quality will improve if you perform more tests.
However, this is incorrect. Testing is a means to obtain information about the quality of the product, and we need to obtain a lot of information quickly. If the test time increases linearly as the product grows, the quality of the software will decline. A linear increase might still be OK. Most software gets more and more complex, so the test time would grow exponentially!
Anyway, testability is a part of quality. I believe that increasing testability is to reduce test time and preparing the basis for more tests. Instead of blindly conducting many tests, I like making it easier to test. That’s the testability. That’s quality.
As I mentioned earlier, I work in a team that makes a platform for E2E testing. Have you tried automating E2E tests? It’s a lot of work!
You have to prepare test data, make test scenarios that do not affect other tests, execute in parallel to shorten execution time, deal with the browser if it’s a web application, deal with the mobile device if it’s a mobile application…
There are a lot of tasks.
However, I think the most tedious part is that everything needs to be done from the UI.
It’s easy to make it testable in unit testing.
For example, by passing a date object as an argument, you can easily perform a test that involves a date. However, in an E2E test, it’s difficult to make it because you can’t manipulate the server time!
To give another example, if it’s unit testing when a function has a lot of functionality, you could split it into smaller features and make it easier to test. But you can’t do that with E2E tests. The UI is split into individual parts for usability, so you couldn’t divide them even further. It would not make sense to make your application harder to use just for the sake of testing it!
In other words, there are only a few testability characteristics that can be used in E2E testing. We can’t expect testability at the code level at all, and testability at the system level is too tightly coupled to the features so it’s difficult to use.
By the way… do you know how to call an application with low testabilities, and struggling with the many E2E testings?
You know, it’s called Icecream cone.
The Icecream cone pattern is an anti-pattern against the best practice of the test pyramid. In contrast to the original test pyramid, Icecream cone has a few unit and integration tests and doing too many E2E tests. In most cases, those E2E tests are done manually.
This figure, icecream cone, is very well-known so I’m sure you are all aware of it. However, sometimes we come across products that are just like this ice cream.
I don’t mean to complain about any of this. The key point is whether this icecream is creating value.
I love icecream!
Let’s take another look at the ice cream cone. Why do we need so many manual testing?
The most important reason is, icecream is the value of the product.
Nobody will test something unnecessary. If a test is necessary, it means there is value for someone.
In a time when things change very rapidly, adding new flavors, or functions is a top priority… Even if the code is dirty! Also, to protect the most important part for users, we must perform many E2E testing.
So I think performing a lot of E2E testing is a good thing in itself.
But why does this happen?
For one, it’s because the person who is conducting the test doesn’t fully understand the test…
I experienced this a long time ago. No one in our team knew much about unit testing tools like xUnit, nor did we perform manual unit testing… We didn’t know anything other than system testing, and we only thought that the only operating point for testing is the UI.
Developers and product managers, and of course customers, no one knew unit testing even existed. When we said ‘testing,’ we meant acceptance test. A developer used to say ‘we’ve fully completed the monkey test’ and released it with full confidence. Can you believe it?!
Another reason is that we focus on making icecream early on in the project. In theory, it seems OK to start with “dirty code that works” as long as you write the test code. You can fix those dirty codes with checking that satisfies requirements. Yet, there are so many cases where only manual testing is performed, and often this is only done from the UI!
This technical debt probably won’t be repaid later because there’s no simple way to verify that requirements are met. I mean, automated testing!
However, it is impossible for E2E testing to realistically cover everything. Because, as I explained earlier, there is a limited number of testability characteristics that can be used in E2E testing. E2E testing is slow and unstable and maintenance is difficult. So there’s a lot of technical constraints that we must clear if we were to perform many tests.
So there is a proposal for me. Let’s focus on items that need to be tested E2E. In other words, just because you perform in E2E doesn’t mean the test is focused on value for the customer. Therefore, we should put less emphasis on them.
To use the icecream analogy, I mean we should have a bigger cone.
Let’s take a very simple registration form as an example. Among these, the only thing that can be tested in E2E is number 5. E2E testing may not be necessary for all other items.
When code-level testability is poor and manual testing from the UI is at the heart of the test, various concerns are mixed in one layer. Let’s break down those concerns, and make tests simpler!
First, I’ll show you a simple way to test without adding new control points. Many applications use a client/server model. We do the visual work on the client-side, and we do work related to domain logic on the server-side. The client and server communicate with each other.
Where communication occurs, it may be used as a control point. If we can break the link between front-end and back-end and treat them separately, testing would be simple.
Many developers believe that UI testing can only be done with E2E or it won’t make sense. In a sense, this is correct. The UI often only makes sense if it is linked to the back-end. Also, ultimately, it’s often necessary to perform similar testing with E2E.
However, it is still important to test the UI alone. E2E testing is difficult because you have to prepare test data and manage the state. I don’t want to think that a test fails because of the effect of another test.
Manipulating the UI often has side effects… That is, an update operation to the database occurs. Would you start by making test data every time, to perform a UI test?
Still, testing the UI separately from the back-end is difficult. You have to write a lot of back-end mocks, and the mocks must track back-end changes.
Here’s one realistic compromise. The test is done in E2E, while the back-end requests use a mock.
By using tools such as Mock-Server and Polly.JS, it is possible to change the response from the specific back-end endpoint to a mock.
At this point, be careful when selecting the library. A library for “complete” E2E testing, like Selenium, is slow and can only perform operations from a user’s perspective. Use lightweight libraries such as Puppeteer and Cypress as much as possible.
If you don’t want to use different notations for UI and E2E tests, it’s a good idea to use a wrapper for each library. CodeceptJS is an excellent library for operating various automated drivers with a single API. CodeceptJS has a good plugin for using Polly.JS. So you can perform E2E testing using mock just like that.
What are the benefits of using a mock? Let’s take an example for the case ‘when part of the back-end server does not return a response.’
Here’s the small example. This is Google’s search form, you know. Let’s test “UI should not be broken even if they suggest API didn’t respond” with a mock!
Take a look at the line I.mockRequest. This is the code for mocking. It overrides the request to a specific API and always returns a 404 response. If you check Chrome’s developer tools, you can see 404 is returned. you can confirm that the UI does not break, even if the suggestion API does not work.
Although it is useful for normal system tests, I often use it to check abnormal systems, as I just explained.
As well as your services, you can also use mock responses from external services. For example, in my experience, when I was involved in an E-Commerce project, there was a time when our site’s front-end completely stopped operating when the external zip code search API used in that project went down. By using a mock of an external API and returning an arbitrary response, these cases can easily be tested.
Once you’ve successfully separated your UI tests, the next step is the back-end. Let’s test the API! The API tests mentioned here include not only those provided for end-users but also those used internally.
If your project uses web application frameworks such as Rails and Laravel, they have functions for API testing so you can use those. With Laravel, it’s FeatureTest, and with Rails, it’s RequestSpec. By using the test library built into the application, you can skip processes such as login. And then you can focus on the input and output of the API. Also, it is easy to manage.
If this is not possible, you can test the API by using tools such as Postman, Karate, Tavern. They are all well-known, so many of you may have heard of them or used them. They are all great tools but Karate is an all-in-one tool that can be used to create back-end test doubles, or by combining with a load testing tool called Gatling, you can also perform load testing. Tavern focuses on writing simple API automated tests in YAML. Postman can be used for automated testing, but I think it’s more suitable for manual testing.
By gradually working your way through from the top like this, it will be possible to automate tests that have previously been difficult to automate.
Just to make sure there is no misunderstanding, I’m not saying that all E2E tests should be replaced like this! What I mean is that you should check minor concerns at lower layers, and focus on major concerns with E2E testing, such as use cases and integration with external systems. If original tests focused on these, they shouldn’t be changed.
Reducing the test level does not necessarily mean that the system requires special changes. It’s possible to use existing interfaces to add useful automated tests. Now that we’ve added automated tests for interfaces, we are ready for refactoring… Now, let’s add new interfaces and increase the integration and unit tests in the layers below. Coming back the icecream analogy, imagine turning the cone to a cup.
If your application doesn’t have any software architectures such as MVC, MVVM, or CleanArchitecture, consider incorporating them. These are good ways to separate development and testing concerns. MVVM, for example, is a good way to separate domain concerns from UI concerns. UI sometimes becomes too complex to test. Let’s separate the UI logic, presentation logic, and business logic, and then testing becomes easier!
All tests had to be done from UI when it was only Model and View, but by sandwiching ViewModel between them, you can test the presentation logic with ViewModel.
Next, let’s start with testing UI components. What is UI component testing? For example, if you are using a UI framework such as React or Vue, the test should be done without linking the back-end. All back-end responses use mocks and lightweight browser implementations such as JSDOM rather than real browsers.
By enriching the lower layers, you will be able to test more of the upper layers.
You can focus on important tests by finding and removing trivial bugs in the lower layers. It’s also one characteristic of the testability.
Next, I would like to talk to you about how to test more of the ice cream, that is, how to make the E2E test itself easier. As I explained before, the biggest reason that E2E testing is so hard is that you have to do everything on UI. So let’s start by thinking about how to operate the UI easily.
One of the most difficult things about E2E testing is creating test data. I recommend automating routine operations to create test data or preparing an API for testing. There are several ways to automate routine operations to create test data. For example, some record and playback browser extensions can be used for this purpose, such as WildFire and iMacros.
Sometimes you may want to reuse your automation code during a manual test. CodeceptJS, which I introduced earlier, can support those purposes.
In addition to general commands such as click and input text, you can also execute high-level commands that you define. If you enter high-level commands from your interactive shell, which is a set of commands you often use in automated tests, CodeceptJS will automate the tests for you.
In this example, high-level commands such as loginAs and addAllItemToCart are already defined. These commands consist of ordinary clicks and inputs.
The commands defined here can be used in both automated and manual tests, so commands created for manual testing can be used later for automated testing. When you want to check the behavior of cases that are not covered by the automatic test, you can conveniently do the first half automatically and the other half manually without having to operate everything from scratch.
Preparing an API for creating test data is also convenient. Registering a large number of new users for testing can be extremely tedious! If you have a script that automatically registers users with the required pattern, you can easily do this process.
Even if there is an API, it’s troublesome to call REST APIs of various back-end servers every time you perform a test! My recommended tool is n8n. Have you ever used process automation tools like IFTTT or Zapier? n8n is an open-source version of it. The advantage of n8n is that you can use the Execute Command job to execute various commands on the server. You can even perform complex processing that IFTTT and Zapier cannot do.
Now, UI is not the only point of contact between the system and the user… Typically, email and SMS are. Most of the time, both are used in parts that are critical for the user. Make sure to use these services proactively to test important transactions such as membership registrations and purchase confirmations.
In particular, it’s very important not to test with a real mail server in a test environment. This is to prevent accidents where emails are sent to real users.
Next, I would like to talk about Automatability, especially in E2E.
Automatability is very important. In the original test pyramid, it suggests us implementing only a few E2E tests. The reason why is automatability of the test at the system level was low, and there is no technology supporting that.
However, there are many tools to improve and support automatability, so I would like to introduce them to you.
Writing tests like this has several benefits. The first benefit is that the developer’s concerns can be separated from the locating of elements. Developers no longer have to worry about does the change for attributes of elements may break the E2E test code.
Another benefit is that when the user unintentionally loses the means to search for an element, this can be detected. To give a specific example, if the default display language changes from English to Japanese, most people would not be able to find the Submit button.
By searching for the element using the character string that is displayed, it is possible to search for the element in a way that closely matches the way a user thinks.
One of the important factors when talking about testability in E2E testing is the presence of locators. A locator is a key to identify the element to be tested… For example, in web applications, ID, class, accessibility ID, etc., were used.
Until now, locators have not been useful in the context of E2E. After all, if you specify an element with something that the user cannot see, such as ID and class, what could it test? If the test was broken when the ID is changed, is the test give any benefits for users?
CodeceptJS, which I’ve talked about many times already, has a function called semantic locator. Using this function, you can write tests like this. I.FillField(‘username’, ‘takuyasuemura’) I.click(‘Sign up for Github’).
When a user searches for an element, they don’t just search by words. Humans are good at understanding structure, so we search for the element based on our understanding of the UI’s structure.
Coming back to CodeceptJS, my favorite syntax is ‘within.’ Within provides us with a means to search ‘an element within an element.’
And since you can use the wording when searching for the parent element, you can implement an operation to ‘click A in the modal dialog with the character string B’ simply and semantically. This is great!
A mechanism called Fallback Locator which was recently introduced in Selenium IDE is also useful. This mechanism gets multiple locators that represent the elements that were recorded when creating the scenario and searches for an element that matches one of them at runtime. It identifies elements in multiple ways without relying on a single locator, increasing the robustness of automated testing.
I would like to talk about the effect of AI-based test tools has on E2E-level automatability.
As I said at the beginning, I am involved in the development of AI-based E2E test tools at a company called Autify. To briefly explain how Autify acquires elements, it does so by using the algorithm that combines Semantic Locator and Fallback Locator that I just talked about. When recording a scenario, record the various characteristics of elements, calculate the agreement for each, and select the one with the highest agreement. This allows you to stabilize the test without having to add special markers to the website you are testing.
Finally, I would like to talk about the goals that can be gained by increasing testability.
I don’t like the picture of the original test pyramid. It is ideal, but it’s a very developer-centric way of thinking, and it doesn’t show any value to the user. So when I first saw the ice cream cone pattern, I thought it was a very good analogy! Regardless of the way they do it, this team is doing what they need to do.
However, they could perform a more balanced test if they took a slightly more sophisticated approach. And by increasing the testability of the entire system, more tests can be done… It doesn’t necessarily mean doing less with E2E!
My colleague says that “E2E testing is slow and expensive, but it gives us confidence.” I thought it was a very good way to think. Of course, it’s important to find bugs in E2E that can only happen when everything is combined, but I think it’s more important that we can be confident in our products.
Let’s test whether ‘our product will give a better experience to our customers.’ That is what the icecream really is.
It’s not something we do just to find bugs, aside from testing. If we just want to find the value that a product gives to users, but only defects are found, we would never have the time to think about functions that have a higher value.
This is what I really wanted to talk about today. The higher the internal quality of the system and the quality of the code, the more E2E testing you can do… Manually and automatically! This may be a stepping stone not only for finding bugs but for exploring new features or discovering use cases that you never had imagined.
Here is a summary of what I talked about today.
A cup can hold more ice cream than a cone. If you have a large cup, you can put more ice cream in it.。
I hope you can bring many flavors to the world. Thank you for listening!