The Evolution of Browser Automation: Christian Bromann [Testμ 2022]

The Evolution of Browser Automation: Christian Bromann [Testμ 2022]

Have you been curious about browser automation? Christian Bromann, Founding Engineer, Stateful Inc., is here to share the perils of information surrounding the topic with Manoj Kumar, VP of Developers Relation, hosting the session.

He started by explaining how there have been plenty of misconceptions surrounding how browser automation works. According to him, with many automation problems, you should know the exact location of the error to fix your app and test the script report back in the testing framework.

He took us over the memory lane to explain what has happened over the past 18 years and how the automaton testing frameworks have changed so massively.

He started with how Jason Huggins, currently the Founding Engineer at Stateful, Inc, created the Selenium tool. Then he spoke about how Simon Stewart, Software Engineer at Apple, created WebDriver. Selenium was previously running in the browser and had issues regarding cross-origin policies and automation around the browser. Hence Simon and Jason collaborated to combine both Selenium and WebDriver — Selenium WebDriver.

It’s not easy to maintain different drivers. Hence W3C started to speculate WebDriver protocol. Soon, WebdriverIO for Selenium testing was released to combat the limitations put forth by Selenium and WebDriver. Protractor and WebDrive JS Bindings were also released.

Christian spoke on how there has always been a lack of testing capabilities. This limitation paved the way for more tools like Cypress, Puppeteer, and Playwright. He spoke on how in 2018, WebDriver earned a W3C recommendation. He also gave a gist on how WebDriver protocols have been in the development stage for quite a while and that the latest version of WebDriver protocol is called WebDriver Bi-Di.

Types of browser automation tools

There are two tools for browser automation, according to Christian Bromann. Here are the snippets he shared:

Conventional tools: Tools like Selenium, Nightwatch, WebdriverIO.

  • Built based on W3C WebDriver specifications.

  • Supports cross browser testing automation.

  • Offers browser and mobile support.

  • Limited support towards automation capabilities.

  • Open-governed open-source projects with a long history and large communities.

  • All the conventional tools are open-sourced.

Non-standard tools: Tools like Cypress and Puppeteer.

  • Custom tools with non-standardized automation features, built based on browser APIs or JavaScript emulation.

  • Limited cross browser support.

  • More automation capabilities and developer focus.

  • Company baked and governed open source projects.

For example,

  • CypressIO is governed by Cypress.

  • TestCafe is governed by DevExpress.

  • Playwright is governed by Microsoft.

  • Puppeteer is governed by Google.

Christian also spoke elaborately on browser automation strategies we need to remember.

According to Christian, Cypress.io uses web and browser APIs, whereas Puppeteer depends upon browser APIs, Playwright on customer or modified browser APIs, Selenium on WebDriver protocol, and TestCafe through Web APIs.

“While all these tools perform the same task, they are quite different if you look into them in detail,” he revealed.

Ways to automate a browser

Christian divides the browser automation techniques into three:

  • Web APIs.

  • Browser APIs.

  • WebDriver Protocol.

He compared and contrasted the three of them:

WEB APISBROWSER APISBROWSER APIS
1st generation of automating browsers.2nd generation of automating browsers.3rd generation of automating browsers.
Provides complete control of the execution environment.It’s available in all browser engines like Chromium, Gecko, or WebKit.Official web standard developed by W3C by all browser vendors.
Automation commands are mostly emulated.It’s accessible in only Chromium and Firefox.Thoroughly tested as a part of the web platform test suite.
It has limitations such as no switching windows or cross-origin or iFrame support.It’s used for delaying purposes.Limited capabilities designed to automate from the user’s point of view.

Christian spoke about how web APIs have been around since 2004 and are not as supportive of browser automation as its successors, even though tools like Cypress and TestCafe use them.

Similarly, browser API works differently from one version to another with no backward compatibility. For example, you would have to update Puppeteer every time there is a new Chrome release.

It’s also challenging to automate three browsers simultaneously since they all operate in three different languages. All such factors led to the birth of the WebDriver protocol to ensure a consistent automation experience. It’s like having ten assistants to get your automation job done.

With this approach, as per Christian, you can easily automate a browser factory, any mobile factory, or an iOS factory.

“Let’s say you run a browser factory. You need to have an assistant that understands the inner workings of a browser, be it Chrome or Safari. If you have a mobile factory, you have an assistant that understands iOS or Android. So, the idea of having this translator for executing a trivial command like a click on a button in a complex user agent like a browser works pretty well now.” he quoted.

With WebDriver Bi-Di, Christian explained how you could provide a number of commands at the same time to multiple factors. He also explained the important features that set it apart.

Christain said that you are also allowed to monkey patch specific APIs and web APIs to have more control over the application, and it’s also backward compatible. He is optimistic that this new protocol can change many things in the future. He explained how you could send over 10,000 commands instead of the recent ones where you can send limited commands.

He then expressed how he wishes to witness the success of WebDriver protocol but also roots for tools like Cypress and Playwright since they take automation to the next level, providing fantastic user experience using tools.

This is what he believes will happen in the upcoming days:

He shared his excitement about different web standards, for instance, the web authentication standard, which allows you to create virtual authenticators in WebDriver. He ended the conversation with a quote from Maya and James, who have worked on the Firefox browser.

This is what they have said: “Automation solutions based on a proprietary protocol will always be limited to the range of browsers. It can support the success of the web. It is built on multi-vendor standards. Hence, it’s important that test tools build on standards and that tests work across all browsers and devices, and the web works with that.“

In the end, our host, Manoj Kumar, laid a few questions in front of him asked by the attendees. He answered them with a willingness to help the community:

A lot of enterprises have their in-house grid setup. You must have seen a lot of implementations. What should those companies or test architects do?

Christian: Imagining that they have Docker instances or Kubernetes instances, I think if they use the Selenium grid to manage the browsers and devices, they are well set up because the Selenium grid will be able to support the new Bi-Di protocol. It can become a problem for cloud vendors only when they send many socket messages across the globe.

How much multi-language support is important when picking up a browser automation tool?

Christian: I believe the tool needs to be suited for the team that uses it if you have a pool of QA engineers with knowledge in different languages. If they are well-versed with only PHP and Java, then they should use a framework that is built in Java. We see more and more JavaScript frameworks popping up. That’s when we deploy the whole stack where we write a backend in Java. We don’t even need the backend that much anymore. It all depends on what your team needs.

Is there or will there be any time soon the possibility of having “extensions” (like Chrome extensions) in these tools available to be part of the browser context when running automated tests?

Christian: Yes, you can already have a Chrome extension installed while running tests on it, even though I’m not sure about the other browsers. There is indeed a need in the industry.

Which one should you prefer over the other: W3C JSON protocol or tools developed to utilize DevTools of browsers?

Christian: It depends on your automation needs and based on that, you can pick the automation engine that works best for you. If you still run on the JSON wire protocol, you need to update since you might not be able to run tests on the browser anymore. The reason is that the protocol has been deprecated for a long time. You can choose depending on your front-end developer and other team members’ preference.

We cordially thank Christian for this amazing speech! Hope you found the insights shared over here useful!

Happy testing!