Jump to content

Performance/Synthetic testing/Run a test

From Wikitech

There are two ways to run performance test using QTE hardware:

  • You can monitor the performance of a page over time (and add alerts to fire if the performance changes).
  • You can run one off tests to measure the difference in performance between two different implementations.

If you are unsure how you should run your test, please ask on the #talk-to-qte Slack channel.

Monitor the performance over time

There are five test servers that continuously runs performance tests 24/7. To add your page or user journey for testing, you commit them to the Git repository. The test servers will then pickup the new page/journey and run the tests the next iteration. The metrics from the test will be sent to Graphite and you can see the data in our synthetic test dashboard in Grafana Use the monitoring when you want alerts on performance regressions.

When you test a single page, you can run test direct against Wikipedia servers or using WebPageReplay. If you are focusing on front end performance WebPageReplay is preferred since it will eliminate instability in network traffic. Our setup with WebPageReplay also enables Mann Whitney U tests to find regressions using statistical significance.

All tests for monitoring exist in https://gerrit.wikimedia.org/r/performance/synthetic-monitoring-tests. The tests are grouped in folders. We have desktop tests (test Wikipedia on desktop), emulated mobile tests (we run Chrome on desktop in emulated mobile mode), first view tests (accessing Wikipedia with an empty browser cache), user journeys (accessing multiple pages, interacting with the pages) and WebPageReplay tests.

Add a page/URL to test

You can choose to measure the performance against direct tests or using the WebPageReplay proxy. Folders with the name firstView is direct tests and folders with Replay in the name will use WebPageReplay. Find the right folder and the text file and add your page/URL at the end of the file.

1. Clone the Git repository: git clone ssh://USERNAME@gerrit.wikimedia.org:29418/performance/synthetic-monitoring-tests.git

2. Add the page to test and commit the change. Add your page to the matching text file in https://gerrit.wikimedia.org/r/plugins/gitiles/performance/synthetic-monitoring-tests/+/refs/heads/master/tests/desktopFirstView/ . For example, if you add a test to en.wiki, use the enwiki.txt

3. Push your change to Gerrit.

4. Add phedenskog as reviewer and feel free to ping me on Slack.

5. When reviewed and submitted, the test will run the next time on the server and you will be able to see the metrics in Grafana.

Add a user journey

At the moment we only do user journeys as direct tests against Wikipedia (to run user journeys against the replay proxy we need to invest some time to verify that it works T372525).

  • Test your user journey script local on your machine to make sure it works.The easiest way is to install sitespeed.io globally on your machine and then run the test. If you have a user scenario that exists in tests/desktopUserJourney/myTest.cjs you can run your test like: sitespeed.io tests/desktopUserJourney/myTest.cjs --multi without adding the configuration file, that way the data will not be sent to Graphite. If you have some specific configuration for your test, then make sure you add that locally too, so you can test your script.
  • Make sure your script is defensive: Check that elements exists before you try to interact with them and create log messages if the elements do not exist. That makes it much easier to debug if tests starts to fail.

Create a new script file to https://gerrit.wikimedia.org/r/plugins/gitiles/performance/synthetic-monitoring-tests/+/refs/heads/master/tests/desktopUserJourneys/ and create a configuration file for the test, matching the name of the test in https://gerrit.wikimedia.org/r/plugins/gitiles/performance/synthetic-monitoring-tests/+/refs/heads/master/config/desktopUserJourneys/

Run one off tests

You can run one off tests using https://wikiperformance.wmcloud.org. You can run tests either through the GUI (the web page) or using the API having sitespeed.io installed locally on your machine. The server will send the test request to a bare metal server running in Frankfurt that has been setup to have as little variance between runs as possible.

To run using the API you need to have an API key, you can get that by asking in the #talk-to-qte Slack channel.

Using WebPageReplay

You can run one off tests using WebPageReplay. That is recommended to get as stable metrics as possible and it's super useful if you test a page that bypass our frontend Varnish layers with a page that do not. This way both pages will have the same time to first byte and you can concentrate on the frontend metrics.

You run WebPageReplay by adding --webpagereplay in the Command line args tab in the GUI.

Add command line arguments





Using Mann Whitney U

You can also use Mann Whitney U test to find statistical significant changes. To do that you need to run two test. One test is your baseline test. And then a test ("current test") that will be compared with the baseline. The default configuration uses the "greater" alternative hypothesis for Mann Whitney. This means that we test if the current test is significant greater than the baseline tests (is slower or have higher metrics).

This also means that you will not find if the baseline test metrics is greater than the current test, it's important to choose the right page as the baseline page. You can change the alternative hypothesis using --compare.alternative and choose two-sided or less, you can read more about that in the documentation.

To run your test, start by collecting the baseline test. It's important to give your test an id (--compare.id), that id will be used when you run the next test and the id need to match so you can compare them.

Add these parameters to the Command Line args field: --webpagereplay --compare.saveBaseline --compare.id main_page and make sure to change the compare.id to a id that makes sense for you. Also change the number of iterations to be 21. That way we are sure Mann Whitney U gets enough data to be able to find differences in the metrics.

Run the tests and wait for the test to finish.

The next step is to add the test that will be compared with the baseline. Make sure to run 21 iterations again and change the id to match the id that you used in the last step. Add the following parameters to the Command Lina args field --webpagereplay --compare.id main_page

Run the tests and wait for the test to finish. Then in your test result, click the Compare tab and you can see the metrics compared between the two tests.

Add a label to your test

You can (and should) add a label to your test to make it easier to find. Click on the "Extras" tab in the GUI and add your label there. Say for example that you add a label named dark_mode_tests. You can then search for all tests for that label by adding label:dark_mode_tests in your search field.

Finding tests

You can use the search functionality to find your tests. All tests that runs through the GUI and API is searchable for 30 days. If you need more help, click the Help-button in the GUI to get more information on how you can find your tests.