Performance Testing: using Apache JMeter along with A/B testing on Google AppEngine

Nitin Agarwal
7 min readDec 14, 2020

Microservices are rapidly gaining popularity. As we continue to add brand new services, that plug into our already running system, it becomes really important to do scalability/performance testing. Below is a list of things that can go wrong and performance testing can help you identify.

  1. Identify if the database is able to scale along with the service.
  2. Identify if there is a need to add a caching layer.
  3. Identify if another service that your current service is using, is scaling properly, or is it becoming a bottleneck.
  4. If your service is taking too long to warm up, leading to poor performance and eventually too many instances getting started than actually needed.
  5. Help you decide on the number of idle instances you want to be running to be able to handle spikes in traffic.
  6. Help you decide the best configuration for your use case. By trying different configurations and analyzing the results.
  7. Identify memory leak issues
  8. Identify max traffic that your service can handle

Another important thing that we should keep in mind while doing performance testing is that you should have clarity as to which aspect of your service you want to performance test, as this will help you run a targeted test and also analyze the results properly.

Problem Statement

Our use case was that we were adding a new service that was going to plug into our already running system, and we had the below objectives

  1. Our MicroService was running on Google App Engine in Python 3. We wanted to find the best configuration for our use case(which was mostly I/O tasks) so that we are able to lower the cost.

Google AppEngine is a PAAS(Platform As a Service) that scales Up and Down automatically(even scales down to 0 instances) and you only have to pay for the instances you are running.

It offers several configurations like machine size, number of concurrent requests, min/max pending latency, min/max idle instance

2. Our service should scale well for a sudden spike in traffic.

3. There are no issues with our database, MongoDB when our service scales up.

4. To load test our service with ~ 1000 rps, and if the high load is affecting the overall latency for our Rest APIs.

5. Check the impact of playing around with MongoDB connection pool size.

6. Decide on min/max idle instance count for our service to handle traffic spike.

Solution

For solving our use case, we used the below tools

  1. Traffic Splitting Feature in AppEngine
  2. Google StackDriver Logs Ingestion in Bigquery
  3. Apache JMeter for sending concurrent requests.

Traffic Splitting

GAE has the concept of versions, where you can deploy different code/configuration for your service on different versions. For example

  • version 1 can have config1 and
  • version 2 can have config2

Now you can use traffic splitting functionality, to send 50–50% traffic to both versions. You can split traffic on the basis of IP and cookie. Use cookie when testing with JMeter.

gcloud app services set-traffic {service_name} --splits {version_1}=.5,{version_2}=.5  --split-by cookie  --project={project_id}

Why A/B testing gives more conclusive results

we could have also tried different configurations one after the other, instead of doing traffic split, and analyzed the results.

During our experiments we found that there was always some variation in results even for the same configuration, if the experiment was done at a different time.

A/B test helps you conclusively decide which configuration is better than the other.

Logs Router to BigQuery

We use google cloud logger, using which we get the grouping of logs for the same API requests as well as some stats like latency, HTTP status code, cost, etc of the requests.

Google StackDriver also offers the feature of automatically syncing logs to BigQuery. We used these stats in BigQuery to analyze the performance of the different configurations we tried

Apache JMeter

Apache JMeter is an easy to use tool for performance testing.

You can find lots of features, I will share info about the features that helped us test out our service, please refer to the JMeter User’s Manual to get a more detailed overview.

When you use JMeter you will be creating a test plan, we will discuss a few of the important elements of a test plan

Thread Group

The thread group element controls the number of threads JMeter will use to execute your test. You have the below options

Quick Tip: Incase you want to put load in access of more than 500 users/threads, it is recommended to use different JMeter instances.

Number of Threads

You can define the number of threads which are basically like unique users required for your tests.

Ramp Up Period

When you are testing your service for a real-world spike in traffic, you should steadily increase the load, instead of sending sudden peak traffic, for example with service handling 100 requests per second, you send 1000 requests the very next second.

The result of such a test where you send a sudden spike of traffic will also be misleading. Traffic scales gradually.

Loop Count

Loop count indicates how many times each thread will execute before it stops. It can have a value like 1, 1000, or even infinite.

Let’s look at an example to understand how this works in real life.

* Number of threads - 100
* Ramp Up Period - 50 seconds
* Loop count - 1000
This means that 2 new threads will be spawned every 1 seconds. So at the end of 10 seconds you will have 20 threads. at the end of 50 seconds we will have 100 threads running.Each of these threads will execute the API call 1000 times as specified in the loop count.Consider thread 1, as soon as it finishes its 1st API call, it will start making 2nd API call, and it will make total 1000 API call and then the thread will stop.It is possible that one of the thread finishes much before other threads and towards the end we only have fewer threads running finishing their loop count.

Thread Group also provides a scheduler, using which you can configure duration, startup delay, etc.

Samplers

Samplers tell JMeter to send requests to a server and wait for a response

HTTP Request

This sampler lets you send an HTTP/HTTPS request to a web server

screenshot of sampler used to send requests to my Rest API

Use Shutdown instead of stopping the test, as shutdown gracefully stops threads and waits for running thread to complete its request

Listeners

Listeners provide access to the information JMeter gathers about the test cases while JMeter runs. There are several listeners that come with JMeter

I used the below listeners

  • View Results in Table
  • Summary Report
  • Response Time Graph
  • View Results Tree
Listeners Used in Test Plan

Listeners can use a lot of memory if there are a lot of samples.

Configuration Elements

A configuration element works closely with a Sampler. Although it does not send requests, it can add to or modify requests.

CSV Data Set Config

In our case we wanted to test one user-facing API, we wanted to load test using a random set of user_ids instead of a single user_id. JMeter offers a very simple mechanism of using CSV Data Set Config.

Complete Workflow

The below diagram indicates the complete workflow we used for performance testing our service.

Workflow for testing out the different configuration settings

Results

Always run your test at least twice, before concluding the results

runtime: python38
entrypoint: gunicorn -b :$PORT -w 4 --threads 20 main:app
service: rating-helper
instance_class: F4
automatic_scaling:
min_idle_instances: 1
max_idle_instances: 3

After doing this exercise, our new service was able to scale in case of a sudden traffic spike. The configuration we choose was the most cost-effective and prod deployment for our new service went very smoothly.

--

--