Paul May

Paul May Editor at Arachnys

Posted 37.01.18

How we solved load testing & scalability issues with Kong

To ensure that our clients are not able to abuse our services and maintain visibility of usage, we needed a way of authenticating API requests whilst simultaneously incorporating throttling or rate limiting.

We realized that we needed to bundle all of these common functionalities, and auxiliary features into one-reuseable layer, allowing each and every service to be confined to a single core business purpose à la the single responsibility principle. We needed to consolidate and prevent the exponential creation of unnecessary process layers. We needed a service which would be extendable, available, stateless, and scalable. We needed Kong.

Kong works by forwarding requests like a standard Nginx load balancer, but its plugin-orientated architecture simultaneously reduces friction within service development, and response time.

Previously, we had to optimize these auxiliary features on a per-service basis. We have not had to change any of our existing services to integrate Kong, nor do we have to update the source to work with a new format (such as protobuf via RPC protocol). Instead we can apply a request transformer for a specific API, and simply pass the authentication token in the headers or query parameters.

This is allowing us to continuously shave off the layers we have previously added to obtain simple, easy-to-scale, easy-to-deploy microservices. Indeed, some of these layers depends on a database, and removing them allows us to free our microservices, and achieve statelessness.

Large-scale monitoring requires a high volume of requests

We realized that we needed to achieve 115 queries per second on our existing Pythia infrastructure (a wrapper around ElasticSearch) for the purposes of large-scale monitoring for our bigger clients. ElasticSearch is our search engine, containing Arachnys news and sanctions data. Querying ElasticSearch directly would be problematic for external users. Tracking, and authentication wouldn’t be available if we let clients communicate directly with ElasticSearch - rather than communicating with ElasticSearch directly via their DSL, we can consume a simple high-level, simple API that our clients can use, and monitor their usage.

Initial load testing

We performed some tests to indicate latency (the time it takes from request to response) and throughput (the number of requests we completed in a given time, 115 gps).

The initial load testing yielded some very undesirable results. Using Vegeta for load-testing, coupled with a unique query generator written by our developer Ian, we noticed that we were fulfilling less than 5% of the requests over a period greater than 30 seconds. Most of these errors were coming from database connection limits and server timeouts. The bottleneck was very clear; We had high latency responses, and each request opened up a connection that lasted until a response was returned, or the client disconnected.




We needed to reduce the friction between the requester and the provider.

Service architecture responds to laws of motion:

“An object at rest will remain at rest unless acted on by an unbalanced force. An object in motion continues in motion with the same speed and in the same direction unless acted upon by an unbalanced force”

First, we considered introducing pgbouncer, which readies established connections (cutting latency) that can be recycled, but it would not solve our problem and would represent another dependency to maintain, and keep alive. Then, we eliminated the database to further reduce the points of failure and the friction.

Stateless Pythia

Without the database, we would now have a bottleneck with the number of requests our Pythia Dyno can process in a given-time. When we scale the dynos, the bottleneck becomes the number of requests ElasticSearch can process in a given time. We have opened the floodgates. There are consequences for removing the database however, as we’ve now sacrificed logging, and fine-grained authentication.


Kong is a scalable, open source API Layer. It is backed by the ‘battle-tested’ NGINX with a focus on high performance:

“OpenResty® is a full-fledged web platform that integrates the standard Nginx core, LuaJIT, many carefully written Lua libraries, lots of high quality 3rd-party Nginx modules, and most of their external dependencies. It is designed to help developers easily build scalable web applications, web services, and dynamic web gateways”

Put simply, it is a configurable load-balancer / proxy for web services. It can be used to provide additional layers for stateless microservices like authentication, logging or rate limiting. This allows us to focus on building services that abide by the single responsibility principle rather than worrying about access control.


Plug n’ Play

Kong has allowed us to overcome our fiddly scalability issues, related to client-specific services. Rather than becoming increasingly fat, with ever-greater management overhead, with dependencies, users, and configuration, we can optimize the common layers in one go. This also is reducing friction in service development.

In the future, we can continuously shave off the layers we have previously added to obtain simple, scalable microservices.

What alternatives are there?

We considered alternatives before arriving at Kong: Traefik, Linkerd, and Caddy. We came to the conclusion that Traefik simply does not offer the quality of documentation that we needed for a smooth implementation. It also does not have many plugins available, or authentication. Linkerd is used for internal routing and the documentation was also poor. Caddy is relatively new, but it is similarly not designed as an API gateway.

As an alternative to Kong, we could just deploy nginx for our services to act as a gateway / authentication layer. But all these requires additional configuration in Ansible, and that can get quite fiddly.


Whilst we were not quite able to hit 100% with current conditions (additional work will be done to make 100% possible), we were able to drastically increase our throughput, under ideal and reachable conditions. In the future, we’ll be managing all of our users and clients in a single database by synchronizing their access to APIs over to Kong. We’ll also be making our existing services thinner by removing any authentication and logging, and updating all services to rely on Kong’s upstream headers, in order to verify access.

The current trend centers around ‘serverless’ computing / architecture: instead of hosting a long-running server application, we would host a function instead, and execute it on demand. This is known as ‘function-as-a-service’ (FaaS), the most popular provider being AWS Lambda. It takes away all concerns over infrastructure management, and focuses on a single business function. Now, the main entry-point is index.php.

With Kong, it is easy for developers to focus on single purpose functions that can later be connected to the wider API ecosystem.

Topics: / Product

Paul May

Paul May

Editor at Arachnys

Stay current with the latest from Arachnys

Subscribe today