Building a Serverless API Usage tracker on AWS

Tracking API calls from ~100 microservices to capture user behavior/feature usage patterns/customer data for billing/identifying trending features/fixing bottlenecks in the system

In this post I would like to share our architecture for the API usage tracker.

API Usage Tracker

AWS EventBus offers out-of-the-box integration with multiple AWS Services.

Supported EventBridge Targets https://docs.aws.amazon.com/eventbridge/latest/userguide/eventbridge-targets.html

The event bus in us-east-1(N.Virginia) supports 10K PutEvents per second. A bus can have at max 300 Rules . A rule can be associated with 5 targets. On a successful rule match the event is pushed from the bus to the target associated with the matched rule.

Service Quotas: https://docs.aws.amazon.com/eventbridge/latest/userguide/cloudwatch-limits-eventbridge.html

Our Architecture

1: AOP: We created a simple java AOP annotation that cuts through the method that needs to be tracked. Our annotation supports Springboot/Dropwizard and other in-house developed custom web frameworks. We define an aspect that cuts through the tracked method and does the following

a. Captures the http-request/SecurityContext data

b. Gathers all the required information(user info/customer data/api descriptions) from the http call

c. Creates a Usage Event

d. Pushes the Usage Event to the AWS Event-Bus

2. Our Event-Bus has a rule to pass custom events on the bus to kinesis-Firehose. The sample pattern below allows all events with source: my-applications and detail-type: api-usage-event to pass through to kinesis firehose. We can also have conditions that restrict events based on certain keys in our event.

Sample Event Rule Pattern
Sample Event Rule Pattern

Content Based Filtering via event Patterns: https://docs.aws.amazon.com/eventbridge/latest/userguide/content-filtering-with-event-patterns.html

Firehose buffering options
Firehose buffering options

3. Firehose provides buffering options that helps generate smaller number of large sized files. This is useful when performing ETL on our API tracking data. On Firehose we set the buffering options as 900 seconds or 128MB whichever is earlier. Firehose then dumps the json raw files in our S3 bucket.

4. The last piece of the stack is configuring a Glue workflow that triggers a glue job to periodically read from this S3 bucket and compress and store our json data in parquet format. The workflow then executes a crawler on successful completion of the glue job. The crawler creates a new table if it doesn’t exist and periodically updates the table for new partitions and columns.

Once the data is available in parquet we can query it via Athena. Generate Billing reports out of it. Identify Product usage count/feature usage rates.

Schema Registry

Eventbridge also provides a schema registry. Once we enable the schema registry on the bus we can get the schema of the events pushed on the bus. This data can be exported in open-api or swagger json and shared with other teams to use our API usage tracker service.

A sample APIUsage Event wrapped within the AWS Event

CICD

This entire stack is developed using AWS cdk. When creating the stack cdk didn’t support binding rule target to kinesis firehose. We had to write a boto script to achieve that. We created a code pipeline that manages our CICD.

Conclusion:

AWS Eventbridge is a wonderful AWS offering that allows for easier integration with multiple AWS services. Since it offers integration out-of-the-box we didnt have write/maintain the integration logic in lambda.