Zuora to Acquire Togai.
Read more

Sneak Peek into Togai’s Architecture

15 Mins Read
Hari Narayanan
Published On : 24/04/2024


Togai is a pricing infrastructure platform built for SaaS companies to observe, measure, analyze & price their products more scientifically.
Building a billing software may seem trivial at the onset but the devil lies in the details.

Building a usage-based billing software is challenging and we’ve gone through it in our earlier blog post. But complexity increases when you need to build a usage billing platform at a scale that is capable of handling millions of events daily from multiple users!
Keeping this in mind, we needed to choose our tech stack at Togai mindfully, which would not only scale but also enable us to add iterations and implement quickly as we get our regular product updates.
If you’ve not already read, do read this blog on software tenets to understand the choices - why we took the decisions we did while building Togai.

Togai backend architecture

Even though monoliths would have helped us build quicker, we still chose to go with a micro-services-based architecture. That is because Togai promises high throughput (almost unlimited) from Day 1 for our customers. With a microservices architecture, we will have the flexibility to choose and scale critical components in the event processing pipeline.

Togai backend components from a bird’s eye view

 Image illustration of the bird's eye view of Togai's backend components.

Tech choices

Language & Framework (Kotlin, Ktor, Javascript, Loopback, Vue3)

We wanted a language that allows us to do type checking at compile times so that we can detect errors as we build. We strongly believe in the FOSS model. So the majority of Togai's backend was built with Kotlin.

Kotlin was an expressive language that helped us achieve more with less code. Kotlin has a superior type of system when compared to Java and with powerful features like coroutines, null safety, and first-class interoperability support with Java. In my experience, it was fun writing the software in Kotlin.

Not just that, Kotlin is in the JVM eco-system. This means we were able to leverage amazing third-party open-source libraries in the JVM eco-system while building Togai. A big shout-out to these amazing libraries that helped in building Togai.

Library Description
https://ktor.io/ Ktor is an HTTP web framework written ground-up in Kotlin. Ktor’s simple constructs helped us to build components faster. Also because of the native co-routine support, we can build systems that handle large-scale throughput.
https://www.jooq.org/ The message from the website says it all - "jOOQ generates Java code from your database and lets you build type-safe SQL queries through its fluent API”. It's a nice library that helped us to stay in the strongly-type realm.
https://flywaydb.org/ Flyway helped us to manage DB migrations.
https://github.com/brettwooldridge/HikariCP A lightweight DB connection pooling library. Helped to optimize connections between DB and application servers.
https://github.com/sksamuel/hoplite A Kotlin native configuration management library. Helped to internally manage configurations specific to environments. Word of caution - This library is not backed by any organization but maintained by an amazing individual - https://github.com/sksamuel - who was very prompt & accessible in responding to our queries.
https://logback.qos.ch/ An SLF4J-compatible logging framework.
https://insert-koin.io/ A dependency injection framework written ground-up with Kotlin.
https://jsonlogic.com/ A specification helped us to represent logic in JSON. This is also maintained by an individual and not backed by any organization. Overall JSONLogic is a fantastic idea that deserves support from the community. Kudos to the maintainer - https://github.com/jwadhams.
https://github.com/sksamuel/cohort The health check library is natively written in Kotlin. This library is also managed by - https://github.com/sksamuel.
https://junit.org/junit5/ A de-facto testing framework for JVM-based apps.
https://www.testcontainers.org/ A library helped us to mock databases for running tests.

While building Togai with Kotlin, we became a fan of Kotlin’s cooperative multitasking construct - “Coroutines”.

In our load tests, with just a t4g.small instance (2 vCPUs and 4 GB RAM machine), we could achieve 60 TPS (transactions per second) with a baseline 20% CPU. This load was handled with request latency within our SLA of 1 second.

It was a pleasant surprise to know the power of the multi-tasking efficiency of coroutines. We loved the way it performed. It could handle more than 100M requests per month and the price was competitive too with just $30 rent on a box. Kotlin and Ktor were the best choices for us in building Togai.

But Togai’s backend was not entirely built with Kotlin. We also have a NodeJS-based service to ingest events from clients.

Why did we choose NodeJS?

We choose NodeJS for the ingestion component because of the async IO. Since Togai ingestion is all about IO where clients send usage events and the events are validated and pushed to a queue, we choose NodeJS.

Yes, I know what you’re thinking, multiple languages create interoperability problems. And yes, it did. To counter it, we used OpenAPI. OpenAPI is a programming language agnostic standard to build HTTP-based RESTful APIs.

With defined API spec, developers are free to write the backend in any language and clients are free to integrate Togai APIs using any language they.. Also, OpenAPI tech is supported by a strong community with amazing tools, this helped us offload mundane tasks like input invalidations, GUI explorers, server/client code generators, and test automation to those open-source tools which helped to speed up our development.

Frontend (Vue3)

Togai's front end is completely built with the Vue3 JS framework. Let’s go through this in detail in a different blog.

Database (Postgres, TimeseriesDB)

We love Postgres. It is one of the most advanced databases available today with strong community support. It has rich feature sets like transactions, partitions, extensions, views, indices with different types, data types, constraints supports, and more.

We did encounter a couple of challenges in Postgres too, like replication configuration, failover management, garbage collection (vacuum), version upgrades, etc., But we ahead with it because it's a well-engineered product with rich features.

But we’ve used Postgres earlier and we know that with the right tuning, we can make Postgres extremely powerful.

Also in Togai, we had a case, where not all the data we get is relational. Every time we get a usage event from our customers, it has a timestamp. We meter and calculate the revenue for each event. It is a time-series data and all queries to the usage, revenue will be based on time ranges.

We evaluated different time series storage solutions (like influxDB, and Amazon timestream) to handle aggregations, downsampling, etc. in the database layer. After a detailed evaluation, concluded to use TimescaleDB because of two reasons

1) higher ingestion rate at scale and
2) better query performance as compared to other time-series solutions.

TimescaleDB is an extension built on top of the Postgres database so this gave us confidence as we already knew Postgres’s capability.

Messaging (NATS)

Togai processes events asynchronously. We needed a message queue solution to temporarily store all unprocessed events and keep them available for event processors to process.

A managed message queue solution (like AWS SQS) was not preferred because of the cost at scale. So we evaluated Kafka & Nats for message queues. Unlike Kafka, NATS is solely built as a message queue keeping performance in mind at scale.

NATS has a smaller memory footprint compared to Kafka and is simple to set up. In our load test, with just a t4g.small instance (2 vCPUs and 4 GB RAM machine), nats were able to support a throughput of over 5000 messages per second with a CPU utilization of less than 5%.

We were able to achieve a throughput of ~1B message processing by spending less than $30 a month.

NATS is the perfect candidate for Togai’s message solution. But there is a catch, we are new to NATS and messaging is critical for Togai. So to avoid any unknown unknowns, we designed a backup messaging solution with AWS SQS queue. Messages will be automatically forwarded to AWS SQS in case there is any failure in NATS ingestion. This backup idea helped us to build a highly available, scalable messaging system at a competitive price.

Deployment

Togai is deployed and hosted in the AWS cloud. We got generous credits, so we didn’t look any further. For infrastructure, we use Terraform and Chef for infrastructure config management. This helped to configure all infrastructure components (like NATS, Nginx, Redis, Postgres, and app servers) with code.

Local development env

Setting a local environment for Togai is complex as it requires many components to run. We sought the help of docker-compose to run on our local desktops.

Testing

As a startup, we focus on speed. So we prefer to write local integration tests than pure unit tests. We have a detailed blog planned on why we chose this approach but in short, it helped to push more features with higher quality in less time. You can find the working example of local integration tests in our open-source identity & access management service(GitHub repo).

For integration tests, Togai relies on Postman’s automation tool (Newman) to test our backend APIs.

Operations

Vector - For log collection, we use Vector to push application logs from the server to a central log store. We evaluated other collector agents (like Logstash, and FluentD) and settled on “Vector” because of its less memory footprint and high performance. Currently, we store the aggregated logs in a central log store backed by AWS EBS.

Newrelic - We use this paid observability solution to monitor app servers, infrastructure & front-end synthetics.

Emails - We use emails to notify any alerts from newrelic. We are looking to update to pagerDuty or any other better notification systems for alerts.

Other logistics - Trello, GitHub & Slack

Summary

This blog post contains a lot of opinionated choices. We are not claiming our choices are the best. Like every software, there are pros and cons in this list.

We mainly choose these technologies and libraries based on

  1. our design tenets,
  2. technology familiarity and
  3. development speed.

I hope this guides you to make an informed decision when choosing the tech for your apps.

Share Article : 
Togai's flexible solution swiftly addressed our pricing & billing needs, cutting our launch time from months to days.
Nikhil Nandagopal, Founder
Try for free
Subscribe to our newsletter
Enter your email address to get the latest news on Togai. We don't spam
Our Top Picks
Unlocking Pricing Flexibility with Togai’s Entitlements
Want to tailor pricing to customer needs? Need to prevent overuse of features? Check out how Togai's Entitlements redefine pricing flexibility.
PUBLISHED ON 12/07/2023
12  MINS READ
READ ARTICLE
When should AI companies think about their pricing?
Are traditional pricing models holding back AI success? Find out why AI businesses are turning to usage-based and hybrid strategies.
PUBLISHED ON 12/07/2023
13  MINS READ
READ ARTICLE
How Can You Leverage Pricing To Increase Profitability
Are you maximizing SaaS profitability? Discover how pricing strategies can optimize your LTV, CAC, churn, and NRR metrics
PUBLISHED ON 13/02/2023
17  MINS READ
READ ARTICLE
SaaS Billing made stupid easy
Get started for free
Logo of Togai
For any queries, reach out to 
[email protected]
The brand logo icon of Linkedin.The brand logo icon of Linkedin.The brand logo icon of Youtube.
chevron-down