How I reduced integration test execution times by 95% using a secret PostgreSQL gem

The treasure of Isla de PostgreSQL

So, we’ve all been there. Needing to run the full test-suite locally, or as a part of a CI flow, but it’s taking ages…

Many may not be aware of a powerful feature in PostgreSQL. When creating a new database, PostgreSQL allows you to clone an existing database and use it as a template. This approach can be particularly advantageous in CI environments. Instead of running migrations for every schema or database, you can run them once on a base template. This significantly reduces overhead. The efficiency gain here stems from the fact that copying a database is generally a less resource-intensive operation compared to executing multiple SQL queries needed to set up indices, tables, and foreign keys. To put this into perspective, I employed this technique in my work and saw a drastic reduction in test execution times - from approximately 2 minutes down to just 2–3 seconds. Yes. That much.

Benchmarks

The secret sauce for this is the template feature.

The benchmarks paint a clear picture: a concise 6 seconds to complete when leveraging template1. Quite impressive.

6 second benchmark for tests This benchmark was a run of 100 iterations with sample migrations of dummy models. This benchmark utilizes the template1 feature, which shares migrations as a common baseline.

14 second benchmark for tests This benchmark was a run of 100 iterations. This benchmark migrates each test database which need a database.

Comparison

In a head-to-head comparison, the benchmark that harnesses template1 drastically outpaces the standard approach, clocking in at just 6 seconds, while the latter lags behind at 14 seconds and 638 milliseconds. These benchmarks were conducted over 100 iterations, with each iteration involving a set of sample migrations applied to dummy models. It’s important to note that results can vary. The tests showcased here were conducted with simple migrations and models. Should your migrations be more intricate and your models more complex, the impact could be more pronounced, particularly if you’ve previously identified a bottleneck during the integration test phase of your workflow.

Practical Example

I made a Github repository should you want to test or possibly have some improvements for the setup I made!

There are a few key elements and optimizations to this whole ordeal. The general idea is to do the following:

  • Run the migrations on template1
  • Create a new test database, which automatically uses the template1 as it’s a built in PostgreSQL feature.
  • Managing open connections

Synchronizing migration of template1

A critical aspect of setting up an efficient test environment is to avoid redundant operations that can lead to increased execution time and resource usage. Go’s sync.Once provides a powerful tool for such an optimization, particularly when used in database migration scenarios. Here’s a brief look at how it works:

lock.Do(func() {
  err = migrateTemplateDatabase(templateDB)
  if err != nil {
   log.Fatalf("failed to create databases: %v", err)
  }
 })

In the snippet above, sync.Once is used to guarantee that the function migrateTemplateDatabase is executed exactly once. This is essential when our aim is to set up a template database-a process that should occur a single time regardless of how many test databases we subsequently create. However, there’s an interesting nuance to consider - sync.Once operates on a per-package basis when running tests in parallel across multiple packages. In theory, each package could attempt to execute its own sync.Once. But in practice, this is not a concern for our database setup for two reasons: When a new database is created from template1, it requires an exclusive lock on template1. As a result, only one operation can be performed at a time, ensuring that even if sync.Once were called in parallel across packages, the database operations would remain safe and sequential.

The function for running migrations is also pretty simple, due to the PoC-nature of this project:

func migrateTemplateDatabase(templateDB *gorm.DB) error {
    // Run migration on template1
    if err := Migrate(templateDB); err != nil {
       log.Fatalf("failed to migrate template1: %v", err)
    }

    return nil
}

Within this function, we invoke Migrate, a method that applies a set of database schema changes captured in our migration scripts. The function is designed to be idempotent, ensuring that it can be run safely multiple times without altering the final state of the database after the initial migration. GORM, and hopefully most other ORMs that support migrations, manage the migrations they have applied previously.

Final words

If your integration tests are hampered by slow database setups, the strategy we’ve discussed might just be the thing you need. The idea is that duplicating an existing database template is more cost-effective than executing a multitude of individual SQL queries repetitively. For a hands-on look, check out the repository to dive deeper into the implementation.

comments powered by Disqus

Related Posts

ETL Revelations: Navigating Data Migration with Foresight and Agility

Unveiling the Hidden Chapters of ETL Migration ETL (Extract, Transform, Load) processes can often feel like navigating through a dense forest, with each step presenting new challenges and discoveries.

Read More

The Trendsale Chronicles: Initial contact. Deactivating accounts, unearthing XSS and Filetype Flaws

Finding our first flaws in Trendsales and establishing contact with them As curious people, and former sneaker bot developers we like to “poke” at website API’s and how they work, to try and see how the internet is designed and how it develops, it doesn’t really matter which site we’re exploring, we do it everywhere, and Trendsales was no exclusion.

Read More

The Trendsale Chronicles: Breaching the admin portal

Finding the domain In the previous blog post, we discussed how we found a stored XSS vulnerability in the Trendsales platform.

Read More