Optimizing Deployments for Cloud Functions

Kshitij Chauhan, Apr 25, 2022

Coffee Filter - Selective Filtering

Firebase Functions are a convenient wrapper around Google Cloud Functions.

My team at upcover started using them to prototype and deploy APIs on Google Cloud. They provide an easy local development experience with the Firebase Emulator Suite, and a convenient deployment workflow with the firebase CLI.

Issues

As our product grew in complexity, the number of Cloud Functions we maintained increased significantly. Once we hit around 60 Cloud Functions, we experienced two major issues:

  1. The firebase CLI tries to deploy all Cloud Functions in your project in parallel. Given enough Cloud Functions, it leads to rate limit errors with the Google Cloud API.
  2. The CLI deploys every Cloud Function every time, regardless of whether a function actually needs redeployment.

Solving Rate Limit Errors

Google Cloud has well defined quotas for API calls. The firebase deploy command frequently exceeded this quota, because it tried to deploy every Cloud Function in parallel.

To solve the API rate limit problem, we started deploying our Cloud Functions in batches of 30. I created a simple Node.js script that did the following:

  1. Discover all Cloud Functions exported from the repository.
  2. Partition discovered functions into batches of 30 (I settled on the number 30 arbitrarily).
  3. Run the firebase deploy --only functions:[<function-names>] for the functions in each batch.

Here's a simplified implementation of the script:

import { chunk } from "lodash";
import { execSync } from "child_process";

const functionsIndex = require("./libs/src/index");
const exportedFunctions = Object.keys(functionsIndex);

const batches = chunk(exportedFunctions, 30);

for (const batch of batches) {
  const args = batch
    .map((functionName) => `function:${functionName}`)
    .join(",");

  const command = `firebase deploy --only ${args}`;
  execSync(command, { stdio: "inherit" });
}

The functionsIndex module is a compiled JavaScript file. It's produced from the src/index.ts TypeScript file containing all exported Cloud Functions. Here's a sample of what it looks like:

// src/index.ts

export { auth } from "./services/auth/function";
export { quotes } from "./services/quotes/function";
export { documents } from "./services/documents/function";

This script solved the rate limit errors, but led to longer CI builds.

It ran firebase deploy multiple times: once for every batch of Cloud Functions. While this led to longer builds, it was needed because the deployments outright failed otherwise.

Adopting Selective Redeployments

Every Cloud Function requires time to deploy. The more Cloud Functons you have, the longer your deployment time. When our CI build hit a duration of 30 mins, we began investigating possible solutions.

Our CTO, Sajjad Naveed, pointed out that we could reduce the time by deploying only the Cloud Functions that changed. To detect changed Cloud Functions, he suggested a way to calculate every function's identity.

If the identity of a function does not change, it does not need redeployment.

Cloud Function Identity

Sajjad suggested bundling a Cloud Function, and then calculating a hash of the bundle to produce the its identity.

I created a script that performed the following steps:

  1. Discover every Cloud Function in the repository.
  2. Bundle the code of every Cloud Function using a bundler, with minification and tree shaking.
  3. Calculate the hash of the bundled code.
  4. Compare the newly calculated hash with an existing hash value stored in a local file. If unchanged, append the function to the list of functions that need redeployment. Finally, update the local file containing hashes with the new hash of this function.
  5. Selectively deploy only the functions that need redeployment.

Why Does This Work?

Consider the following Cloud Function:

import * as functions from "firebase-functions";
import { FooService } from "./fooService";

export const FooFunction = functions.https.onRequest((req, res) => {
  const message = FooService.getMessage();
  res.json({ message });
});

The identity of FooFunction can change due to three reasons only:

  1. If we update the firebase-functions package
  2. If we update the implementation of FooService
  3. If FooFunction itself changes (if we change the response to res.status(201).json({ message }))

Simply put, the identity of a Cloud Function changes only its code, or its dependencies change

Bundling therefore works great for this purpose: the produced bundle for a Cloud Function would change when any/all of the above happen. A different bundle would produce a different hash, and therefore, a new identity.

Results

Selective redeployments dramatically reduced our average deployment time.

If a pull request affected only a single cloud function, merging that pull request would trigger redployment of that function only. However, if a pull request updated a dependency used by many cloud functions, all of them would need redeployment.

Here are some screenshots to demonstrate:

Short Selective Redeployment

Deployment Affecting a Single Cloud Function

Long Selective Redeployment

Deployment Affecting Multiple Cloud Functions

I used ESBuild as the bundler (mainly due to its speed), but the process should work with any bundler.

functions-differ

I decided to build this functionality into a reusable open source package.

Meet functions-differ. A CLI tool written in Node.js that calculates and maintains function identities. Use it to selectively redeploy your Cloud Functions.

It's an experimental tool that I maintain in my free time, so there are some rough edges. Feel free to try it and share feedback.

I plan to write a post on how to integrate functions-differ with your CI workflow in the future. Watch this space for more.


Thanks to Bhanu and Sajjad for contributing to this tool.

Questions, comments, or feedback? Reach out to me on Twitter: @haroldadmin