Migrating to GitHub Apps: Code Climate shares their story

Code Climate shares their experience and what they learned building with GitHub Apps.

Code Climate switched to GitHub Apps
|
| 6 minutes

We first started talking about GitHub Apps in 2017 as the recommended way for developers to build integrations on GitHub. With GitHub Apps, developers can use either GitHub’s REST or GraphQL APIs to interact with the GitHub ecosystem and provide a more flexible security model for users. Since announcing, we have released new documentation and guides to help simplify the process of building new integrations or migrating existing OAuth Apps and talking to integrators about their experiences. For those considering building or migrating to GitHub Apps, we thought it might be interesting to share one of those stories from Code Climate.

Code Climate has built on GitHub from the very beginning. Now they’re using GitHub Apps to build bots, manage server-to-server integrations, limit permission scopes, and provide better support to the engineers using their products. Here’s how they build with GitHub Apps, and what they learned in the process, in the words of Chris Hulton, Senior Engineer.

Code Climate Switches to GitHub Apps

Why are we so excited about GitHub Apps?

Though GitHub OAuth provides great capabilities for managing user-initiated actions, it presented us with a few limitations for building server-to-server integrations.

Using GitHub Apps to build bots

In our Quality product, we wanted to provide users with an Automated Code Review feature. Before GitHub Apps, we used individual user credentials to retrieve and post data into GitHub. We couldn’t post review comments on our customers’ pull requests as a Code Climate bot. GitHub Apps solve this problem.

A Bot found code similarities and suggests refactoring

Using GitHub Apps to manage server-to-server integrations

Some of our customers have immediate security policies that don’t allow us to use SSH to clone repository data from GitHub. Prior to GitHub Apps, we implemented HTTPS repository data cloning by rotating through various users’ OAuth access tokens. This means that these HTTPS calls are associated with specific GitHub users, even though they are not initiated by them. With GitHub Apps, we can authenticate to pull this type of data as a service.

Using GitHub Apps to limit permission scope

In the world of GitHub OAuth, read and write access to repository data are bundled together. GitHub Apps provides much more granularity in permissions. We are now able to ask for only the access level we need (read-only) on the specific resources we need (e.g. pull requests). This aligns the requested permission with the services we provide, making our users more comfortable.

Permission request to install Code Climate Velocity testing

How did we go about implementing GitHub Apps?

We found that switching to GitHub Apps did not require our existing system to be substantially restructured. The API endpoints and queries that we implemented via OAuth remained the same for GitHub Apps, and only our method of authentication changed.

For querying, we use GitHub’s graphql-client gem, with our client defined as:

class HTTP < ::GraphQL::Client::HTTP
  def initialize(vcs, access_token)
    super(vcs.graphql_api_url)
    @access_token = access_token
  end

  def headers(context)
    {}.tap do |h|
      if (token = context.fetch(:access_token, @access_token))
        h.merge!("Authorization" => "Bearer #{token}")
      end
    end
  end
end

With OAuth, for access_token, we provided the authorized OAuth token of a random user in the organization. Randomization helped us to balance usage across the organization, but was not ideal. We wanted to make the API request on behalf of the organization itself, which is where GitHub Apps came in.

With GitHub Apps, access is provided via short-lived, renewable tokens belonging to the installation, and the installation is granted the appropriate permissions. We introduced a component in our code that would generate a token for the organization’s GitHub App installation using the octokit gem and the credentials of our GitHub App:

  def create_token
    create_token_client.create_app_installation_access_token(
      external_database_id,
      accept: Octokit::Preview::PREVIEW_TYPES[:integrations],
    )
  end

  def create_token_client
    @create_token_client ||= Octokit::Client.new(
      bearer_token: generate_jwt_token,
      api_endpoint: vcs.api_url,
      web_endpoint: vcs.web_url,
      connection_options: {
        url: vcs.api_url,
      },
    )
  end

  def generate_jwt_token
    now = Time.now.to_i
    key = OpenSSL::PKey::RSA.new(vcs_app.private_key)
    opts = {
      iat: now,
      exp: now + 60,
      iss: vcs_app.external_database_id.to_i,
    }

    JWT.encode(opts, key, "RS256")
  end

We then pass this token to the API client in exactly the same way as before, keeping the overall change small and isolated.

Additionally, each generated token comes with an associated expires_at timestamp. To improve performance, we store an encrypted copy of the temporary token along with this timestamp in our database, allowing us to refresh the token only when necessary.

What did we learn in the process?

API Rate Limit Calculation

During this migration, we learned about the different rate-limiting mechanisms between GitHub OAuth and GitHub Apps.

With OAuth, we were able to cycle through user tokens to make API requests. This provided a very high capacity, as each token allowed for 5,000 GraphQL points per hour. When switching to GitHub Apps, however, the rate limit became organization-wide, meaning we had to think more critically about how often we were hitting the API and how expensive our queries were.

To understand our current GraphQL usage, we used the rateLimit object received from our GraphQL queries, and began tracking statistics using StatsD around how expensive each query was, as well as how often the tokens we used approached their rate limits:

  def process_RateLimit(object)
    object = Connectors::GitHub::Graphql::Fragments::RateLimitFragment.new(object)

    prefix = ["graphql", definition.name.demodulize, "limit"].join(".")
    $statsd.gauge("#{prefix}.cost", object.cost)
    $statsd.gauge("#{prefix}.remaining", object.remaining)
  end

We found that many of our GraphQL queries were fairly inexpensive, but we identified one particularly complex query to be expensive (15 GraphQL points) and frequent (every 10 minutes for every repository in an organization). We thought about how this query would perform for a large organization in our system (for example, 100 repositories) when using GitHub Apps:

100 repositories 15 GraphQL points 6 queries per hour = 9,000 GraphQL points / hour

This was initially concerning, as the number was significantly more than the 5,000 GraphQL point rate limit provided for an installation. However, another advantage of GitHub Apps is that its rate limit scales with the size of the organization. For installations with more than 20 repositories, an additional 50 requests (or 50 GraphQL points) is provided for each repository per hour.

With this increase, we identified our modified rate limit would be:

5,000 GraphQL points base + 50 additional points 100 repositories = 10,000 GraphQL points*

This provided us with enough capacity to maintain our current system. To help us stay comfortably within our limit going forward, we worked with GitHub to identify additional strategies for keeping our API usage within its rate limit:

  • Ingest real-time data via inbound webhooks
  • Catch-up with historical data using the API
  • Throttle your API requests to stay within your rate limits
  • Use conditional requests, where possible (currently only available via v3 REST API)

Impact on User Onboarding

Another lesson we learned during this migration is that GitHub Apps requires a different onboarding approach. Unlike OAuth, where users typically identify themselves and grant access in one step, with GitHub Apps, users are required to leave our site and install the application on GitHub before data can be processed. To smooth out this flow, we use the GitHub App installation endpoints to track the user’s onboarding progress and display in-app calls to action linking out to the GitHub App setup URL (when appropriate).

Wrap up

We are very excited to continue to build on GitHub Apps and to take advantage of all the data exposed through the GraphQL API. We’re also looking forward to partnering with GitHub and pioneering more beta features in the future.

If you’re interested in taking a more data-driven approach to engineering management (and want to see our GitHub Apps workflow in action) check us out!

Related posts