Introducing the CodeSearchNet challenge
We’re announcing the CodeSearchNet Challenge and releasing a large dataset for natural language processing and machine learning.
From maintaining our robust CI/CD tooling to scaling our databases and ensuring our tools are kept up-to-date, maintaining the infrastructure of the world’s leading AI-powered developer platform is a big job. This is how we do it.
We’re announcing the CodeSearchNet Challenge and releasing a large dataset for natural language processing and machine learning.
We recently upgraded GitHub to use the latest version of Ruby 2.6. Ruby 2.6 contains an optimization for reducing memory usage.
Performance and reliability conversations as a GitHub product
On August 15th GitHub celebrated a major milestone: our main application is now running on the latest version of Rails: 5.2.1! 🎉 In total the project took a year and…
At GitHub, we serve tens of thousands of requests every second out of our network edge, operating on GitHub’s metal cloud. We’ve previously introduced GLB, our scalable load balancing solution…
GitHub uses MySQL as its main datastore for all things non-git, and its availability is critical to GitHub’s operation. The site itself, GitHub’s API, authentication and more, all require database…
GitHub’s Spokes system stores multiple distributed copies of Git repositories. This article discusses how we got Spokes replication to span widely separated datacenters. Background: Spokes GitHub developed a system called…
At GitHub, we use MySQL as the main database technology backing our services. We run classic MySQL master-replica setups, where writes go to the master, and replicas replay master’s changes asynchronously. To be able to serve our traffic we read data from the MySQL replicas.
Over the past 18 months we’ve made a significant investment in GitHub’s physical infrastructure. The goal of this work is to improve the redundancy and global availability of our system.…
GitHub is at a scale that provides exposure to interesting aspects of running a major site and are working to mature and level-up many parts of our infrastructure as we…
Over the last year, GitHub has gradually evolved the infrastructure that runs the Ruby on Rails application responsible for github.com and api.github.com. We reached a big milestone recently: all web…
Our MySQL infrastructure is a critical component to GitHub. MySQL serves GitHub.com, GitHub’s API, authentication and more. Every git request touches MySQL in some way. We are tasked with keeping…
At GitHub we recently revamped how we do DNS from the ground up. This included both how we interact with external DNS providers and how we serve records internally to…
Building robust systems involves designing for failure. As Site Reliability Engineers at GitHub, we’re always on the lookout for places where redundancy can help to mitigate problems, and today we’ll…
Historically, we have used Redis in two ways at GitHub: We used it as an LRU cache to conveniently store the results of expensive computations over data originally persisted in…
Build what’s next on GitHub, the place for anyone from anywhere to build anything.
Get tickets to the 10th anniversary of our global developer event on AI, DevEx, and security.