GitHub Availability Report: January 2021
Introduction In January, we experienced one incident resulting in significant impact and degraded state of availability for the GitHub Actions service. January 28 04:21 UTC (lasting 3 hours 53 minutes)…
Introduction
In January, we experienced one incident resulting in significant impact and degraded state of availability for the GitHub Actions service.
January 28 04:21 UTC (lasting 3 hours 53 minutes)
Our service monitors detected abnormal levels of errors affecting the Actions service. This incident resulted in the failure or delay of some queued jobs for a period of time. Jobs that were queued during the incident were run successfully after the issue was resolved.
We identified the issue as caused by an infrastructure error in our SQL database layer. The database failure impacted one of the core microservices that facilitates authentication and communication between the Actions microservices, which affected queued jobs across the service. In normal circumstances, automated processes would detect that the database was unhealthy and failover with minimal or no customer impact. In this case, the failure pattern was not recognized by the automated processes, and telemetry did not show issues with the database, resulting in a longer time to determine the root cause and complete mitigation efforts.
To help avoid this class of failure in the future, we are updating the automation processes in our SQL database layer to improve error detection and failovers. Furthermore, we are continuing to invest in localizing failures to minimize the scope of impact resulting from infrastructure errors.
In summary
We’ll continue to keep you updated on the progress we’re making on ensuring reliability of our services. To learn more about how teams across GitHub identify and address opportunities to improve our engineering systems, check out the GitHub Engineering blog.
Tags:
Written by
Related posts
GitHub and JFrog partner to unify code and binaries for DevSecOps
This partnership between GitHub and JFrog enables developers to manage code and binaries more efficiently on two of the most widely used developer platforms in the world.
2024 GitHub Accelerator: Meet the 11 projects shaping open source AI
Announcing the second cohort, delivering value to projects, and driving a new frontier.
Introducing GitHub Copilot Extensions: Unlocking unlimited possibilities with our ecosystem of partners
The world of Copilot is getting bigger, improving the developer experience by keeping developers in the flow longer and allowing them to do more in natural language.