Sharing security expertise through CodeQL packs (Part I)
Introducing CodeQL packs to help you codify and share your knowledge of vulnerabilities.
Congratulations! You’ve discovered a security bug in your own code before anyone has exploited it. It’s a big relief. You’ve created a CodeQL query to find other places where this happens and ensure this will never happen again, and you’ve deployed your new query to be run on every pull request in your repo to prevent similar mistakes from ever being made again.
What’s the best way to share this knowledge with the community to help protect the open source ecosystem by making sure that the same vulnerability is never introduced into anyone’s codebase, ever?
The short answer: produce a CodeQL pack containing your queries, and publish them to GitHub. CodeQL packaging is a beta feature in the CodeQL ecosystem. With CodeQL packaging, your expertise is documented, concise, executable, and easily shareable.
This is the first post of a two-part series on CodeQL packaging. In this post, we show how to use CodeQL packs to share security expertise. In the next post, we will discuss some of our implementation and design decisions.
Modeling a vulnerability in CodeQL
CodeQL’s customizability makes it great for detecting vulnerabilities in your code. Let’s use the Exec call vulnerable to binary planting query as an example. This query was developed by our team in response to discovering a real vulnerability in one of our open source repositories.
The purpose of this query is to detect executables that are potentially vulnerable to Windows binary planting, an exploit where an attacker could inject a malicious executable into a pull request. This query is meant to be evaluated on JavaScript code that is run inside of a GitHub Action. It matches all arguments to calls to the ToolRunner
(a GitHub Action API) where the argument has not been sanitized (that is, ensured to be safe) by having been wrapped in a call to safeWhich
. The implementation details of this query are not relevant to this post, but you can explore this query and other domain-specific queries like it in the repository.
This query is currently protecting us on every pull request, but in its current form, it is not easily available for others to use. Even though this vulnerability is relatively difficult to attack, the surface area is large, and it could affect any GitHub Action running on Windows in public repositories that accept pull requests. You could write a stern blog post on the dangers of invoking unqualified Windows executables in untrusted pull requests (maybe you’re even reading such a post right now!), but your impact will be much higher if you could share the query to help anyone find the bug in their code. This is where CodeQL packaging comes in. Using CodeQL packaging, not only can developers easily learn about the binary planting pattern, but they can also automatically apply the pattern to find the bug in their own code.
Sharing queries through CodeQL packs
If you think that your query is general purpose and applicable to all repositories in all situations, then it is best to contribute it to our open source CodeQL query repository (and collect a bounty in the process!). That way, your query will be run on every pull request on every repository that has GitHub code scanning enabled.
However, many (if not most) queries are domain specific and not applicable to all repositories. For example, this particular binary planting query is only applicable to GitHub Actions implemented in JavaScript. The best way to share such queries query is by creating a CodeQL pack and publishing it to the CodeQL package registry to make it available to the world. Once published, CodeQL packs are easily shared with others and executed in their CI/CD pipeline.
There are two kinds of CodeQL packs:
- Query packs, which contain a set of pre-compiled queries that can be easily evaluated on a CodeQL database.
- Library packs, which contain CodeQL libraries (
*.qll
files), but do not contain any runnable queries. Library packs are meant to be used as building blocks to produce other query packs or library packs.
In the rest of this post, we will show you how to create, share, and consume a CodeQL query pack. Library packs will be introduced in a future blog post.
To create a CodeQL pack, you’ll need to make sure that you’ve installed and set up the CodeQL CLI. You can follow the instructions here.
The next step is to create a qlpack.yml
file. This file declares the CodeQL pack and information about it. Any *.ql
files in the same directory (or sub-directory) as a qlpack.yml
file are considered part of the package. In this case, you can place binary-planting.ql
next to the qlpack.yml
file.
Here is the qlpack.yml
from our example:
name: aeisenberg/codeql-actions-queries
version: 1.0.1
dependencies:
codeql/javascript-all: ~0.0.10
All CodeQL packs must have a name
property. If they are going to be published to the CodeQL registry, then they must have a scope as part of the name. The scope is the part of the package name before the slash (in this example: aeisenberg
). It should be the username or organization on github.com
that will own this package. Anyone publishing a package must have the proper privileges to do so for that scope. The name part of the package name must be unique within the scope. Additionally, a version, following standard semver rules, is required for publishing.
The dependencies
block lists all of the dependencies of this package and their compatible version ranges. Each dependency is referenced as the scope/name
of a CodeQL library pack, and each library pack may in turn depend on other library packs declared in their qlpack.yml
files. Each query pack must (transitively) depend on exactly one of the core language packs (for example, JavaScript, C#, Ruby, etc.), which determines the language your query can analyze.
In this query pack, the standard JavaScript libraries, codeql/javascript-all
, is the only dependency and the semver range ~0.0.10
means any version >= 0.0.10
and < 0.1.0
suffices.
With the qlpack.yml
defined, you can now install all of your declared dependencies. Run the codeql pack install
command in the root directory of the CodeQL pack:
$ codeql pack install
Dependencies resolved. Installing packages...
Install location: /Users/andrew.eisenberg/.codeql/packages
Installed fresh codeql/javascript-all@0.0.14
After making any changes to the query, you can then publish the query to the GitHub registry. You do this by running the codeql pack publish
command in the root of the CodeQL pack.
Here is the output of the command:
$ codeql pack publish
Running on packs: aeisenberg/codeql-actions-queries.
Bundling and then publishing qlpack located at '/Users/andrew.eisenberg/git-repos/codeql-actions-queries'.
Bundled qlpack created at '/var/folders/41/kxmfbgxj40dd2l_x63x9fw7c0000gn/T/codeql-docker17755193287422157173/.Docker Package Manager/codeql-actions-queries.1.0.1.tgz'.
Packaging> Package 'aeisenberg/codeql-actions-queries' will be published to registry 'https://ghcr.io/v2/' as 'aeisenberg/codeql-actions-queries'.
Packaging> Package 'aeisenberg/codeql-actions-queries@1.0.1' will be published locally to /Users/andrew.eisenberg/.codeql/packages/aeisenberg/codeql-actions-queries/1.0.1
Publish successful.
You have successfully published your first CodeQL pack! It is now available in the registry on GitHub.com for anyone else to run using the CodeQL CLI. You can view your newly-published package on github.com:
At the time of this writing, packages are initially uploaded as private packages. If you want to make it public, you must explicitly change the permissions. To do this, go to the package page, click on package settings, then scroll down to the Danger Zone:
And click Change visibility.
Running queries from CodeQL packs using the CodeQL CLI
Running the queries in a CodeQL pack is simple using the CodeQL CLI. If you already have a database created, just call the codeql database analyze
command with the --download
option, passing a reference to the package you want to use in your analysis:
$ codeql database analyze --format=sarif-latest --output=out.sarif --download my-db aeisenberg/codeql-actions-queries@^1.0.1
The --download
option asks CodeQL to download any CodeQL packs that aren’t already available. The ^1.0,0
is optional and specifies that you want to run the latest version of the package that is compatible with ^1.0.1
. If no version range is specified, then the latest version is always used. You can also pass a list of packages to evaluate. The CodeQL CLI will download and cache each specified package and then run all queries in their default query suite.
To run a subset of queries in a pack, add a :
and a path after it:
aeisenberg/codeql-actions-queries@^1.0.1:binary-planting.ql
Everything after the :
is interpreted as a path relative to the root of the pack, and you can specify a single query, a query directory, or a query suite (.qls
file).
Evaluating CodeQL packs from code scanning
Run the queries from your CodeQL pack in GitHub code scanning is easy! In your code scanning workflow, in the github/codeql-action/init
step, add packs
entry to list the packs you want to run:
- uses: github/codeql-action/init@v1
with:
packs:
- aeisenberg/codeql-actions-queries@1.0.1
languages: javascript
Note that specifying a path after a colon is not yet supported in the codeql-action, so using this approach, you can only run the default query suite of a pack in this manner.
Conclusion
We’ve shown how easy it is to share your CodeQL queries with the world using two CLI commands: the first resolves and retrieves your dependencies and the second compiles, bundles, and publishes your package.
To recap:
Publishing a CodeQL query pack consists of:
- Create the
qlpack.yml
file. - Run
codeql pack install
to download dependencies. - Write and test your queries.
- Run
codeql pack publish
to share your package in GHCR.
Using a CodeQL query pack from GHCR on the command line consists of:
codeql database analyze --download path/to/my-db aeisenberg/codeql-actions-queries@1.0.1
Using a CodeQL query pack from GHCR in code-scanning consists of:
- Adding a config-file input to the
github/codeql-action/init
action - Adding a packs block in the config file
The CodeQL Team has already published all of our standard queries as query packs, and all of our core libraries as library packs. Any pack named {*}-queries
is a query pack and contains queries that can be used to scan your code. Any pack named {*}-all
is a library pack and contains CodeQL libraries (*.qll
files) that can be used as the building blocks for your queries. When you are creating your own query packs, you should be adding as a dependency the library pack for the language that your query will scan.
If you are interested in understanding more about how we’ve implemented packaging and some of our design decisions, please check out our second post in this series. Also, if you are interested in learning more or contributing to CodeQL, get involved with the Security Lab.
Sharing your security expertise has never been easier!
Tags:
Written by
Related posts
Execute commands by sending JSON? Learn how unsafe deserialization vulnerabilities work in Ruby projects
Can an attacker execute arbitrary commands on a remote server just by sending JSON? Yes, if the running code contains unsafe deserialization vulnerabilities. But how is that possible? In this blog post, we’ll describe how unsafe deserialization vulnerabilities work and how you can detect them in Ruby projects.
10 years of the GitHub Security Bug Bounty Program
Let’s take a look at 10 key moments from the first decade of the GitHub Security Bug Bounty program.
Where does your software (really) come from?
GitHub is working with the OSS community to bring new supply chain security capabilities to the platform.