Yummy cookies across domains
Last Friday we announced and performed a migration of all GitHub Pages to their own github.io domain. This was a long-planned migration, with the specific goal of mitigating phishing attacks…
Last Friday we announced and performed a migration of all GitHub Pages to their own
github.io
domain. This was a long-planned migration, with the specific goal
of mitigating phishing attacks and cross-domain cookie vulnerabilities arising from
hosting custom user content in a subdomain of our main website.
There’s been, however, some confusion regarding the implications and impact of these
cross-domain cookie attacks. We hope this technical blog post will help clear things up.
Cookie tossing from a subdomain
When you log in on GitHub.com, we set a session cookie through the HTTP headers
of the response. This cookie contains the session data that uniquely identifies
you:
Set-Cookie: _session=THIS_IS_A_SESSION_TOKEN; path=/; expires=Sun, 01-Jan-2023 00:00:00 GMT; secure; HttpOnly
The session cookies that GitHub sends to web browsers are set on the
default domain (github.com
), which means they are not accessible from any
subdomain at *.github.com
. We also specify the HttpOnly
attribute, which means they
cannot be read through the document.cookie
JavaScript API. Lastly, we specify the
Secure
attribute, which means that they will only be transferred through
HTTPS.
Hence, it’s never been possible to read or “steal” session cookies from a GitHub
Pages hosted site. Session cookies are simply not accessible from the user code
running in GitHub Pages, but because of the way web browsers send cookies in HTTP
requests, it was possible to “throw” cookies from a GitHub Pages site to the
GitHub parent domain.
When the web browser performs an HTTP request, it sends the matching cookies
for the URL in a single Cookie:
header, as key-value pairs. Only the cookies
that match the request URL will be sent. For example, when performing a request to
github.com
, a cookie set for the domain github.io
will not be sent, but a
cookie set for .github.com
will.
GET / HTTP/1.1
Host: github.com
Cookie: logged_in=yes; _session=THIS_IS_A_SESSION_TOKEN;
Cookie tossing issues arise from the fact that the Cookie
header only
contains the name and value for each of the cookies, and none of the extra
information with which the cookies were set, such as the Path
or Domain
.
The most straightforward cookie-tossing attack would have involved using the
document.cookie
JavaScript API to set a _session
cookie on a GitHub Pages
hosted website. Given that the website was hosted under *.github.com
, this
cookie would have been sent to all requests to the parent domain, despite the
fact it was set in a subdomain.
/* set a cookie in the .github.com subdomain */
document.cookie = "_session=EVIL_SESSION_TOKEN; Path=/; Domain=.github.com"
GET / HTTP/1.1
Cookie: logged_in=yes; _session=EVIL_SESSION_TOKEN; _session=THIS_IS_A_SESSION_TOKEN;
Host: github.com
In this example, the cookie set through JavaScript in the subdomain is sent
next to the legitimate cookie set in the parent domain, and there is no way
to tell which one is coming from where given that the Domain
, Path
,
Secure
and HttpOnly
attributes are not sent to the server.
This is a big issue for most web servers, because the ordering of the cookies
set in a domain and in its subdomains is not specified by RFC 6265, and web
browsers can choose to send them in any order they please.
In the case of Rack, the web server interface that powers Rails and Sinatra,
amongst others, cookie parsing happens as follows:
def cookies
hash = {}
cookies = Utils.parse_query(cookie_header, ';,')
cookies.each { |k,v| hash[k] = Array === v ? v.first : v }
hash
end
If there is more than one cookie with the same name in the Cookie:
header,
the first one will be arbitrarily assumed to be the value of the cookie.
This is a very well-known attack: A couple weeks ago, security researcher Egor
Homakov blogged about a proof-of-concept attack just like this
one. The impact of the vulnerability was not critical (CSRF tokens get reset
after each log-in, so they cannot be permanently fixated), but it’s a very
practical example that people could easily reproduce to log out users and be
generally annoying. This forced us to rush our migration of GitHub
Pages to their own domain, but left us with a few weeks’ gap (until the
migration was complete), during which we had to mitigate the disclosed attack
vector.
Fortunately, the style of the disclosed attack was simple enough to mitigate on the server
side. We anticipated, however, several other attacks that were either trickier
to stop, or simply impossible. Let’s take a look at them.
Protecting from simple cookie tossing
The first step was mitigating the attack vector of simple cooking tossing.
Again, this attack exploits the fact that web browsers will send two cookie
tokens with the same name without letting us know the domain in which they were
actually set.
We cannot see where each cookie is coming from, but if we skip the cookie
parsing of Rack, we can see whether any given request has two duplicate
_session
cookies. The only possible cause for this is that somebody is
attempting to throw cookies from a subdomain, so instead of trying to guess
which cookie is legitimate and which cookie is being tossed, we simply instruct
the web browser to drop the cookie set in the subdomain before proceeding.
To accomplish this, we craft a very specific response: we instruct the web
browser to redirect to the same URL that was just requested, but with a
Set-Cookie
header that drops the subdomain cookie.
GET /libgit2/libgit2 HTTP/1.1
Host: github.com
Cookie: logged_in=yes; _session=EVIL_SESSION_TOKEN; _session=THIS_IS_A_SESSION_TOKEN;
HTTP/1.1 302 Found
Location: /libgit2/libgit2
Content-Type: text/html
Set-Cookie: _session=; Expires=Thu, 01-Jan-1970 00:00:01 GMT; Path=/; Domain=.github.com;
We decided to implement this as a Rack middleware. This way the cookie check
and consequent redirect could be performed before the application code gets to
run.
When the Rack middleware triggers, the redirect will happen transparently
without the user noticing, and the second request will contain only one
_session
cookie: the legitimate one.
This “hack” is enough to mitigate the straightforward cookie tossing attack
that most people would attempt, but there are more complex attacks that we also
need to consider.
Cookie paths workaround
If the malicious cookie is set for a specific path which is not the root (e.g.
/notifications
) the web browser will send that cookie when the user visits
github.com/notifications
, and when we try to clear it in the root path, our
header will have no effect.
document.cookie = "_session=EVIL_SESSION_TOKEN; Path=/notifications; Domain=.github.com"
GET /notifications HTTP/1.1
Host: github.com
Cookie: logged_in=yes; _session=EVIL_SESSION_TOKEN; _session=THIS_IS_A_SESSION_TOKEN;
HTTP/1.1 302 Found
Location: /notifications
Content-Type: text/html
# This header has no effect; the _session cookie was set
# with `Path=/notifications` and won't be cleared by this,
# causing an infinite redirect loop
Set-Cookie: _session=; Expires=Thu, 01-Jan-1970 00:00:01 GMT; Path=/; Domain=.github.com;
The solution is pretty straightforward, albeit rather inelegant: for any given
request URL, the web browser would only send a malicious JavaScript cookie if
its Path
matches partially the path of the request URL. Hence, we only
need to attempt to drop the cookie once in each component of the path:
HTTP/1.1 302 Found
Location: /libgit2/libgit2/pull/1457
Content-Type: text/html
Set-Cookie: _session=; Expires=Thu, 01-Jan-1970 00:00:01 GMT; Path=/; Domain=.github.com;
Set-Cookie: _session=; Expires=Thu, 01-Jan-1970 00:00:01 GMT; Path=/libgit2; Domain=.github.com;
Set-Cookie: _session=; Expires=Thu, 01-Jan-1970 00:00:01 GMT; Path=/libgit2/libgit2; Domain=.github.com;
Set-Cookie: _session=; Expires=Thu, 01-Jan-1970 00:00:01 GMT; Path=/libgit2/libgit2/pull; Domain=.github.com;
Set-Cookie: _session=; Expires=Thu, 01-Jan-1970 00:00:01 GMT; Path=/libgit2/libgit2/pull/1457; Domain=.github.com;
Again, we’re blind on the server-side when it comes to cookies. Our only option
is this brute-force approach to clearing the cookies, which despite its roughness,
worked surprisingly well while we completed the github.io
migration.
Cookie escaping
Let’s step up our game: Another attack can be performed by exploiting the fact
that RFC 6265 doesn’t specify an escaping behavior for cookies. Most web
servers/interfaces, including Rack, assume that cookie names can be URL-encoded
(which is a rather sane assumption to make, if they contain non-ASCII
characters), and hence will unescape them when generating the cookie list:
cookies = Utils.parse_query(string, ';,') { |s| Rack::Utils.unescape(s) rescue s }
This allows a malicious user to set a cookie that the web framework will
interpret as _session
despite the fact that its name in the web browser is
not _session
. The attack simply has to escape characters that don’t
necessarily need to be escaped:
GET / HTTP/1.1
Host: github.com
Cookie: logged_in=yes; _session=chocolate-cookie; _%73ession=bad-cookie;
{
"_session" : ["chocolate-cookie", "bad-cookie"]
}
If we try to drop the second cookie from the list of cookies that Rack
generated, our header will have no effect. We’ve lost crucial
information after Rack’s parsing: the fact that the name of the cookie was
URL-encoded to a different value than the one our web framework received.
# This header has no effect: the cookie in
# the browser is actually named `_%73ession`
Set-Cookie: _session=; Expires=Thu, 01-Jan-1970 00:00:01 GMT; Path=/; Domain=.github.com;
To work around this, we had to skip Rack’s cookie parsing by disabling the
unescaping and finding all the cookie names that would match our target after
unescaping.
cookie_pairs = Rack::Utils.parse_query(cookies, ';,') { |s| s }
cookie_pairs.each do |k, v|
if k == '_session' && Array === v
bad_cookies << k
elsif k != '_session' && Rack::Utils.unescape(k) == '_session'
bad_cookies << k
end
end
This way we can actually drop the right cookie (be it either set as _session
or as a escaped variation). With this kind of Middleware in place, we were able
to tackle all the cookie tossing attacks that can be tackled on the server
side. Unfortunately, we were aware of another vector which made middleware
protection useless.
Cookie overflow
If you’re having cookie problems I feel bad for you, son.
I’ve got 99 cookies and my domain’s ain’t one.
This is a slightly more advanced attack that exploits the hard limit that all
web browsers have on the number of cookies that can be set per domain.
Firefox, for example, sets this hard limit to 150 cookies, while Chrome sets it
to 180. The problem is that this limit is not defined per cookie Domain
attribute, but by the actual domain where the cookie was set. A single HTTP
request to any page on the main domain and subdomains will send a maximum
number of cookies, and the rules for which ones are picked are, once again,
undefined.
Chrome for instance doesn’t care about the cookies of the parent domain, the
ones set through HTTP or the ones set as Secure
: it’ll send the 180 newest
ones. This makes it trivially easy to “knock out” every single cookie from the
parent domain and replace them with fake cookies, all by running JavaScript on
a subdomain:
for (i = 0; i < 180; i++) {
document.cookie = "cookie" + i + "=chocolate-chips; Path=/; Domain=.github.com"
}
After setting these 180 cookies in the subdomain, all the cookies from the
parent domain vanish. If now we expire the cookies we just set, also from
JavaScript, the cookie list for both the subdomain and the parent domain
becomes empty:
for (i = 0; i < 180; i++) {
document.cookie = "cookie" + i + "=chocolate-chips; Path=/; Domain=.github.com; Expires=Thu, 01-Jan-1970 00:00:01 GMT;"
}
/* all cookies are gone now; plant the evil one */
document.cookie = "_session=EVIL_SESSION_TOKEN; Path=/; Domain=.github.com"
This allows us to perform a single request with just one _session
cookie:
the one we’ve crafted in JavaScript. The original Secure
and HttpOnly
_session
cookie is now gone, and there is no way to detect in the web server
that the cookie being sent is neither Secure
, HttpOnly
, nor set in the
parent domain, but fully fabricated.
With only one _session
cookie sent to the server, there is no way to know
whether the cookie is being tossed at all. Even if we could detect an invalid
cookie, the same attack can be used to simply annoy users by logging them out
of GitHub.
Conclusion
As we’ve seen, by overflowing the cookie jar in the web browser, we can craft
requests with evil cookies that cannot be blocked server-side. There’s nothing
particularly new here: Both Egor’s original proof of concept and the variations
exposed here have been known for a while.
As it stands right now, hosting custom user content under a subdomain is simply
a security suicide, particularly accentuated by Chrome’s current implementation choices.
While Firefox handles more gracefully the distinction between Parent Domain and Subdomain
cookies (sending them in more consistent ordering, and separating their storage to prevent overflows
from a subdomain), Chrome performs no such distinction and treats session
cookies set through JavaScript the same way as Secure HttpOnly
cookies set from the server,
leading to a very enticing playground for tossing attacks.
Regardless, the behavior of cookie transmission through HTTP headers is so ill-defined and
implementation-dependent that it’s just a matter of time until somebody comes
up with yet another way of tossing cookies across domains, independent of the targeted
web browser.
While cookie tossing attacks are not necessarily critical (i.e. it is not possible
to hijack user sessions, or accomplish anything anything besides phishing/annoying the
users), they are worringly straightforward to perform, and can be quite annoying.
We hope that this article will help raise awareness of the issue and the difficulties
to protect against these attacks by means that don’t involve a full domain migration:
a drastic, but ultimately necessary measure.
Written by
Related posts
Unlocking the power of unstructured data with RAG
Unstructured data holds valuable information about codebases, organizational best practices, and customer feedback. Here are some ways you can leverage it with RAG, or retrieval-augmented generation.
GitHub Availability Report: May 2024
In May, we experienced one incident that resulted in degraded performance across GitHub services.
How we improved push processing on GitHub
Pushing code to GitHub is one of the most fundamental interactions that developers have with GitHub every day. Read how we have significantly improved the ability of our monolith to correctly and fully process pushes from our users.