Notes: Exploiting Clam AntiVirus in a mail server environment
Simon Scannell’s ClamAV research explores creative heap memory exploitation methods, making it an excellent resource for learning about the topic. Simon also experiments with reusing application functionality to create side-channels. This allows Simon to overcome a significant obstacle: having no direct communication path between ClamAV and the hacker.
Quite frankly, Simon’s use of side-channels is inspiring; it demonstrates the hacker philosophy of being persistent and working in hyper-narrow constraints. In my enthusiasm for Simon’s work, I wrote a summary that ended up becoming a blog post - which I decided to share here. I hope you find this topic as fascinating as I did.
Table of contents
About this series
In the Notes series, I attempt to break down and comprehend the computer security and privacy research of others. To be completely clear: these are summaries. I will skip certain details for the sake of brevity and not getting stuck in application-specific nonsense.
Presentation details
- author: Simon Scannell
- slides
- presentation
- date: June 11, 2023
Introduction
ClamAV is an open-source antivirus written in both C and Rust that is oriented towards scanning files for known viruses, making it particularly useful in file and e-mail server contexts. ClamAV supports many different file formats including recursive unpacking of archives and “container-like” files. This wide support for many different file types presents massive attack surface.
In this presentation by Simon Scannell, he exploits ClamAV’s file parser attack surface in creative ways. This includes achieving code execution on an e-mail server by sending e-mails containing carefully-crafted files.
Notably, Simon discusses how his exploitation techniques can be employed when no direct communication channel exists between the hacker and the vulnerable software (for example, when attacking a load-balanced service).
Challenges in exploitation
Despite ClamAV’s massive attack surface, pwning it remains difficult for a few reasons:
- There is usually no direct communication channel between the attacker and ClamAV. For example, an e-mail service may only provide a boiler-plate response to an attacker’s e-mail which states a virus was found. This makes it very difficult to extract any leaked process state from ClamAV
- ClamAV is a multi-threaded daemon (i.e., it does not work like OpenSSH’s sshd which forks and execs for each new connection). Crashing ClamAV will result in the entire process dying - not just a child process. The exploit must also avoid crashes because ClamAV takes several minutes to parse virus definition updates on startup, making a crash an expensive time setback for the attacker
- ASLR coupled with the ClamAV executable being a Position Independent Executable makes brute-forcing memory addresses impossible. This prevents the attacker from easily reusing code from ClamAV to build their exploit
Building memory read and write primitives
Simon Scannell and his team discovered a heap buffer overflow in ClamAV’s Apple HFS+ parsing code (CVE-2023-20032). This vulnerability allows an attacker to overwrite other heap chunk’s metadata and contents with attacker-controlled data. The underlying parsing logic uses two values from the HFS+ header to size a heap-based buffer and write to the buffer. No bounds check was performed on the buffer before writing to it.
Since these values are attacker-controlled, they can be used to create an artificially small buffer that can be overwritten with the desired length of data:
// ClamAV HFS+ heap buffer overflow illustration.
//
// Note: This is (mostly) pseudo code for illustration purposes only.
// It is not exactly what ClamAV does.
//
// Let's assume the attacker controls the following fields of the HFS+
// volume header and that they provide the following values in the
// volHeader struct:
// volHeader->blockSize = 99999
// volHeader->someHFSSize = 32
heapChunkPointer = malloc(volHeader->someHFSSize);
// The following read() will copy too many HFS file bytes into the
// heap chunk pointed to by heapChunkPointer, thus overwriting the
// next heap chunk(s) and their heap metadata:
read(hfsFileDesc, heapChunkPointer, volHeader->blockSize);
// |_________| |______________| |__________________|
// evil data 32 bytes 99999 bytes
Simon et al. coupled this bug with an information leak vulnerability (CVE-2023-29077, an out-of-bounds heap read) in a Microsoft CHM file parser library used by ClamAV. The out-of-bounds read is enabled by an integer truncation bug where a 64-bit integer type was cast to a 32-bit integer. This integer truncation allows the attacker to create an undersized heap buffer and bypass the buffer’s bounds check:
// libmspack (CHM parser) out-of-bounds read illustration.
//
// Note: This is (mostly) pseudo code for illustration purposes only.
// It is not exactly what libmspack does.
// Phase 1 / 2:
//
// Let's assume the attacker controls the following value:
int64 file->length = 4294968633;
// The value above is equal to: 1<<32|1337
//
// The following variable assignment casts file->length (a signed 64-bit
// integer type) to a signed 32-bit "int" type. This has the effect of
// removing the "on" bit at position 32.
//
// Here is an illustration of what the 64-bit integer looks like before and
// after truncation in binary with the top 24-bits omitted for brevity:
//
// Before: (...) 00000001 00000000 00000000 00000101 00110001 // 4294968633
// After: |bits removed| 00000000 00000000 00000101 00110001 // 1337
len = (int) file->length;
// This produces a small "data" heap chunk of 1337 bytes:
data = malloc((size_t) len);
// (...)
// Phase 2 / 2:
//
// Future bounds checks on the "data" heap buffer still use the original,
// untruncated file->length field (which is a huge number).
//
// This allows the attacker to bypass a bounds check and read 4 bytes
// of memory into the variable. Below, the "pos" variable is another
// attacker-controlled value which can also be a huge number. pos only
// needs to be smaller than file->length for the bounds check to pass:
pos = 3735928559; // 0xdeadbeef
// (3735928559 <= 4294968633) == true
if (pos <= (file->length - entrySize))
{
// (...)
// EndGetI32() reads 32-bits (4 bytes) of data from index "pos",
// reading outside the bounds of the "data" heap chunk.
//
// The offset_ptr variable is later added with other
// values to seek within the CHM file. More on this
// in the next section :)
*offset_ptr = EndGetI32(&data[pos]);
// (...)
}
Using virus detection logic as an ASLR bypass oracle
One of the primary challenges with attacking modern programs is Address Space Layout Randomization (ASLR). ASLR prevents the attacker from easily guessing memory addresses of program code at runtime. Simon et al. work around this by combining the out-of-bounds read with ClamAV’s virus detection logic.
The data read by the CHM out-of-bounds read is later used to calculate an offset within the CHM file. This calculation is the sum of three values:
- An attacker-controlled value from the CHM file
- Another attacker-controlled value from the CHM file
- The value read by the out-of-bounds read. This would ideally be an ASLR-wrapped function pointer not known by the attacker
The offset calculated from these three values is used to seek within the CHM file. Because this is an IO seek operation and not a memory access operation, an out-of-bounds seek will not cause a segmentation fault (i.e., a ClamAV crash). This allows the attacker to repeatedly solve for the third value by changing the other two attacker-controlled values.
This is cleverly accomplished by including a known virus in the CHM file. When the offset equals the location of the known virus, ClamAV detects the virus, sending a notification back to the attacker. In effect, the virus detection logic is repurposed as an oracle. The oracle indicates when the third value (a function pointer) has been correctly guessed, bypassing ASLR.
Under normal circumstances, the ASLR oracle would need to guess 28 bits of the unknown function pointer - which would be impractical. Simon et al. subvert this by abusing the out-of-bounds read working in four byte (32-bit) increments. By positioning the read so that it collects three bytes of known data and one byte of unknown data, the attacker can brute-force only one byte (8 bits or 255 possibilities) at a time. Each time a byte is correctly guessed, a virus detection alert will be generated.
For example:
// Memory layout of known data followed by
// an unkown function pointer:
//
// +--------------------+---------------------------+
// | <known data> | <unknown func pointer> |
// |------+------+------+------+------+------+------|
// | 0x00 | 0x00 | 0x00 | 0x?? | 0x?? | 0x?? | 0x?? |
// +------+------+------+------+------+------+------+
// Intial out-of-bounds read starts at first byte:
// |
// V
// | 0x00 | 0x00 | 0x00 | 0x?? | ---- | ---- | ---- |
// After guessing the first byte, we offset the read
// to the next byte:
// |
// V
// | ---- | 0x00 | 0x00 | 0xAA | 0x?? | ---- | ---- |
// After guessing the second byte, we offset
// to the third byte:
// |
// V
// | ---- | ---- | 0x00 | 0xAA | 0x60 | 0x?? | ---- |
// After guessing the third byte, we offset the
// read to the fourth byte:
// |
// V
// | ---- | ---- | ---- | 0xAA | 0x60 | 0x51 | 0x?? |
// Now we know the value of the unknown data:
// | ---- | ---- | ---- | 0xAA | 0x60 | 0x51 | 0x7f |
// Thus, the function pointer read by the OOB read
// is: 0x7f5160aa
Making heap exploitation reliable
In this section, we will examine how Simon et al. manipulate the heap to exploit the out-of-bounds read. This strategy is later reused to exploit the heap buffer overflow.
Deterministically placing a heap chunk (such as one containing a function pointer) is the bane of most heap-based bugs. Ordinarily, there is no guarantee where the next heap allocation will be stored in heap space since the heap is essentially a list of reusable chunks of memory. Placement behavior is determined by the C library’s memory allocator, which is not attacker-controlled.
Hackers can leverage the behavior of both the heap memory allocator and the vulnerable application to influence the layout of the heap. This allows the hacker to structure the heap so that the next heap memory allocation(s) fall exactly where desired within the heap. I refer to this as memory widget alignment on heap (or MWAH).
Simon combined the behavior of ClamAV’s Apple DMG and Rich Text Format (RTF) parsers to manipulate the heap layout. The parsers can be abused to force the allocator to extend the heap, adding new heap chunks only to the end of the heap.
An Apple DMG contains many files, each represented by a data structure encoded in base64 known as a “mish”. Because of this, each “mish” triggers two heap allocations: one for the base64 string and a second for the decoded string.
In the first phase of this MWAH, the attacker spams many “mish” chunks. This fills all existing “holes” in the heap. From this point forward, all new heap allocations will be placed deterministically at the end of the heap. Here is what the heap looks like after these mish chunks are allocated:
// Note: "fragmented heap" refers to the original heap layout with all
// existing free chunks ("holes") taken by the mish chunks.
//
// Phase 1 / 6:
//
// +------------+-------+-------+-------+------+
// | fragmented | mish | mish | mish | top |
// | heap | chunk | chunk | chunk | of |
// | (...) | 998 | 999 | 1000 | heap |
// +------------+-------+-------+-------+------+
In the second phase, a mish base64 string of a large size (121 kb) is provided in the same DMG file. Because there is no existing “hole” in the heap to fit the base64 string or its decoded data, the allocator places the two chunks at the top of the heap:
// Phase 2 / 6:
//
// +------------+-------+-------+-------+---------------+---------+------+
// | fragmented | mish | mish | mish | large base64 | decoded | top |
// | heap | chunk | chunk | chunk | string chunk | mish | of |
// | (...) | 998 | 999 | 1000 | (121 kb) | chunk | heap |
// +------------+-------+-------+-------+---------------+---------+------+
// \ /
// newly allocated chunks
The DMG parser then free()
’s the large base64 string chunk
, leaving
a hole in the heap:
// Phase 3 / 6:
//
// +------------+-------+-------+-------+---------------+---------+------+
// | fragmented | mish | mish | mish | free chunk of | decoded | top |
// | heap | chunk | chunk | chunk | a large size | mish | of |
// | (...) | 998 | 999 | 1000 | (121 kb) | chunk | heap |
// +------------+-------+-------+-------+---------------+---------+------+
// \ /
// base64 string
// chunk is freed
This is where the RTF parser comes into play. A RTF file supports recursive parsing states. For example:
// RTF recursion example:
// abc {object ff {object ee}} def
A stack structure is used to track these recursive states. Simon found that
triggering enough recursion results in the RTF parsing state object (known
as rtf_state
) growing to the same size as the free chunk from Phase 3
.
Since the C library’s heap memory allocator prefers to use the smallest free
chunk, it will always place the rtf_state
chunk in the hole from Phase 3
.
Furthermore, because the chunk itself is nearly the maximum-permitted chunk
size, it is incredibly unlikely that the memory allocator will reuse that
chunk for other data.
The rtf_state
struct is useful to the attacker because it contains several
function pointers. Overwriting any of these pointers will allow the attacker
to point them at attacker-controlled code. When one of the overwritten
function pointers is called, attacked-controlled code will execute:
struct rtf_state {
rtf_callback_begin cb_begin
rtf_callback_process cb_process // <--- Attacker already knows some
// of this function pointer's
// bytes, making it a nice target
// for an overwrite.
rtf_callback_end cb_end
// Additional omitted fields (...)
}
Here is what the heap will look like once the rtf_state
object
is allocated:
// Phase 4 / 6:
//
// +------------+-------+-------+-------+---------------+---------+------+
// | fragmented | mish | mish | mish | rtf_state | decoded | top |
// | heap | chunk | chunk | chunk | object chunk | mish | of |
// | (...) | 998 | 999 | 1000 | (121 kb) | chunk | heap |
// +------------+-------+-------+-------+---------------+---------+------+
// \ /
// rtf_state is
// placed into
// hole because
// it is an
// exact fit
Once the RTF file is parsed, the rtf_state
is free()
ed. This is actually
a good thing for the attacker because the contents of the chunk are not
overwritten prior to the free()
. In other words - even though the chunk
is marked as being free - the function pointers and other fields are
left behind:
// Phase 5 / 6:
//
// +------------+-------+-------+-------+---------------+---------+------+
// | fragmented | mish | mish | mish | free chunk | decoded | top |
// | heap | chunk | chunk | chunk | containing | mish | of |
// | (...) | 998 | 999 | 1000 | rtf_state | chunk | heap |
// +------------+-------+-------+-------+---------------+---------+------+
// \ /
// rtf_state is
// free()ed, but
// its contents
// remain in the
// free chunk
The attacker allocates another Microsoft CHM object that will be placed
into the chunk previously occupied by rtf_state
. This CHM object is
slightly smaller than the rtf_state
- thus the memory allocator will
use the free chunk and overwrite only some of the left-behind data.
Most importantly, it leaves the rtf_state->cb_process
function
pointer untouched:
// Phase 6 / 6:
//
// +------------+-------+-------+-------+-------+-------+---------+------+
// | fragmented | mish | mish | mish | chm | rtf | decoded | top |
// | heap | chunk | chunk | chunk | chunk | func | mish | of |
// | (...) | 998 | 999 | 1000 | | ptr | chunk | heap |
// +------------+-------+-------+-------+-------+-------+---------+------+
// \ /
// a chm object
// *partially*
// overwrites the
// free rtf_state
// chunk, leaving
// behind the rtf
// func pointer.
// This allows
// the OOB read
// to copy the
// func pointer
Finally, the out-of-bounds read from earlier can be applied.
This sets up the CHM library out-of-bounds read such that it will read the
cb_process
function pointer. Now the attacker can add the pointer with
the other two attacker-controlled CHM values to bypass ASLR.
Building an exploit
To build an exploit, an attacker would likely rely on existing code found within the ClamAV process. The memory addresses of useful code are not known to the attacker because of ASLR. That is why heap manipulation and the information leak were critical to understand and make reliable.
The leaked RTF function pointer from earlier can be used to calculate the
base address of its parent library (libclamav.so
) by subtracting the
offset of the relevant function within the library file from the function
pointer’s value.
For example:
// <leaked ptr> <offset in libclamav.so> <libclamav.so base address>
// 0x7feface22a10 - 0x1c2a10 = 0x7fefacc60000
Once libclamav.so’s base address is known, the attacker can simply add the offset(s) of useful libclamav code to the base address. The attacker then creates a blob of bytes that consists of each address of useful code to jump execution to. This style of exploitation building is known as Return Oriented Programming (ROP).
A similar MWAH to the one outlined in the previous section is used to setup the heap buffer overflow to overwrite the same function pointer. While Simon does not go into detail about this, he likely overwrote the function pointer to point at the ROP payload and then triggered a call to the function pointer, thus jumping code execution to the ROP payload which executes a reverse shell (as he demonstrates at the end of his presentation).
Applying to a real world mail server
Simon demonstrates his ClamAV exploit on “Zimbra”, an e-mail service that uses ClamAV to scan e-mail attachments. When ClamAV detects a virus, it discards the offending e-mail instead of forwarding it to the intended victim. It does not inform the attacker that their malicious e-mail was discarded or that it contains a virus.
Unfortunately, Zimbra complicates the ASLR oracle by asynchronously queuing attachments for scanning and not notifying the sender if ClamAV detects a virus. Simon cleverly works around this by abusing Zimbra’s e-mail spoofing validation logic. Under normal circumstances, Zimbra will lookup an attacker-controlled DNS record when an e-mail passes the ClamAV check. This verifies whether the DNS record owner permits e-mails to be sent from their domain.
If you recall, the ASLR bypass oracle works by triggering a virus detection. This means a virus detection will not trigger a DNS record lookup.
To leverage this, the attacker can create a unique DNS name for each ASLR oracle attempt. When the oracle “fails” by not triggering a detection, Zimbra will lookup the attacker-controlled DNS record. By combining the exploit generation code with a DNS server, the attacker can treat each DNS lookup as an oracle failure. In effect, the DNS lookups act as a side-channel for the ASLR oracle.
The final exploitation strategy requires the attacker to send (at most) 255 e-mails for each memory address byte. When a byte guess is incorrect, ClamAV fails to detect the known virus and Zimbra makes a DNS lookup against an attacker-controlled domain. This tells the exploit generation code to try a different byte.
Once the memory address is fully determined, the attacker triggers the heap buffer overflow and overwrites the RTF function pointer with the address of their ROP payload. Finally, the attacker executes the ROP payload by triggering a call to the RTF function pointer.
Thoughts on side-channels
I think Simon’s presentation clearly illustrates the clever and unexpected ways that side-channels can be “stacked” to pipe an information leak back to an attacker. I have read a few accounts of how difficult it is to prevent side-channels. I think this presentation really underscores that difficulty.
Simon summarizes this point succinctly:
Just because the result of an OOB [out of bounds] Read is not reflected, it can be side-channeled by observing how the application behaves based on the value [read by the OOB read]
A fascinating implication of this, according to Simon, is that such side-channels can be used against load-balanced services that the attacker cannot directly communicate with:
A similar approach can be applied to scenarios where you target a load-balanced service and need to ensure that your payload is triggered against the worker for which ASLR has been defeated
Simon et al. refer to this technique as “conditional corruption”.
Thoughts on heap memory exploitation
In regards to heap memory exploitation, understanding the following allocator and application characteristics is important:
- Large heap chunks are unlikely to be generated by the application in normal conditions. Thus, knowing the maximum size of a chunk is useful for creating heap “holes” that have a low likelihood of being reused by the allocator
- Identifying logic that triggers multiple allocations for a single object makes it easier to manipulate the structure of the heap. For example, an object represented as a base64 string will trigger two allocations: one for the base64 string and a second for its decoded value
- If you can deterministically reuse a freed chunk that you previously controlled, you can use the chunk’s previous state to leak information (assuming the allocator or application does not zero the memory when freed). For example, if the freed chunk contains a function pointer, you can leak it using an out-of-bounds read that works relative to a heap-based object. The freed chunk’s function pointer can be leaked by placing the vulnerable object in the freed chunk’s spot
- Integer truncation bugs are incredibly useful for bypassing bounds checks and for manipulating heap chunk size. By under-sizing a heap buffer, we can potentially index into other heap memory relative to the vulnerable heap object
Lastly, read and write primitives can be combined to incrementally brute-force highly entropic values. This is particularly useful for bypassing ASLR. Even if the read or write primitives work in multibyte sequences, the primitives can be offset to only read or write one byte at a time. For example, brute-forcing 28 bits of data would potentially require 268,435,456 guesses. Using incremental primitives, we only need to brute-force 255 possibilities four times (1,020 guesses).
Special thanks and closing
I would like to thank Chris Marget and Seung Kang for reviewing this piece and encouraging me to post it here. Thank you both for your help :)
I hope you enjoyed this style of blog post. I am not sure if I will write another post like this… but I learned a lot, and I am happy I can share it with you.
Good night, and good luck.