Weakness reference
CWE-122

Heap-based Buffer Overflow

A heap-based buffer overflow occurs when a program writes more data to a dynamically allocated memory buffer than it can hold, overwriting adjacent heap…

01Summary

A heap-based buffer overflow occurs when a program writes more data to a dynamically allocated memory buffer than it can hold, overwriting adjacent heap memory. Unlike stack-based overflows, heap overflows corrupt data structures and metadata rather than return addresses, but can still lead to arbitrary code execution, denial of service, or information disclosure.

02How It Happens

Heap memory is allocated at runtime using functions like malloc() or new, and is managed by the heap allocator. When a program copies user-controlled data into a heap buffer without validating the input length, it can write past the buffer's boundary. The overflow corrupts adjacent heap objects—other allocated buffers, metadata pointers, or allocator structures—depending on memory layout. Unlike stack overflows, exploiting heap overflows requires understanding heap internals and is often more complex, but the consequences can be equally severe.

03Real-World Impact

Heap overflows can lead to arbitrary code execution by corrupting function pointers or allocator metadata, allowing an attacker to redirect execution flow. They can also cause denial of service by corrupting critical data structures, or leak sensitive information from adjacent memory. The impact depends on what data is adjacent to the overflowed buffer and how the heap allocator responds to corruption.

04Vulnerable & Fixed Patterns

Vulnerable pattern
import ctypes

def process_user_data(user_input):
    # Allocate a fixed-size buffer
    buffer = ctypes.create_string_buffer(16)
    
    # Copy user input without length validation
    ctypes.memmove(buffer, user_input.encode(), len(user_input))
    
    return buffer.value

Why it's vulnerable:
The code allocates a 16-byte buffer but copies the entire user input without checking its length. If user_input exceeds 16 bytes, the write overflows the buffer boundary and corrupts adjacent heap memory.

Fixed pattern
import ctypes

def process_user_data(user_input, max_length=16):
    # Allocate a fixed-size buffer
    buffer = ctypes.create_string_buffer(max_length)
    
    # Validate input length before copying
    input_bytes = user_input.encode()
    if len(input_bytes) > max_length - 1:
        raise ValueError(f"Input exceeds maximum length of {max_length - 1}")
    
    ctypes.memmove(buffer, input_bytes, len(input_bytes))
    return buffer.value
Vulnerable pattern
<?php
function process_user_data($user_input) {
    // Simulate a fixed-size buffer (16 bytes)
    $buffer = str_pad("", 16, "\x00");
    
    // Unsafe: copy user input without length check
    $buffer = substr_replace($buffer, $user_input, 0);
    
    return $buffer;
}

$data = $_GET['data'];
process_user_data($data);
?>

Why it's vulnerable:
The code treats a string as a fixed-size buffer and overwrites it with user input of arbitrary length. In languages with manual memory management (C/C++), this would overflow the heap; in PHP, it demonstrates the conceptual flaw of unbounded writes.

Fixed pattern
<?php
function process_user_data($user_input, $max_length = 16) {
    // Validate input length
    if (strlen($user_input) > $max_length) {
        throw new Exception("Input exceeds maximum length of {$max_length}");
    }
    
    // Safe: copy only validated length
    $buffer = str_pad($user_input, $max_length, "\x00");
    
    return $buffer;
}

$data = isset($_GET['data']) ? $_GET['data'] : '';
process_user_data($data);
?>

05Prevention Checklist

Validate input length
before copying into fixed-size buffers; reject or truncate oversized input.
Use safe APIs
that enforce bounds: strncpy() instead of strcpy(), snprintf() instead of sprintf(), or language-level bounds checking.
Enable compiler protections
such as stack canaries, ASLR, and heap hardening (e.g., AddressSanitizer during development).
Prefer dynamic allocation
with automatic bounds checking (e.g., std::string in C++, or native strings in Python/PHP) over manual buffers.
Fuzz test
with oversized and malformed inputs to detect buffer overflows before deployment.
Use static analysis tools
to identify unbounded copy operations and missing length checks.

06Signs You May Already Be Affected

Heap overflows may manifest as unexpected crashes or segmentation faults in production, particularly when processing large or specially crafted input. You may also observe memory corruption errors in logs, or unusual behavior in features that process user-supplied data (file uploads, form fields, API parameters). If a security scanner or fuzzer reports a heap overflow, investigate the affected code path immediately.

07Related Recent Vulnerabilities