Weakness reference
CWE-131

Incorrect Calculation of Buffer Size

This weakness occurs when code allocates a buffer but miscalculates its required size, resulting in a buffer that is too small for the data it will hold. An…

01Summary

This weakness occurs when code allocates a buffer but miscalculates its required size, resulting in a buffer that is too small for the data it will hold. An attacker or unexpected input can then overflow the undersized buffer, potentially corrupting memory, crashing the application, or enabling code execution. It is a common root cause of memory safety vulnerabilities in languages like C and C++.

02How It Happens

Buffer size miscalculation typically stems from one of several patterns: forgetting to account for null terminators in strings, using the wrong unit (bytes vs. elements), miscounting array dimensions, or failing to validate input length before copying. For example, code might allocate space for a fixed number of items but then copy more items than allocated, or calculate the size based on an untrusted input value without bounds checking. The vulnerability becomes exploitable when an attacker can supply input larger than the allocated buffer.

03Real-World Impact

When a buffer overflows due to undersizing, the excess data overwrites adjacent memory. In the best case, this causes a crash. In worse cases, it can corrupt heap metadata, overwrite function pointers, or modify return addresses on the stack, allowing an attacker to execute arbitrary code with the privileges of the affected process. Even in memory-safe languages, similar logic errors can lead to out-of-bounds access and information disclosure.

04Vulnerable & Fixed Patterns

Vulnerable pattern
import struct

def process_user_data(user_input):
    # Vulnerable: assumes input is always <= 10 bytes
    buffer_size = 10
    buffer = bytearray(buffer_size)
    
    # No length check; if user_input is longer, this overflows
    buffer[:len(user_input)] = user_input[:buffer_size]
    
    return buffer

Why it's vulnerable:
The code allocates a fixed 10-byte buffer but does not validate that user_input is actually that size before copying. If user_input exceeds the buffer, memory corruption occurs.

Fixed pattern
import struct

def process_user_data(user_input, max_size=10):
    # Fixed: validate input length before allocation
    if len(user_input) > max_size:
        raise ValueError(f"Input exceeds maximum size of {max_size} bytes")
    
    buffer = bytearray(len(user_input))
    buffer[:] = user_input
    
    return buffer
Vulnerable pattern
<?php
function read_user_name($input) {
    // Vulnerable: allocates 20 bytes but doesn't check input length
    $buffer_size = 20;
    $buffer = str_pad("", $buffer_size, "\0");
    
    // Assumes input fits; no validation
    $buffer = substr($input, 0, $buffer_size);
    
    return $buffer;
}

$user_data = $_GET['name'];  // Attacker controls this
$result = read_user_name($user_data);
?>

Why it's vulnerable:
The function assumes the input will fit within 20 bytes but does not validate the actual length of $_GET['name'] before processing. While PHP's string handling is more forgiving than C, the pattern reflects the underlying logic error.

Fixed pattern
<?php
function read_user_name($input, $max_size = 20) {
    // Fixed: validate input length before use
    if (strlen($input) > $max_size) {
        throw new Exception("Input exceeds maximum size of {$max_size} bytes");
    }
    
    // Safe to use; length is guaranteed
    $buffer = substr($input, 0, $max_size);
    
    return $buffer;
}

$user_data = $_GET['name'] ?? '';
$result = read_user_name($user_data);
?>

05Prevention Checklist

Always validate input length
before copying or processing; reject or truncate data that exceeds the expected maximum.
Account for all overhead
when calculating buffer size: null terminators, delimiters, metadata, and alignment requirements.
Use safe APIs
that enforce bounds checking (e.g., strncpy() instead of strcpy(), parameterized queries instead of string concatenation).
Prefer dynamic allocation
based on actual input size rather than fixed-size buffers, when feasible.
Enable compiler warnings and runtime checks
(e.g., -Wall -Wextra in GCC, AddressSanitizer) to catch buffer overflows during development.
Conduct code review
focusing on size calculations, especially in loops and when handling untrusted input.

06Signs You May Already Be Affected

Look for unexpected application crashes, segmentation faults, or memory corruption errors in logs, particularly when processing large or unusual input. If you use a memory debugging tool like Valgrind or AddressSanitizer during testing and see "buffer overflow" or "heap corruption" warnings, investigate the allocation and copy logic in the flagged code paths.

07Related Recent Vulnerabilities