Weakness reference
CWE-120

Buffer Copy without Checking Size of Input

This weakness occurs when a program copies data from one buffer to another without first checking whether the source data will fit in the destination. If the…

01Summary

This weakness occurs when a program copies data from one buffer to another without first checking whether the source data will fit in the destination. If the source is larger than the destination can hold, the excess data overwrites adjacent memory, potentially corrupting data, crashing the program, or allowing an attacker to execute arbitrary code. It is one of the most common and dangerous memory safety flaws in C and C++ applications.

02How It Happens

The vulnerability arises when developers use unsafe copy functions—such as strcpy(), strcat(), sprintf(), or memcpy() with a hardcoded size—without validating the length of incoming data first. The program assumes the input will always be smaller than the buffer, but if an attacker or unexpected input provides more data than the buffer can hold, the overflow occurs silently. This is especially dangerous in C, where there is no built-in bounds checking on arrays or string operations.

03Real-World Impact

Buffer overflows can lead to denial of service (crash), data corruption, or arbitrary code execution. An attacker who can control the input size may overwrite function return addresses or other critical data structures on the stack, redirecting program execution to malicious code. Even in modern systems with protections like ASLR and stack canaries, buffer overflows remain a high-severity issue and are frequently the root cause of critical vulnerabilities in system software, network services, and embedded applications.

04Vulnerable & Fixed Patterns

Vulnerable pattern
import ctypes

def copy_user_input(user_data):
    # Fixed-size buffer (10 bytes)
    buffer = ctypes.create_string_buffer(10)
    # Dangerous: no length check before copy
    ctypes.memmove(buffer, user_data.encode(), len(user_data.encode()))
    return buffer.value

Why it's vulnerable:
The function copies the entire user input into a 10-byte buffer without checking if user_data is longer than 10 bytes. If user_data exceeds that size, the overflow corrupts adjacent memory.

Fixed pattern
import ctypes

def copy_user_input(user_data):
    max_size = 10
    buffer = ctypes.create_string_buffer(max_size)
    # Safe: truncate or reject if input exceeds buffer size
    encoded = user_data.encode()
    if len(encoded) > max_size - 1:  # -1 for null terminator
        raise ValueError("Input too large")
    ctypes.memmove(buffer, encoded, len(encoded))
    return buffer.value
Vulnerable pattern
<?php
function process_input($user_input) {
    $buffer = '';
    // Dangerous: no length check, assumes input fits
    $buffer = substr($user_input, 0, 256);
    // If $user_input is longer, data is silently truncated
    // In C-style code, this could overflow
    return $buffer;
}
?>

Why it's vulnerable:
While PHP's substr() is bounds-safe, this pattern mirrors the C vulnerability: copying without validating input length first. In lower-level PHP extensions or C code called from PHP, the same pattern causes buffer overflow.

Fixed pattern
<?php
function process_input($user_input) {
    $max_size = 256;
    // Safe: validate length before use
    if (strlen($user_input) > $max_size) {
        throw new Exception("Input exceeds maximum size");
    }
    $buffer = $user_input;
    return $buffer;
}
?>

05Prevention Checklist

Use safe string functions:
Replace strcpy() with strncpy(), strlcpy(), or snprintf() that accept a size limit; use strcat_s() or equivalent in modern C.
Always validate input length:
Before copying, check that the source data length does not exceed the destination buffer capacity.
Prefer dynamic allocation:
Use heap-allocated buffers with explicit size tracking, or use higher-level languages/libraries that handle bounds checking automatically.
Enable compiler warnings:
Turn on -Wall -Wextra (GCC/Clang) or equivalent to catch unsafe function calls at compile time.
Use static analysis tools:
Run tools like Clang Static Analyzer, Coverity, or Valgrind to detect buffer overflows before deployment.
Apply address sanitizers:
Compile with -fsanitize=address (ASAN) during testing to catch overflows at runtime.

06Signs You May Already Be Affected

Look for unexpected crashes or segmentation faults in logs, especially in response to unusual or oversized input. Unexplained memory corruption, data appearing in wrong variables, or sudden program termination after processing user-supplied data may indicate a buffer overflow. If you maintain C or C++ code, review logs for core dumps or stack traces that reference memory access violations.

07Related Recent Vulnerabilities