Improper Neutralization of Formula Elements in a CSV File
This weakness occurs when user input is written directly into a CSV file without escaping formula characters. When a spreadsheet application Excel, Google…
This weakness occurs when user input is written directly into a CSV file without escaping formula characters. When a spreadsheet application (Excel, Google Sheets, LibreOffice) opens the file, it interprets certain prefixes like =, +, @, and - as the start of a formula, allowing an attacker to inject arbitrary commands or formulas that execute in the context of the user's spreadsheet. The result can range from data exfiltration to remote code execution, depending on the spreadsheet software's capabilities and security settings.
02How It Happens
CSV files are plain text, but spreadsheet applications treat certain characters as special when they appear at the start of a cell. If a web application exports user-supplied data (usernames, product names, comments, etc.) directly into a CSV without prefixing formula characters with a safe escape, the spreadsheet will parse and execute the formula when opened. For example, a username like =cmd|'/c calc'!A1 or @SUM(1+1) will be interpreted as a command or formula rather than literal text. The vulnerability exists because the developer assumes CSV is "just text" and doesn't account for spreadsheet formula injection.
03Real-World Impact
An attacker can craft input that, when exported to CSV and opened in a spreadsheet, executes arbitrary formulas or macros. This can lead to sensitive data being silently exfiltrated to an attacker-controlled server, unauthorized modification of spreadsheet data, or in some cases (particularly with older Excel versions or when macros are enabled), execution of arbitrary code on the user's machine. Organizations that regularly export user data or transaction logs to CSV are at particular risk.
04Vulnerable & Fixed Patterns
Vulnerable pattern
import csv
def export_user_data(users, filename):
with open(filename, 'w', newline='') as f:
writer = csv.writer(f)
writer.writerow(['Username', 'Email', 'Score'])
for user in users:
writer.writerow([user['name'], user['email'], user['score']])
# If user['name'] is "=cmd|'/c calc'!A1", it will be executed when opened
export_user_data(user_list, 'export.csv')
Why it's vulnerable: The code writes user-supplied data directly into the CSV without checking for or escaping formula prefixes. A spreadsheet application will interpret the leading = as a formula.
Fixed pattern
import csv
def export_user_data(users, filename):
with open(filename, 'w', newline='') as f:
writer = csv.writer(f)
writer.writerow(['Username', 'Email', 'Score'])
for user in users:
# Prefix potentially dangerous characters with a single quote
safe_name = user['name']
if safe_name and safe_name[0] in ('=', '+', '-', '@', '\t', '\r'):
safe_name = "'" + safe_name
writer.writerow([safe_name, user['email'], user['score']])
export_user_data(user_list, 'export.csv')
Vulnerable pattern
<?php
function export_users_to_csv($users, $filename) {
$file = fopen($filename, 'w');
fputcsv($file, ['Username', 'Email', 'Score']);
foreach ($users as $user) {
fputcsv($file, [$user['name'], $user['email'], $user['score']]);
}
fclose($file);
}
// If $user['name'] is "=cmd|'/c calc'!A1", it will execute when opened
export_users_to_csv($users, 'export.csv');
?>
Why it's vulnerable: The code passes user data directly to fputcsv() without sanitizing formula prefixes. Spreadsheet software will interpret leading =, +, @, or - as a formula.
Fixed pattern
<?php
function export_users_to_csv($users, $filename) {
$file = fopen($filename, 'w');
fputcsv($file, ['Username', 'Email', 'Score']);
foreach ($users as $user) {
// Prefix potentially dangerous characters with a single quote
$safe_name = $user['name'];
if (!empty($safe_name) && in_array($safe_name[0], ['=', '+', '-', '@', "\t", "\r"])) {
$safe_name = "'" . $safe_name;
}
fputcsv($file, [$safe_name, $user['email'], $user['score']]);
}
fclose($file);
}
export_users_to_csv($users, 'export.csv');
?>
05Prevention Checklist
Sanitize all user input before CSV export: Prefix any cell value that begins with =, +, -, @, tab, or carriage return with a single quote ('). This forces the spreadsheet to treat it as text.
Use allowlisting for expected data types: If a field should only contain numbers or alphanumeric text, validate and reject input that contains formula characters.
Consider alternative export formats: For sensitive data exports, use JSON or XML with proper schema validation instead of CSV, or serve CSV with a Content-Disposition: attachment header and appropriate MIME type to encourage safer handling.
Document the risk to users: If your application exports user-generated content, inform administrators that CSV files may contain formulas and should be treated with caution.
Test with real spreadsheet software: Verify that exported CSVs do not trigger formula execution when opened in Excel, Google Sheets, and LibreOffice.
06Signs You May Already Be Affected
Review recent CSV exports from your application for cells beginning with =, +, @, or -. Check application logs for user input containing these characters, particularly in fields that are exported to CSV (usernames, product names, comments, etc.). If you find such patterns, re-export those files with the fix applied and notify users who may have opened the original files.