File upload is one of those features that every application eventually needs and every security team dreads. Profile pictures, document attachments, CSV imports, firmware updates -- the use cases are endless, and each one is an opportunity for an attacker to put their code on your server.
The fundamental problem is simple: you are accepting arbitrary data from an untrusted source and storing it on your infrastructure. Every layer of validation between the user's browser and your storage backend is a potential bypass target. Attackers have been finding creative ways past file upload restrictions for decades, and they are not slowing down.
The Core Attack: Webshell Upload
The most impactful file upload attack is uploading a webshell -- a server-side script that gives the attacker command execution on your server. Upload a PHP file to a server running PHP, and if that file is accessible via a URL, the attacker has remote code execution.
A basic PHP webshell is one line: <?php system($_GET['cmd']); ?>. Upload this as avatar.php, access it at https://target.com/uploads/avatar.php?cmd=whoami, and the attacker is running commands as your web server user.
This attack requires two conditions: the file must be stored in a web-accessible directory, and the server must execute it. Preventing either condition blocks the attack.
Content-Type and Extension Validation Bypasses
The first line of defense is usually checking the file extension and MIME type. But these checks are notoriously easy to bypass:
Extension bypasses:
- Double extensions:
shell.php.jpg(some servers execute based on the first extension) - Null bytes:
shell.php%00.jpg(older languages truncate at the null byte) - Alternative extensions:
.php5,.phtml,.phar,.shtml,.asp,.aspx,.jsp,.jspx - Case variations:
.PhP,.pHp,.PHP - Trailing characters:
shell.php.,shell.php::$DATA(NTFS alternate data streams)
Content-Type bypasses:
- The
Content-Typeheader is set by the client and can be anything the attacker wants - Setting
Content-Type: image/jpegwhile uploading a PHP file bypasses MIME-based checks - Some applications check the MIME type from the file header (magic bytes), which can be prepended to a malicious file
Prevention:
- Maintain a strict allowlist of permitted extensions, not a blocklist
- Normalize the filename before checking: lowercase, strip trailing dots, resolve double extensions
- Never rely solely on
Content-Typeheaders - Validate file content using magic bytes AND file structure analysis (a JPEG must actually parse as valid JPEG)
Path Traversal in File Uploads
If the application uses the original filename when storing the file, path traversal attacks can write files to arbitrary locations. A filename like ../../../etc/cron.d/backdoor writes to the cron directory instead of the uploads folder.
Even if the application sanitizes the filename for display, it might use the original filename for storage. The sanitization might also be incomplete -- encoding the traversal as ..%2f..%2f or ....//....// can bypass simple string replacement.
Prevention:
- Generate a random filename on the server and discard the original filename entirely
- If you must preserve the original name, store it in a database and use the random name for the filesystem
- Validate that the resolved storage path is within the expected directory
- Use your language's path canonicalization functions to resolve symlinks and traversal sequences
Storage Architecture
Where and how you store uploaded files has a massive impact on security:
Direct storage on the web server is the most dangerous option. Files are accessible via URL, and misconfigured servers will execute them. Avoid this entirely for user-uploaded content.
Separate storage service (S3, Azure Blob, GCS) isolates uploaded files from your application server. Even if an attacker uploads a PHP file, there is no PHP interpreter on S3 to execute it. Serve files through signed URLs with expiration times.
CDN with restricted MIME types adds another layer. Configure the CDN to serve all uploaded files with Content-Type: application/octet-stream or the verified MIME type, and set Content-Disposition: attachment to prevent browser rendering.
Recommendations:
- Store uploads in object storage (S3, GCS, Azure Blob), not on the application server
- Serve files from a separate domain (e.g.,
uploads.example.com) to prevent cookie access - Set
X-Content-Type-Options: nosniffto prevent MIME sniffing - Use
Content-Disposition: attachmentfor downloads to prevent in-browser rendering - Generate pre-signed URLs with short expiration for access control
Image-Specific Attacks
Image uploads deserve special attention because they are the most common upload type and have unique attack vectors:
Image metadata exploits. EXIF data in JPEG files can contain JavaScript (which some viewers render), PHP code (which can be executed if the image is included as PHP), or malicious XML (for XXE attacks in parsers).
Polyglot files. A file can be simultaneously valid as multiple formats. A file that is both a valid JPEG and valid JavaScript can be uploaded as an image but executed as script if served with the wrong Content-Type.
Image processing vulnerabilities. Libraries like ImageMagick, libpng, and libjpeg have had numerous CVEs. Processing a malicious image can trigger buffer overflows, arbitrary code execution, or SSRF (ImageTragick, CVE-2016-3714).
Prevention:
- Strip all metadata from uploaded images using a trusted library
- Re-encode images to a standard format (convert everything to PNG or WebP) to eliminate polyglot attacks
- Keep image processing libraries updated and sandboxed
- Set resource limits on image processing to prevent denial of service through decompression bombs
Virus and Malware Scanning
For applications that accept document uploads (PDF, Office, ZIP), malware scanning is essential:
ClamAV is the standard open-source option. It is free, well-maintained, and handles most common malware signatures. Run it as a daemon for performance.
Commercial solutions (VirusTotal API, MetaDefender) provide better detection rates through multiple scanning engines. The tradeoff is cost and latency.
Sandboxed execution (Cuckoo Sandbox, Any.Run) detonates files in an isolated environment to detect behavior-based threats. This is slower but catches zero-day malware that signature-based scanners miss.
Implementation:
- Scan files asynchronously after upload but before making them available
- Quarantine files that fail scanning and alert the security team
- Re-scan stored files periodically as scanner signatures are updated
- Limit file sizes to reduce scanning overhead and prevent denial of service
Denial of Service Through Uploads
Even without code execution, file uploads can be used for denial of service:
- Disk exhaustion. Uploading many large files fills the disk. Rate limiting and per-user quotas are essential.
- Decompression bombs. A ZIP file containing petabytes of repeated data expands to fill all available disk space when extracted. Set extraction limits and maximum decompressed sizes.
- Pixel flood. An image with dimensions of 65535x65535 but small file size (compressed) explodes when loaded into memory for processing. Validate image dimensions before processing.
- CPU exhaustion. Complex image operations on large files can consume all available CPU. Use processing timeouts and resource limits.
Prevention:
- Enforce maximum file sizes at the web server level (before the application processes the request)
- Set per-user upload quotas and rate limits
- Validate compressed file contents before extraction
- Check image dimensions before loading into memory
- Use processing timeouts for all file operations
How Safeguard.sh Helps
Safeguard.sh monitors the libraries your application depends on for file processing -- image libraries, PDF parsers, archive extractors, and document processors are frequent sources of critical CVEs. When a vulnerability like ImageTragick is discovered in your dependency tree, Safeguard.sh alerts you immediately. By maintaining a complete SBOM of your application, Safeguard.sh ensures that vulnerable file processing libraries are identified and tracked, reducing the window between disclosure and remediation.