Code Security

Insecure Deserialization: Why Untrusted Data Should Never Become Objects

Deserialization vulnerabilities turn data into code execution. Here is how they work, which languages are most affected, and how to defend against them.

James
DevSecOps Architect
6 min read

Insecure deserialization is one of the most dangerous vulnerability classes because it frequently leads to remote code execution. The attacker sends a crafted data payload, and the application's deserialization logic executes arbitrary code as a side effect of processing that payload. No SQL injection, no file upload, no command injection — just data that becomes code.

The vulnerability has powered some of the most impactful attacks in recent years. The Apache Commons Collections gadget chain in Java has been used in attacks against WebLogic, JBoss, Jenkins, and dozens of other Java applications. Python's pickle module has been exploited in machine learning pipelines. PHP's unserialize function has been a persistent source of vulnerabilities in WordPress and other PHP applications.

How Deserialization Works

Serialization converts an in-memory object into a format that can be stored or transmitted — a byte stream, JSON, XML, or a language-specific format. Deserialization reverses this: it reconstructs the object from the serialized data.

The danger is that deserialization does not just create an object. Depending on the language and format, it can:

  • Call constructors and initialization methods
  • Set arbitrary properties, including private ones
  • Invoke magic methods (__wakeup in PHP, readObject in Java, __reduce__ in Python)
  • Reconstruct complex object graphs with circular references

When the serialized data comes from an untrusted source, the attacker controls which classes are instantiated and what values their properties contain. If the application's classpath includes classes with dangerous side effects in their deserialization logic (gadget chains), the attacker can chain these classes to achieve code execution.

Java: The Epicenter of Deserialization Attacks

Java's native serialization (ObjectInputStream) is the most exploited deserialization mechanism. The format is binary, complex, and powerful. It can reconstruct any serializable class on the classpath, invoke readObject and readResolve methods, and handle complex object graphs.

The Apache Commons Collections gadget chain demonstrated the severity. By combining several classes from this widely-used library — InvokerTransformer, ChainedTransformer, ConstantTransformer — an attacker could construct a serialized object that executes arbitrary commands when deserialized. Since Apache Commons Collections was on the classpath of virtually every Java enterprise application, the attack surface was enormous.

Other gadget libraries include Spring Framework, Apache Commons BeanUtils, Hibernate, and many more. The ysoserial tool catalogs known gadget chains and generates payloads for them.

Prevention in Java:

  • Do not use Java native serialization for untrusted data. Use JSON (Jackson, Gson) or Protocol Buffers instead.
  • If you must use native serialization, use allowlists. Java 9+ supports serialization filters that restrict which classes can be deserialized. Configure ObjectInputFilter to allow only expected classes.
  • Remove gadget libraries from the classpath if possible. Often they are transitive dependencies that are not actually used.
  • Use look-ahead deserialization that inspects the stream before deserializing. Libraries like NotSoSerial and SerialKiller provide this.

Python: Pickle Is Not Safe

Python's pickle module is explicitly documented as unsafe for untrusted data, yet it remains widely used for data exchange, especially in machine learning workflows.

Pickle can execute arbitrary Python code during deserialization through the __reduce__ method. A malicious pickle payload can import any module and call any function — os.system, subprocess.Popen, anything.

import pickle
import os

class Exploit:
    def __reduce__(self):
        return (os.system, ('whoami',))

# This executes 'whoami' when unpickled
payload = pickle.dumps(Exploit())
pickle.loads(payload)  # RCE here

Prevention in Python:

  • Never unpickle data from untrusted sources. Use JSON, MessagePack, or Protocol Buffers instead.
  • For ML model serialization, use ONNX, SafeTensors, or other formats that do not support arbitrary code execution.
  • If pickle is unavoidable, use a restricted unpickler that overrides find_class to allowlist specific modules and classes.

PHP: Object Injection Through Unserialize

PHP's unserialize() function can instantiate arbitrary classes and invoke magic methods (__wakeup, __destruct, __toString). If any class in the application has a dangerous magic method, the attacker can trigger it.

Common PHP gadget techniques chain __destruct methods that write to files, __toString methods that execute queries, or __wakeup methods that make HTTP requests.

Prevention in PHP:

  • Use json_decode instead of unserialize for data exchange.
  • If unserialize is needed, use the allowed_classes option (PHP 7+) to restrict which classes can be instantiated.
  • Avoid storing serialized PHP objects in cookies, session data, or database fields that users can influence.

.NET: BinaryFormatter and Friends

.NET's BinaryFormatter has similar risks to Java's ObjectInputStream. Microsoft has deprecated it and recommends against using it for untrusted data. Other dangerous .NET formatters include SoapFormatter, ObjectStateFormatter, and LosFormatter.

The TypeNameHandling setting in JSON.NET (Newtonsoft.Json) can also enable deserialization attacks if set to anything other than None. When enabled, JSON payloads can specify .NET types to instantiate, potentially triggering dangerous constructors.

Prevention in .NET:

  • Do not use BinaryFormatter for any data from untrusted sources.
  • Set TypeNameHandling = TypeNameHandling.None in JSON.NET configuration.
  • Use System.Text.Json which does not support polymorphic deserialization by default.

Detecting Deserialization Vulnerabilities

In code review: Search for deserialization function calls — ObjectInputStream.readObject(), pickle.loads(), unserialize(), BinaryFormatter.Deserialize(). Check if the input comes from an untrusted source.

In SAST: Most static analysis tools have rules for unsafe deserialization. The accuracy depends on whether the tool can trace the data source to the deserialization call.

In DAST: Send known gadget chain payloads to parameters that accept serialized data. Look for out-of-band callbacks (DNS, HTTP) that indicate code execution.

In dependency scanning: Check if your dependencies include known gadget chain libraries. The presence of vulnerable gadget libraries increases the exploitability of any deserialization vulnerability.

The JSON Safety Myth

JSON is often presented as safe from deserialization attacks. This is mostly true — standard JSON parsers create dictionaries/maps and primitive types, not arbitrary objects. But there are exceptions:

  • JSON.NET with TypeNameHandling enabled (as mentioned above)
  • Jackson with DefaultTyping enabled
  • Any JSON parser configured to support polymorphic deserialization

The safety of JSON depends on the parser configuration, not the format itself.

How Safeguard.sh Helps

Safeguard.sh tracks gadget chain libraries in your dependency tree. When your application includes Apache Commons Collections, Spring Framework components, or other libraries known to enable deserialization attacks, Safeguard.sh identifies them in your SBOM. More importantly, when new gadget chains are discovered in existing libraries, Safeguard.sh alerts you to the increased risk even if the library version has not changed. The platform's policy gates can enforce rules about which serialization libraries and configurations are acceptable, preventing new applications from introducing deserialization risks.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.