Application Security

Sensitive Data Exposure Prevention: Protecting Data at Rest, in Transit, and in Use

Data exposure is not just about encryption. It is about knowing where your sensitive data lives, how it moves, and who can access it at every stage.

Yukti Singhal
Security Researcher
6 min read

The OWASP Top 10 renamed "Sensitive Data Exposure" to "Cryptographic Failures" in 2021, reflecting the insight that data exposure usually stems from missing or broken cryptography rather than direct data leakage. But the broader problem remains: applications handle sensitive data — credentials, financial information, health records, personal identifiers — and that data leaks through dozens of channels that developers do not always anticipate.

Preventing data exposure requires thinking about data through its entire lifecycle: collection, processing, storage, transmission, and disposal. A failure at any stage can expose data that was protected at every other stage.

Data Classification: Know What You Have

Before you can protect sensitive data, you need to know what qualifies as sensitive and where it lives.

Regulatory classification. PCI DSS defines cardholder data. HIPAA defines protected health information. GDPR defines personal data broadly. Your regulatory requirements determine the minimum classification scheme.

Business classification. Some data is sensitive for business reasons independent of regulation. Trade secrets, pricing algorithms, customer lists, and strategic plans need protection even if no regulation mandates it.

A practical classification scheme:

  • Public. Data intentionally available to anyone. Marketing content, public APIs.
  • Internal. Data for internal use that would not cause significant harm if exposed. Internal documentation, non-sensitive configurations.
  • Confidential. Data that could cause harm if exposed. Customer data, employee records, financial data.
  • Restricted. Data that would cause severe harm if exposed. Credentials, encryption keys, health records, payment card data.

Every piece of data your application handles should map to one of these categories. The controls applied to each category determine your data protection posture.

Data in Transit

TLS everywhere. All data transmission should use TLS 1.2 or 1.3. This includes external-facing APIs, internal service-to-service communication, and database connections. "Internal traffic does not need encryption" is a myth that assumes attackers cannot reach your internal network.

Certificate management. Use certificates from trusted CAs. Automate certificate rotation with tools like Let's Encrypt or your cloud provider's certificate manager. Monitor certificate expiration. A single expired certificate can cause outages or force users to accept insecure connections.

HSTS. HTTP Strict Transport Security tells browsers to always use HTTPS. Set max-age to at least one year, include subdomains, and consider HSTS preloading. Without HSTS, an attacker on the network can intercept the initial HTTP request before the redirect to HTTPS.

Certificate pinning. For mobile applications and high-security APIs, pin the expected certificate or public key. This prevents attacks using fraudulently issued certificates. The trade-off is operational complexity when certificates need to rotate.

Data at Rest

Encryption at rest. Encrypt storage volumes, database files, and backups. Use your cloud provider's encryption services (AWS KMS, Azure Key Vault, GCP Cloud KMS) for managed encryption with key rotation.

Field-level encryption. For highly sensitive fields (SSNs, payment card numbers), encrypt at the application layer before storage. This protects against database compromises and provides more granular access control than volume-level encryption.

Key management. Encryption is only as strong as the key management. Keys stored alongside encrypted data provide no protection. Use dedicated key management services with access controls, auditing, and automatic rotation.

Password storage. Never store passwords in plaintext or with reversible encryption. Use adaptive hashing algorithms: bcrypt, scrypt, or Argon2id. Set the cost factor high enough that hashing takes 100-500ms. Salt each password individually.

Backup encryption. Backups are often less protected than primary storage. Encrypt backups, store them in separate accounts or locations, and test that decryption works.

Data in Processing

Data is vulnerable when it is in memory and being processed.

Logging. Applications frequently log sensitive data unintentionally. A debug log that records full request bodies captures passwords, tokens, and personal data. Implement structured logging with field-level filtering. Mask or redact sensitive fields before logging.

Error messages. Exception handlers that include variable contents in error messages can expose sensitive data. Return generic error messages to users and log details server-side.

Caching. HTTP caches, CDN caches, and application caches may store sensitive responses. Set appropriate Cache-Control headers (no-store for sensitive responses). Clear caches when sensitive data changes.

Memory handling. In languages with manual memory management, clear sensitive data from memory after use. In managed languages, minimize the time sensitive data spends in memory and avoid unnecessary copies.

Temporary files. Applications that write sensitive data to temporary files must ensure those files are encrypted and deleted after use. Temp directories are often shared and may persist across reboots.

Data Minimization

The most effective data protection strategy is not collecting sensitive data you do not need.

Collection minimization. Only collect data that serves a specific purpose. If you do not need a user's date of birth, do not ask for it.

Retention minimization. Delete data when it is no longer needed. Implement automated retention policies that purge expired data. Less stored data means less data to protect and less data exposed in a breach.

Processing minimization. If you need data for analytics but not for individual identification, anonymize or aggregate it. Differential privacy, k-anonymity, and data masking reduce risk while preserving utility.

Access minimization. Grant access to sensitive data only to roles that require it. Monitor access patterns and investigate anomalies.

Common Data Exposure Vectors

URL parameters. Sensitive data in URLs gets logged in server logs, browser history, referrer headers, and proxy logs. Use POST bodies for sensitive data.

HTML source. Hidden form fields, comments, and JavaScript variables visible in page source may contain sensitive data. The client side is not a hiding place.

API responses. APIs that return full database records when the consumer only needs a few fields over-expose data. Return only the fields the consumer needs.

Version control. Credentials and API keys committed to Git remain in the repository history even after deletion. Use tools like git-secrets or truffleHog to detect secrets in repositories.

Third-party services. Analytics scripts, error tracking services, and ad networks may capture sensitive data from your pages. Review what data flows to third-party services.

Browser storage. Data in localStorage, sessionStorage, and IndexedDB is accessible to any JavaScript running on the same origin, including XSS payloads. Do not store sensitive data in browser storage.

Compliance Frameworks

PCI DSS mandates encryption of cardholder data at rest and in transit, access logging, and regular security testing. Non-compliance results in fines and loss of payment processing ability.

GDPR requires appropriate technical measures to protect personal data, breach notification within 72 hours, and data subject rights (access, deletion, portability).

HIPAA requires encryption of protected health information, access controls, audit logging, and breach notification.

SOC 2 evaluates controls around security, availability, processing integrity, confidentiality, and privacy.

Each framework has specific technical requirements, but they all share common themes: encrypt sensitive data, control access, log activities, and respond to incidents.

How Safeguard.sh Helps

Safeguard.sh helps prevent sensitive data exposure through supply chain visibility. The platform tracks cryptographic libraries in your dependency tree, alerting you when you are using libraries with known vulnerabilities in encryption, hashing, or TLS implementations. When a cryptographic library you depend on has a weakness that could compromise data protection, Safeguard.sh surfaces it before it reaches production. The platform's SBOM inventory also helps you identify applications that use outdated or deprecated cryptographic components, supporting your migration to stronger algorithms and implementations.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.