Most DSPM stops at the bucket boundary. Safeguard traces the data — through every function, queue, and store it visits — from the API endpoint that ingested it to the cold-storage row that retains it. Function-level lineage with GDPR / DPDP-grade classification.
Classifier-only DSPM finds the sensitive data. It does not tell you which API endpoint produced it, which transform mutated it, which queue staged it, or which downstream service still has a stale copy.
When a regulator asks "how does an Indian customer's phone number reach your analytics warehouse" the answer cannot be "we believe via the ingestion pipeline". It must be a function-by-function trace.
That trace requires joining cloud-side data discovery with the deployed call graph — which is what Safeguard already maintains for vulnerability reachability. The same machinery applied to data is DSPM that lawyers can use.
Classifiers that scan tables and buckets see data at rest. They miss data that flows through a transform service and is dropped — but only after being logged to a third-party SDK.
Cataloguing where data lives is solved. Cataloguing the deployed code that put it there, mutated it, and forwarded it on is where every legacy DSPM ends.
Engineers create scratch tables, debug dumps, and analytics extracts daily. Without function-level provenance these are invisible until someone notices the bucket name.
GDPR Art. 30 and the DPDP fiduciary register both require purpose-of-processing per data category. That maps to functions, not tables.
The scanner-suite enumerates managed databases, object stores, message queues, and warehouse tables across AWS / Azure / GCP. Sampling reads classify content in place — no copy is ever made.
47 built-in classifiers map fields to regulator-defined categories (e.g., DPDP Sensitive Personal Data, GDPR Special Category) with the confidence band attached.
The SCA call graph + IaC plan are joined to the data catalog. Every sensitive field traces back to the handler that wrote it and forward to every consumer that reads it.
The Article 30 / DPDP fiduciary register is built from the lineage automatically. Updates land as PRs in your policy repo, not as ad-hoc spreadsheet edits.
Read-only credentials pull schemas and sample rows from every managed data store across the connected clouds.
47 classifiers — name, email, Aadhaar-like patterns, payment instrument hints — tag each column with category and confidence.
The SCA engine matches every classified field to the handlers, transforms, and consumers in the deployed call graph.
Each ingestion endpoint gets a function-level lineage tree showing where the field is read, mutated, forwarded, or deleted.
Tenant-defined rules (e.g., "no DPDP Sensitive Personal Data leaves region IN") evaluate against the lineage; violations open PRs or block deploys.
The Article 30 / DPDP register exports as a signed PDF + JSON, regenerated on every code change so it never goes stale.
Pair with comply-with-global-regulations and sovereign for residency enforcement, and SBOM Studio to fold data lineage into the artefact graph.
Bring one cloud account and one repo. We'll produce a function-level lineage tree for the top three PII fields in under an hour.