What Is Data Integrity?

tball

Data integrity is the difference between data you can trust and data that quietly drives poor decisions, failed audits, or operational disruption.

What Is Data Integrity?

Data integrity means data remains accurate, complete, consistent, and trustworthy throughout its lifecycle—when it’s created, stored, processed, and transmitted.

From a cybersecurity perspective, the most direct definition is from NIST: data integrity is a property whereby data has not been altered in an unauthorised manner since it was created, transmitted, or stored.

Data integrity can be compromised by:

  • Malicious activity, such as tampering with records, code, model files, logs, or configurations.
  • Human error, such as incorrect updates, flawed manual processes, or poor change control.
  • System and integration failures, such as corruption, broken sync, partial writes, or inconsistent validation.

Data Integrity Vs Data Quality

While data quality means that data is fit for purpose, data integrity signifies that data is reliable and protected from unintended or unauthorised change.

  • Data Quality: Measures usefulness for a given context (For example: timeliness, completeness, relevance, reliability).
  • Data Integrity: Focuses on correctness and consistency over time, including whether changes are controlled, traceable, and not unauthorised.

A dataset can be high-quality for a use case but still have weak integrity if it can be modified without appropriate safeguards.

Data Accuracy Vs Data Integrity

Data accuracy and integrity differ in the question they focus on. While data accuracy is a point-in-time question, data integrity is a lifecycle and assurance question.

For example:

  • Data Accuracy: “Is this value correct right now?”
  • Data Integrity: “Can we trust this value stays correct across systems and over time—and that any changes are authorised, detected, and recoverable?”

You can have accurate data with poor integrity (Correct today, easy to tamper with tomorrow), and strong integrity controls around data that is still inaccurate (Protected, but wrong because upstream processes are flawed).

Why Is Data Integrity Important?

Integrity is business-critical because it directly affects trust in systems—and trust is what lets teams move quickly during day-to-day operations and during incidents.

When integrity fails, the impact typically shows up in five places:

  1. Decision-making breaks:
    Analytics, reporting, forecasting, fraud detection, and even automated workflows depend on reliable inputs. If records are altered—maliciously or accidentally—leaders may act on the wrong information with confidence, which is often worse than acting slowly.

  2. Operations slow down or stop: 
    Once teams suspect corruption or tampering, they introduce manual verification, freeze changes, or suspend workflows until they can re-establish confidence. That can affect billing, procurement, customer services, and critical infrastructure processes.

  3. Security response becomes harder: 
    Data integrity includes security evidence like logs, alerts, and configuration baselines. If those artefacts can be changed or deleted, investigations lose reliability and attackers gain time. 

  4. Recovery becomes uncertain:
    After ransomware, tampering, or Supply chain attacks, restoring systems is only the first step. The harder step is proving that what you restored is clean and trustworthy—especially if attackers were able to change data or tooling upstream.

  5. Data integrity failures expose businesses to significant financial loss:
    When data is altered, corrupted, or misused, organisations face direct monetary impact. Often driven by recovery costs, lost business and regulatory fallout, the average cost of a major data breach to a British organisation is reported as £3.4 million.

Data Integrity In Cybersecurity

Data integrity is a core cybersecurity concern because it’s both a security objective and a signal. Protecting integrity is part of what it means to secure systems in the first place, and unexpected change is often one of the earliest indicators that something is wrong.

Importantly, integrity is not limited to data security in the narrow sense. It spans identity, endpoints, applications, cloud, and supply chains—because each of these is a pathway for unauthorised modification. Therefore, the responsibility of data integrity is not owned by one team, but is shared across multiple cybersecurity roles and systems.

Here are a few of the cybersecurity domains that data integrity connects to: 

  • Identity And Access: Credential compromise can turn an attacker into an “authorised” editor of data.
  • Endpoint And Server Security: Malware can alter files, configurations, and local stores.
  • Application Security: Weak validation and insecure APIs can allow malicious writes.
  • Cloud Security: Misconfigurations can expose storage or logs to tampering.
  • Supply Chain Security: Compromised dependencies and pipelines can change behaviour and data flows before production even runs.

Data Integrity In Information Security

In information security, integrity is commonly described as guarding against improper modification, including ensuring authenticity and non-repudiation.

Practically, that means three outcomes:

  • Changes are authorised (Only the right people or processes can modify critical information).
  • Changes are traceable (You can prove who changed what, when, and why).
  • Changes are recoverable (You can restore a known-good state and validate it).

Data Integrity Testing, Validation, and Checks

When teams talk about “testing” data integrity, they’re usually describing one of four things: validating data at the point it’s created, checking it stays consistent over time, proving it’s trustworthy after change, or spotting unexpected drift early. 

The terms below make up components of the data integrity testing process, and what they actually mean in practice:

Data Integrity Validation

Data Integrity
Testing Term

What It Means

Example

Data Integrity
Validation

Rules that confirm data
meets required formats
and relationships before it’s accepted or processed.

A date field must be
a real date; a customer
ID must exist before an
order can be saved.

Data Integrity
Checks

Repeatable checks that
confirm data hasn’t
drifted or broken
consistency across
systems.

Comparing “total orders today”
in the ERP vs the data warehouse to catch mismatches.

Data Integrity
Testing

A structured set of tests
used during migrations,
releases, investigations, or recovery to prove data is still trustworthy.

After a migration,
confirming record counts,
key fields, and relationships
match expectations.

Data Integrity
Testing Tools

Tools that automate
validation, reconciliation, monitoring, and change
tracking so integrity issues
are found earlier.

Automated tests that
flag a sudden spike in
duplicates or missing
records.

Key Data Integrity Tools & How they Work

Most organisations rely on a stack of data integrity tools and controls that work together: some prevent bad data from being written, others detect drift, and others help prove what changed and restore a trusted state.

At a glance, here’s how they connect:

  1. Constraints and validation reduce the chance integrity is lost in the first place.
  2. Observability and monitoring surface drift early.
  3. Audit trails and cryptographic verification help prove what changed (and whether it was authorised). 
  4. Backup verification and restore testing make integrity recoverable after incidents.
Data Integrity Lifecycle

Database Integrity Constraints

These are built-in database guardrails that stop invalid or inconsistent records at write-time, such as domain constraints, entity/key constraints, and referential integrity rules. They’re foundational because they prevent entire classes of integrity issues before they spread.

Validation And Rule Engines

Validation tools and frameworks enforce rules at ingestion and transformation points (APIs, forms, ETL/ELT, streaming). They typically check types, ranges, required fields, and relationships so incorrect data is rejected or quarantined early.

Reconciliation, Comparison, And Data Observability

When the same “truth” exists in multiple places (ERP, CRM, warehouse, reports), reconciliation tools compare counts, totals, and key fields to detect drift. Data observability platforms extend this with monitoring for anomalies (spikes in nulls/duplicates, schema drift, pipeline breakage) so integrity issues are detected closer to when they start.

Audit Trails And Change Tracking

Change tracking tools answer the integrity questions that matter most during investigations and audits: What changed? Who changed it? When? These controls don’t just support compliance—they make it faster to isolate the moment integrity was lost and reduce the time spent arguing with the data.

Cryptographic Integrity Verification

When you need strong assurance that data or artefacts haven’t been tampered with, teams use hashes/checksums and digital signatures (common for software artefacts, backups, and sensitive transfers). In security contexts, file integrity checking is also used to detect unauthorised modification of critical executables and libraries by comparing hashes and enforcing write protections.

Backup Verification And Restore Testing

Backups support integrity when they’re verifiable. Many organisations add automated checks and restore tests to confirm recovery points are complete and usable—because “we have backups” doesn’t help if you can’t confidently restore a clean, trusted state.

How To Ensure Data Integrity

Ensuring data integrity is about making unauthorised change unlikely, making unexpected change visible, and making recovery trustworthy. That requires a mix of governance, technical controls, monitoring, and discipline around change and incident response.

1. Define What “Integrity” Means For Your Business

Start by identifying which data must be trusted (financial records, identity stores, patient data, regulated records, production configurations, and security logs). Then define what “correct” looks like: valid values, allowed workflows, required approvals, and acceptable time windows.

This step prevents the most common integrity failure: applying strong controls to low-impact data while high-impact systems remain editable, inconsistent, or poorly monitored.

2. Control Who Can Change Data (And How They Change It)

Most integrity problems become possible because write access is too broad or too informal. Tightening access and change pathways makes both fraud and attacker tampering harder.

Here ways to control data access to protect data integrity:

  • Enforce Strong Authentication: Require MFA for privileged access and sensitive workflows.
  • Apply Least Privilege: Limit write permissions to the smallest set of roles and systems needed.
  • Separate Duties For High-Risk Changes: Split creation and approval for payment details, permissions, production releases, and policy changes.
  • Use Controlled Change Workflows: Push sensitive edits through ticketing/approval rather than manual, ad-hoc updates.

3. Validate Data At Ingestion And Across Integrations

Integrity loss often starts at the edge: APIs, forms, imports, and integrations. Input validation (formats, ranges, schemas, required fields) reduces accidental corruption and makes some attack paths harder. Reconciliation checks between systems of record and downstream consumers help catch drift before it becomes business-impacting.

If the same dataset passes through multiple systems, inconsistent validation rules can silently degrade integrity—even when each system is “working as designed.”

4. Make Changes Traceable With Audit Trails And Protected Logging

If you can’t explain a change, you can’t trust it. Traceability is the difference between “we think something happened” and “we can prove what happened.”

This can include the following actions:

  • Log Sensitive Actions: Record administrative actions and high-risk record changes in critical systems.
  • Centralise And Correlate Logs: Combine identity, endpoint, cloud, and application signals for visibility.
  • Protect Logs From Tampering: Ensure attackers can’t easily delete or alter evidence during an incident.
  • Alert On Risky Patterns: Flag bulk edits, unusual admin activity, privilege changes, and changes outside approved windows.

5. Use Cryptographic Integrity Controls Where They Fit

Cryptographic techniques strengthen integrity assurance by detecting unauthorised modification. Hashes and checksums can validate data and artefacts; digital signatures can verify software packages and build outputs; and authenticated channels can protect integrity during transmission. NIST frames integrity in cryptographic contexts as ensuring data has not been modified or deleted in an unauthorised and undetected manner.

6. Monitor For Unexpected Change (Baseline And Drift Detection)

Prevention won’t catch everything. Monitoring is what tells you when integrity is slipping—whether from attacker activity, misconfiguration, or unintended process changes. A common approach is to baseline known-good states and alert on drift in sensitive files, configurations, services, and environments.

7. Build Recovery That Restores Trust

After ransomware, tampering, or corruption, “restored” is not the same as “trusted.” Reliable recovery includes resilient backups, regular restore testing, and post-incident validation (reconciliation, verification of artefacts, and confirmation of expected configurations). Without validation, you can reintroduce compromised data or hidden persistence.

8. Reduce Supply Chain Integrity Risk In Code And Pipelines

Supply chain attacks are integrity compromises at scale. If dependencies, build processes, or maintainer accounts are compromised, integrity can fail upstream—then spread through normal development workflows. Controls like dependency pinning, artefact verification, pipeline hardening, and developer identity protection reduce this risk.

Recent Data Integrity Compromise Examples

Below are three examples found in Trend Micro research that illustrate integrity compromise in modern environments.

Anubis Ransomware Wipes Data Completley

An emerging ransomware-as-a-service (RaaS) group, Anubis has added a rare file-wiping feature on top of typical extortion tactics. For the victims in multiple sectors including healthcare and construction, data integrity suffered from complete destruction, removing the option to restore cleanly even after paying a ransom.

How integrity practices could help:

  • Strong recovery engineering (immutable/offline backups, restore testing, and post-restore validation) makes it far harder for “wipe mode” to become existential.
  • Monitoring for unusual file modification patterns and privileged activity can provide earlier detection before the blast radius expands.

NPM Supply Chain Attack Activity

Evidenced by the ongoing npm supply chain attack, some attackers are compromising software delivered through trusted package ecosystems—tampering with packages or publishing malicious updates that downstream teams consume as routine dependencies.

Why this is integrity compromise:

  • The code you believe you’re using is no longer the code you actually received. That breaks the integrity of the software supply chain and can lead to credential theft, hidden persistence, or tampering with application behaviour.

How integrity practices could help:

  • Dependency pinning and verification (hashing/signing), combined with monitoring for unexpected pipeline changes, reduces the chance of silently adopting malicious updates.
  • Strong developer identity controls (MFA, protected maintainers, secrets hygiene) makes account takeover harder—a common root cause in ecosystem compromises.

LLM Compromise Paths: Poisoning And Tampering Threaten Trust In AI Outputs

According to research on Large Language Model (LLM) compromise paths, LLMs are vulnerable to integrity-relevant threats such as poisoned data and tampering with model files or adapters. The best defence, the research found, is rigorous data validation and sanitisation pipelines as a defence.

Why this is integrity compromise:

  • The model’s behaviour can be altered by manipulating training or fine-tuning inputs, or by modifying the model artefacts themselves—meaning outputs can’t be trusted even if systems remain “online.”

How integrity practices could help:

  • Data integrity controls (validation, sanitisation, provenance) help defend against poisoning.
  • Access control and monitoring for unauthorised changes to model files and configurations helps detect tampering early.

Data Integrity By Industry 

Data integrity matters in every sector, but it becomes business-critical in regulated and high-impact industries—where organisations must prove that records are complete, accurate, and unaltered, and where integrity failures can trigger serious consequences (patient safety risks, financial loss, regulatory scrutiny, or product quality issues).

These sectors also tend to have two things in common:

  • Higher stakes for errors: Small changes can create outsized harm (a dose, a payment instruction, a clinical note).
  • Stronger expectations for evidence: It’s not enough to say data is correct—you often need to show how it remained correct (audit trails, access controls, validated processes).

Below is how integrity typically shows up across three common regulated environments—and what organisations usually do to reduce integrity breaches.

Data Integrity In Pharma

Pharma and other regulated environments require strong integrity controls around traceability, auditability, and defensible records. Integrity here is as much about proving authorised change as it is about correctness—because records underpin product quality, safety, and compliance.

What data integrity commonly looks like in pharma:

  • End-to-End Traceability: Clear records of who created, reviewed, approved, and changed data across the lifecycle (lab systems, manufacturing, quality, release).
  • Audit-Ready Change Histories: Changes to critical data are attributable and reviewable, not overwritten or “silent.”
  • Controlled Data Processes: Strict governance for data capture, review, and retention (especially in GxP-relevant systems).

Data Integrity In Financial Services

In financial services, integrity failures can quickly become fraud, misreporting, or customer harm—especially when payment instructions, identity data, or transaction records are altered. Even small integrity issues can scale fast because systems are highly automated and interconnected.

Attributes of data integrity in financial services:

  • Transaction Trust: Confidence that amounts, account details, and timestamps are accurate and have not been manipulated.
  • Reporting Reliability: Financial reporting and risk models depend on consistent, reconcilable data.
  • Anti-Fraud Dependence: Fraud detection and AML signals lose value if underlying records are incomplete or tampered with.

Integrity In Healthcare

In healthcare cybersecurity, integrity intersects directly with continuity and safety. If records are unavailable, corrupted, or untrustworthy, operational risk rises immediately—because clinicians and staff rely on accurate, timely information to make decisions.

What integrity commonly looks like in healthcare:

  • Clinical Record Trust: Confidence that patient histories, allergies, medications, and clinical notes are accurate and unaltered.
  • Operational Continuity: Scheduling, labs, imaging, and care coordination all depend on reliable systems.
  • Incident Impact Visibility: Integrity issues can become safety issues if decisions are made on corrupted or incomplete data.

Strengthen Data Integrity With Trend Vision One™

Maintaining data integrity at scale means more than preventing unauthorised change—it also means knowing where sensitive data lives, how it moves, and where risk is building before it turns into an incident. Trend Vision One™ Data Security helps organisations discover and classify sensitive data across environments, prioritise risk with centralised visibility and analysis, and respond faster when activity suggests exposure, misuse, or compromise.

Unify data integrity controls across security layers with Trend Vision One™.

Frequently Asked Questions (FAQs)

Expand all Hide all

What is data integrity?

add

Data integrity means data stays accurate, complete, consistent, and trustworthy across its lifecycle, and has not been altered in an unauthorised way.

What’s the difference between data integrity, data quality, and data accuracy?

add

Data accuracy asks whether a value is correct at a specific point in time, data integrity ensures data remains trustworthy and protected from unauthorised change over time, and data quality measures whether data is fit for a specific purpose, including completeness, timeliness, and relevance.

Why is data integrity important?

add

Data integrity is important because integrity failures undermine decision-making, disrupt operations, slow incident response, complicate recovery, and can lead to financial and compliance consequences.

What is an integrity breach?

add

An integrity breach occurs when data, logs, configurations, or records are modified, deleted, corrupted, encrypted, or manipulated without proper authorisation, traceability, or validation.

What is data integrity testing? How does it include data integrity checks and validation?

add

Data integrity validation ensures data meets required rules before it is accepted, checks confirm data remains consistent over time or across systems, and testing is a structured process, often during migrations, releases, or recovery, that combines both to prove data can still be trusted.

What are the key data integrity tools?

add

Key data integrity tools include database constraints and transactions, validation frameworks in applications and pipelines, reconciliation and monitoring tools to detect drift, audit trails and change tracking to prove what changed, and backup verification and restore testing to ensure trusted recovery.

How do you ensure data integrity?

add

You ensure data integrity by defining integrity requirements for critical data, controlling who can change it, validating inputs, logging and protecting changes, monitoring for unexpected drift, and verifying recovery after incidents.

Data Integrity