LB
Back to AI Jargon for Execs
GeneralAzure AI4 min read

What Is "Data Poisoning"? The AI Security Threat Boards Need to Understand

Data poisoning is a category of AI security attack in which a malicious actor corrupts the data used to train or fine-tune an AI model. By introducing manipulated examples into the training dataset, an attacker can cause the model to learn incorrect patterns, produce biased outputs, or behave in specific unintended ways when triggered by particular inputs. As organisations move toward fine-tuning AI models on their own proprietary data, data poisoning becomes a genuine enterprise security concern.

01How data poisoning works

AI models learn from data. In training, the model adjusts its parameters to minimise errors on the examples it is shown. If a sufficiently large proportion of the training examples are manipulated in a particular direction, the model learns the manipulation as if it were a genuine pattern.

A simple example: an attacker who wanted to cause a sentiment analysis model to misclassify certain content might introduce manipulated examples into the training dataset where that content is labelled with an incorrect sentiment. If the training dataset is large and the manipulation subtle, it may not be caught during quality review but will cause the deployed model to behave incorrectly.

Backdoor attacks are a more sophisticated variant. The attacker introduces a specific trigger pattern into the training data: a particular word, symbol, or input feature. When the deployed model encounters that trigger in production, it produces a specific targeted output the attacker designed, while behaving normally on all other inputs. This makes backdoor attacks particularly difficult to detect.

02When is this relevant for enterprises?

For organisations using large general-purpose AI models (GPT-4, Claude, Gemini) through APIs, data poisoning at the model level is a vendor responsibility. These organisations are not training models themselves; the risk is managed by the AI provider.

Data poisoning becomes directly relevant for enterprises in three scenarios. First, when the organisation is fine-tuning a model on proprietary data: the fine-tuning dataset needs to be treated as a sensitive asset, with access controls, integrity verification, and provenance tracking.

Second, when the organisation is using AI for security-sensitive decisions and the training pipeline is potentially accessible to adversaries: this applies particularly in defence, financial fraud detection, and critical infrastructure sectors.

Third, when the organisation is purchasing AI systems from third-party vendors where the training provenance is not fully transparent. Due diligence on AI procurement should include questions about training data governance.

03Governance controls

Protecting against data poisoning requires treating training data with the same security discipline applied to other critical assets.

Data provenance controls: track where every element of a training dataset came from. For fine-tuning datasets built from internal business data, this means access logging and change management. For datasets incorporating external data sources, it means verifying the integrity of those sources.

Data quality controls: independent review of training data, including statistical checks for anomalous distributions that might indicate manipulation. Training data quality is a security control, not just a modelling concern.

Model evaluation: robust evaluation of fine-tuned models against adversarial test cases, not just standard benchmarks. If an attacker has introduced a backdoor trigger, standard performance metrics will not surface it.

Supply chain awareness: understanding the provenance of base models used for fine-tuning, and the training data practices of AI vendors. Organisations should include AI model supply chain in their information security risk assessments.

Key Takeaways

  • 1.Data poisoning corrupts AI training data to cause a model to learn incorrect patterns, produce biased outputs, or behave in specific unintended ways.
  • 2.Backdoor attacks are a sophisticated variant where a trigger pattern causes specific targeted behaviour in deployment while the model appears normal otherwise.
  • 3.For organisations using general-purpose AI via API, data poisoning at model level is a vendor responsibility; it becomes directly relevant when fine-tuning on proprietary data.
  • 4.Governance controls include data provenance tracking, data quality review, adversarial model evaluation, and AI model supply chain risk assessment.
  • 5.AI procurement due diligence should include questions about training data governance, particularly for AI systems making security-sensitive or high-stakes decisions.

References & Further Reading

Want to discuss this with an expert?

Book a strategy call to explore how these insights apply to your organisation.

Book a Strategy Call