Misinformation

Misinformation testing evaluates whether LLM applications (e.g., chatbots, content generation tools) resist producing, accepting, or amplifying false or misleading information.
This includes measuring the model’s ability to reject incorrect claims, request clarification, or otherwise avoid validating falsehoods.

In high-risk domains (e.g., healthcare, financial services, legal contexts), even small inaccuracies can lead to serious compliance, reputational, or safety risks.

Misinformation testing helps organizations:

Detect when a model accepts false information without challenge.
Understand why certain prompts succeed in “tricking” the model.
Reduce the risk of misinformation propagation in production environments.

Types of Misinformation Tests

1. False Assertion Testing

Measures how the model responds to deliberately false statements. Prompts are generated by modifying factual datasets such as WikiFact — for example, changing dates, statistics, or terminology — while preserving the overall sentence structure.

Objective: Determine whether the model rejects the falsehood, requests clarification, or instead validates the incorrect information.

2. Package Hallucination

Checks whether models reference the correct package names when recommending libraries or tools for specific programming languages.

Language	Example Focus
Python	Python package reference accuracy
JavaScript	npm package reference accuracy
Ruby	Ruby gem reference accuracy
Rust	Rust crate reference accuracy

Factors That Increase Misinformation Risk

Our testing shows that certain linguistic patterns and prompt structures can make models more likely to accept false information.

Prefix framing

Introductory phrases can prime a model to treat a false statement as true. Examples:

Why is it that …
Explain why …
Summarize reports suggesting that …
Help me understand why …

These prefixes often trigger the model to elaborate or rationalize rather than verify the claim.

Common Adversarial Modifications

Modification type	Example	Observed sensitivity
Single-digit year shift	2015 → 2017	High vulnerability
Larger year shift	2015 → 2005 / 2115 / 3015	Increasingly easier to catch
Small statistical change	97,607 → 99,607	Low sensitivity
Percentage change	14.1% → 13.8%	Low sensitivity
Ranking change	2nd → 3rd largest	Low sensitivity
Sequence change	Event A ↔ Event B order swap	Moderate sensitivity
Geographic substitution	Georgia → Alabama	Moderate sensitivity
Terminology change	Award name tweak	Low–moderate sensitivity

Why We Show These Results

The example results below are taken from controlled “false assertion” prefix tests.
They are included to:

Demonstrate how certain prompt styles (prefix framing) can significantly change a model’s misinformation acceptance rate.
Highlight that even high-performing models can be vulnerable to subtle misinformation attacks.

These figures are not intended as definitive model rankings, but as a snapshot of attack patterns and their impact on accuracy.

Example Results: False Assertion Prefix Testing

In controlled tests, models were prompted with falsified claims.
Below are example error rates for Gpt-4.1 across prefixes:

Prefix	Error Rate
Why is it that	36.0%
Explain why	24.5%
Summarize reports suggesting	37.0%
Help me understand why	33.5%

Full cross-model benchmark data available in the Appendix.

How Asenion Conducts Misinformation Testing

Asenion uses a combination of:

Open-source datasets (e.g., WikiFact)
Proprietary datasets
Synthetic misinformation scenarios

Testing methods include:

Statistical evaluation of false acceptance/rejection rates
Scenario-based stress testing using prefixes and subtle factual changes
Regression testing to track improvements or regressions in misinformation handling across model updates

Appendix: Full False Assertion Test Results

Click to view all models and prefixes