Temporal Validity Discipline

A serious critique of artificial intelligence must be tied to the model, date, task and workflow it actually examines; yesterday’s AI failure cannot automatically be used as today’s verdict.

TECHNOLOGY & AILEADERSHIP & DECISION-MAKING

Dr Danie Adendorff

6/11/20266 min read

Temporal Validity Discipline

Why criticism of AI must be current, specific and workflow-aware

By Dr Danie Adendorff

The problem is not criticism. The problem is stale criticism.

There is a growing problem in the public debate about artificial intelligence. Many criticisms of AI are not entirely false. They may even have been accurate when first made. But they are often repeated long after the technical conditions that produced them have changed. A failure observed in one model, one task, one year or one weak workflow is then treated as if it were a permanent verdict on all AI-assisted work.

That is not serious analysis. It is temporal laziness.

AI is not a static object. It is a fast-moving capability environment. A system tested in 2023 is not automatically equivalent to a system being used in 2026. A result observed in early 2025 is not automatically a fair description of later systems. A single-pass prompt is not equivalent to a disciplined workflow involving retrieval, verification, critique, correction and human review.

This is the problem I call Temporal Validity Discipline. It requires claims about AI capability, risk and reliability to be tied to the actual conditions under which the evidence was produced: the model, the date, the task, the tool configuration, the evaluation method and the workflow. Without those details, criticism may sound authoritative, but it lacks methodological discipline.

Yesterday’s AI is not today’s evidence.

A common criticism says: “AI hallucinates.” That statement is true, but incomplete. AI has hallucinated. AI can still hallucinate. The responsible conclusion, however, is not that all AI-assisted work is worthless. The responsible conclusion is that important claims must be verified.

A proper criticism should say something more precise: this model, at this date, under these conditions, in this domain, produced this type of error. That is evidence. The general slogan “AI hallucinates” is too blunt unless it is tied to time, task and method.

The same problem appears in broad claims such as “AI makes people stupid,” “AI cannot do research,” or “AI writing has no academic value.” These claims may describe poor use. A student who submits a first-pass AI output without reading it is not learning properly. A blogger who publishes generated claims without checking sources is not doing serious work. A researcher who accepts AI-generated references without verification is violating evidentiary discipline. But these examples prove the danger of weak workflows, not the illegitimacy of disciplined AI-assisted research.

The ethical failure is not assistance. The ethical failure is premature finality: the user receives an output and stops.

AI risk is real, but it is not static.

The defensible position is neither AI romanticism nor AI panic. Hallucination remains an active problem in language models, and responsible users must not pretend otherwise. OpenAI’s own 2025 explanation of hallucination acknowledges that models can still produce confident falsehoods and that evaluation systems may reward guessing rather than honest uncertainty. That is a serious warning.

Reference reliability remains a particularly important risk. Chelli and colleagues’ 2024 study in the Journal of Medical Internet Research examined hallucination and reference accuracy in AI-supported systematic-review contexts. The study reported substantial hallucination rates in the tested systems, including 39.6% for GPT-3.5, 28.6% for GPT-4 and 91.4% for Bard. That evidence supports a strict rule: never treat AI-generated references as verified evidence.

But evidence must be used within its own limits. A study of particular systems, under particular conditions, at a particular point in time, should not be converted into a timeless condemnation of all later systems, all later workflows or all AI-assisted writing. The correct lesson is not “do not use AI.” The correct lesson is: do not use AI without verification.

The workflow is the decisive issue.

Much criticism of AI focuses on model error. That is necessary, but incomplete. The decisive question is not only whether AI can make a mistake. The decisive question is whether the workflow can detect, correct and remove the mistake before publication.

A weak workflow looks like this: prompt, output, copy, publish.

A serious workflow looks very different: prompt design, Version 1, human review, source verification, reviewer critique, correction, Version 2, further review, Version 3, final judgement and declaration.

The first workflow is output consumption. The second is accountable production.

If AI produces a flawed reference and the user publishes it, the workflow has failed. If AI produces a flawed reference and the verification process catches it, corrects it and strengthens the article, the workflow has worked. That is why the debate must move from AI detection to AI governance. The question is not merely whether AI was present. The question is whether human judgement remained in command.

The accusation problem.

Temporal Validity Discipline also applies to accusation. Some institutions and commentators still speak as if AI detection tools can reliably determine whether a text was written by a human or by a machine. That confidence is dangerous.

Liang and colleagues showed in 2023 that GPT detectors frequently misclassified non-native English writing as AI-generated. This is not a small technical problem. It has ethical consequences. If a non-native English writer is falsely accused because their writing style triggers a detector, the issue becomes one of academic justice. A tool intended to protect integrity may end up punishing linguistic difference.

That is why accusations must be evidence-disciplined. Detector output should never be treated as conclusive proof. At most, it is a weak signal requiring further inquiry. Responsible institutions should look at drafts, process evidence, oral defence, source checks, assignment design and contextual indicators. The AI-assisted author must not deceive. But the accuser must not overclaim.

Integrity is two-sided.

The myth that AI makes people stupid.

One of the laziest claims in the debate is that AI “makes people stupid.” The claim may be emotionally satisfying, but it is too crude to be useful.

AI can weaken thinking when it is used passively. If a student asks for an answer, copies it and learns nothing, the tool has become a shortcut around cognition. If a professional uses AI to avoid judgement, the tool has become a liability. If a researcher accepts generated claims without checking sources, the tool has damaged intellectual discipline.

But AI can also strengthen thinking when used actively. It can help structure a difficult argument, expose weaknesses, generate counterarguments, test assumptions, improve clarity and accelerate revision. It can assist dyslexic, neurodivergent, disabled or second-language writers in expressing ideas that are already theirs. It can help a serious thinker move faster through drafting while preserving responsibility for judgement.

The difference is not the tool alone. The difference is the user and the workflow.

A passive user may become weaker. An active user may become sharper.

Temporal Validity Discipline.

Temporal Validity Discipline is therefore a rule for both sides of the AI debate.

For AI users, it says: do not rely on capability claims without current verification. Do not assume the system is correct. Do not treat AI output as evidence. Test it, check it, revise it and remain accountable.

For AI critics, it says: do not rely on stale failures as permanent proof. Do not generalise from one model to all systems. Do not confuse weak workflows with disciplined practice. Do not accuse without evidence.

For institutions, it says: regulate AI use with current knowledge, not panic from a previous model era. Assessment rules must protect integrity, but they must also avoid false accusation and unfair harm.

For authors, it says: declare material AI assistance, verify claims, retain judgement and accept responsibility.

This is the discipline the AI era requires.

Conclusion: current risk, current evidence, accountable workflow.

AI risk is real. Hallucination is real. Poor AI use is real. Misconduct is real. But none of these facts justifies lazy criticism.

The serious position is more demanding: AI must be used responsibly, and AI must be criticised responsibly. Both require evidence. Both require currency. Both require context.

A claim about AI must be model-specific, date-specific, task-specific and workflow-specific. Otherwise, it risks becoming exactly what it condemns: confident assertion without sufficient evidence.

Temporal Validity Discipline does not excuse AI. It disciplines the conversation around AI.

The real question is not whether AI once failed. All tools fail. The real question is whether the present system, in the present workflow, under present verification conditions, can produce accountable work.

Yesterday’s failure is evidence. It is not a timeless verdict.

Sources and Notes

OpenAI. “Why Language Models Hallucinate.” 2025. Used for the claim that hallucination remains an active problem and that evaluation systems may reward guessing rather than uncertainty. [link]

Chelli, M. et al. “Hallucination Rates and Reference Accuracy of ChatGPT and Bard for Systematic Reviews: Comparative Analysis.” Journal of Medical Internet Research, 2024;26:e53164. DOI: 10.2196/53164. Used for evidence on hallucination rates in AI-generated references. [link]

Liang, W. et al. “GPT Detectors Are Biased Against Non-Native English Writers.” Patterns, 2023;4(7):100779. DOI: 10.1016/j.patter.2023.100779. Used for evidence that AI-detection tools may misclassify non-native English writing and create fairness risks. [link]

National Institute of Standards and Technology. Artificial Intelligence Risk Management Framework 1.0. 2023. Used for the broader governance principle that AI trustworthiness requires structured risk management. [link]

Author workflow disclosure

This article was produced through an AI-assisted but human-directed workflow. AI support was used for accessibility assistance, structuring, language refinement, source-discovery prompts, revision planning, and conversion of editorial comments into proposed amendments. The author retained responsibility for the argument, accepted or rejected changes, checked the logic of claims, assessed source credibility, and remained accountable for the final text. AI-generated material was not treated as empirical evidence, and synthetic or illustrative examples were not presented as observed data.

Image note: The accompanying image is AI-generated and intended for illustration purposes only.

RTETURN HOME