What does the "validity vs soundness" cognitive-reasoning question test?

Note on framing: This is the cog_sample_1 item-level explainer for the AIEH cognitive-reasoning sample-test family. The construct draws on Stanovich’s rationality framework and Carroll’s three-stratum theory of cognitive abilities; cognitive-reasoning items target the deliberate-reasoning side of that taxonomy rather than raw fluid-intelligence pattern-recognition.

This item presents two arguments and asks the candidate to grade each on validity (does the conclusion follow from the premises?) and soundness (is the argument valid AND are the premises true?). One argument has true premises but invalid form. The other argument has impeccable logical form but at least one false premise. The candidate must recognize that validity and soundness are independent properties — that an argument can be valid-but-unsound or sound-but-not-true-feeling — and grade them correctly. The scenario probes the validity-soundness distinction, one of the foundational discriminations in formal reasoning and one of the most-skipped distinctions in casual argument evaluation.

What this question tests

The item targets the ability to evaluate logical structure independent of the rhetorical quality, factual content, or emotional appeal of the argument. The skill matters for two reasons. First, it is a robust predictor of higher-order reasoning performance: respondents who can keep validity and soundness separate tend to perform better on multi-step inference tasks, consistency-checking tasks, and adversarial-argument analysis. Stanovich’s program of research on rational-thinking dispositions documents the validity-soundness distinction as a near-universal weak spot in untrained reasoners — most adults default to grading arguments on whether the conclusion “feels right” rather than on whether the conclusion follows. Second, the distinction is a load-bearing component of professional work in domains that ship arguments downstream: legal reasoning, scientific peer review, software-design critique, and policy analysis all require evaluators who do not collapse validity into soundness.

The construct is part of what Carroll’s three-stratum theory calls broad cognitive ability at stratum II — specifically the fluid reasoning factor (Gf) — but with a specific learnable-skill overlay that separates trained from untrained respondents even at equal raw cognitive capacity. AIEH’s cognitive-reasoning assessment family targets the trained-skill overlay because employers care about whether a candidate can actually use the skill on the job, not just whether they have the raw capacity to acquire it.

Why this is the right answer (concrete worked example)

The correct response is Argument A is invalid (and therefore unsound regardless of its premises); Argument B is valid but unsound (because at least one premise is false). The worked example illustrates the asymmetry.

Argument A: “All effective managers communicate clearly. Maria communicates clearly. Therefore, Maria is an effective manager.” Both premises are stated as true (and the second one is observed about Maria). The conclusion is the kind of claim a casual reader would accept. The argument is invalid: it commits the formal fallacy of affirming the consequent (from “if P then Q” and “Q,” inferring “P”). Maria’s clear communication does not entail that she is an effective manager — many people communicate clearly without being effective managers, and the first premise only tells us something about effective managers, not about all clear communicators. Because the argument form is invalid, the truth of the premises does not transmit to the conclusion. The argument is also unsound, because soundness requires validity as a precondition.

Argument B: “All birds can fly. Penguins are birds. Therefore, penguins can fly.” The form is valid: it is a textbook syllogism (universal-affirmative major premise

singular-affirmative minor premise → singular-affirmative conclusion). If the premises were true, the conclusion would follow necessarily. But the first premise is false: not all birds can fly (penguins, ostriches, kiwis, etc.). The argument is therefore valid but unsound, and its conclusion is in fact false despite the impeccable logical form.

The candidate who grades these correctly demonstrates that they can hold validity and soundness as independent properties and evaluate each on its own terms. The candidate who grades Argument A as valid because its conclusion seems true, or grades Argument B as invalid because its conclusion is false, has collapsed validity into soundness — a failure mode Stanovich documents as belief bias.

What the wrong answers reveal

The wrong-answer patterns map cleanly to belief-bias variants:

“Argument A is valid because the conclusion is true.” This response treats the conclusion’s truth as evidence of valid form, conflating the two dimensions. Stanovich’s research shows this pattern is endemic in untrained reasoners; it is one of the most-replicated findings in the rational-thinking-disposition literature.
“Argument B is invalid because penguins can’t fly.” This response treats a false conclusion as evidence of invalid form. The respondent is reasoning backward from the conclusion’s truth-status to the form’s validity, exactly the inverse of how validity actually works.
“Both are invalid; neither has true premises.” This response collapses validity into soundness and treats the absence of soundness as the absence of validity. The respondent has not internalized that validity is a property of form, independent of whether the premises happen to be true.
“Both are sound because the form is recognizable.” This response collapses soundness into validity and treats valid-looking form as sufficient for soundness. The respondent has not internalized that soundness requires both validity AND true premises.

How the sample test scores you

In the AIEH 5-item cognitive-reasoning sample test, this item contributes one of five datapoints aggregated into the single cognitive_reasoning score via the W3.2 normalize-by-count threshold. Graded scoring on the two-judgment item: full credit for correctly grading both arguments, partial credit for one correct judgment, zero credit for both wrong.

Data Notice: Sample-test results are directional indicators only. The validity-soundness distinction has high test-retest reliability among trained respondents but considerable scenario-specific variance in untrained respondents; for a verified Skills Passport credential, take the full 50-item assessment when it ships.

The full assessment probes deductive reasoning, inductive reasoning, probabilistic reasoning, base-rate reasoning, and confounding-vs-causation reasoning across 50 items. See the scoring methodology for how cognitive-reasoning scores map onto the AIEH 300–850 Skills Passport scale, and the cognitive ability in hiring overview for context on cognitive-reasoning’s predictive validity for job performance per Schmidt & Hunter 1998.

Belief bias. The well-replicated finding that adults evaluate an argument’s logical validity in part by whether its conclusion is believable. Trained reasoners reduce belief bias but rarely eliminate it; AIEH cognitive- reasoning items test whether the candidate’s residual belief bias is small enough to score correctly on validity-soundness items.
Affirming the consequent. The formal fallacy of inferring P from “if P then Q” and Q. One of the most- common invalid argument forms in casual reasoning, and the specific form Argument A illustrates.
Modus ponens vs modus tollens. The two valid conditional argument forms: from “if P then Q” and P infer Q (modus ponens); from “if P then Q” and not-Q infer not-P (modus tollens). Strong cognitive-reasoning candidates recognize these reflexively.
Carroll’s three-stratum theory. The dominant contemporary cognitive-abilities taxonomy, with general intelligence (g) at stratum III, broad abilities at stratum II (including fluid reasoning Gf), and narrow abilities at stratum I. Cognitive-reasoning items target Gf-adjacent skills with measurable trained-component variance.
Schmidt & Hunter 1998 cognitive-ability validity. The meta-analytic finding that general mental ability is the single strongest predictor of job performance across occupational levels (corrected validity ~0.65 for professional/managerial roles). Cognitive-reasoning scores are an applied-skill proxy for the trained component of this construct.

For broader context on cognitive-ability validity in hiring, see the cognitive ability in hiring overview, the skills-based hiring evidence page, and the hire workflow page for cognitive-test sequencing.

Sources

Carroll, J. B. (1993). Human Cognitive Abilities: A Survey of Factor-Analytic Studies. Cambridge University Press.
Evans, J. St. B. T. (2003). In two minds: Dual-process accounts of reasoning. Trends in Cognitive Sciences, 7(10), 454–459.
Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology. Psychological Bulletin, 124(2), 262–274.
Stanovich, K. E. (2009). What Intelligence Tests Miss: The Psychology of Rational Thought. Yale University Press.

What this question tests

Why this is the right answer (concrete worked example)

What the wrong answers reveal

How the sample test scores you

Related concepts

Sources

Try the question yourself