Royal Army College

Lesson Overview

A great deal of what the Army teaches is a skill, a thing done with the hands and body, and a skill cannot be assessed by asking about it. You do not find out whether a soldier can apply a dressing, handle a weapon safely, or tie a knot by giving them a written test on it; you find out by having them do it and watching. This lesson is the partner of the last: where Lesson 06 dealt with assessing knowledge through written and oral tests, this one deals with assessing practical skills through observed performance. The candidate does the thing, the assessor watches against a standard, and the judgement is made on what is actually done. It sounds simple, and the principle is, but doing it fairly and consistently is a real craft, because watching a performance and judging it is far more open to inconsistency and bias than marking an objective test against a key.

The danger in practical assessment is exactly that openness. Two assessors watching the same performance can reach different verdicts; the same assessor can judge harder in the morning than the afternoon, or harder on a candidate they doubt; a vague standard lets impressions stand in for evidence. So the whole craft taught here is the craft of making an observed judgement reliable and fair: assessing against a clear, fixed standard rather than a feeling, using tools like the checklist to anchor the judgement, watching the right things, and reaching a verdict on the evidence of the performance and nothing else. A practical assessment built this way measures the skill honestly; one built on impression measures the assessor's mood, and the qualification suffers.

This is the knowledge layer. It teaches you how a practical skill is assessed, how to build and use an assessment against a standard, what to watch for, and how to judge fairly, so that you can plan and run an observed assessment that holds the four principles. The eye that reads a performance accurately, the judgement of a borderline pass, the steadiness that judges the tenth candidate as fairly as the first, is built by assessing real candidates under a qualified assessor and signed off in person. Read this to know how a skill is assessed; learn to assess one by assessing it.

By the end you will be able to explain why a skill must be assessed by performance, build and use a clear standard and checklist for a practical assessment, set up and run an observed assessment fairly, judge a performance against the standard rather than by impression, and recognise the particular threats to reliability and fairness in practical assessment.

Key Terms

Practical skill: a task done with the hands and body, learned by doing and assessed by being done, such as a drill, a procedure, or the handling of equipment.
Performance assessment: assessing a skill by having the candidate perform it while the assessor observes and judges against a standard.
Standard: the fixed statement of what a correct performance of the skill looks like, against which the candidate is judged, not against other candidates.
Checklist: a prepared list of the points a correct performance must show, used by the assessor to anchor the judgement and mark consistently.
Criterion-referenced: judged against a fixed standard (the criteria) rather than against how other candidates did; the correct basis for a skills assessment.
Critical point: a part of a skill that must be done correctly for the whole to be safe or sound, and whose failure fails the assessment whatever else is right.
Observation: the assessor's watching of the performance, which must take in the right things, at the right moments, without interfering with the candidate.
Halo and sequence effects: common biases in which an impression of the candidate, or the candidate before them, colours the judgement of the performance.
Standardisation: the practices that make different assessors, and the same assessor over time, judge the same performance the same way (developed in Lesson 08).
Defensible judgement: a verdict supported by recorded evidence from the performance against the standard, which can be explained and stood behind.

Why a skill is assessed by performance

A skill lives in the doing, and so it can only honestly be assessed in the doing. This follows directly from validity, the first principle: an assessment must measure the outcome it claims to, and the outcome of a skill is the ability to perform it, so only a performance measures it. A written test on a skill measures knowledge about the skill, which is a different thing, and often a misleading one: a candidate can write a perfect account of how to apply a dressing and fumble it under the eyes of an assessor, and another can struggle to describe it in words yet do it cleanly with their hands. To certify that a soldier can do a thing, you must watch them do it. This is why the Army's skill qualifications rest on observed performance, and why "confirmation by performance" ran through the instruction course (TRG 301) and runs through this one.

The consequence is that practical assessment is built around a real performance under realistic conditions. The candidate is set to do the actual skill, to the actual standard, in conditions near enough to the real ones that success means they could do it for real. A skill assessed only in artificially easy conditions, the first-aid drill on a calm clean volunteer in good light, may not be the usable skill the outcome means, so the assessment conditions are made as realistic as fairness and safety allow. What the assessor certifies is not that the candidate knows the skill or could describe it, but that, watched, under proper conditions, they did it to standard.

Because the judgement rests on watching rather than on a key, practical assessment carries a reliability problem that objective written tests do not, and the rest of this lesson is largely about solving it. The solution is to anchor every judgement to a clear fixed standard and the evidence of the performance, so that what is being judged is the skill against the criteria, not the candidate against the assessor's impression.

The standard and the checklist

The foundation of a fair practical assessment is a clear standard: a fixed statement of what a correct performance looks like, written before any candidate is assessed. Without it, the assessor judges by a private picture of "good", which differs between assessors and drifts within one, and the assessment becomes unreliable and unfair. With it, every candidate is judged against the same explicit criteria, which is what criterion-referencing means, the correct basis for a skills assessment: the candidate is measured against the standard the job requires, never against how the other candidates happened to do, so a whole course can pass if all reach the standard, or none if none do.

The standard is made usable by a checklist: a prepared list of the points a correct performance must show, which the assessor marks off as they watch. The checklist anchors the judgement, turning a vague impression into a record of specific things the candidate did or did not do, which makes the assessment more reliable (different assessors checking the same points judge more alike), fairer (every candidate checked against the same points), and more defensible (the verdict is supported by a record of evidence, not a feeling). A good checklist names the things that actually matter to a correct, safe performance, in observable terms, and no more; a checklist cluttered with trivia buries the important points among the unimportant.

Within the standard, some points are critical points: parts of the skill that must be done correctly for the whole to be safe or sound, and whose failure fails the assessment whatever else is right. A weapon-handling drill done with flair but unsafely has failed on the critical point of safety, however good the rest; a procedure that reaches the wrong result has failed whatever its style. The assessor marks critical points as such on the checklist, so that a failure there is recognised as decisive and not averaged away against the things the candidate did well. Identifying the critical points correctly, the safety, the must-be-right steps, is part of building a sound standard.

   THE FOUNDATION OF A FAIR PRACTICAL ASSESSMENT

   THE STANDARD       a fixed, written statement of what a correct
                      performance looks like, set BEFORE assessing
                      ......... criterion-referenced: candidate vs the
                                standard, NOT vs other candidates

   THE CHECKLIST      the points a correct performance must show,
                      marked as you watch
                      ......... turns impression into recorded evidence:
                                more reliable, fairer, defensible

   CRITICAL POINTS    the must-be-right (esp. safety) points whose
                      failure fails the whole, whatever else is right
                      ......... not averaged away against the good bits

Setting up and running the assessment

A practical assessment is set up and run so that what is tested is the candidate's skill and nothing else, applying the fairness of Lesson 04 to observed performance. Several things make it fair.

Same task, same conditions, same standard for all. Every candidate performs the same skill, with the same equipment, in the same conditions, judged against the same standard and checklist. Differences in the task or conditions between candidates make the result reflect the luck of the draw rather than the skill, so they are removed as far as possible; where conditions must vary (the weather on an outdoor assessment), the assessor holds the standard fixed even as conditions change.

Brief the candidate clearly. The candidate is told plainly what they are to do, to what standard, and how it will be judged, so that no one fails through misunderstanding the task rather than lacking the skill. A practical assessment tests the doing of the skill, not the guessing of the assessor's intent, so the brief leaves no avoidable doubt about what is wanted.

Observe well, without interfering. The assessor positions themselves to see the things that matter, especially the critical points, and watches actively against the checklist, but does not coach, prompt, or rescue during the assessment, because the moment they help, they are no longer measuring the candidate's unaided skill. This is the hard discipline for an instructor turned assessor: the instinct to correct, right in teaching (TRG 301), is wrong in assessment, where the candidate must be left to perform and be judged on what they actually do. The assessor watches, marks the checklist, and keeps quiet.

Make a record as you go. The assessor marks the checklist during or immediately after the performance, while it is fresh, rather than reconstructing it later from memory, because memory blurs and favours a general impression over the specific evidence. The marked checklist, with notes on any critical-point failure, is the record that makes the verdict defensible and feeds the feedback and recording of Lesson 05.

Judging fairly: the threats to reliability

The verdict on a practical assessment is reached against the standard, on the evidence of the checklist, and the assessor's main task is to keep that judgement clean of the biases that watching invites. A few threats are common enough to name and guard against deliberately.

The halo effect is letting a general impression of the candidate colour the judgement of the specific performance: a confident, well-turned-out candidate is unconsciously marked up, a nervous or scruffy one marked down, regardless of what their hands actually did. The guard is the checklist: judge the points, not the person, and mark what was done, not how impressive the candidate seemed. The sequence effect is letting the candidate before colour the candidate now, judging an average performance harshly because it followed a brilliant one, or kindly because it followed a poor one. The guard is again the fixed standard: each candidate against the standard, never against the last candidate. Drift and fatigue make the same assessor judge the tenth candidate differently from the first, harder or softer as the day wears on; the guard is the standard and checklist held steady, and, on a long assessment, deliberate self-checking against the criteria.

Underlying all of these is the single discipline that makes practical assessment fair: judge the performance against the standard, on the evidence, not by impression. The assessor who marks the checklist honestly, holds the fixed standard against every candidate alike, recognises a critical-point failure as decisive, and reaches a verdict they could defend from the recorded evidence has solved the reliability problem that watching creates. Making different assessors do this alike, and the same assessor do it consistently, is the work of standardisation, taken up fully in the next lesson.

In Practice: Assessing a Safety-Critical Drill

An assessor of the Royal Army College must assess a section, one candidate at a time, on a practical drill where one of the points is safety, and getting it wrong is dangerous. A weak assessor would watch each performance and form a general impression, pass the candidates who looked confident, and judge the later ones against the earlier. The College's assessor does it against a standard.

Before the first candidate, she has a clear written standard and a checklist of the points a correct performance must show, with the safety step marked as a critical point whose failure fails the assessment whatever else is right. She sets the same task and conditions for every candidate, briefs each one plainly on what to do and how it will be judged, and positions herself to see the critical points. As each performs, she watches against the checklist and marks it as she goes, and she does not prompt or rescue, however much her instructor's instinct itches to correct, because she is measuring unaided skill. One candidate performs smoothly and confidently but fumbles the safety step; she records it as a critical-point failure and does not let the polish of the rest average it away, because an unsafe drill is a failure however good it looks. Another is nervous and awkward but does every point, including safety, correctly; she passes them on the evidence of the checklist, not on the poor impression, guarding against the halo effect.

She holds the fixed standard against the tenth candidate as firmly as the first, checking herself against the criteria as the day wears on so fatigue does not drift her judgement, and judges each against the standard, never against the candidate before. At the end, every verdict is supported by a marked checklist and is defensible: she could show exactly why each candidate passed or failed, on the evidence of what they did against the standard. The skill was assessed honestly, the dangerous performance failed and the safe-but-awkward one passed, which is exactly right, and the qualification means what it says.

Check Your Understanding

Explain why a practical skill must be assessed by performance and not by a written or oral test, in terms of validity. Why does assessing by observation create a reliability problem that an objective written test does not?
Describe the role of the standard and the checklist in a fair practical assessment, what criterion-referencing means and why it is the right basis, and what a critical point is and why its failure is not averaged away against the things done well.
Set out how a practical assessment is run fairly (same task and conditions, clear brief, observing without interfering, recording as you go), and name the threats to reliable judgement (halo, sequence, drift and fatigue) and the single discipline that guards against all of them.

Reflection (write a short paragraph): Recall a time you were assessed on something you did, a test, a trade check, a driving test, a sports trial. Did the assessor seem to judge you against a clear standard or against a private impression, and did you sense any of the biases in this lesson, the confident candidate marked up, the judgement coloured by who went before? Now picture assessing a safety-critical skill yourself. Which would you find harder, resisting the instinct to prompt and rescue a struggling candidate, or holding the exact same standard against the last candidate of a long day as against the first, and what would you do to manage it?

Summary

A practical skill is assessed by observed performance, because validity requires measuring the actual ability, and a written or oral test measures only knowledge about the skill. The candidate does the real skill, in realistic conditions, to the standard, while the assessor watches.
Watching creates a reliability problem absent from objective tests, solved by anchoring every judgement to a clear fixed standard (criterion-referenced: candidate against the standard, never against other candidates) and a checklist of the points a correct performance must show, which turns impression into recorded, defensible evidence.
Some points are critical points (especially safety), whose failure fails the whole assessment whatever else is right, and which are never averaged away against the things done well.
Run it fairly: same task, conditions, and standard for all, a clear brief, observe well without coaching or rescuing (the instructor's correcting instinct is wrong in assessment), and record on the checklist as you go while it is fresh.
Guard the judgement against the halo effect (impression of the person), the sequence effect (the candidate before), and drift and fatigue, all by the one discipline: judge the performance against the fixed standard, on the evidence, not by impression, reaching a defensible verdict.
This is the knowledge layer; reading a performance accurately and judging a borderline pass are mastered by assessing real candidates under a qualified assessor and signed off in person. This lesson pairs with the knowledge testing of Lesson 06, rests on the principles of Lesson 02 and the fairness of Lesson 04, and leads into the marking, standards, and moderation of Lesson 08 that make assessors judge alike.

Assessing Practical Skills