Royal Army College

Lesson Overview

In the last lesson you learned why we assess and what we are looking for: confirmation that a student has reached the required standard, so that a qualification can be trusted and recognition is fairly given. This lesson asks the harder question. What makes an assessment a good one? Two assessors can watch the same student perform the same drill and reach opposite verdicts. One waves through a national who is not ready; the other refers a national who is. Both believe they are being careful. The difference between them is not effort or goodwill. It is whether their assessment rests on sound principles.

Good assessment is built on four principles, remembered as VRFT: it must be valid, reliable, fair, and transparent. Underneath all four sits a single idea that separates Army assessment from a school prize day: we judge a student against a fixed standard, not against the people standing next to them. This is called criterion-referenced assessment, and once you understand it the four principles fall into place. An assessment that honours these principles protects the able and the unready alike, and protects the value of every badge the College awards.

This lesson is the knowledge layer. Reading it will not by itself make you a sound assessor; that is mastered by assessing under supervision, having your judgements checked against an experienced assessor's, and learning where your own eye runs too soft or too hard. The practical assessing in TRG 310 is signed off in person, by a qualified instructor, where supervision allows. By the end you will be able to define and apply each of the four principles of good assessment, explain what each one demands of you in practice, recognise common ways an assessment fails each test, and explain why Army assessment is criterion-referenced rather than ranked against peers.

Key Terms

Principles of good assessment (VRFT): the four qualities every sound assessment must have, namely valid, reliable, fair, and transparent.
Valid: an assessment is valid when it actually measures the outcome it claims to measure, by testing the real skill or knowledge the course teaches, in conditions close enough to the real task.
Reliable: an assessment is reliable when it gives the same result regardless of who assesses or when, because it rests on clear criteria and consistent conditions rather than the assessor's mood or memory.
Fair: an assessment is fair when it is free of bias and favouritism, holds the same standard for everyone, and allows reasonable adjustments for genuine need without ever lowering that standard.
Transparent: an assessment is transparent when the student knows in advance what is required of them and how they will be assessed, with no hidden traps or surprise conditions.
Criterion: a single fixed statement of what the student must do or know to reach the standard, written so that it can be judged yes or no.
Criterion-referenced assessment: judging a student against fixed criteria, asking can they do the thing to the standard, rather than ranking them against other students.
Norm-referenced assessment: ranking students against one another, for example passing the top half of a group; the wrong model for Army competence assessment.
Reasonable adjustment: a change to how an assessment is conducted, made for a genuine need, that lets a candidate show their ability without changing the standard they must meet.

Why principles, not just a checklist

You might reasonably ask why we begin with principles at all. Surely a good marking scheme is enough? A marking scheme tells you what to look for in one task. The principles tell you whether the task, the conditions, and your own judgement are sound in the first place. A flawless checklist applied to the wrong test, or applied differently to different nationals, still produces a worthless result.

Think of the four principles as four tests you apply to your own assessment before you trust its verdict. Valid asks: am I measuring the right thing? Reliable asks: would another assessor, or I myself on another day, reach the same result? Fair asks: have I held everyone to the same standard, free of favour? Transparent asks: did the student know in advance what they faced? An assessment that passes all four can be defended to the candidate, to the course supervisor, and to anyone who later relies on the qualification. An assessment that fails any one of them is unsafe, however much care went into it.

        THE FOUR PRINCIPLES AND THEIR TESTS

  +-------------+------------------------------------------+
  | PRINCIPLE   | THE TEST YOU APPLY                       |
  +-------------+------------------------------------------+
  | VALID       | "Am I measuring the right thing,         |
  |             |  in realistic conditions?"               |
  +-------------+------------------------------------------+
  | RELIABLE    | "Would another assessor, or I on a       |
  |             |  different day, reach the same result?"  |
  +-------------+------------------------------------------+
  | FAIR        | "Have I held everyone to the same        |
  |             |  standard, free of bias or favour?"      |
  +-------------+------------------------------------------+
  | TRANSPARENT | "Did the student know in advance what    |
  |             |  was required and how they were judged?" |
  +-------------+------------------------------------------+

Valid: measure the outcome you claim to measure

An assessment is valid when it measures the outcome it claims to measure, and not something else. The learning outcome states what the student should be able to do. The valid assessment tests exactly that, in conditions close enough to the real task that a pass means something.

The most common failure of validity is testing the wrong thing because it is easier to test. The outcome says the national can apply a field dressing under pressure. A written multiple-choice quiz on the steps of applying a dressing is easy to set and easy to mark, but it does not measure the outcome. It measures whether the student can recognise the steps on paper. A national can score full marks on that quiz and still fumble the dressing with cold hands on an uneven casualty. The quiz is invalid for that outcome. To assess the skill validly you must watch the national do it.

The second half of validity is realistic conditions. A skill assessed in a warm, quiet, unhurried room, with the assessor prompting the next step, does not prove the national can perform when tired, rushed, and unprompted, which is the only condition that matters for a home-defence force. Validity does not demand danger or cruelty; the instructor's duty of care still runs throughout. It demands conditions realistic enough that the assessed performance predicts the real one. Match the method to the outcome: a skill is assessed by doing it, knowledge by explaining or answering, understanding by questioning. When you find yourself choosing a method because it is convenient to mark rather than because it tests the real outcome, stop. That is the moment validity slips.

Reliable: the same result whoever assesses, and whenever

An assessment is reliable when it gives the same result regardless of who runs it and when. If a national would pass under Corporal Adeyemi on Tuesday but refer under Sergeant Voss on Thursday, for the same performance, the assessment is unreliable and its verdict is noise.

Reliability comes from two things: clear criteria and consistent conditions. Clear criteria mean every assessor is looking for the same observable points, written plainly enough to be judged yes or no, rather than each assessor carrying a private picture of "good enough" in their head. "Demonstrates good map sense" is not a reliable criterion; two assessors will read it differently. "Sets the map to north and identifies the section's location within one hundred metres" is reliable, because both assessors will agree on whether it happened. The marking scheme exists to make your eye and another assessor's eye see the same thing.

Consistent conditions mean every candidate faces the same task, the same time allowed, the same resources, and the same level of help. If one national gets three attempts and the next gets one, or one is coached through the hard step and the next is left to fail it, the results cannot be compared. Reliability is not coldness. It is the discipline of holding the test steady so that the only thing that varies is the candidate's performance. Where an assessment is high stakes, the College may have a second assessor watch and agree the judgement, precisely to keep it reliable.

Fair: the same standard for all, free of favour

An assessment is fair when it is free of bias and favouritism and holds the same standard for everyone. Fairness is closely tied to reliability but it is not the same. Reliability is about consistency of measurement. Fairness is about freedom from bias, conscious or not.

The obvious failure is favouritism: going easier on a friend, a keen national, or someone from your own section, and harder on someone you have clashed with. Less obvious, and more dangerous because it feels like good judgement, is unconscious bias. Marking down the quiet national who you have decided lacks confidence, or waving through the loud one whose manner you read as competence, are both failures of fairness. The defence against both is the same as for reliability: judge against the written criteria, point by point, not against your feeling about the person. A criterion is either met or it is not, regardless of who is performing it.

Fairness also requires reasonable adjustments for genuine need. A reasonable adjustment changes how a candidate is assessed so they can show their true ability, without changing what standard they must reach. A national with a reading difficulty might have a written question read aloud to them; the question and the required answer do not change. A left-handed national is shown the drill from an angle they can follow. What an adjustment must never do is lower the standard. You may change the path to the criterion; you may not move the criterion. The home-defence force this College serves is small and young, and its members differ widely in background and starting level. Fairness is what lets that range of people meet one honest standard.

Transparent: no hidden traps

An assessment is transparent when the student knows, in advance, what is required of them and how they will be assessed. There are no surprise conditions, no hidden criteria, no traps sprung to catch the candidate out. Before the assessment the candidate should be able to tell you what they have to do, to what standard, in what conditions, with what time and resources, and what counts as a pass.

Transparency is sometimes mistaken for going soft, as though hiding the criteria makes the test harder and therefore better. It does the opposite. A hidden test does not measure competence; it measures whether the candidate guessed what you wanted. That is invalid as well as unfair. Telling a national exactly what they must achieve does not lower the bar. It removes everything except their actual ability from the result, which is the whole point. The strongest argument for transparency is simple: if a candidate fails, you must be able to show them the criterion they did not meet and the standard they were told about in advance. An assessment you cannot explain to the person who failed it is not one you should be running.

Transparency begins long before assessment day. It begins when the outcomes are stated at the start of the lesson, so the national trains towards the same standard they will later be judged on. The assessment should hold no content the student was never taught and never told to expect.

The foundation: criterion-referenced, not ranked against peers

Underneath all four principles sits one decision about how we judge. Army competence assessment is criterion-referenced. We measure each student against a fixed criterion, asking can this national do the thing to the standard, yes or no. We do not rank students against one another and pass the better half. That second approach, ranking against the group, is called norm-referenced assessment, and it is the wrong model for a force whose lives may depend on competence.

The reason is plain when you picture it. Suppose you assess first aid by passing the top half of each course. On a strong course, a competent national is referred merely for being slightly below their excellent peers. On a weak course, an unready national passes merely for being the least bad in a poor group. In both cases the badge means nothing, because it records a position in a group rather than a level of ability. Criterion-referenced assessment fixes the standard outside the group: the criterion is the same whether the whole course meets it or none of them do. If everyone reaches the standard, everyone passes, and that is a success, not a marking error. If no one does, no one passes, and the failure belongs to the training, not the candidates.

   NORM-REFERENCED                 CRITERION-REFERENCED
   (ranked against peers)          (judged against a fixed standard)

   Students:  A B C D E F          Students:  A B C D E F
              | | | | | |                     | | | | | |
   sorted by score, then           each measured against the
   pass the top portion            SAME fixed criterion line
                                    -------------------------- STANDARD
   pass --> A B C                   pass --> A   C   E F
   refer -> D E F                   refer ->   B   D
   (the line MOVES with             (the line is FIXED; the group
    the strength of the group)       does not move it)

   WRONG for competence            RIGHT for competence

Everything in this lesson follows from that single choice. Because we judge against fixed criteria, those criteria must measure the real outcome (valid), must be clear enough to judge the same way every time (reliable), must be applied to everyone alike (fair), and must be known to the student in advance (transparent). Hold to criterion-referencing and the four principles become the natural way to do the job. Abandon it, and no amount of careful marking will save the result.

In Practice: Corporal Adeyemi assesses the field dressing

Corporal Adeyemi has to assess eight nationals on applying a field dressing to a limb wound, the practical at the end of a basic first-aid module. The outcome states that the national can apply an effective dressing, unprompted, within ninety seconds.

She starts with validity. The outcome is a skill, so she will not set a written quiz; she will watch each national do it on a training partner acting as a casualty. To keep it realistic she has them work kneeling on the ground, after a short burst of physical activity so their hands are not perfectly steady, with the casualty acting distressed. She does not make it dangerous; her duty of care holds. She makes it real enough that a pass predicts the real task.

For reliability she uses the course marking scheme, six observable points, each judged met or not met: exposes the wound, applies enough pressure, positions the pad over the wound, secures it firmly without cutting off circulation, completes within ninety seconds, and reassures the casualty throughout. Every candidate gets the same casualty brief, the same time, the same single attempt, and no coaching during the test. Because the points are observable, another assessor watching beside her would tick the same boxes.

Fairness shapes how she handles two cases. One national is her own training partner from earlier in the course; she is careful to judge him against the six points exactly as she does the others, neither softer nor, overcorrecting, harder. Another national has limited movement in one wrist and has a genuine, recorded need; she allows him to brace against his knee, a reasonable adjustment to how he works, while still requiring the dressing to be effective and within time. The standard does not move; only his path to it does.

She handled transparency a week earlier. At the start of the module she told the course the exact outcome, the ninety-second standard, the six points, and the conditions. Nothing on assessment day is a surprise. When one national is referred for a dressing that slips loose, she can point to the secured-firmly criterion he did not meet and to the standard he was told about in advance. He is disappointed but he cannot call it unfair, because it was valid, reliable, fair, and transparent, and judged against a criterion that did not move for anyone.

Check Your Understanding

An instructor assesses whether a national can read a map and navigate to a point by giving a written test on the names of map symbols. Which principle of assessment does this most clearly fail, and why?
Explain the difference between criterion-referenced and norm-referenced assessment, and give one reason why a home-defence force must use the criterion-referenced approach for competence.
A candidate with a genuine reading difficulty has the written questions read aloud to them. Explain why this is a reasonable adjustment rather than a lowering of the standard, referring to the difference between how a candidate is assessed and what they must achieve.

Reflection (write a short paragraph): Think about a time you were assessed, in or out of the Army, where you did not know in advance exactly what was expected of you, or where you felt the same standard was not applied to everyone. Which of the four principles was missing, how did it affect your trust in the result, and what would you do differently as the assessor to put it right?

Summary

Good assessment rests on four principles, VRFT: valid (measures the real outcome, in realistic conditions), reliable (same result whoever assesses and whenever, through clear criteria and consistent conditions), fair (free of bias and favouritism, same standard for all, with reasonable adjustments that never lower the standard), and transparent (the student knows in advance what is required and how they will be judged).
Each principle is a test you apply to your own assessment before trusting its verdict. Failing any one makes the result unsafe, however much care went into it.
Army competence assessment is criterion-referenced: each national is judged against a fixed standard, not ranked against peers. Norm-referencing is the wrong model because it records a position in a group rather than a level of ability.
A reasonable adjustment changes how a candidate is assessed, not the standard they must reach.
This lesson is the knowledge layer; sound assessing is mastered by assessing under supervision and is signed off in person where supervision allows.
Builds on Lesson 01 · Why and What We Assess and leads into Lesson 03 · Methods of Assessment (choosing the method that makes an assessment valid) and Lesson 04 · Conducting an Assessment Fairly. Connects to TRG 301 · Methods of Instruction (stating outcomes so training and assessment match), TRG 320 · Practical Training Safety Officer (safe, realistic practical conditions), ADM 220 · Course Records and Qualification Tracking (recording trusted outcomes), and LDR 420 · Command Responsibility and Ethical Leadership (the integrity behind fair judgement).

The Principles of Good Assessment