Blog

How OET Speaking is scored: the 9 criteria

Behind the score report: how the 9 OET Speaking criteria are graded, what each one rewards, and where candidates most often lose marks.

5 min readBy OET Live

When you finish an OET Speaking sub-test, two examiners score your role-play independently, each on nine separate criteria. The criteria are scored 0–5. The scores are averaged. The aggregate determines your 0–500 numeric score and your band letter.

That much is public. What's less obvious is what each criterion is actually rewarding — and where most candidates lose marks. This post is a walk-through of all 9 criteria with examples of what high-scoring and low-scoring versions sound like.

If you want the format basics first — timing, role-cards, who the interlocutor is — start with our OET Speaking sub-test format guide.

The four linguistic criteria

These are the ones IELTS-trained candidates expect. They're scored less harshly under exam stress than the clinical criteria, but they're also the slowest to improve, so under-investing here is a long-term mistake.

1. Intelligibility

What it rewards: being understandable at the sound and stress level. Not accent reduction — clarity.

  • Low-scoring: sound substitutions that change meaning ("pleased" → "please"), stress on the wrong syllable that makes a word unrecognisable ("conTRAct" instead of "CONtract"), running words together so badly that the listener has to pause and re-parse.
  • High-scoring: any accent, as long as a non-native English listener could follow you without effort.

You don't need to sound American or British. You need to sound clear.

2. Fluency

What it rewards: real conversational tempo, with hesitations roughly where a native speaker would put them.

  • Low-scoring: long pauses while you assemble a sentence, fillers ("uh", "um", "you know") above 1–2 per minute, false starts.
  • High-scoring: thinking pauses at clause boundaries, not mid-clause; tempo that matches the conversation tempo.

3. Appropriateness of language

What it rewards: language register that fits a clinical conversation — neither too casual nor too clinical for the patient in front of you.

  • Low-scoring: jargon dropped into a patient explanation without unpacking, contractions in moments that demand formality, hedging language that's too weak ("might could").
  • High-scoring: medical accuracy without inscrutability, formal register with warmth.

4. Resources of grammar and expression

What it rewards: range and accuracy of grammatical structures, including the harder ones (conditionals, passives in clinical context, modal verbs to soften).

  • Low-scoring: every sentence is simple present tense, occasional tense errors that distort meaning, missing articles that hurt clarity.
  • High-scoring: comfortable use of conditionals ("If your symptoms worsen, we would want to..."), passives where appropriate, modals to soften ("It might be a good idea to...").

The five clinical communication criteria

These are the ones candidates most often under-prepare for. They are also where you can move your band fastest with focused practice.

5. Relationship building

What it rewards: greeting, rapport, empathy markers, closing. Treating the patient as a person, not a case.

  • Low-scoring: launches straight into history-taking without acknowledging the patient, no empathy markers when the patient discloses something difficult, closes the consultation transactionally.
  • High-scoring: greets and orients warmly, acknowledges concerns ("That sounds really difficult", "I can see why you're worried"), closes with care ("Before you go — anything else on your mind?").

6. Understanding the patient perspective

What it rewards: actively eliciting concerns, ideas, expectations, and fears. Listening for what they actually came in worried about.

This is where most candidates lose the most marks. The "ICE" framework (Ideas, Concerns, Expectations) is your friend.

  • Low-scoring: tells the patient what to do without asking what they're worried about, ignores cues the patient drops ("I'm not really sure I want to take medicine for this"), doesn't elicit fears.
  • High-scoring: asks open questions early ("What's been on your mind about this?"), follows up on dropped cues, names the unspoken concern ("It sounds like you're worried this might be something more serious — is that right?").

7. Providing structure

What it rewards: signposting, transitions, agenda-setting. Letting the patient know what's about to happen.

This feels redundant to non-native speakers. It is not redundant to examiners.

  • Low-scoring: no announced agenda, abrupt transitions between history-taking and explanation, no signposting before introducing a new topic.
  • High-scoring: opens with a brief agenda ("I'd like to ask you a few questions about the pain, then talk about what we might do next"), signposts transitions ("Now I'd like to ask about..."), summarises before moving on.

8. Information gathering

What it rewards: using open and closed questions appropriately, screening for red flags, summarising back to check understanding.

  • Low-scoring: leans entirely on closed yes/no questions, misses red flags, doesn't summarise.
  • High-scoring: opens with open questions, narrows with closed questions, summarises ("Let me make sure I have this right..."), screens for the things you'd ask in a real consultation.

9. Information giving

What it rewards: chunking information, checking understanding, controlling jargon.

  • Low-scoring: information firehose, no comprehension checks, jargon used without unpacking.
  • High-scoring: small chunks, comprehension checks between chunks ("Does that make sense so far?"), jargon translated immediately ("hypertension — that just means high blood pressure").

Where the leverage is

Across our calibration sets, the criteria most commonly under-performed are, in rough order:

  1. Understanding the patient perspective — under-elicited concerns
  2. Providing structure — missing signposting
  3. Information giving — too much jargon, too few checks

If you have limited prep time, drilling phrases for those three criteria delivers the highest band lift per hour. The linguistic criteria are slower to move, and intelligibility/grammar errors are forgiven more under exam stress than the clinical ones.

How OET Live scores your role-plays

We use the public rubric. Each of your role-plays is independently scored on the same 9 criteria, 0–5, by a model calibrated against examiner-scored calibration sets. We then aggregate to a 0–500 score and band letter, and we show you the per-criterion breakdown — which the official OET score report does not.

We also pull the specific quote from your transcript that earned (or lost) the criterion. So when we say "your understanding the patient perspective score was 3/5", we show you the place where you should have followed up on a cue and didn't.

If that's the feedback you wish your OET tutor had time to give, join the waitlist.

Further reading

Want the rest of the story? Get on the waitlist.

We email everyone on the waitlist when there's something worth showing — and stay quiet when there isn't.