Why must UK subtitle timecodes sit on a 40 ms boundary?

UK broadcast runs on a 25 fps PAL grid, so the smallest legal time step is 1000/25 = 40 ms. Timestamps off that boundary are truncated or interpolated unpredictably by hardware decoders, which can drop a frame from a cue and breach the Ofcom 1.0 s minimum dwell.

What are the Ofcom minimum and maximum on-screen dwell times?

A subtitle block must stay on screen for at least 1.0 second (25 frames) and no longer than 7.0 seconds (175 frames). Cues under the floor are stretched; cues over the ceiling are split.

What is the minimum gap between consecutive subtitle cues?

At least 250 ms, which is 6.25 frames at 25 fps. Because frames are integers, round up to 7 frames (280 ms) so the gap stays at or above the floor while landing on the grid and giving the decoder time to clear its render buffer.

How do I stop floating-point drift in subtitle timing?

Operate exclusively on integer frame counts. Snap each millisecond value to the nearest frame on input and reconstruct milliseconds by multiplying frames by 40 on output, so off-grid values are unrepresentable.

Ofcom subtitle timing requirements explained

UK broadcast delivery quantises every subtitle cue to a 25 fps grid, which fixes a 40 ms frame boundary as the smallest legal time step. The exact failure this page solves is the QC rejection that fires when a theoretically valid SRT or WebVTT file carries timestamps that do not sit on that 40 ms boundary — the residue of IEEE 754 double-precision export from an ASR engine or a datetime.timedelta round-trip. Once cues are off-grid, the Ofcom Code on Subtitling Standards thresholds (minimum on-screen dwell of 1.0 s / 25 frames, maximum dwell of 7.0 s / 175 frames, and a minimum inter-cue gap of 250 ms) can no longer be enforced deterministically: a block that measures 1.000 s in floating point can render as 24 frames after the decoder truncates, breaching the floor by one frame. The fix is to abandon float seconds entirely and operate on integer frame counts.

from dataclasses import dataclass
from typing import Iterator, List, Tuple

# UK 25 fps PAL delivery grid — EBU N19 / Ofcom Code on Television Access Services
FPS            = 25
FRAME_MS       = 40     # 1000 ms / 25 fps — the only legal time step
MIN_DUR_FRAMES = 25     # Ofcom: minimum 1.0 s on-screen dwell  (25 frames)
MAX_DUR_FRAMES = 175    # Ofcom: maximum 7.0 s on-screen dwell  (175 frames)
MIN_GAP_FRAMES = 7      # Ofcom: >=250 ms inter-cue gap; 6.25 -> round up to 7 (280 ms)

@dataclass(frozen=True)
class SubtitleCue:
    index: int
    start_ms: int
    end_ms: int
    text: str

def to_frames(ms: int) -> int:
    """Snap a millisecond value to the nearest 40 ms grid, return integer frames."""
    return round(ms / FRAME_MS)            # nearest-frame rounding kills float drift

def to_ms(frames: int) -> int:
    return frames * FRAME_MS               # exact: every value lands on the grid

def quantize_and_validate(
    cues: List[SubtitleCue],
) -> Iterator[Tuple[SubtitleCue, str]]:
    """Yield (grid-aligned cue, flag). All maths is integer-frame, never float seconds."""
    prev_end = 0                           # running end position, in frames
    for cue in cues:
        start = to_frames(cue.start_ms)
        end   = to_frames(cue.end_ms)
        flags: List[str] = []

        if (end - start) < MIN_DUR_FRAMES:           # Ofcom 1.0 s floor
            end = start + MIN_DUR_FRAMES
            flags.append("MIN_DWELL_EXTENDED")
        if (end - start) > MAX_DUR_FRAMES:           # Ofcom 7.0 s ceiling -> force split
            end = start + MAX_DUR_FRAMES
            flags.append("MAX_DWELL_CLAMPED")
        if prev_end and (start - prev_end) < MIN_GAP_FRAMES:   # Ofcom inter-cue gap
            start = prev_end + MIN_GAP_FRAMES
            if (end - start) < MIN_DUR_FRAMES:       # re-assert dwell after the shove
                end = start + MIN_DUR_FRAMES
            flags.append("GAP_ENFORCED")
        if abs(to_ms(start) - cue.start_ms) >= FRAME_MS:       # was off-grid on input
            flags.append("OFFGRID_CORRECTED")

        prev_end = end
        yield SubtitleCue(cue.index, to_ms(start), to_ms(end), cue.text), \
            "|".join(flags) or "COMPLIANT"

Code walkthrough

to_frames is the entire defence against drift. Dividing the input millisecond value by FRAME_MS (40) and applying round collapses any sub-frame residue — a 1003 ms ASR timestamp and a 997 ms one both resolve to frame 25 — so two pipelines that disagree at the microsecond level still emit byte-identical grid output. Reconstruction via to_ms multiplies back by 40, which guarantees every exported timestamp is an exact multiple of the frame interval; nothing downstream can be off-grid because off-grid values are unrepresentable in the integer domain.

The three threshold blocks enforce the Ofcom Code in dependency order. Minimum dwell is checked first because both later corrections can shorten a cue: a cue under 25 frames is stretched at its tail to exactly 1.0 s. Maximum dwell clamps any cue over 175 frames to the 7.0 s ceiling and raises MAX_DWELL_CLAMPED so a downstream reflow step knows the block is a split candidate rather than silently truncated. The gap check is last because it is the only rule that mutates start: Ofcom requires a clean separation between consecutive cues so the decoder can flush its render buffer, and 250 ms is 6.25 frames, which is not an integer — rounding up to 7 frames (280 ms) is the only way to stay at or above the floor while landing on the grid. After shoving start forward, dwell is re-asserted so the gap correction can never produce a sub-1.0 s block.

prev_end carries state in frames, never milliseconds, so the inter-cue arithmetic (start - prev_end) is exact integer subtraction with no accumulated error across a multi-hour asset. The OFFGRID_CORRECTED flag is diagnostic: it fires whenever the input start differed from the grid-snapped value by a frame or more, which is the signature of upstream float export and the field you grep for when triaging where drift entered the pipeline. The generator yields one cue at a time, so memory is bounded by a single cue regardless of asset length — a multi-hour VOD manifest streams through with O(1) overhead.

Edge cases & known gotchas

Non-25 fps sources. A 23.976 or 29.97 fps caption file snapped to the 25 fps grid will look aligned but be systematically offset — convert and re-base timecodes before quantising, via SRT timestamp normalization, or the OFFGRID_CORRECTED flag will fire on every cue.
Cascade overflow. Enforcing the gap on a dense dialogue run pushes each start forward, which can cascade and drift the tail of a sequence later than its audio; cap the cumulative shove and flag the block for a manual reflow rather than letting it walk.
Maximum-dwell clamp loses text. MAX_DWELL_CLAMPED only fixes the timing; the on-screen text still needs splitting into two blocks. Treat the flag as a work item, not a resolution.
round() banker’s rounding. Python’s round uses round-half-to-even, so a value exactly on a half-frame (e.g. 20 ms) snaps to the nearest even frame. This is deterministic and fine for QC, but document it — a naive reviewer expecting round-half-up will see a one-frame difference.
Legitimate long cues. Lyric and karaoke assets routinely exceed 7.0 s by design; gate MAX_DWELL_CLAMPED behind a per-genre override instead of hard-clamping every block.

Integration hook

This quantiser is the calibration stage that runs immediately after parse and before the rule engine in the Ofcom Code on Subtitling Standards validator: that cluster’s validate_ofcom step assumes its input is already on the 40 ms grid when it computes reading speed and dwell, so this snap must run first. Feed its grid-aligned SubtitleCue stream straight into that validator, and route the emitted flags into the same JSON verdict so an OFFGRID_CORRECTED count becomes a tracked QC metric rather than a silent repair.

Ofcom Code on Subtitling Standards — parent reference: the full reading-speed, line-geometry and sync rule set this grid feeds.
SRT timestamp normalization — the frame-rate re-basing that must precede quantisation for non-PAL sources.
Converting SCC to SRT without timing loss — drop-frame-to-25fps conversion that produces the off-grid timestamps this page corrects.
Detecting sync drift in automated QC pipelines — catches the cumulative offset that survives per-cue snapping.

Part of: Broadcast Captioning Architecture & Compliance — the regulatory-thresholds-as-code reference for broadcast and OTT caption pipelines.

Ofcom subtitle timing requirements explained

Code walkthrough

Edge cases & known gotchas

Integration hook

Related