Where in the pipeline should the gate run?

Immediately after caption generation and normalization but before packaging or mux, running in parallel with A/V encoding while declared a required predecessor of every downstream delivery step.

What exit code should the gate return?

0 for compliant (proceed), 1 for a compliance failure (block the merge or deploy), and 2 for gate misuse or a crash so the failure routes to engineering rather than the captioning team.

Why does a gate pass locally but fail on the CI runner?

Usually a different pycaption or ffprobe version on the runner changes microsecond rounding or format auto-detection. Pin exact versions in the lockfile and install from pre-built wheels.

How do I stop drop-frame timecode causing false drift failures?

Normalize the timebase at parse time so drop-frame and non-drop are handled correctly before the gate runs; otherwise ~3.6 seconds per hour of phantom drift trips the +/-100 ms tolerance.

CI/CD Gating for Caption Builds

Caption builds fail closed or they do not fail at all. When caption artifacts are generated programmatically — by speech-to-text engines, vendor APIs, or formatting scripts — a malformed payload that slips past review is not a cosmetic defect: it is a FCC 47 CFR § 79.1 violation baked into the delivery container, discoverable only at playout or during an audit. The engineering gap this page closes is the absence of a hard, deterministic enforcement boundary between caption generation and delivery. A CI/CD gate is that boundary: a stateless job that parses the caption artifact, asserts every cue against a numeric rule set, and returns a non-zero exit code the moment any hard threshold is breached, halting the pipeline before the asset reaches a mux, an MAM, or a CDN origin. This is the build-time enforcement arm of the broader Automated QC Validation & Reporting control layer.

The quantified contract is simple to state and unforgiving to enforce: zero cues over the reading-rate ceiling, zero cumulative timing drift beyond the regulatory tolerance, zero line-length overflows, and zero overlaps within a layout region. Any one of these conditions sets the exit code to non-zero, and a non-zero exit code blocks the merge or deploy.

Problem Framing: The Build Exit Contract

A monitoring tool that flags a bad caption file after delivery is a detective control; a CI gate is a preventive control, and the difference is the exit code. The gate runs inside the same job context as the build, reconstructs the timing grid and control codes of the artifact in memory, and converts every compliance question into a boolean assertion. The output is not a dashboard — it is a process exit status that the CI runner already knows how to act on: 0 advances the pipeline, any non-zero value fails the stage and, in a protected merge queue, blocks the merge.

Three properties make this enforceable rather than advisory. First, the gate is deterministic — the same caption bytes always produce the same verdict, so a green build is reproducible evidence. Second, it is stateless — no cross-asset context leaks between jobs, so a passing asset cannot be contaminated by a previous failing one. Third, it is auditable — every fail carries the cue index, the timestamp, and the clause it violated, so the JSON the gate emits is admissible in an FCC Part 79 compliance inquiry. These mirror the design constraints of the parent QC layer, narrowed to the single question a build asks: ship or block?

Pipeline Stage & Prerequisites

The gate executes immediately after caption generation and normalization, but strictly before packaging, muxing, or delivery. At that inflection point the caption artifact (SCC, SRT, TTML, or WebVTT) is final but the delivery container has not yet been written, so a failure costs a rerun rather than a recall. The validation job should run in parallel with audio/video encoding to preserve pipeline velocity, but it must be declared a required predecessor of every downstream packaging step. The normalized cue model it consumes is the same canonical structure produced by the SRT, SCC & WebVTT parsing workflows — the gate validates that model, it does not re-implement parsing.

Required tooling:

Tool / Library	Version	Role in the gate
Python	≥ 3.9	Runtime; `dataclasses` and `:=` used below
`pycaption`	≥ 1.0.6	Multi-format reader (SCC/SRT/TTML/WebVTT) → normalized cues
`pysrt`	≥ 1.1.2	Fast SRT cue access for SRT-only pipelines
`ffprobe` (FFmpeg)	≥ 4.4	Reference PTS/timebase for drift checks (no full decode)
`numpy`	≥ 1.21	Vectorized drift/rate math on large assets

Pin these in the job’s lockfile and install from pre-built wheels — compiling on an ephemeral runner adds minutes to every gate and is the most common cause of a “slow but green” pipeline. For SCC ingest the gate relies on the state-machine decode covered in parsing SCC with Python libraries; for SRT it assumes the cues have already passed SRT timestamp normalization so that comparisons run on monotonic, frame-quantized times.

Step-by-Step Implementation

Step 1 — Load and normalize the payload

Load the artifact into pycaption’s normalized cue model so the rest of the gate is format-agnostic. The reader auto-detects format, but explicit dispatch keeps the gate’s failures legible.

import sys
from pycaption import SCCReader, SRTReader, WebVTTReader, DFXPReader

READERS = {
    "scc": SCCReader,     # CEA-608 byte pairs
    "srt": SRTReader,     # SubRip
    "vtt": WebVTTReader,  # W3C WebVTT
    "ttml": DFXPReader,   # TTML / IMSC1 / SMPTE-TT
}

def load(payload: str, fmt: str):
    fmt = fmt.lower()
    if fmt not in READERS:
        raise ValueError(f"Unsupported caption format: {fmt}")
    # pycaption stores all times in microseconds internally
    return READERS[fmt]().read(payload)

Step 2 — Assert each cue against the numeric rule set

Iterate the normalized cues and convert every compliance rule into an explicit assertion. Each check cites the clause it enforces; the threshold values themselves live in the reference table below so they are tuned in one place.

import re
from dataclasses import dataclass, field

@dataclass
class GateConfig:
    max_cps: float = 20.0           # FCC 47 CFR § 79.1(j) readability; 17-20 cps practical ceiling
    min_cue_sec: float = 1.5        # decoder anti-flicker floor (CEA-608 render stability)
    max_line_chars: int = 32        # CEA-608 fixed 32-column grid
    max_drift_ms: float = 100.0     # FCC § 79.1 synchronicity tolerance (hard fail)

@dataclass
class Violation:
    code: str
    detail: str
    cue_us: int

def visible_len(text: str) -> int:
    # Count rendered glyphs only; whitespace does not consume decoder bandwidth
    return len(re.sub(r"\s+", "", text))

def check_cue(cue, cfg: GateConfig) -> list[Violation]:
    start_us, end_us = cue.start, cue.end
    duration_s = (end_us - start_us) / 1_000_000
    out: list[Violation] = []

    # CEA-608 render stability: sub-1.5s cues flicker on hardware decoders
    if duration_s < cfg.min_cue_sec:
        out.append(Violation("SHORT_CUE", f"{duration_s:.2f}s", start_us))

    text = cue.get_text()
    for line in text.splitlines():
        # CEA-608 — fixed 32-column safe-area grid; overflow is silently clipped by decoders
        if len((stripped := line.strip())) > cfg.max_line_chars:
            out.append(Violation("LINE_OVERFLOW", f"{len(stripped)} chars", start_us))

    # FCC 47 CFR § 79.1(j) — reading rate must stay readable
    cps = visible_len(text) / duration_s if duration_s > 0 else float("inf")
    if cps > cfg.max_cps:
        out.append(Violation("CPS_VIOLATION", f"{cps:.1f} cps", start_us))

    return out

Step 3 — Detect overlaps and cumulative drift across the track

Per-cue checks miss two whole-track failures: overlapping cues in the same region, and synchronization drift that accumulates over the program. Sort once, then sweep.

def check_track(captions, cfg: GateConfig) -> list[Violation]:
    out: list[Violation] = []
    for lang in captions.get_languages():
        cues = captions.get_captions(lang)
        for cue in cues:
            out.extend(check_cue(cue, cfg))

        # Zero-tolerance overlap detection within a layout region
        for prev, nxt in zip(cues, cues[1:]):
            if nxt.start < prev.end:  # CEA-608/708 — overlapping cues corrupt the buffer
                out.append(Violation("OVERLAP", f"{(prev.end - nxt.start)/1000:.0f}ms", nxt.start))
    return out

For drift the gate compares caption onsets to reference video PTS from ffprobe; the correlation-window and smoothing details belong to automated sync drift detection, and the gate simply asserts the cumulative result stays inside cfg.max_drift_ms.

Step 4 — Emit the exit contract and a structured report

The entry point reads the artifact from standard input — which is what makes the gate composable in shell pipelines and containers — writes a machine-readable report, and translates the violation list into the exit code the runner acts on.

import json

def run_gate(payload: str, fmt: str, cfg: GateConfig = GateConfig()) -> int:
    captions = load(payload, fmt)
    violations = check_track(captions, cfg)

    report = {
        "format": fmt,
        "passed": not violations,
        "violation_count": len(violations),
        "violations": [v.__dict__ for v in violations],
    }
    # Report to stdout for archival; human-readable fails to stderr
    print(json.dumps(report))
    for v in violations:
        print(f"[FAIL] {v.code} {v.detail} @ {v.cue_us} us", file=sys.stderr)

    return 1 if violations else 0  # non-zero blocks the build/merge

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: caption_gate.py <scc|srt|vtt|ttml>  < artifact", file=sys.stderr)
        sys.exit(2)  # exit 2 = misuse, distinct from 1 = compliance failure
    sys.exit(run_gate(sys.stdin.read(), sys.argv[1]))

The three-state exit contract matters: 0 = compliant (proceed), 1 = compliance failure (block), 2 = gate misuse or crash (block, but flag for engineering rather than caption review). A CI step that treats every non-zero identically will route a runner crash to the captioning team and waste a triage cycle. Runner configuration, format matrices, and conditional deploy triggers for the GitHub Actions side are in integrating caption QC into GitHub Actions CI.

Threshold Reference Table

Every limit the gate asserts, with its source. Tune GateConfig against this table, not against prose.

Check	Hard limit	Soft warn	Source / clause	Notes
Reading rate (sustained)	20 cps	17 cps	FCC 47 CFR § 79.1(j)	Language-agnostic; preferred over WPM
Reading rate (burst)	30 cps	—	CEA-608 decoder bandwidth	Short EAS overlays only
CEA-608 byte budget	29.97 B/s	—	CEA-608 / line 21	Pacing ceiling per airtime second
Line length (608)	32 chars	28 chars	CEA-608 column grid	Overflow is silently clipped
Min cue duration	1.5 s	2.0 s	Decoder render stability	Sub-floor cues flicker
Sync drift (cumulative)	±100 ms	±50 ms	FCC § 79.1 synchronicity	29.97 fps ≈ ±3 frames
Sync drift (60p)	±33.3 ms	±16.7 ms	SMPTE ST 12-1 timecode	Progressive pipelines
Cue overlap (same region)	0 ms	—	CEA-608/708 buffer model	Zero tolerance

Reading-rate enforcement — the tokenization and non-destructive remediation behind the cps numbers — is detailed in enforcing character rate limits in QC.

Verification & Test Pattern

A gate that has never failed on a known-bad fixture is untested infrastructure. Pin both a clean fixture (must exit 0) and a poisoned fixture that breaches each threshold (must exit 1), and assert on the violation codes so a refactor cannot silently weaken a check.

import pytest
from caption_gate import run_gate, GateConfig, load, check_track

OVER_RATE_VTT = """WEBVTT

00:00:00.000 --> 00:00:00.800
Far too many visible characters packed into well under a second of airtime here
"""

CLEAN_VTT = """WEBVTT

00:00:01.000 --> 00:00:03.500
A short, well-paced line.
"""

def codes(payload, fmt):
    return {v.code for v in check_track(load(payload, fmt), GateConfig())}

def test_clean_payload_passes():
    assert run_gate(CLEAN_VTT, "vtt") == 0

def test_dense_cue_fails_rate_and_duration():
    found = codes(OVER_RATE_VTT, "vtt")
    # one dense, sub-1.5s cue must trip BOTH the rate and the duration floor
    assert "CPS_VIOLATION" in found
    assert "SHORT_CUE" in found

def test_misuse_returns_two():
    with pytest.raises(SystemExit):
        run_gate("garbage", "xyz")  # unsupported fmt raises before exit math

Run this suite as its own CI step before the gate is ever pointed at production artifacts; reusable drift-threshold fixtures are a recurring need across the QC pages and follow the same fixture-per-clause discipline.

Troubleshooting / Failure Modes

Gate passes locally, fails on the runner : Root cause: the runner installed a different pycaption/ffprobe version, changing microsecond rounding or auto-detection. Fix: pin exact versions in the lockfile and install from wheels; never rely on the runner’s system FFmpeg.

OOM SIGKILL on feature-length assets : Root cause: get_captions() materializes the full cue list; on long programs across a matrix of jobs the resident set climbs past the ~7 GB runner ceiling. Fix: process one language/reel at a time, drop references between iterations, and trigger gc.collect() between assets.

Drift false positives at GOP boundaries : Root cause: comparing raw caption onsets to keyframe-only PTS makes transient jitter look like drift. Fix: assert on the smoothed cumulative offset from automated sync drift detection, not per-cue deltas.

Drop-frame timecode read as non-drop : Root cause: SCC drop-frame (;) parsed as non-drop (:) injects ~3.6 s/hr of phantom drift, tripping the ±100 ms gate on otherwise compliant files. Fix: normalize timebase at parse time via SRT timestamp normalization before the gate runs.

Build hangs in the merge queue : Root cause: a malformed artifact sends a reader into a pathological parse with no timeout, holding the runner. Fix: enforce a 3–5 minute step timeout and treat a timeout as exit 2 (engineering), not a caption failure.

Every non-zero exit blames the captioning team : Root cause: CI treats 1 and 2 identically. Fix: branch on the exit code — route 1 to caption review and 2 to pipeline engineering.

Operational Notes

At single-file scale the gate is trivial; at multi-language, multi-format delivery it becomes a throughput problem. Move from one-file invocation to batch-aware orchestration with a bounded worker pool — size workers to roughly min(cpu_count, 7GB / peak_asset_RSS) so a matrix of jobs cannot collectively exhaust the runner. Read artifacts from stdin or object storage rather than copying onto the runner disk, and validate in generator-driven chunks so memory scales with cue-window size, not asset duration; the streaming approach in async batch caption processing is the reference for that I/O pattern.

The JSON report from Step 4 should not die with the job. Route it to the scheduled QC report generation layer so per-build verdicts roll up into the audit record, and retain gated artifacts for the legally required period — the retention and tamper-evidence design lives in secure caption pipeline design. Cache dependencies and pre-compiled wheels to keep the gate inside its time budget, and keep the gate itself stateless so it can scale horizontally without coordination.

Integrating caption QC into GitHub Actions CI — runner config, format matrix and exit-code wiring for the gate.
Automated sync drift detection — the smoothed PTS-alignment the drift assertion depends on.
Enforcing character rate limits in QC — cps/WPM tokenization behind the reading-rate threshold.
Scheduled QC report generation — where the gate’s JSON verdicts are aggregated and archived.
SCC vs SRT vs WebVTT architecture — format tradeoffs that set per-format gate thresholds.

Part of: Automated QC Validation & Reporting — the deterministic caption QC and reporting reference.

CI/CD Gating for Caption Builds

Continue reading

Related in QC & Reporting