video: pre-ship review fixes for the FFmpeg renderer

Six prod-blocking issues and three correctness improvements from an
independent code review of 7243ef7. Verified on Huawei Mate 20 (EMUI
11) — playback, rotation, replay-after-end all still work.

  - EAGAIN on avcodec_send_packet was silently dropping the input
    packet (SimpleDecoder consumed it before we could retry).
    ffmpeg_jni.cc now caches a frame drained from the output queue
    into pending_frame, retries the send, and the next
    ffmpegVideoReceiveFrame emits the cached frame in order before
    pulling a new one.
  - C.TIME_UNSET == Long.MIN_VALUE == AV_NOPTS_VALUE was an
    undocumented coincidence between two upstreams. Gate it
    explicitly so a future Media3 sentinel change can't scramble
    display-order PTS recovery.
  - supportsFormat parses the H.264 profile from format.codecs and
    rejects non-8-bit profiles (High 10 / High 4:2:2 / High 4:4:4).
    These initialise libavcodec cleanly and only fail at the first
    receive — too late for ExoPlayer to fall through to MediaCodec.
    Rejecting upfront lets the platform decoder pick them up.
  - build_ffmpeg.sh wraps the whole run in a portable mkdir-based
    lock and clones into a staging dir + atomic rename with a
    sentinel file. Concurrent Gradle daemons no longer corrupt
    each other; an interrupted clone leaves no usable state for
    the next run to mistake as finished.
  - FfmpegOutputSurface and VideoCompositor both used to call
    eglTerminate(EGL_DEFAULT_DISPLAY) on teardown. That display is
    process-global and shared — the first teardown killed the
    other consumer's surface. Drop both calls; per-context cleanup
    + eglReleaseThread is sufficient. Likely cause of any "frozen
    surface after second video" report.
  - Rotation swap in renderOutputBuffer mutates the public
    outputBuffer.width/height. Bound it to SURFACE_YUV output mode
    via a currentOutputMode tracker; YUV-mode consumers
    (VideoDecoderOutputBufferRenderer.setOutputBuffer) read
    width/height expecting CODED dims that match yuvStrides[0] —
    the swap would walk chroma off the end of the allocation.
  - Fragment shader bumped from mediump to highp. The limited-range
    pre-scale (y - 16/255) * (255/219) was at risk of quantizing
    through 10-bit mediump and banding dark gradients on older
    Mali / Adreno parts. highp on the fragment is universally
    supported on GLES2 implementations Android ships post-2014.
  - Threading config comment was wrong about what FF_THREAD_SLICE
    does for H.264. Replace with the accurate explanation (slice
    threading degenerates to single-threaded on iOS's single-slice
    encodes; FRAME threading is rejected because of the input-side
    latency, not because libavcodec doesn't support it).
  - FfmpegVideoDecoder header documents two known limits the
    review surfaced but that don't have a clean fix at this layer:
    EOS tail-frame loss (~500 ms truncation on first play-through
    only; replay is fine because flush_buffers clears libavcodec)
    and the size-based colorspace heuristic mislabelling iPhone
    6/7-era unspecified-metadata BT.601 1080p clips as BT.709.
This commit is contained in:
agra
2026-05-29 07:33:20 +03:00
parent 7243ef7de4
commit c0d55babf3
6 changed files with 206 additions and 36 deletions

View File

@@ -98,8 +98,13 @@ final class FfmpegOutputSurface {
// yuvj420p) and limited-range conversion. uSampleScale rescales the
// horizontal texture coordinate to skip the right-side padding that
// FFmpeg's SIMD-aligned linesize introduces (yStride >= width).
// highp on the fragment so the limited-range pre-scale
// `(y - 16/255) * (255/219)` doesn't quantize through 10-bit-ish
// mediump precision and band dark gradients on older Mali / Adreno
// parts. highp on a fragment shader is universally supported on
// GLES2 implementations Android ships post-2014.
private static final String FRAGMENT_SHADER =
"precision mediump float;\n"
"precision highp float;\n"
+ "varying vec2 vTex;\n"
+ "uniform sampler2D uY;\n"
+ "uniform sampler2D uU;\n"
@@ -409,7 +414,11 @@ final class FfmpegOutputSurface {
eglContext = EGL14.EGL_NO_CONTEXT;
}
EGL14.eglReleaseThread();
EGL14.eglTerminate(eglDisplay);
// NB: do NOT eglTerminate(EGL_DEFAULT_DISPLAY) here — the
// display is shared with VideoCompositor's EGL context, and
// tearing it down would silently kill the other consumer's
// surface. eglDestroyContext + eglReleaseThread is sufficient
// to clean up our share.
eglDisplay = EGL14.EGL_NO_DISPLAY;
}
quadBuffer = null;

View File

@@ -26,6 +26,27 @@ import java.util.List;
* directly onto libavcodec's avcodec_send_packet / avcodec_receive_frame
* lifecycle so we can drain multiple reordered frames out of a single
* input packet.
*
* <h3>Known limits</h3>
* <ul>
* <li><b>EOS trailing frames.</b> Media3's {@code SimpleDecoder}
* base class special-cases the end-of-stream input buffer and
* never invokes our {@code decode()} for it, so libavcodec's
* reorder buffer (~16 frames for iOS H.264 High@3.1) is never
* drained with {@code avcodec_send_packet(NULL)}. The last ~500 ms
* of a clip can be truncated on first play-through. Replay via
* {@code REPEAT_MODE_ONE} or {@code seekTo(0)} hits
* {@code avcodec_flush_buffers} which clears the queue, so the
* second play and onwards are full-length.</li>
* <li><b>Colorspace heuristic.</b> When the bitstream's
* {@code colorspace}/{@code primaries}/{@code transfer} are all
* unspecified we fall back to a size-based guess (BT.709 for >=
* 720p, BT.601 below). iPhone 6/7-era 1080p clips that recorded
* BT.601 with unspecified metadata get mislabelled — skin tones
* are slightly oversaturated. Modern iOS sets {@code bt709}
* explicitly so trusting the bitstream is correct for almost
* everything in circulation today.</li>
* </ul>
*/
@UnstableApi
public final class FfmpegVideoDecoder

View File

@@ -63,6 +63,7 @@ public final class FfmpegVideoRenderer extends DecoderVideoRenderer {
private int surfaceWidth = -1;
private int surfaceHeight = -1;
private int surfaceRotation = 0;
private @C.VideoOutputMode int currentOutputMode = C.VIDEO_OUTPUT_MODE_NONE;
public FfmpegVideoRenderer(
long allowedJoiningTimeMs,
@@ -111,6 +112,16 @@ public final class FfmpegVideoRenderer extends DecoderVideoRenderer {
if (!FfmpegLibrary.supportsFormat(mime)) {
return RendererCapabilities.create(C.FORMAT_UNSUPPORTED_SUBTYPE);
}
if (!supports8BitH264Profile(format.codecs)) {
// The YUV path only handles planar 4:2:0 8-bit (yuv420p /
// yuvj420p). High 10 / High 4:2:2 / High 4:4:4 / Main 10
// streams initialise libavcodec cleanly and only fail at the
// first receive — by then ExoPlayer has committed to this
// renderer and can't fall back. Reject upfront so the platform
// MediaCodec path (which often handles these via hardware) gets
// selected instead.
return RendererCapabilities.create(C.FORMAT_UNSUPPORTED_SUBTYPE);
}
if (format.cryptoType != C.CRYPTO_TYPE_NONE) {
return RendererCapabilities.create(C.FORMAT_UNSUPPORTED_DRM);
}
@@ -118,6 +129,26 @@ public final class FfmpegVideoRenderer extends DecoderVideoRenderer {
C.FORMAT_HANDLED, ADAPTIVE_SEAMLESS, TUNNELING_NOT_SUPPORTED);
}
/**
* Parses the H.264 profile from an avc1 codec string (e.g.
* {@code avc1.640028}). Accepts the 8-bit YUV 4:2:0 profiles —
* Baseline (0x42), Main (0x4D), Extended (0x58), High (0x64) —
* and rejects everything else. When the codec string is missing or
* malformed we permit it: the worst case is a hard fail at decode
* time, which is no worse than today's behaviour.
*/
private static boolean supports8BitH264Profile(@Nullable String codecs) {
if (codecs == null) return true;
String lower = codecs.toLowerCase();
if (!lower.startsWith("avc1.") || lower.length() < 11) return true;
try {
int profile = Integer.parseInt(lower.substring(5, 7), 16);
return profile == 0x42 || profile == 0x4D || profile == 0x58 || profile == 0x64;
} catch (NumberFormatException e) {
return true;
}
}
@Override
protected FfmpegVideoDecoder createDecoder(Format format, @Nullable CryptoConfig cryptoConfig)
throws FfmpegDecoderException {
@@ -134,16 +165,22 @@ public final class FfmpegVideoRenderer extends DecoderVideoRenderer {
/// Pre-swap buffer dims for 90°/270° rotated streams so the
/// {@code maybeNotifyVideoSizeChanged} call inside the base
/// renderOutputBuffer reports DISPLAY-orientation dimensions (matching
/// what MediaCodecVideoRenderer does for the hardware path). Without
/// this swap, portrait iOS videos report their coded landscape size
/// and the downstream compositor lays out the Flutter texture
/// rotated.
/// renderOutputBuffer reports DISPLAY-orientation dimensions
/// (matching what MediaCodecVideoRenderer does for the hardware
/// path). Without this swap, portrait iOS videos report their
/// coded landscape size and the downstream compositor lays out the
/// Flutter texture rotated. The swap is bounded to SURFACE_YUV
/// output mode because YUV-mode consumers
/// ({@code VideoDecoderOutputBufferRenderer.setOutputBuffer}) read
/// {@code buffer.width}/{@code height} expecting CODED dimensions
/// that match {@code yuvStrides[0]} — swapping there would walk
/// the chroma planes off the end of the allocation.
@Override
protected void renderOutputBuffer(
VideoDecoderOutputBuffer outputBuffer, long presentationTimeUs, Format outputFormat)
throws DecoderException {
if (outputFormat != null
if (currentOutputMode == C.VIDEO_OUTPUT_MODE_SURFACE_YUV
&& outputFormat != null
&& (outputFormat.rotationDegrees == 90 || outputFormat.rotationDegrees == 270)
&& outputBuffer.width != outputBuffer.height) {
int tmp = outputBuffer.width;
@@ -183,6 +220,7 @@ public final class FfmpegVideoRenderer extends DecoderVideoRenderer {
@Override
protected void setDecoderOutputMode(@C.VideoOutputMode int outputMode) {
currentOutputMode = outputMode;
if (decoder != null) {
decoder.setOutputMode(outputMode);
}
@@ -204,6 +242,7 @@ public final class FfmpegVideoRenderer extends DecoderVideoRenderer {
protected void onDisabled() {
releaseOutputSurface();
decoder = null;
currentOutputMode = C.VIDEO_OUTPUT_MODE_NONE;
super.onDisabled();
}

View File

@@ -387,7 +387,11 @@ internal class VideoCompositor(
EGL14.eglDestroyContext(eglDisplay, eglContext)
eglContext = EGL14.EGL_NO_CONTEXT
}
EGL14.eglTerminate(eglDisplay)
EGL14.eglReleaseThread()
// NB: do NOT eglTerminate(EGL_DEFAULT_DISPLAY) here — the
// display is shared with FfmpegOutputSurface's EGL context,
// and tearing it down would silently kill the other consumer's
// window surface. Per-context cleanup above is enough.
eglDisplay = EGL14.EGL_NO_DISPLAY
}
}