video: vendor FFmpeg software AVC renderer

Adds an LGPL FFmpeg-backed video renderer that slots ahead of Media3's
MediaCodecVideoRenderer via EXTENSION_RENDERER_MODE_PREFER. Resolves
playback failures on Huawei EMUI 11 (Mate 20, Kirin 980): the Codec2
HiSilicon AVC decoder initialises cleanly on iOS High@3.1 streams with
deep DPB + full-range yuvj420p, then errors on the first sample inside
MediaCodecVideoRenderer (init-failure fallback can't catch this).
Google's C2 SW AVC decoder hits its 8-frame output-delay cap on the
same shape and stalls on dequeueOutputBuffer.

Media3's own decoder-ffmpeg ships only an audio renderer;
ExperimentalFfmpegVideoRenderer has been a stub since 2020 (returns
FORMAT_UNSUPPORTED_TYPE, createDecoder returns null). NextLib is
GPL-3.0. So we vendor our own Apache-licensed JNI on top of LGPL
FFmpeg, dynamically linked at runtime.

Build flow:
  - android/ffmpeg/ holds the JNI source + CMakeLists + orchestrator
    script + LGPL notice. No native binaries in git.
  - :ux:buildFfmpegJni Gradle task (wired to preBuild) clones
    Media3 1.9.2 + FFmpeg release/6.0 into build/ffmpeg-work/ on
    first run, builds h264-only static libs per ABI, links
    libffmpegJNI.so per ABI into build/jniLibs/<abi>/. AGP picks
    them up via sourceSets.main.jniLibs.srcDirs +=. Gradle
    UP-TO-DATE skips the task when ffmpeg_jni.cc / CMakeLists /
    build_ffmpeg.sh are unchanged.

Renderer:
  - FfmpegVideoDecoder (SimpleDecoder) sends each packet with its
    inputBuffer.timeUs as pkt->pts; the JNI overwrites
    outputBuffer.timeUs with f->pts on receive so frames emitted in
    display order carry their true display PTS (input PTS in decode
    order scrambles ExoPlayer's drop logic and halves the render
    rate on B-frame streams).
  - FfmpegOutputSurface does YUV->RGB in one GLES2 pass against an
    EGL window surface sized to display orientation. Y plane uses
    GL_NEAREST (1:1 sized, sampling at exact texel centres
    preserves luma detail); chroma uses GL_LINEAR. Pre-rotated quad
    UVs (0/90/180/270) keep the YUV sampling correct when the
    coded frame needs rotation for display.
  - FfmpegVideoRenderer swaps the output buffer's width/height for
    90/270 streams before super.renderOutputBuffer notifies size,
    matching MediaCodecVideoRenderer's post-rotation reporting.

Decoder fallback:
  - Renderers.kt selects FfmpegVideoRenderer first when
    libffmpegJNI.so is loaded; falls through to the platform path
    for formats FFmpeg doesn't handle or ABIs without the .so.
  - MediaCodec selector deprioritises every HiSilicon decoder
    (OMX.hisi.* and c2.hisi.*) so the platform path picks
    c2.android.avc.decoder ahead of the C2 Hisi variant when FFmpeg
    isn't available. Required because the C2 Hisi failure is
    post-init, which Media3's setEnableDecoderFallback(true) can't
    intercept.

Compositor:
  - VideoCompositor.setInputSurfaceSize lets the renderer resize the
    codec-input SurfaceTexture before eglCreateWindowSurface so the
    EGL surface inherits matching buffer dimensions on creation
    (MediaCodec sizes natively; EGL doesn't).
  - VideoPlayerInstance wires Renderers.build with a sizer callback
    that calls into compositor.setInputSurfaceSize from the FFmpeg
    renderer thread.

Adds docs/architecture.md with the layered video pipeline diagram,
file map, renderer-selection rationale, build flow, and LGPL
boundary notes.
This commit is contained in:
agra
2026-05-28 19:24:17 +03:00
parent 7ad3a38d38
commit 7243ef7de4
16 changed files with 2439 additions and 16 deletions

View File

@@ -0,0 +1,24 @@
/*
* Copyright 2026 swipelab.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*/
package io.swipelab.ux.video.ffmpeg;
import androidx.media3.common.util.UnstableApi;
import androidx.media3.decoder.DecoderException;
@UnstableApi
public final class FfmpegDecoderException extends DecoderException {
public FfmpegDecoderException(String message) {
super(message);
}
public FfmpegDecoderException(String message, Throwable cause) {
super(message, cause);
}
}

View File

@@ -0,0 +1,89 @@
/*
* Copyright 2026 swipelab.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*/
package io.swipelab.ux.video.ffmpeg;
import androidx.annotation.Nullable;
import androidx.media3.common.MimeTypes;
import androidx.media3.common.util.UnstableApi;
/**
* Loads libffmpegJNI.so and answers capability queries against the
* statically-linked FFmpeg build. The current native build only enables
* the H.264 decoder; queries for other MIME types return false.
*/
@UnstableApi
public final class FfmpegLibrary {
private static final Object lock = new Object();
private static boolean attempted;
private static boolean available;
private FfmpegLibrary() {}
public static boolean isAvailable() {
synchronized (lock) {
if (attempted) {
return available;
}
attempted = true;
try {
System.loadLibrary("ffmpegJNI");
available = true;
} catch (UnsatisfiedLinkError e) {
android.util.Log.w(
"UxFfmpeg",
"libffmpegJNI.so missing for this ABI; falling back to MediaCodec");
available = false;
}
return available;
}
}
public static String getVersion() {
return isAvailable() ? ffmpegGetVersion() : "";
}
public static int getInputBufferPaddingSize() {
return isAvailable() ? ffmpegGetInputBufferPaddingSize() : 0;
}
/** Whether the given MIME type maps to a decoder built into this libffmpegJNI.so. */
public static boolean supportsFormat(@Nullable String mimeType) {
if (mimeType == null || !isAvailable()) {
return false;
}
String codecName = getCodecName(mimeType);
if (codecName == null) {
return false;
}
return ffmpegHasDecoder(codecName);
}
/**
* FFmpeg decoder name for the given MIME type, or null if no mapping
* exists. Only video codecs are wired up; the JNI build excludes
* audio decoders entirely.
*/
@Nullable
public static String getCodecName(String mimeType) {
switch (mimeType) {
case MimeTypes.VIDEO_H264:
return "h264";
default:
return null;
}
}
private static native String ffmpegGetVersion();
private static native int ffmpegGetInputBufferPaddingSize();
private static native boolean ffmpegHasDecoder(String codecName);
}

View File

@@ -0,0 +1,417 @@
/*
* Copyright 2026 swipelab.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*/
package io.swipelab.ux.video.ffmpeg;
import android.opengl.EGL14;
import android.opengl.EGLConfig;
import android.opengl.EGLContext;
import android.opengl.EGLDisplay;
import android.opengl.EGLSurface;
import android.opengl.GLES20;
import android.util.Log;
import android.view.Surface;
import androidx.media3.common.util.UnstableApi;
import androidx.media3.decoder.VideoDecoderOutputBuffer;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.nio.FloatBuffer;
/**
* EGL/GLES2 helper that converts a decoded YUV420P/YUVJ420P frame into
* an RGB image written to an Android {@link Surface}. Lives for the
* duration of one output Surface; the renderer creates a new one when
* the Surface changes and releases this on shutdown.
*
* <p>The fragment shader handles both limited-range BT.709
* (Android-recorded H.264) and full-range BT.709 (iOS yuvj420p). The
* decoder flags per-buffer range via
* {@link VideoDecoderOutputBuffer#decoderPrivate} (0 = limited,
* 1 = full).
*/
@UnstableApi
final class FfmpegOutputSurface {
private static final String TAG = "UxFfmpegOutputSurface";
// Fullscreen quad: (x, y, u, v) per vertex in TRIANGLE_STRIP order
// (top-left, bottom-left, top-right, bottom-right). UVs encode the
// rotation that maps a coded-orientation YUV plane onto a
// display-orientation render target — for each 90° step the UV
// tuple shifts one vertex around the corner ring. Used by
// `pickQuad(rotation)` at configure time so the running shader
// doesn't need a per-frame rotation uniform.
private static final float[] QUAD_0 = {
-1f, 1f, 0f, 0f,
-1f, -1f, 0f, 1f,
1f, 1f, 1f, 0f,
1f, -1f, 1f, 1f,
};
private static final float[] QUAD_90 = {
-1f, 1f, 0f, 1f,
-1f, -1f, 1f, 1f,
1f, 1f, 0f, 0f,
1f, -1f, 1f, 0f,
};
private static final float[] QUAD_180 = {
-1f, 1f, 1f, 1f,
-1f, -1f, 1f, 0f,
1f, 1f, 0f, 1f,
1f, -1f, 0f, 0f,
};
private static final float[] QUAD_270 = {
-1f, 1f, 1f, 0f,
-1f, -1f, 0f, 0f,
1f, 1f, 1f, 1f,
1f, -1f, 0f, 1f,
};
private static float[] pickQuad(int rotation) {
switch (((rotation % 360) + 360) % 360) {
case 90:
return QUAD_90;
case 180:
return QUAD_180;
case 270:
return QUAD_270;
default:
return QUAD_0;
}
}
private static final String VERTEX_SHADER =
"attribute vec4 aPos;\n"
+ "attribute vec2 aTex;\n"
+ "varying vec2 vTex;\n"
+ "void main() {\n"
+ " gl_Position = aPos;\n"
+ " vTex = aTex;\n"
+ "}\n";
// BT.709 YUV->RGB. uFullRange selects between full-range (iOS
// yuvj420p) and limited-range conversion. uSampleScale rescales the
// horizontal texture coordinate to skip the right-side padding that
// FFmpeg's SIMD-aligned linesize introduces (yStride >= width).
private static final String FRAGMENT_SHADER =
"precision mediump float;\n"
+ "varying vec2 vTex;\n"
+ "uniform sampler2D uY;\n"
+ "uniform sampler2D uU;\n"
+ "uniform sampler2D uV;\n"
+ "uniform float uSampleScale;\n"
+ "uniform float uFullRange;\n"
+ "void main() {\n"
+ " vec2 c = vec2(vTex.x * uSampleScale, vTex.y);\n"
+ " float y = texture2D(uY, c).r;\n"
+ " float u = texture2D(uU, c).r - 0.5;\n"
+ " float v = texture2D(uV, c).r - 0.5;\n"
+ " if (uFullRange < 0.5) {\n"
+ " y = (y - 16.0/255.0) * (255.0/219.0);\n"
+ " u *= 255.0/224.0;\n"
+ " v *= 255.0/224.0;\n"
+ " }\n"
+ " float r = y + 1.5748 * v;\n"
+ " float g = y - 0.1873 * u - 0.4681 * v;\n"
+ " float b = y + 1.8556 * u;\n"
+ " gl_FragColor = vec4(clamp(vec3(r, g, b), 0.0, 1.0), 1.0);\n"
+ "}\n";
private EGLDisplay eglDisplay = EGL14.EGL_NO_DISPLAY;
private EGLContext eglContext = EGL14.EGL_NO_CONTEXT;
private EGLSurface eglSurface = EGL14.EGL_NO_SURFACE;
private EGLConfig eglConfig;
private int program;
private int aPosLoc;
private int aTexLoc;
private int uYLoc;
private int uULoc;
private int uVLoc;
private int uSampleScaleLoc;
private int uFullRangeLoc;
private final int[] textures = new int[3];
// Per-plane allocated dimensions; -1 forces a glTexImage2D on the
// next upload (initial frame or size change). Subsequent frames use
// glTexSubImage2D which only copies pixels and avoids the driver-
// side reallocation that glTexImage2D triggers.
private final int[] allocatedStride = { -1, -1, -1 };
private final int[] allocatedRows = { -1, -1, -1 };
private FloatBuffer quadBuffer;
// Width/height the EGL window surface was created for; taken from
// the first decoded frame, not eglQuerySurface (which can return
// stale 1×1 dimensions if the SurfaceTexture's defaultBufferSize
// wasn't set before window creation).
private int surfaceWidth;
private int surfaceHeight;
// Rotation (clockwise) the configure() call selected the quad UVs
// for. Used to detect mid-stream rotation changes and rebuild.
private int configuredRotation;
/**
* Binds this helper to the given surface and creates EGL context +
* GL resources sized for the supplied DISPLAY dimensions (i.e.
* already swapped for 90°/270° rotated streams). Caller must have
* already resized the underlying SurfaceTexture (via
* setDefaultBufferSize) so the EGL window surface picks up matching
* buffer dimensions on creation.
*
* @param rotation 0 / 90 / 180 / 270 — the clockwise rotation that
* should be applied to the coded YUV to display correctly.
* Selects which set of pre-rotated quad UVs the shader samples
* with.
*/
void configure(Surface surface, int width, int height, int rotation)
throws FfmpegDecoderException {
if (eglContext != EGL14.EGL_NO_CONTEXT && eglSurface != EGL14.EGL_NO_SURFACE) {
return;
}
surfaceWidth = width;
surfaceHeight = height;
this.configuredRotation = rotation;
eglDisplay = EGL14.eglGetDisplay(EGL14.EGL_DEFAULT_DISPLAY);
if (eglDisplay == EGL14.EGL_NO_DISPLAY) {
throw new FfmpegDecoderException("eglGetDisplay failed");
}
int[] version = new int[2];
if (!EGL14.eglInitialize(eglDisplay, version, 0, version, 1)) {
throw new FfmpegDecoderException("eglInitialize failed");
}
int[] cfgAttribs = {
EGL14.EGL_RED_SIZE, 8,
EGL14.EGL_GREEN_SIZE, 8,
EGL14.EGL_BLUE_SIZE, 8,
EGL14.EGL_ALPHA_SIZE, 8,
EGL14.EGL_RENDERABLE_TYPE, EGL14.EGL_OPENGL_ES2_BIT,
EGL14.EGL_SURFACE_TYPE, EGL14.EGL_WINDOW_BIT,
EGL14.EGL_NONE
};
EGLConfig[] cfgs = new EGLConfig[1];
int[] numCfgs = new int[1];
if (!EGL14.eglChooseConfig(eglDisplay, cfgAttribs, 0, cfgs, 0, 1, numCfgs, 0)
|| numCfgs[0] < 1) {
throw new FfmpegDecoderException("eglChooseConfig failed");
}
eglConfig = cfgs[0];
int[] ctxAttribs = { EGL14.EGL_CONTEXT_CLIENT_VERSION, 2, EGL14.EGL_NONE };
eglContext =
EGL14.eglCreateContext(eglDisplay, eglConfig, EGL14.EGL_NO_CONTEXT, ctxAttribs, 0);
if (eglContext == EGL14.EGL_NO_CONTEXT) {
throw new FfmpegDecoderException("eglCreateContext failed");
}
int[] surfAttribs = { EGL14.EGL_NONE };
eglSurface =
EGL14.eglCreateWindowSurface(eglDisplay, eglConfig, surface, surfAttribs, 0);
if (eglSurface == EGL14.EGL_NO_SURFACE) {
throw new FfmpegDecoderException("eglCreateWindowSurface failed");
}
if (!EGL14.eglMakeCurrent(eglDisplay, eglSurface, eglSurface, eglContext)) {
throw new FfmpegDecoderException("eglMakeCurrent failed");
}
initGl();
}
private void initGl() throws FfmpegDecoderException {
int vs = compileShader(GLES20.GL_VERTEX_SHADER, VERTEX_SHADER);
int fs = compileShader(GLES20.GL_FRAGMENT_SHADER, FRAGMENT_SHADER);
program = GLES20.glCreateProgram();
GLES20.glAttachShader(program, vs);
GLES20.glAttachShader(program, fs);
GLES20.glLinkProgram(program);
int[] linked = new int[1];
GLES20.glGetProgramiv(program, GLES20.GL_LINK_STATUS, linked, 0);
if (linked[0] == 0) {
String log = GLES20.glGetProgramInfoLog(program);
throw new FfmpegDecoderException("Program link failed: " + log);
}
GLES20.glDeleteShader(vs);
GLES20.glDeleteShader(fs);
aPosLoc = GLES20.glGetAttribLocation(program, "aPos");
aTexLoc = GLES20.glGetAttribLocation(program, "aTex");
uYLoc = GLES20.glGetUniformLocation(program, "uY");
uULoc = GLES20.glGetUniformLocation(program, "uU");
uVLoc = GLES20.glGetUniformLocation(program, "uV");
uSampleScaleLoc = GLES20.glGetUniformLocation(program, "uSampleScale");
uFullRangeLoc = GLES20.glGetUniformLocation(program, "uFullRange");
GLES20.glGenTextures(3, textures, 0);
for (int i = 0; i < 3; i++) {
int t = textures[i];
// Y plane (i=0) is 1:1 sized to the EGL surface, so GL_NEAREST
// samples the exact texel value at each pixel center and
// preserves luma detail. Chroma planes (i=1,2) are 4:2:0
// subsampled — bilinear filtering reconstructs the smooth
// colour transitions the encoder expected.
int filter = i == 0 ? GLES20.GL_NEAREST : GLES20.GL_LINEAR;
GLES20.glBindTexture(GLES20.GL_TEXTURE_2D, t);
GLES20.glTexParameteri(GLES20.GL_TEXTURE_2D, GLES20.GL_TEXTURE_MIN_FILTER, filter);
GLES20.glTexParameteri(GLES20.GL_TEXTURE_2D, GLES20.GL_TEXTURE_MAG_FILTER, filter);
GLES20.glTexParameteri(GLES20.GL_TEXTURE_2D, GLES20.GL_TEXTURE_WRAP_S,
GLES20.GL_CLAMP_TO_EDGE);
GLES20.glTexParameteri(GLES20.GL_TEXTURE_2D, GLES20.GL_TEXTURE_WRAP_T,
GLES20.GL_CLAMP_TO_EDGE);
}
float[] quad = pickQuad(configuredRotation);
quadBuffer =
ByteBuffer.allocateDirect(quad.length * 4)
.order(ByteOrder.nativeOrder())
.asFloatBuffer();
quadBuffer.put(quad).position(0);
GLES20.glPixelStorei(GLES20.GL_UNPACK_ALIGNMENT, 1);
}
private static int compileShader(int type, String src) throws FfmpegDecoderException {
int s = GLES20.glCreateShader(type);
GLES20.glShaderSource(s, src);
GLES20.glCompileShader(s);
int[] ok = new int[1];
GLES20.glGetShaderiv(s, GLES20.GL_COMPILE_STATUS, ok, 0);
if (ok[0] == 0) {
String log = GLES20.glGetShaderInfoLog(s);
GLES20.glDeleteShader(s);
throw new FfmpegDecoderException(
(type == GLES20.GL_VERTEX_SHADER ? "Vertex" : "Fragment")
+ " shader compile failed: "
+ log);
}
return s;
}
/**
* Uploads the YUV planes from {@code buffer} into GL textures and
* draws the conversion shader to the EGL window surface. Caller is
* responsible for calling {@code buffer.release()} afterwards.
*
* @param rotation rotation degrees from the source format. Must
* match the value passed to {@link #configure} (used here only
* to recover the CODED Y-plane dimensions — needed for the
* stride-vs-width sample-scale calculation — from the buffer's
* DISPLAY-orientation width/height).
*/
void render(VideoDecoderOutputBuffer buffer, int rotation) throws FfmpegDecoderException {
if (eglContext == EGL14.EGL_NO_CONTEXT || eglSurface == EGL14.EGL_NO_SURFACE) {
throw new FfmpegDecoderException("render called before configure");
}
if (buffer.yuvPlanes == null || buffer.yuvStrides == null) {
throw new FfmpegDecoderException("output buffer has no YUV data");
}
EGL14.eglMakeCurrent(eglDisplay, eglSurface, eglSurface, eglContext);
int yStride = buffer.yuvStrides[0];
int uvStride = buffer.yuvStrides[1];
// For 90°/270° streams the renderer swapped buffer.width/height
// to display orientation so ExoPlayer's size notification was
// correct, but the YUV planes are still stored in coded
// orientation — undo the swap here so sampleScale = codedWidth /
// yStride lands on the correct value.
boolean rotated = rotation == 90 || rotation == 270;
int codedWidth = rotated ? buffer.height : buffer.width;
int codedHeight = rotated ? buffer.width : buffer.height;
int uvHeight = (codedHeight + 1) / 2;
GLES20.glViewport(0, 0, surfaceWidth, surfaceHeight);
GLES20.glClearColor(0f, 0f, 0f, 1f);
GLES20.glClear(GLES20.GL_COLOR_BUFFER_BIT);
GLES20.glUseProgram(program);
uploadPlane(0, textures[0], buffer.yuvPlanes[0], yStride, codedHeight);
uploadPlane(1, textures[1], buffer.yuvPlanes[1], uvStride, uvHeight);
uploadPlane(2, textures[2], buffer.yuvPlanes[2], uvStride, uvHeight);
GLES20.glUniform1i(uYLoc, 0);
GLES20.glUniform1i(uULoc, 1);
GLES20.glUniform1i(uVLoc, 2);
GLES20.glUniform1f(uSampleScaleLoc, yStride > 0 ? (float) codedWidth / yStride : 1f);
GLES20.glUniform1f(uFullRangeLoc, buffer.decoderPrivate == 1L ? 1f : 0f);
quadBuffer.position(0);
GLES20.glVertexAttribPointer(aPosLoc, 2, GLES20.GL_FLOAT, false, 16, quadBuffer);
quadBuffer.position(2);
GLES20.glVertexAttribPointer(aTexLoc, 2, GLES20.GL_FLOAT, false, 16, quadBuffer);
GLES20.glEnableVertexAttribArray(aPosLoc);
GLES20.glEnableVertexAttribArray(aTexLoc);
GLES20.glDrawArrays(GLES20.GL_TRIANGLE_STRIP, 0, 4);
GLES20.glDisableVertexAttribArray(aPosLoc);
GLES20.glDisableVertexAttribArray(aTexLoc);
if (!EGL14.eglSwapBuffers(eglDisplay, eglSurface)) {
Log.w(TAG, "eglSwapBuffers failed; client may have detached the surface");
}
}
private void uploadPlane(
int unit, int texture, ByteBuffer src, int stride, int rows) {
GLES20.glActiveTexture(GLES20.GL_TEXTURE0 + unit);
GLES20.glBindTexture(GLES20.GL_TEXTURE_2D, texture);
src.position(0);
if (allocatedStride[unit] == stride && allocatedRows[unit] == rows) {
GLES20.glTexSubImage2D(
GLES20.GL_TEXTURE_2D,
0,
0, 0,
stride, rows,
GLES20.GL_LUMINANCE,
GLES20.GL_UNSIGNED_BYTE,
src);
} else {
GLES20.glTexImage2D(
GLES20.GL_TEXTURE_2D,
0,
GLES20.GL_LUMINANCE,
stride,
rows,
0,
GLES20.GL_LUMINANCE,
GLES20.GL_UNSIGNED_BYTE,
src);
allocatedStride[unit] = stride;
allocatedRows[unit] = rows;
}
}
void release() {
if (eglDisplay != EGL14.EGL_NO_DISPLAY) {
EGL14.eglMakeCurrent(eglDisplay, EGL14.EGL_NO_SURFACE, EGL14.EGL_NO_SURFACE,
EGL14.EGL_NO_CONTEXT);
if (textures[0] != 0) {
GLES20.glDeleteTextures(3, textures, 0);
textures[0] = textures[1] = textures[2] = 0;
}
if (program != 0) {
GLES20.glDeleteProgram(program);
program = 0;
}
if (eglSurface != EGL14.EGL_NO_SURFACE) {
EGL14.eglDestroySurface(eglDisplay, eglSurface);
eglSurface = EGL14.EGL_NO_SURFACE;
}
if (eglContext != EGL14.EGL_NO_CONTEXT) {
EGL14.eglDestroyContext(eglDisplay, eglContext);
eglContext = EGL14.EGL_NO_CONTEXT;
}
EGL14.eglReleaseThread();
EGL14.eglTerminate(eglDisplay);
eglDisplay = EGL14.EGL_NO_DISPLAY;
}
quadBuffer = null;
}
}

View File

@@ -0,0 +1,31 @@
/*
* Copyright 2026 swipelab.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*/
package io.swipelab.ux.video.ffmpeg;
import androidx.media3.common.util.UnstableApi;
/**
* Resizes the SurfaceTexture backing the player's output Surface
* before the FFmpeg renderer's EGL window surface is created. Without
* this the SurfaceTexture's default buffer size is whatever
* implementation default Android picked (often 1×1) — the EGL surface
* inherits that, our quad renders into one pixel, and the compositor
* stretches it across the Flutter texture.
*
* <p>The compositor's player listener also calls
* setDefaultBufferSize, but only on the main thread after the player
* fires onVideoSizeChanged. By then the FFmpeg renderer has already
* created its EGL window surface on the player thread. Threading this
* hook through ensures resize-before-window-create.
*/
@UnstableApi
public interface FfmpegSurfaceSizer {
void resize(int width, int height);
}

View File

@@ -0,0 +1,188 @@
/*
* Copyright 2026 swipelab.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*/
package io.swipelab.ux.video.ffmpeg;
import androidx.annotation.Nullable;
import androidx.media3.common.C;
import androidx.media3.common.Format;
import androidx.media3.common.util.UnstableApi;
import androidx.media3.common.util.Util;
import androidx.media3.decoder.DecoderInputBuffer;
import androidx.media3.decoder.SimpleDecoder;
import androidx.media3.decoder.VideoDecoderOutputBuffer;
import java.nio.ByteBuffer;
import java.util.List;
/**
* FFmpeg H.264 video decoder. Mirrors the Media3 audio pattern but
* targets {@link VideoDecoderOutputBuffer}. Two-step send/receive maps
* directly onto libavcodec's avcodec_send_packet / avcodec_receive_frame
* lifecycle so we can drain multiple reordered frames out of a single
* input packet.
*/
@UnstableApi
public final class FfmpegVideoDecoder
extends SimpleDecoder<
DecoderInputBuffer, VideoDecoderOutputBuffer, FfmpegDecoderException> {
// Mirrored from ffmpeg_jni.cc.
private static final int VIDEO_DECODER_SUCCESS = 0;
private static final int VIDEO_DECODER_ERROR_INVALID_DATA = -1;
private static final int VIDEO_DECODER_ERROR_OTHER = -2;
private static final int VIDEO_DECODER_READ_AGAIN = -3;
private final String codecName;
@Nullable private final byte[] extraData;
private final long nativeContext;
private volatile @C.VideoOutputMode int outputMode;
public FfmpegVideoDecoder(
Format format, int numInputBuffers, int numOutputBuffers, int initialInputBufferSize, int threads)
throws FfmpegDecoderException {
super(
new DecoderInputBuffer[numInputBuffers],
new VideoDecoderOutputBuffer[numOutputBuffers]);
if (!FfmpegLibrary.isAvailable()) {
throw new FfmpegDecoderException("Failed to load decoder native libraries.");
}
String mime = format.sampleMimeType;
if (mime == null) {
throw new FfmpegDecoderException("Format has null sampleMimeType.");
}
String name = FfmpegLibrary.getCodecName(mime);
if (name == null) {
throw new FfmpegDecoderException("No FFmpeg codec mapped for " + mime);
}
codecName = name;
extraData = flattenInitializationData(format.initializationData);
nativeContext = ffmpegVideoInitialize(codecName, extraData, threads);
if (nativeContext == 0) {
throw new FfmpegDecoderException("Initialization failed.");
}
setInitialInputBufferSize(initialInputBufferSize);
}
@Override
public String getName() {
return "ffmpeg" + FfmpegLibrary.getVersion() + "-" + codecName;
}
@Override
protected DecoderInputBuffer createInputBuffer() {
return new DecoderInputBuffer(
DecoderInputBuffer.BUFFER_REPLACEMENT_MODE_DIRECT,
FfmpegLibrary.getInputBufferPaddingSize());
}
@Override
protected VideoDecoderOutputBuffer createOutputBuffer() {
return new VideoDecoderOutputBuffer(this::releaseOutputBuffer);
}
@Override
protected FfmpegDecoderException createUnexpectedDecodeException(Throwable error) {
return new FfmpegDecoderException("Unexpected decode error", error);
}
@Override
@Nullable
protected FfmpegDecoderException decode(
DecoderInputBuffer inputBuffer, VideoDecoderOutputBuffer outputBuffer, boolean reset) {
if (reset) {
ffmpegVideoFlush(nativeContext);
}
ByteBuffer inputData = Util.castNonNull(inputBuffer.data);
int inputSize = inputData.limit();
int sendResult =
ffmpegVideoSendPacket(nativeContext, inputData, inputSize, inputBuffer.timeUs);
if (sendResult == VIDEO_DECODER_ERROR_INVALID_DATA) {
// Treat invalid bitstream as non-fatal — match MediaCodec behavior.
outputBuffer.shouldBeSkipped = true;
return null;
} else if (sendResult == VIDEO_DECODER_ERROR_OTHER) {
return new FfmpegDecoderException("avcodec_send_packet failed (see logcat).");
}
// sendResult is VIDEO_DECODER_SUCCESS or VIDEO_DECODER_READ_AGAIN.
// EAGAIN on send means the decoder needs to be drained first — but
// SimpleDecoder gives us one input + one output per decode() call,
// so on the next iteration we'll be called with a fresh output
// buffer and try receive again. Drop this input on EAGAIN.
// init seeds outputBuffer.timeUs from the input PTS; the native
// receive path overwrites it with the decoded frame's true
// display-order PTS (recovered from libavcodec) before returning
// success.
outputBuffer.init(inputBuffer.timeUs, outputMode, /* supplementalData= */ null);
outputBuffer.format = inputBuffer.format;
int receiveResult = ffmpegVideoReceiveFrame(nativeContext, outputBuffer);
if (receiveResult == VIDEO_DECODER_READ_AGAIN) {
// No frame ready yet (decoder still building reorder buffer).
// Skip this output buffer; subsequent send/receive cycles will
// drain frames once the pipeline fills.
outputBuffer.shouldBeSkipped = true;
return null;
} else if (receiveResult == VIDEO_DECODER_ERROR_INVALID_DATA) {
outputBuffer.shouldBeSkipped = true;
return null;
} else if (receiveResult == VIDEO_DECODER_ERROR_OTHER) {
return new FfmpegDecoderException("avcodec_receive_frame failed (see logcat).");
}
return null;
}
@Override
public void release() {
super.release();
ffmpegVideoRelease(nativeContext);
}
public void setOutputMode(@C.VideoOutputMode int outputMode) {
this.outputMode = outputMode;
}
/**
* Coalesces the codec-specific data into one contiguous byte array
* for FFmpeg's extradata pointer. For AVC the MP4 extractor delivers
* a single avcC blob in slot 0; some demuxers split SPS/PPS into
* separate NAL units (Annex B). In both cases libavcodec auto-detects
* the layout from the first byte.
*/
@Nullable
private static byte[] flattenInitializationData(List<byte[]> initializationData) {
if (initializationData == null || initializationData.isEmpty()) {
return null;
}
if (initializationData.size() == 1) {
return initializationData.get(0);
}
int total = 0;
for (byte[] part : initializationData) total += part.length;
byte[] out = new byte[total];
int off = 0;
for (byte[] part : initializationData) {
System.arraycopy(part, 0, out, off, part.length);
off += part.length;
}
return out;
}
private native long ffmpegVideoInitialize(
String codecName, @Nullable byte[] extraData, int threads);
private native int ffmpegVideoSendPacket(
long context, ByteBuffer inputData, int inputSize, long ptsUs);
private native int ffmpegVideoReceiveFrame(long context, VideoDecoderOutputBuffer outputBuffer);
private native void ffmpegVideoFlush(long context);
private native void ffmpegVideoRelease(long context);
}

View File

@@ -0,0 +1,220 @@
/*
* Copyright 2026 swipelab.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*/
package io.swipelab.ux.video.ffmpeg;
import static androidx.media3.exoplayer.DecoderReuseEvaluation.DISCARD_REASON_MIME_TYPE_CHANGED;
import static androidx.media3.exoplayer.DecoderReuseEvaluation.REUSE_RESULT_NO;
import static androidx.media3.exoplayer.DecoderReuseEvaluation.REUSE_RESULT_YES_WITHOUT_RECONFIGURATION;
import android.os.Handler;
import android.view.Surface;
import androidx.annotation.Nullable;
import androidx.media3.common.C;
import androidx.media3.common.Format;
import androidx.media3.common.MimeTypes;
import androidx.media3.common.util.TraceUtil;
import androidx.media3.common.util.UnstableApi;
import androidx.media3.decoder.CryptoConfig;
import androidx.media3.decoder.DecoderException;
import androidx.media3.decoder.VideoDecoderOutputBuffer;
import androidx.media3.exoplayer.DecoderReuseEvaluation;
import androidx.media3.exoplayer.RendererCapabilities;
import androidx.media3.exoplayer.video.DecoderVideoRenderer;
import androidx.media3.exoplayer.video.VideoRendererEventListener;
import java.util.Objects;
/**
* ExoPlayer renderer that decodes video with our vendored FFmpeg
* extension and converts the YUV frame to RGB via a GLES2 shader before
* presenting on the output Surface. Slotted ahead of
* MediaCodecVideoRenderer via EXTENSION_RENDERER_MODE_PREFER so files
* that exceed platform decoder caps (deep DPB, iOS yuvj420p) work on
* devices where Media3's hardware path fails.
*/
@UnstableApi
public final class FfmpegVideoRenderer extends DecoderVideoRenderer {
private static final String TAG = "FfmpegVideoRenderer";
private static final int DEFAULT_NUM_INPUT_BUFFERS = 4;
// iOS H.264 records with max_num_reorder_frames up to 16, and our
// shouldBeSkipped-on-EAGAIN path consumes one output buffer per
// input until the DPB fills. With 8 buffers the pipeline ran out
// before steady state and the renderer dropped frames waiting on
// free output buffers; 16 covers the worst iOS reorder shape.
private static final int DEFAULT_NUM_OUTPUT_BUFFERS = 16;
// 480x848 baseline; the decoder grows the buffer if a packet exceeds.
private static final int DEFAULT_INPUT_BUFFER_SIZE = 256 * 1024;
private final int threads;
private final int numInputBuffers;
private final int numOutputBuffers;
private final FfmpegSurfaceSizer surfaceSizer;
@Nullable private FfmpegVideoDecoder decoder;
@Nullable private FfmpegOutputSurface outputSurface;
@Nullable private Surface currentSurface;
private int surfaceWidth = -1;
private int surfaceHeight = -1;
private int surfaceRotation = 0;
public FfmpegVideoRenderer(
long allowedJoiningTimeMs,
@Nullable Handler eventHandler,
@Nullable VideoRendererEventListener eventListener,
int maxDroppedFramesToNotify,
FfmpegSurfaceSizer surfaceSizer) {
this(
allowedJoiningTimeMs,
eventHandler,
eventListener,
maxDroppedFramesToNotify,
surfaceSizer,
/* threads= */ 0,
DEFAULT_NUM_INPUT_BUFFERS,
DEFAULT_NUM_OUTPUT_BUFFERS);
}
public FfmpegVideoRenderer(
long allowedJoiningTimeMs,
@Nullable Handler eventHandler,
@Nullable VideoRendererEventListener eventListener,
int maxDroppedFramesToNotify,
FfmpegSurfaceSizer surfaceSizer,
int threads,
int numInputBuffers,
int numOutputBuffers) {
super(allowedJoiningTimeMs, eventHandler, eventListener, maxDroppedFramesToNotify);
this.threads = threads;
this.numInputBuffers = numInputBuffers;
this.numOutputBuffers = numOutputBuffers;
this.surfaceSizer = surfaceSizer;
}
@Override
public String getName() {
return TAG;
}
@Override
public @Capabilities int supportsFormat(Format format) {
String mime = format.sampleMimeType;
if (!FfmpegLibrary.isAvailable() || mime == null || !MimeTypes.isVideo(mime)) {
return RendererCapabilities.create(C.FORMAT_UNSUPPORTED_TYPE);
}
if (!FfmpegLibrary.supportsFormat(mime)) {
return RendererCapabilities.create(C.FORMAT_UNSUPPORTED_SUBTYPE);
}
if (format.cryptoType != C.CRYPTO_TYPE_NONE) {
return RendererCapabilities.create(C.FORMAT_UNSUPPORTED_DRM);
}
return RendererCapabilities.create(
C.FORMAT_HANDLED, ADAPTIVE_SEAMLESS, TUNNELING_NOT_SUPPORTED);
}
@Override
protected FfmpegVideoDecoder createDecoder(Format format, @Nullable CryptoConfig cryptoConfig)
throws FfmpegDecoderException {
TraceUtil.beginSection("createFfmpegVideoDecoder");
int initialInputBufferSize =
format.maxInputSize != Format.NO_VALUE ? format.maxInputSize : DEFAULT_INPUT_BUFFER_SIZE;
FfmpegVideoDecoder d =
new FfmpegVideoDecoder(
format, numInputBuffers, numOutputBuffers, initialInputBufferSize, threads);
decoder = d;
TraceUtil.endSection();
return d;
}
/// Pre-swap buffer dims for 90°/270° rotated streams so the
/// {@code maybeNotifyVideoSizeChanged} call inside the base
/// renderOutputBuffer reports DISPLAY-orientation dimensions (matching
/// what MediaCodecVideoRenderer does for the hardware path). Without
/// this swap, portrait iOS videos report their coded landscape size
/// and the downstream compositor lays out the Flutter texture
/// rotated.
@Override
protected void renderOutputBuffer(
VideoDecoderOutputBuffer outputBuffer, long presentationTimeUs, Format outputFormat)
throws DecoderException {
if (outputFormat != null
&& (outputFormat.rotationDegrees == 90 || outputFormat.rotationDegrees == 270)
&& outputBuffer.width != outputBuffer.height) {
int tmp = outputBuffer.width;
outputBuffer.width = outputBuffer.height;
outputBuffer.height = tmp;
}
super.renderOutputBuffer(outputBuffer, presentationTimeUs, outputFormat);
}
@Override
protected void renderOutputBufferToSurface(
VideoDecoderOutputBuffer outputBuffer, Surface surface) throws FfmpegDecoderException {
int rotation = outputBuffer.format != null ? outputBuffer.format.rotationDegrees : 0;
if (outputSurface == null
|| currentSurface != surface
|| surfaceWidth != outputBuffer.width
|| surfaceHeight != outputBuffer.height
|| surfaceRotation != rotation) {
releaseOutputSurface();
// Resize the SurfaceTexture before creating the EGL window
// surface — the EGL surface inherits the SurfaceTexture's buffer
// dimensions at creation and won't auto-resize later.
surfaceSizer.resize(outputBuffer.width, outputBuffer.height);
outputSurface = new FfmpegOutputSurface();
outputSurface.configure(surface, outputBuffer.width, outputBuffer.height, rotation);
currentSurface = surface;
surfaceWidth = outputBuffer.width;
surfaceHeight = outputBuffer.height;
surfaceRotation = rotation;
}
try {
outputSurface.render(outputBuffer, rotation);
} finally {
outputBuffer.release();
}
}
@Override
protected void setDecoderOutputMode(@C.VideoOutputMode int outputMode) {
if (decoder != null) {
decoder.setOutputMode(outputMode);
}
}
@Override
protected DecoderReuseEvaluation canReuseDecoder(
String decoderName, Format oldFormat, Format newFormat) {
boolean sameMime = Objects.equals(oldFormat.sampleMimeType, newFormat.sampleMimeType);
return new DecoderReuseEvaluation(
decoderName,
oldFormat,
newFormat,
sameMime ? REUSE_RESULT_YES_WITHOUT_RECONFIGURATION : REUSE_RESULT_NO,
sameMime ? 0 : DISCARD_REASON_MIME_TYPE_CHANGED);
}
@Override
protected void onDisabled() {
releaseOutputSurface();
decoder = null;
super.onDisabled();
}
private void releaseOutputSurface() {
if (outputSurface != null) {
outputSurface.release();
outputSurface = null;
}
currentSurface = null;
surfaceWidth = -1;
surfaceHeight = -1;
surfaceRotation = 0;
}
}

View File

@@ -1,28 +1,77 @@
package io.swipelab.ux.video
import android.content.Context
import android.os.Handler
import androidx.media3.common.util.UnstableApi
import androidx.media3.exoplayer.DefaultRenderersFactory
import androidx.media3.exoplayer.Renderer
import androidx.media3.exoplayer.RenderersFactory
import androidx.media3.exoplayer.mediacodec.MediaCodecInfo
import androidx.media3.exoplayer.mediacodec.MediaCodecSelector
import androidx.media3.exoplayer.video.VideoRendererEventListener
import io.swipelab.ux.video.ffmpeg.FfmpegLibrary
import io.swipelab.ux.video.ffmpeg.FfmpegSurfaceSizer
import io.swipelab.ux.video.ffmpeg.FfmpegVideoRenderer
/// Renderer factory configured for Banlu. Two non-default tweaks:
/// Renderer factory configured for Banlu.
///
/// - `setEnableDecoderFallback(true)` — Media3's default refuses to
/// fall back if the primary decoder fails to start. On Huawei
/// EMUI (Mate 20 Pro / LYA-L29 / API 29) the hardware AVC decoder
/// `OMX.hisi.video.decoder.avc` fails codec start; without
/// fallback the surface stays black.
/// - `MediaCodecSelector` deprioritises `OMX.hisi.*` so Media3
/// picks the working software decoder first
/// (`c2.android.avc.decoder` on the affected device). The
/// hardware decoder stays as a last-resort option for devices
/// where it works correctly.
/// Video path is tiered:
/// 1. `FfmpegVideoRenderer` — vendored Media3 FFmpeg extension that
/// decodes H.264 with libavcodec and converts YUV→RGB via a GLES2
/// shader. Selected first for any MIME type FFmpeg supports. Has
/// no DPB cap, handles iOS yuvj420p full-range streams that defeat
/// both `c2.hisi.avc.decoder` (EMUI 11 Mate 20 stalls on the first
/// sample after init) and `c2.android.avc.decoder` (Google C2 SW
/// caps output delay at 8 frames; iOS H.264 with `has_b_frames=16`
/// starves the decoder until CCodec's queue timeout). When
/// `libffmpegJNI.so` is missing for the current ABI, FfmpegLibrary
/// returns isAvailable()=false and selection falls through.
/// 2. `MediaCodec` path with `setEnableDecoderFallback(true)`. Init
/// failures (EMUI 10 `OMX.hisi.video.decoder.avc`) bounce to the
/// next decoder.
/// 3. `MediaCodecSelector` deprioritises every HiSilicon decoder
/// (`OMX.hisi.*` and `c2.hisi.*`) so Media3 picks
/// `c2.android.avc.decoder` ahead of `c2.hisi.avc.decoder` when
/// FFmpeg isn't available. Without this, the EMUI 11 C2 Hisi
/// decoder is chosen and fails after init — which decoder
/// fallback doesn't catch.
@UnstableApi
internal object Renderers {
fun build(context: Context): RenderersFactory {
return DefaultRenderersFactory(context)
fun build(context: Context, surfaceSizer: FfmpegSurfaceSizer): RenderersFactory {
return object : DefaultRenderersFactory(context) {
override fun buildVideoRenderers(
context: Context,
extensionRendererMode: Int,
mediaCodecSelector: MediaCodecSelector,
enableDecoderFallback: Boolean,
eventHandler: Handler,
eventListener: VideoRendererEventListener,
allowedVideoJoiningTimeMs: Long,
out: ArrayList<Renderer>,
) {
if (FfmpegLibrary.isAvailable()) {
out.add(
FfmpegVideoRenderer(
allowedVideoJoiningTimeMs,
eventHandler,
eventListener,
MAX_DROPPED_VIDEO_FRAME_COUNT_TO_NOTIFY,
surfaceSizer,
),
)
}
super.buildVideoRenderers(
context,
extensionRendererMode,
mediaCodecSelector,
enableDecoderFallback,
eventHandler,
eventListener,
allowedVideoJoiningTimeMs,
out,
)
}
}
.setEnableDecoderFallback(true)
.setMediaCodecSelector { mimeType, requiresSecure, requiresTunneling ->
val infos = MediaCodecSelector.DEFAULT.getDecoderInfos(
@@ -31,7 +80,12 @@ internal object Renderers {
val ok = ArrayList<MediaCodecInfo>(infos.size)
val broken = ArrayList<MediaCodecInfo>()
for (info in infos) {
if (info.name.startsWith("OMX.hisi.")) broken.add(info) else ok.add(info)
val name = info.name.lowercase()
if (name.startsWith("omx.hisi.") || name.startsWith("c2.hisi.")) {
broken.add(info)
} else {
ok.add(info)
}
}
ok.addAll(broken)
ok

View File

@@ -77,7 +77,7 @@ internal class VideoCompositor(
private var eglPbufferSurface: EGLSurface = EGL14.EGL_NO_SURFACE
private var eglWindowSurface: EGLSurface = EGL14.EGL_NO_SURFACE
private var inputTextureId: Int = 0
private var inputSurfaceTexture: SurfaceTexture? = null
@Volatile private var inputSurfaceTexture: SurfaceTexture? = null
private var inputSurface: Surface? = null
private var program: Int = 0
@@ -128,6 +128,23 @@ internal class VideoCompositor(
}
}
/// Sizes the codec-input SurfaceTexture's buffer dimensions. For
/// MediaCodec this is unnecessary — the codec writes via native
/// surface APIs that auto-size the SurfaceTexture buffer to the
/// encoded frame. For our FFmpeg path the codec writes via an EGL
/// window surface (created from this Surface), and eglCreateWindow
/// Surface inherits the SurfaceTexture's defaultBufferSize at
/// creation time and never re-queries. Without this hook the EGL
/// surface is created at whatever default the SurfaceTexture is in
/// (1×1 in practice) and the rendered quad is squashed to one pixel
/// that the downstream blit then stretches across the whole Flutter
/// texture — looks like a solid fill. Safe to call from any thread.
fun setInputSurfaceSize(width: Int, height: Int) {
if (disposed) return
if (width <= 0 || height <= 0) return
inputSurfaceTexture?.setDefaultBufferSize(width, height)
}
fun setDisplaySize(width: Int, height: Int) {
if (disposed) return
if (width <= 0 || height <= 0) return

View File

@@ -35,7 +35,6 @@ internal class VideoPlayerInstance(
val textureId: Long get() = textureEntry.id()
private val mainHandler = Handler(Looper.getMainLooper())
private val player: ExoPlayer = ExoPlayer.Builder(context, Renderers.build(context)).build()
/// Wrapped around the Flutter `SurfaceTextureEntry`'s `SurfaceTexture`.
/// We deliberately do NOT call `setDefaultBufferSize` here — that
@@ -77,6 +76,15 @@ internal class VideoPlayerInstance(
},
)
private val player: ExoPlayer = ExoPlayer.Builder(
context,
// FFmpeg renderer calls this on its own thread before it builds
// an EGL window surface against the codec-input Surface. Sizes
// the compositor's INPUT SurfaceTexture so the EGL surface
// inherits matching buffer dimensions on creation.
Renderers.build(context) { w, h -> compositor.setInputSurfaceSize(w, h) },
).build()
private var disposed = false
private var firstFrameRendered = false
private var stateReady = false