3.4.3 JDK API Guide (with Python Bindings)
Last Version: 2025/09/25
Introduction
This section explains the core design and usage of the JDK API and its Python bindings. It covers:
- Data type definitions
- C++ classes for multimedia (capture, encode/decode, image processing, display output)
- Python bindings and End-to-End examples
The goal is to help developers quickly build and integrate multimedia applications.
Data Type Definitions (data_type
)
These are the basic enums and structs used by the API.
Enum: media_type
Device media type enumeration:
Value | Description |
---|---|
MEDIA_TYPE_CANT_STAT | Cannot get device status |
MEDIA_TYPE_UNKNOWN | Unknown |
MEDIA_TYPE_VIDEO | Video |
MEDIA_TYPE_VBI | VBI (Vertical Blanking Interval) |
MEDIA_TYPE_RADIO | Radio |
MEDIA_TYPE_SDR | SDR(Software Defined Radio) |
MEDIA_TYPE_TOUCH | Touch input |
MEDIA_TYPE_SUBDEV | Sub-device |
MEDIA_TYPE_DVB_FRONTEND | Digital TV frontend |
MEDIA_TYPE_DVB_DEMUX | Digital TV demultiplexer |
MEDIA_TYPE_DVB_DVR | Digital TV recorder |
MEDIA_TYPE_DVB_NET | Digital TV network |
MEDIA_TYPE_DTV_CA | Digital TV conditional access |
MEDIA_TYPE_MEDIA | Media device |
Enum: codec_type
Indicates whether device/context is for encoding or decoding:
Value | Description |
---|---|
NOT_CODEC | Not codec |
CODEC_DEC | Decoder |
CODEC_ENC | Encoder |
Struct: v4l2_ctx
V4L2 capture and encoding context structure:
struct v4l2_ctx {
int fd; // Device file handle
unsigned int width; // Video width
unsigned int height; // Video height
unsigned int pixelformat; // Input pixel format
unsigned int out_pixelformat; // Output pixel format
int nplanes; // Input plane count
int out_nplanes; // Output plane count
struct buffer* cap_buffers; // Capture buffer array
struct buffer* out_buffers; // Output buffer array
__u32 bytesperline[VIDEO_MAX_PLANES]; // Bytes per line (input)
__u32 out_bytesperline[VIDEO_MAX_PLANES]; // Bytes per line (output)
FILE* file[2]; // Input/output file pointers
int verbose; // Log level
enum codec_type ctype; // Codec type (encode/decode)
};
Core C++ API
Main classes for multimedia processing:
- Frames (
JdkFrame
) - Camera input (
JdkCamera
) - Decoder/Encoder (
JdkDecoder
,JdkEncoder
) - Display output (
JdkVo
,JdkDrm
) - Image processing (
JdkV2D
)
JdkFrame
: Image Frame Wrapper
class JdkFrame {
public:
JdkFrame(int dma_fd_, size_t size_, int w, int h);
~JdkFrame();
// Map DMA buffer to CPU memory and return pointer
unsigned char* toHost() const;
// Return data copy
std::vector<unsigned char> Clone() const;
// Save as NV12 format .yuv file
bool saveToFile(const std::string& filename) const;
// Load data from file (paired with saveToFile)
bool loadFromFile(const std::string& filename, size_t expected_size);
// Get underlying DMA FD
int getDMAFd() const;
// Get buffer size
size_t getSize() const { return size_; }
// Get resolution
int getWidth() const { return width_; }
int getHeight() const { return height_; }
// Copy raw NALU data to internal buffer (e.g., after encoding)
// offset:target buffer offset
int MemCopy(const uint8_t* nalu, int nalu_size, int offset = 0);
private:
size_t size_; // Total buffer size
int width_;
int height_;
JdkDma dma_; // DMA sync helper
std::shared_ptr<JdkDmaBuffer> data; // Underlying DMA buffer
};
using JdkFramePtr = std::shared_ptr<JdkFrame>;
JdkDma
and JdkDmaBuffer
class JdkDmaBuffer {
public:
// Constructor with DMA buffer allocation
explicit JdkDmaBuffer(size_t size);
~JdkDmaBuffer();
// Return mapped userspace address
void* data() const;
// Fill entire buffer with value
void fill(uint8_t val);
// Get physical address (call map_phys_addr first)
void map_phys_addr();
// Public fields (read-only)
size_t m_size;
uint64_t m_phys;
};
class JdkDma {
public:
// Asynchronous DMA data copy
int Asyn(const JdkDmaBuffer& dst, const JdkDmaBuffer& src, size_t size);
// DMA copy between FDs
int Asyn(const int& dst_fd, const int& src_fd, size_t size);
};
JdkCamera
class JdkCamera {
public:
/**
* Create and open V4L2 device
* @param device Device path (e.g., "/dev/video0")
* @param width Desired capture width
* @param height Desired capture height
* @param pixfmt V4L2 pixel format (e.g., V4L2_PIX_FMT_NV12)
* @param req_count Requested buffer count (default: 4)
* @return JdkCameraPtr on success, nullptr otherwise
*/
static std::shared_ptr<JdkCamera> create(const std::string& device,
int width,
int height,
__u32 pixfmt,
int req_count = 4);
/** Get one frame (blocking) */
JdkFramePtr getFrame();
~JdkCamera();
private:
explicit JdkCamera(const std::string& device);
class Impl;
std::unique_ptr<Impl> impl_;
};
using JdkCameraPtr = std::shared_ptr<JdkCamera>;
JdkDecoder
class JdkDecoder {
public:
/**
* Initialize hardware decoder
* @param width Output resolution width
* @param height Output resolution height
* @param payload Input stream type (see MppCodingType)
* @param Format Output pixel format (default: NV12)
*/
JdkDecoder(int width, int height,
MppCodingType payload,
MppPixelFormat Format = PIXEL_FORMAT_NV12);
~JdkDecoder();
/** Decode from wrapped frame */
std::shared_ptr<JdkFrame> Decode(std::shared_ptr<JdkFrame> frame);
/** Decode from raw NALU data */
std::shared_ptr<JdkFrame> Decode(const uint8_t* nalu, int nalu_size);
private:
int width_;
int height_;
MppCodingType payload_;
int format_;
int channel_id_;
MppVdecCtx* pVdecCtx = nullptr;
};
JdkEncoder
class JdkEncoder {
public:
/**
* Initialize hardware encoder
* @param width Input resolution width
* @param height Input resolution height
* @param payload Output stream type (see MppCodingType)
* @param Format Input pixel format (default: NV12)
*/
JdkEncoder(int width, int height,
MppCodingType payload,
MppPixelFormat Format = PIXEL_FORMAT_NV12);
~JdkEncoder();
/** Encode raw frame to compressed stream */
std::shared_ptr<JdkFrame> Encode(std::shared_ptr<JdkFrame> frame);
private:
int width_;
int height_;
MppCodingType payload_;
int format_;
int encoder_id_ = 0;
MppVencCtx* pVencCtx = nullptr;
};
JdkDrm
/** Supported pixel formats */
enum class PixelFmt : uint32_t {
NV12 = DRM_FORMAT_NV12
};
class JdkDrm {
public:
/**
* Open DRM device and initialize
* @param width Display width
* @param height Display height
* @param stride Line stride (bytes)
* @param fmt Pixel format
* @param device DRM device path (default: "/dev/dri/card0")
*/
JdkDrm(int width, int height, int stride,
PixelFmt fmt = PixelFmt::NV12,
const char* device = "/dev/dri/card0");
~JdkDrm();
/** Send frame to DRM display */
int sendFrame(std::shared_ptr<JdkFrame> frame);
/** Destroy specified framebuffer */
void destroyFb(uint32_t fb, uint32_t handle);
/** Open DRM device */
int openCard(const char* dev);
/** Automatically select suitable connector/crtc/plane */
int pickConnectorCrtcPlane();
/** Import DMA FD as DRM framebuffer */
int importFb(int dma_fd, uint32_t& fb_id, uint32_t& handle);
private:
struct LastFB {
uint32_t fb_id;
uint32_t handle;
int dma_fd;
} last_;
};
JdkV2D
/** Supported output pixel formats (see full header file for all options) */
enum V2DFormat {
// Example: V2D_NV12, V2D_RGB888, ……
};
/** Rectangle definition */
struct V2DRect {
int x, y, width, height;
};
class JdkV2D {
public:
JdkV2D() = default;
~JdkV2D() = default;
/** Convert image format */
JdkFramePtr convert_format(const JdkFramePtr& input,
V2DFormat out_format);
/** Resize image */
JdkFramePtr resize(const JdkFramePtr& input,
int out_width, int out_height);
/** Resize and convert format in one step */
JdkFramePtr resize_and_convert(const JdkFramePtr& input,
int out_width, int out_height,
V2DFormat out_format);
/** Fill a rectangle area */
bool fill_rect(const JdkFramePtr& image,
const V2DRect& rect,
uint32_t rgba_color);
/** Draw a rectangle border */
bool draw_rect(const JdkFramePtr& image,
const V2DRect& rect,
uint32_t rgba_color,
int thickness = 2);
/** Draw multiple rectangles */
bool draw_rects(const JdkFramePtr& image,
const std::vector<V2DRect>& rects,
uint32_t rgba_color,
int thickness = 2);
/** Blend two images (overlay `top` onto `bottom`) */
JdkFramePtr blend(const JdkFramePtr& bottom,
const JdkFramePtr& top);
};
JdkVo
class JdkVo {
public:
/**
* Initialize Video Output (Vo)
* @param width Output frame width
* @param height Output frame height
* @param Format Pixel format (default: NV12)
*/
JdkVo(int width, int height,
MppPixelFormat Format = PIXEL_FORMAT_NV12);
~JdkVo();
/** Send frame to video output hardware */
int sendFrame(std::shared_ptr<JdkFrame> frame);
private:
int width_;
int height_;
MppPixelFormat format_;
int channel_id_;
MppVoCtx* pVoCtx = nullptr;
};
Python Bindings (pyjdk
)
Install and Import
# Download and install
wget https://gitlab.dc.com:8443/bianbu/bianbu-linux/jdk/-/blob/main/pyjdk/pyjdk-0.1.0-cp312-cp312-linux_riscv64.whl
pip install pyjdk-0.1.0-cp312-cp312-linux_riscv64.whl
# Import
import pyjdk as jdk
Available Enums
jdk.PixelFormat
:NV12
,MJPEG
,JPEG
(corresponds to V4L2 FourCC)jdk.CodingType
:H264
,H265
,JPEG
,MJPEG
jdk.MppPixelFormat
:NV12
,NV21
jdk.V2DFormat
: Common formats likeRGB888
(other values depend on the actual build output)jdk.DrmPixelFormat
:NV12
jdk.V2DRect
r = jdk.V2DRect(x, y, w, h)
# Also accepts (x,y,w,h) tuple/list/dict in drawing functions
jdk.Dma
dma = jdk.Dma()
dma.asyn(dst_fd: int, src_fd: int, size: int) -> int # Asynchronous DMA copy (wraps JdkDma::Asyn)
jdk.Frame
(equivalent to JdkFrame)
f = jdk.Frame(dma_fd: int, size: int, width: int, height: int)
# Read-only
f.dma_fd: int
f.size: int
f.width: int
f.height: int
# I/O and Views
f.save(path: str) -> bool # Save underlying buffer (NV12/raw/bitstream)
f.load_from_file(path: str, expected_size: int) -> bool
f.to_numpy_nv12(copy: bool = False) -> (y, uv) # Zero-copy or deep-copy to numpy (NV12 two-plane format)
f.to_bytes() -> bytes # Directly export underlying buffer
f.mem_copy(src: bytes|bytearray|memoryview, offset: int = 0) -> int
# Resource management
f.release() # Immediately release the underlying buffer (QBUF)
# Supports 'with' syntax — automatically releases when exiting scope
with f:
y, uv = f.to_numpy_nv12()
Note:
to_numpy_nv12(copy=False)
returns a zero-copy view ofy/uv
bound to the underlying buffer.- The lifetime of this view is tied to the numpy object.
- To get an independent copy, set
copy=True
.
Camera Capture (MIPI / USB)
jdk.MipiCam
cam = jdk.MipiCam.create(device: str, width: int, height: int,
fourcc: jdk.PixelFormat = jdk.PixelFormat.NV12,
req_count: int = 4) -> jdk.MipiCam
frame = cam.get_frame() # Blocking frame capture, returns jdk.Frame
# Also supports iteration: for f in cam — to capture frames continuously
# Supports 'with cam:' syntax for automatic resource management
jdk.UsbCam
uc = jdk.UsbCam.create("/dev/video20", 1280, 720, jdk.PixelFormat.MJPEG)
f = uc.get_frame() # Returns MJPEG bitstream frame (can be decoded with Decoder)
See examples: mipi_cam.py
, usb_cam.py
.
Encoder jdk.Encoder
enc = jdk.Encoder(width: int, height: int,
coding: jdk.CodingType = jdk.CodingType.H264,
pixfmt: jdk.MppPixelFormat = jdk.MppPixelFormat.NV12)
pkt = enc.encode(frame: jdk.Frame) -> jdk.Frame # return bitstream in Frame
Note:
- Only the
encode(...)
method is exposed in the current version. - If your local script uses
encode_frame(...)
, please change it toencode(...)
. - The
encode_frame
name inencode_h264.py
is outdated and should be updated.
Decoder jdk.Decoder
dec = jdk.Decoder(width: int, height: int,
coding: jdk.CodingType = jdk.CodingType.JPEG,
pixfmt: jdk.MppPixelFormat = jdk.MppPixelFormat.NV12)
# 1) Decode from "bitstream frame"
yuv = dec.decode(bitstream_frame: jdk.Frame) -> jdk.Frame
# 2) Decode directly from bytes-like object (bytes/bytearray/memoryview)
yuv = dec.decode(bitstream: bytes|bytearray|memoryview) -> jdk.Frame
See examples: decode_jpeg.py
, encode_decode.py
.
Image Processing jdk.V2D
v2d = jdk.V2D()
out1 = v2d.convert_format(input: jdk.Frame, out_format: jdk.V2DFormat) -> jdk.Frame
out2 = v2d.resize(input: jdk.Frame, out_width: int, out_height: int) -> jdk.Frame
out3 = v2d.resize_and_convert(input: jdk.Frame, out_width: int, out_height: int,
out_format: jdk.V2DFormat) -> jdk.Frame
v2d.fill_rect(image: jdk.Frame, rect: jdk.V2DRect, rgba_color: int) -> bool
v2d.draw_rect(image: jdk.Frame, rect: jdk.V2DRect, rgba_color: int, thickness: int = 2) -> bool
v2d.draw_rects(image: jdk.Frame, rects: list[jdk.V2DRect], rgba_color: int, thickness: int = 2) -> bool
mixed = v2d.blend(bottom: jdk.Frame, top: jdk.Frame) -> jdk.Frame
See example: v2d_demo.py
. Use rgba_color
as 0xAARRGGBB
.
Display Output
jdk.Vo
vo = jdk.Vo(width: int, height: int, pixfmt: jdk.MppPixelFormat = jdk.MppPixelFormat.NV12)
vo.send_frame(frame: jdk.Frame) -> int
jdk.Drm
drm = jdk.Drm(width: int, height: int,
stride: int = 0,
pixfmt: jdk.DrmPixelFormat = jdk.DrmPixelFormat.NV12,
card: str = "/dev/dri/card0")
drm.send_frame(frame: jdk.Frame) -> int
See Examples: jdk_vo.py
, jdk_drm.py
.
End-to-End Examples(same as source code)
MIPI Capture → Encode → Decode(from encode_decode.py
)
cam = jdk.MipiCam.create("/dev/video50", 1920, 1080, jdk.PixelFormat.NV12)
enc = jdk.Encoder(1920, 1080, jdk.CodingType.H264, jdk.MppPixelFormat.NV12)
dec = jdk.Decoder(1920, 1080, jdk.CodingType.H264, jdk.MppPixelFormat.NV12)
for _ in range(60):
f = cam.get_frame()
pkt = enc.encode(f)
yuv = dec.decode(pkt)
Decode JPEG/MJPEG from bytes(from decode_jpeg.py
)
bs_frame = jdk.Frame(-1, size, 1920, 1080)
bs_frame.load_from_file("examples/data/1920x1080.jpg", expected_size=size)
dec = jdk.Decoder(1920,1080, jdk.CodingType.MJPEG, jdk.MppPixelFormat.NV12)
yuv = dec.decode(bs_frame)
NV12 → RGB888 + Image frame(from v2d_demo.py
)
f = jdk.Frame(-1, w*h*3//2, w, h)
f.load_from_file("frame_1920x1080_nv12.yuv", expected_size=w*h*3//2)
v2d = jdk.V2D()
rgb = v2d.convert_format(f, jdk.V2DFormat.RGB888)
v2d.draw_rects(f, [jdk.V2DRect(30,20,100,80)], 0xFFFFFF00, 4)
Error Handling and Performance Notes
- All blocking or compute-intensive operations (frame capture, encoding/decoding, and V2D processing) release the Global Interpreter Lock (GIL) at the C++ level, improving multi-threaded performance in Python.
Frame.to_numpy_nv12(copy=False)
returns zero-copy views of the underlying buffer. Manage lifecycle carefully:- Call
f.release()
when done, OR - Use the frame within a
with f:
block for automatic cleanup.
- Call
- The decoder accepts two input types:
Frame
andbytes-like
objects, making it easy to adapt decoding for both network streams and file streams.