3.4.3 JDK API Guide (with Python Bindings)

Last Version: 2025/09/25

Introduction

This section explains the core design and usage of the JDK API and its Python bindings. It covers:

Data type definitions
C++ classes for multimedia (capture, encode/decode, image processing, display output)
Python bindings and End-to-End examples

The goal is to help developers quickly build and integrate multimedia applications.

Data Type Definitions (`data_type`)

These are the basic enums and structs used by the API.

Enum: `media_type`

Device media type enumeration:

Value	Description
`MEDIA_TYPE_CANT_STAT`	Cannot get device status
`MEDIA_TYPE_UNKNOWN`	Unknown
`MEDIA_TYPE_VIDEO`	Video
`MEDIA_TYPE_VBI`	VBI (Vertical Blanking Interval)
`MEDIA_TYPE_RADIO`	Radio
`MEDIA_TYPE_SDR`	SDR（Software Defined Radio）
`MEDIA_TYPE_TOUCH`	Touch input
`MEDIA_TYPE_SUBDEV`	Sub-device
`MEDIA_TYPE_DVB_FRONTEND`	Digital TV frontend
`MEDIA_TYPE_DVB_DEMUX`	Digital TV demultiplexer
`MEDIA_TYPE_DVB_DVR`	Digital TV recorder
`MEDIA_TYPE_DVB_NET`	Digital TV network
`MEDIA_TYPE_DTV_CA`	Digital TV conditional access
`MEDIA_TYPE_MEDIA`	Media device

Enum: `codec_type`

Indicates whether device/context is for encoding or decoding:

Value	Description
`NOT_CODEC`	Not codec
`CODEC_DEC`	Decoder
`CODEC_ENC`	Encoder

Struct: `v4l2_ctx`

V4L2 capture and encoding context structure:

struct v4l2_ctx {
    int                  fd;                // Device file handle
    unsigned int         width;             // Video width
    unsigned int         height;            // Video height
    unsigned int         pixelformat;       // Input pixel format
    unsigned int         out_pixelformat;   // Output pixel format
    int                  nplanes;           // Input plane count
    int                  out_nplanes;       // Output plane count
    struct buffer*       cap_buffers;       // Capture buffer array
    struct buffer*       out_buffers;       // Output buffer array
    __u32                bytesperline[VIDEO_MAX_PLANES];      // Bytes per line (input)
    __u32                out_bytesperline[VIDEO_MAX_PLANES];  // Bytes per line (output)
    FILE*                file[2];           // Input/output file pointers
    int                  verbose;           // Log level
    enum codec_type      ctype;             // Codec type (encode/decode)
};

Core C++ API

Main classes for multimedia processing:

Frames (JdkFrame)
Camera input (JdkCamera)
Decoder/Encoder (JdkDecoder, JdkEncoder)
Display output (JdkVo, JdkDrm)
Image processing (JdkV2D)

`JdkFrame`: Image Frame Wrapper

class JdkFrame {
public:
    JdkFrame(int dma_fd_, size_t size_, int w, int h);
    ~JdkFrame();

    // Map DMA buffer to CPU memory and return pointer
    unsigned char*               toHost() const;
    // Return data copy
    std::vector<unsigned char>   Clone() const;
    // Save as NV12 format .yuv file
    bool saveToFile(const std::string& filename) const;
    // Load data from file (paired with saveToFile)
    bool loadFromFile(const std::string& filename, size_t expected_size);

    // Get underlying DMA FD
    int    getDMAFd() const;
    // Get buffer size
    size_t getSize() const     { return size_; }
    // Get resolution
    int    getWidth() const    { return width_; }
    int    getHeight() const   { return height_; }

    // Copy raw NALU data to internal buffer (e.g., after encoding)
    // offset：target buffer offset
    int MemCopy(const uint8_t* nalu, int nalu_size, int offset = 0);

private:
    size_t                         size_;   // Total buffer size
    int                            width_;
    int                            height_;
    JdkDma                         dma_;    // DMA sync helper
    std::shared_ptr<JdkDmaBuffer>  data;    // Underlying DMA buffer
};

using JdkFramePtr = std::shared_ptr<JdkFrame>;

`JdkDma` and `JdkDmaBuffer`

class JdkDmaBuffer {
public:
    // Constructor with DMA buffer allocation
    explicit JdkDmaBuffer(size_t size);
    ~JdkDmaBuffer();

    // Return mapped userspace address
    void* data() const;
    // Fill entire buffer with value
    void  fill(uint8_t val);

    // Get physical address (call map_phys_addr first)
    void map_phys_addr();

    // Public fields (read-only)
    size_t   m_size;
    uint64_t m_phys;
};

class JdkDma {
public:
    // Asynchronous DMA data copy
    int Asyn(const JdkDmaBuffer& dst, const JdkDmaBuffer& src, size_t size);
    // DMA copy between FDs
    int Asyn(const int& dst_fd, const int& src_fd, size_t size);
};

`JdkCamera`

class JdkCamera {
public:
    /** 
     * Create and open V4L2 device
     * @param device   Device path (e.g., "/dev/video0")
     * @param width    Desired capture width
     * @param height   Desired capture height
     * @param pixfmt   V4L2 pixel format (e.g., V4L2_PIX_FMT_NV12)
     * @param req_count Requested buffer count (default: 4)
     * @return JdkCameraPtr on success, nullptr otherwise
     */
    static std::shared_ptr<JdkCamera> create(const std::string& device,
                                             int                width,
                                             int                height,
                                             __u32              pixfmt,
                                             int                req_count = 4);
    /** Get one frame (blocking) */
    JdkFramePtr getFrame();

    ~JdkCamera();

private:
    explicit JdkCamera(const std::string& device);
    class Impl;
    std::unique_ptr<Impl> impl_;
};
using JdkCameraPtr = std::shared_ptr<JdkCamera>;

`JdkDecoder`

class JdkDecoder {
public:
    /**
     * Initialize hardware decoder
     * @param width   Output resolution width
     * @param height  Output resolution height
     * @param payload Input stream type (see MppCodingType)
     * @param Format  Output pixel format (default: NV12)
     */
    JdkDecoder(int width, int height,
               MppCodingType payload,
               MppPixelFormat Format = PIXEL_FORMAT_NV12);
    ~JdkDecoder();

    /** Decode from wrapped frame */
    std::shared_ptr<JdkFrame> Decode(std::shared_ptr<JdkFrame> frame);
    /** Decode from raw NALU data */
    std::shared_ptr<JdkFrame> Decode(const uint8_t* nalu, int nalu_size);

private:
    int            width_;
    int            height_;
    MppCodingType  payload_;
    int            format_;
    int            channel_id_;
    MppVdecCtx*    pVdecCtx = nullptr;
};

`JdkEncoder`

class JdkEncoder {
public:
     /**
     * Initialize hardware encoder
     * @param width   Input resolution width
     * @param height  Input resolution height
     * @param payload Output stream type (see MppCodingType)
     * @param Format  Input pixel format (default: NV12)
     */
    JdkEncoder(int width, int height,
               MppCodingType payload,
               MppPixelFormat Format = PIXEL_FORMAT_NV12);
    ~JdkEncoder();

    /** Encode raw frame to compressed stream */
    std::shared_ptr<JdkFrame> Encode(std::shared_ptr<JdkFrame> frame);

private:
    int            width_;
    int            height_;
    MppCodingType  payload_;
    int            format_;
    int            encoder_id_ = 0;
    MppVencCtx*    pVencCtx    = nullptr;
};

`JdkDrm`

/** Supported pixel formats */
enum class PixelFmt : uint32_t {
    NV12 = DRM_FORMAT_NV12
};

class JdkDrm {
public:
    /**
     * Open DRM device and initialize
     * @param width   Display width
     * @param height  Display height
     * @param stride  Line stride (bytes)
     * @param fmt     Pixel format
     * @param device  DRM device path (default: "/dev/dri/card0")
     */
    JdkDrm(int width, int height, int stride,
           PixelFmt fmt = PixelFmt::NV12,
           const char* device = "/dev/dri/card0");
    ~JdkDrm();

    /** Send frame to DRM display */
    int sendFrame(std::shared_ptr<JdkFrame> frame);
    /** Destroy specified framebuffer */
    void destroyFb(uint32_t fb, uint32_t handle);
    /** Open DRM device */
    int openCard(const char* dev);
    /** Automatically select suitable connector/crtc/plane */
    int pickConnectorCrtcPlane();
    /** Import DMA FD as DRM framebuffer */
    int importFb(int dma_fd, uint32_t& fb_id, uint32_t& handle);

private:
    struct LastFB {
        uint32_t fb_id;
        uint32_t handle;
        int      dma_fd;
    } last_;
};

`JdkV2D`

/** Supported output pixel formats (see full header file for all options) */
enum V2DFormat {
    // Example: V2D_NV12, V2D_RGB888, ……
};

/** Rectangle definition */
struct V2DRect {
    int x, y, width, height;
};

class JdkV2D {
public:
    JdkV2D()  = default;
    ~JdkV2D() = default;

    /** Convert image format */
    JdkFramePtr convert_format(const JdkFramePtr& input,
                               V2DFormat out_format);
    /** Resize image */
    JdkFramePtr resize(const JdkFramePtr& input,
                       int out_width, int out_height);
    /** Resize and convert format in one step */
    JdkFramePtr resize_and_convert(const JdkFramePtr& input,
                                   int out_width, int out_height,
                                   V2DFormat out_format);
    /** Fill a rectangle area */
    bool fill_rect(const JdkFramePtr& image,
                   const V2DRect& rect,
                   uint32_t rgba_color);
    /** Draw a rectangle border */
    bool draw_rect(const JdkFramePtr& image,
                   const V2DRect& rect,
                   uint32_t rgba_color,
                   int thickness = 2);
    /** Draw multiple rectangles */
    bool draw_rects(const JdkFramePtr& image,
                    const std::vector<V2DRect>& rects,
                    uint32_t rgba_color,
                    int thickness = 2);
    /** Blend two images (overlay `top` onto `bottom`) */
    JdkFramePtr blend(const JdkFramePtr& bottom,
                      const JdkFramePtr& top);
};

`JdkVo`

class JdkVo {
public:
     /**
     * Initialize Video Output (Vo)
     * @param width   Output frame width
     * @param height  Output frame height
     * @param Format  Pixel format (default: NV12)
     */
    JdkVo(int width, int height,
          MppPixelFormat Format = PIXEL_FORMAT_NV12);
    ~JdkVo();

    /** Send frame to video output hardware */
    int sendFrame(std::shared_ptr<JdkFrame> frame);

private:
    int            width_;
    int            height_;
    MppPixelFormat format_;
    int            channel_id_;
    MppVoCtx*      pVoCtx = nullptr;
};

Python Bindings (`pyjdk`)

Install and Import

# Download and install
wget https://gitlab.dc.com:8443/bianbu/bianbu-linux/jdk/-/blob/main/pyjdk/pyjdk-0.1.0-cp312-cp312-linux_riscv64.whl

pip install pyjdk-0.1.0-cp312-cp312-linux_riscv64.whl

# Import
import pyjdk as jdk

Available Enums

jdk.PixelFormat: NV12, MJPEG, JPEG (corresponds to V4L2 FourCC)
jdk.CodingType: H264, H265, JPEG, MJPEG
jdk.MppPixelFormat: NV12, NV21
jdk.V2DFormat: Common formats like RGB888 (other values depend on the actual build output)
jdk.DrmPixelFormat: NV12

`jdk.V2DRect`

r = jdk.V2DRect(x, y, w, h)
# Also accepts (x,y,w,h) tuple/list/dict in drawing functions

`jdk.Dma`

dma = jdk.Dma()
dma.asyn(dst_fd: int, src_fd: int, size: int) -> int  # Asynchronous DMA copy (wraps JdkDma::Asyn)

`jdk.Frame`（equivalent to JdkFrame）

f = jdk.Frame(dma_fd: int, size: int, width: int, height: int)

# Read-only
f.dma_fd: int
f.size: int
f.width: int
f.height: int

# I/O and Views
f.save(path: str) -> bool                    # Save underlying buffer (NV12/raw/bitstream)
f.load_from_file(path: str, expected_size: int) -> bool
f.to_numpy_nv12(copy: bool = False) -> (y, uv)  # Zero-copy or deep-copy to numpy (NV12 two-plane format)
f.to_bytes() -> bytes                         # Directly export underlying buffer
f.mem_copy(src: bytes|bytearray|memoryview, offset: int = 0) -> int

# Resource management
f.release()                                   # Immediately release the underlying buffer (QBUF)
# Supports 'with' syntax — automatically releases when exiting scope
with f:
    y, uv = f.to_numpy_nv12()

Note:

to_numpy_nv12(copy=False) returns a zero-copy view of y/uv bound to the underlying buffer.
The lifetime of this view is tied to the numpy object.
To get an independent copy, set copy=True.

Camera Capture (MIPI / USB)

`jdk.MipiCam`

cam = jdk.MipiCam.create(device: str, width: int, height: int,
                         fourcc: jdk.PixelFormat = jdk.PixelFormat.NV12,
                         req_count: int = 4) -> jdk.MipiCam
frame = cam.get_frame()  # Blocking frame capture, returns jdk.Frame

# Also supports iteration: for f in cam — to capture frames continuously
# Supports 'with cam:' syntax for automatic resource management

`jdk.UsbCam`

uc = jdk.UsbCam.create("/dev/video20", 1280, 720, jdk.PixelFormat.MJPEG)
f  = uc.get_frame()      # Returns MJPEG bitstream frame (can be decoded with Decoder)

See examples: mipi_cam.py, usb_cam.py.

Encoder `jdk.Encoder`

enc = jdk.Encoder(width: int, height: int,
                  coding: jdk.CodingType = jdk.CodingType.H264,
                  pixfmt: jdk.MppPixelFormat = jdk.MppPixelFormat.NV12)

pkt = enc.encode(frame: jdk.Frame) -> jdk.Frame   # return bitstream in Frame

Note:

Only the encode(...) method is exposed in the current version.
If your local script uses encode_frame(...), please change it to encode(...).
The encode_frame name in encode_h264.py is outdated and should be updated.

Decoder `jdk.Decoder`

dec = jdk.Decoder(width: int, height: int,
                  coding: jdk.CodingType = jdk.CodingType.JPEG,
                  pixfmt: jdk.MppPixelFormat = jdk.MppPixelFormat.NV12)

# 1) Decode from "bitstream frame"
yuv = dec.decode(bitstream_frame: jdk.Frame) -> jdk.Frame

# 2) Decode directly from bytes-like object (bytes/bytearray/memoryview)
yuv = dec.decode(bitstream: bytes|bytearray|memoryview) -> jdk.Frame

See examples: decode_jpeg.py, encode_decode.py.

Image Processing `jdk.V2D`

v2d = jdk.V2D()
out1 = v2d.convert_format(input: jdk.Frame, out_format: jdk.V2DFormat) -> jdk.Frame
out2 = v2d.resize(input: jdk.Frame, out_width: int, out_height: int) -> jdk.Frame
out3 = v2d.resize_and_convert(input: jdk.Frame, out_width: int, out_height: int,
                              out_format: jdk.V2DFormat) -> jdk.Frame

v2d.fill_rect(image: jdk.Frame, rect: jdk.V2DRect, rgba_color: int) -> bool
v2d.draw_rect(image: jdk.Frame, rect: jdk.V2DRect, rgba_color: int, thickness: int = 2) -> bool
v2d.draw_rects(image: jdk.Frame, rects: list[jdk.V2DRect], rgba_color: int, thickness: int = 2) -> bool

mixed = v2d.blend(bottom: jdk.Frame, top: jdk.Frame) -> jdk.Frame

See example: v2d_demo.py. Use rgba_color as 0xAARRGGBB.

Display Output

`jdk.Vo`

vo = jdk.Vo(width: int, height: int, pixfmt: jdk.MppPixelFormat = jdk.MppPixelFormat.NV12)
vo.send_frame(frame: jdk.Frame) -> int

`jdk.Drm`

drm = jdk.Drm(width: int, height: int,
              stride: int = 0,
              pixfmt: jdk.DrmPixelFormat = jdk.DrmPixelFormat.NV12,
              card: str = "/dev/dri/card0")

drm.send_frame(frame: jdk.Frame) -> int

See Examples: jdk_vo.py, jdk_drm.py.

End-to-End Examples（same as source code）

MIPI Capture → Encode → Decode（from encode_decode.py）

cam = jdk.MipiCam.create("/dev/video50", 1920, 1080, jdk.PixelFormat.NV12)
enc = jdk.Encoder(1920, 1080, jdk.CodingType.H264, jdk.MppPixelFormat.NV12)
dec = jdk.Decoder(1920, 1080, jdk.CodingType.H264, jdk.MppPixelFormat.NV12)

for _ in range(60):
    f = cam.get_frame()
    pkt = enc.encode(f)
    yuv = dec.decode(pkt)

Decode JPEG/MJPEG from bytes（from decode_jpeg.py）

bs_frame = jdk.Frame(-1, size, 1920, 1080)
bs_frame.load_from_file("examples/data/1920x1080.jpg", expected_size=size)
dec = jdk.Decoder(1920,1080, jdk.CodingType.MJPEG, jdk.MppPixelFormat.NV12)
yuv = dec.decode(bs_frame)

NV12 → RGB888 + Image frame（from v2d_demo.py）

f = jdk.Frame(-1, w*h*3//2, w, h)
f.load_from_file("frame_1920x1080_nv12.yuv", expected_size=w*h*3//2)

v2d = jdk.V2D()
rgb = v2d.convert_format(f, jdk.V2DFormat.RGB888)
v2d.draw_rects(f, [jdk.V2DRect(30,20,100,80)], 0xFFFFFF00, 4)

Error Handling and Performance Notes

All blocking or compute-intensive operations (frame capture, encoding/decoding, and V2D processing) release the Global Interpreter Lock (GIL) at the C++ level, improving multi-threaded performance in Python.
Frame.to_numpy_nv12(copy=False) returns zero-copy views of the underlying buffer. Manage lifecycle carefully:
- Call f.release() when done, OR
- Use the frame within a with f: block for automatic cleanup.
The decoder accepts two input types: Frame and bytes-like objects, making it easy to adapt decoding for both network streams and file streams.

3.4.3 JDK API Guide (with Python Bindings)

Introduction​

Data Type Definitions (data_type)​

Enum: media_type​

Enum: codec_type​

Struct: v4l2_ctx​

Core C++ API​

JdkFrame: Image Frame Wrapper​

JdkDma and JdkDmaBuffer​

JdkCamera​

JdkDecoder​

JdkEncoder​

JdkDrm​

JdkV2D​

JdkVo​

Python Bindings (pyjdk)​

Install and Import​

Available Enums​

jdk.V2DRect​

jdk.Dma​

jdk.Frame（equivalent to JdkFrame）​

Camera Capture (MIPI / USB)​

jdk.MipiCam​

jdk.UsbCam​

Encoder jdk.Encoder​

Decoder jdk.Decoder​

Image Processing jdk.V2D​

Display Output​

jdk.Vo​

jdk.Drm​

End-to-End Examples（same as source code）​

Error Handling and Performance Notes​