Skip to main content

3.4.3 JDK API Description (including Python packaging)


Catalogue

  • [Data type definition (data_type)](# data type definition data_type)
  • [Core Interface (C++)](#Core Interface c)
    • [JdkFrame](#jdkframe-Image Frame Encapsulation Class)
    • [JdkDma / JdkDmaBuffer](#jdkdma-and -jdkdmabufferdma-memory and asynchronous transmission class)
    • [JdkCamera](#jdkcamera-Camera Image Acquisition Interface)
    • [JdkDecoder](#jdkdecoder-Hardware Video Decoder)
    • [JdkEncoder](#jdkencoder-hardware video encoder)
    • [JdkDrm](#jdkdrm-based-drm-based video output interface)
    • [JdkV2D](#jdkv2d-Image Processing Scaling Format Conversion Image Overlay)
    • [JdkVo](#jdkvo-vovideo-output display interface)
  • [Python binding (pyjdk)](#python-binding pyjdk)

Data type definition (data_type)

Define the basic enumerations and structures used in API

Enumeration type: media_type

Enumeration of device media types:

ValueImplication
MEDIA_TYPE_CANT_STAT无法获取设备状态
MEDIA_TYPE_UNKNOWNUnknown
MEDIA_TYPE_VIDEO视频
MEDIA_TYPE_VBIVBI(垂直消隐)
MEDIA_TYPE_RADIOBroadcast
MEDIA_TYPE_SDRSDR(软件定义无线电)
MEDIA_TYPE_TOUCH触摸输入
MEDIA_TYPE_SUBDEVSub-equipment
MEDIA_TYPE_DVB_FRONTEND数字电视前端
MEDIA_TYPE_DVB_DEMUX数字电视解复用
MEDIA_TYPE_DVB_DVR数字电视录像
MEDIA_TYPE_DVB_NET数字电视网络
MEDIA_TYPE_DTV_CA数字电视条件访问
MEDIA_TYPE_MEDIA媒体设备

Enumeration type: codec_type

Indicates whether the current device or context is encoded or decoded:

ValueImplication
NOT_CODEC非编解码
CODEC_DECDecode
CODEC_ENCCode

Structure: v4l2_ctx

V4L2 Capture and Coding Context Structure Definition:

struct v4l2_ctx {
int fd; // Device file handle
unsigned int width; // Video width
unsigned int height; // Video height
unsigned int pixelformat; // Input pixel format
unsigned int out_pixelformat; // Output pixel format
int nplanes; // Input the number of planes
int out_nplanes; // Output plane number
struct buffer* cap_buffers; // Capture the buffer array
struct buffer* out_buffers; // Output buffer array
__u32 bytesperline[VIDEO_MAX_PLANES]; // The number of bytes of each input plane row
__u32 out_bytesperline[VIDEO_MAX_PLANES]; // The number of bytes of each output plane row
FILE* file[2]; // Input/output file pointer
int verbose; // Log detailed level
enum codec_type ctype; // Encoding/decoding type
};

Core interface (C++)

The main categories of multimedia: including frames, cameras, decoders/encoders, video output and image processing.

JdkFrame: Image Frame Encapsulation Class

class JdkFrame {
public:
JdkFrame(int dma_fd_, size_t size_, int w, int h);
~JdkFrame();

// Map the DMA buffer to the CPU memory and return the pointer
unsigned char* toHost() const;
// Clone to return a copy of the data
std::vector<unsigned char> Clone() const;
// Save as a .yuv file in NV12 format
bool saveToFile(const std::string& filename) const;
// Load data from the file (pair with saveToFile)
bool loadFromFile(const std::string& filename, size_t expected_size);

// Get the underlying DMA FD
int getDMAFd() const;
// Get the size of the buffer
size_t getSize() const { return size_; }
// Get the resolution
int getWidth() const { return width_; }
int getHeight() const { return height_; }

// Copy the original NALU data to the internal buffer (such as writing after coding)
// offset: target buffer offset
int MemCopy(const uint8_t* nalu, int nalu_size, int offset = 0);

private:
size_t size_; // The total size of the buffer
int width_;
int height_;
JdkDma dma_; // DMA synchronization assistance
std::shared_ptr<JdkDmaBuffer> data; // Bottom DMA buffer
};

using JdkFramePtr = std::shared_ptr<JdkFrame>;

JdkDmaJdkDmaBuffer

class JdkDmaBuffer {
public:
// Construct and allocate DMA buffers
explicit JdkDmaBuffer(size_t size);
~JdkDmaBuffer();

// Return the mapped user space address
void* data() const;
// The whole block fill value
void fill(uint8_t val);

// Get the physical address (need to call map_phys_addr first)
void map_phys_addr();

// Public fields (read-only)
size_t m_size;
uint64_t m_phys;
};

class JdkDma {
public:
// Copy data asynchronously through the DMA engine
int Asyn(const JdkDmaBuffer& dst, const JdkDmaBuffer& src, size_t size);
// DMA copying between FD
int Asyn(const int& dst_fd, const int& src_fd, size_t size);
};

JdkCamera

class JdkCamera {
public:
/**
* Create and open the V4L2 device
* @param device Device path (e.g. "/dev/video0")
* @param width Expected collection width
* @param height Expected collection height
* @param pixfmt V4L2 pixel format (e.g. V4L2_PIX_FMT_NV12)
* @param req_count The number of requested buffers (default 4)
* @return Successfully return JdkCameraPtr, otherwise return nullptr
*/
static std::shared_ptr<JdkCamera> create(const std::string& device,
int width,
int height,
__u32 pixfmt,
int req_count = 4);
/** Get a frame of image (blocking) */
JdkFramePtr getFrame();

~JdkCamera();

private:
explicit JdkCamera(const std::string& device);
class Impl;
std::unique_ptr<Impl> impl_;
};
using JdkCameraPtr = std::shared_ptr<JdkCamera>;

JdkDecoder

class JdkDecoder {
public:
/**
* Initialize the hardware decoder
* @param width Output resolution width
* @param height Output resolution height
* @param payload Input code stream type (see MppCodingType)
* @param Format Output pixel format (default NV12)
*/
JdkDecoder(int width, int height,
MppCodingType payload,
MppPixelFormat Format = PIXEL_FORMAT_NV12);
~JdkDecoder();

/** Decoding, decoding from the encapsulated frame */
std::shared_ptr<JdkFrame> Decode(std::shared_ptr<JdkFrame> frame);
/** Decoding, decoding from naked NALU data */
std::shared_ptr<JdkFrame> Decode(const uint8_t* nalu, int nalu_size);

private:
int width_;
int height_;
MppCodingType payload_;
int format_;
int channel_id_;
MppVdecCtx* pVdecCtx = nullptr;
};

JdkEncoder

class JdkEncoder {
public:
/**
*Initialize the hardware encoder
* @param width Input resolution width
* @param height Input resolution height
* @param payload Output code stream type (see MppCodingType)
* @param Format Input pixel format (default NV12)
*/
JdkEncoder(int width, int height,
MppCodingType payload,
MppPixelFormat Format = PIXEL_FORMAT_NV12);
~JdkEncoder();

/** Encoding, encode the original frame into a compressed code stream */
std::shared_ptr<JdkFrame> Encode(std::shared_ptr<JdkFrame> frame);

private:
int width_;
int height_;
MppCodingType payload_;
int format_;
int encoder_id_ = 0;
MppVencCtx* pVencCtx = nullptr;
};

JdkDrm

/** Supported pixel formats */
enum class PixelFmt : uint32_t {
NV12 = DRM_FORMAT_NV12
};

class JdkDrm {
public:
/**
* Open the DRM device and initialize it
* @param width Display width
* @param height Display height
* @param stride Line span (bytes)
* @param fmt Pixel format
* @param device DRM device path (default "/dev/dri/card0")
*/
JdkDrm(int width, int height, int stride,
PixelFmt fmt = PixelFmt::NV12,
const char* device = "/dev/dri/card0");
~JdkDrm();

/** Send a frame to the DRM screen */
int sendFrame(std::shared_ptr<JdkFrame> frame);
/** Destroy the specified framebuffer */
void destroyFb(uint32_t fb, uint32_t handle);
/** Turn on the DRM device */
int openCard(const char* dev);
/** Automatically choose the appropriate connector/crtc/plane */
int pickConnectorCrtcPlane();
/** Import DMA FD as DRM framebuffer */
int importFb(int dma_fd, uint32_t& fb_id, uint32_t& handle);

private:
struct LastFB {
uint32_t fb_id;
uint32_t handle;
int dma_fd;
} last_;
};

JdkV2D

/** Supported target pixel format (please refer to the full header file for enumeration values) */
enum V2DFormat {
// For example: V2D_NV12, V2D_RGB888, ......
};

/** Rectangular area */
struct V2DRect {
int x, y, width, height;
};

class JdkV2D {
public:
JdkV2D() = default;
~JdkV2D() = default;

/** Format conversion */
JdkFramePtr convert_format(const JdkFramePtr& input,
V2DFormat out_format);
/** Zoom */
JdkFramePtr resize(const JdkFramePtr& input,
int out_width, int out_height);
/** Zoom and format conversion at the same time */
JdkFramePtr resize_and_convert(const JdkFramePtr& input,
int out_width, int out_height,
V2DFormat out_format);
/** Fill the rectangular area */
bool fill_rect(const JdkFramePtr& image,
const V2DRect& rect,
uint32_t rgba_color);
/** Draw a rectangular border */
bool draw_rect(const JdkFramePtr& image,
const V2DRect& rect,
uint32_t rgba_color,
int thickness = 2);
/** Draw multiple rectangles */
bool draw_rects(const JdkFramePtr& image,
const std::vector<V2DRect>& rects,
uint32_t rgba_color,
int thickness = 2);
/** Image fusion (bottom overlay top) */
JdkFramePtr blend(const JdkFramePtr& bottom,
const JdkFramePtr& top);
};

JdkVo

class JdkVo {
public:
/**
* Initialize Vo output
* @param width Output width
* @param height Output height
* @param Format Pixel format (default NV12)
*/
JdkVo(int width, int height,
MppPixelFormat Format = PIXEL_FORMAT_NV12);
~JdkVo();

/** Send a frame to Vo hardware output */
int sendFrame(std::shared_ptr<JdkFrame> frame);

private:
int width_;
int height_;
MppPixelFormat format_;
int channel_id_;
MppVoCtx* pVoCtx = nullptr;
};

Python binding (pyjdk)

Module import

#Download and install
wget https://gitlab.dc.com:8443/bianbu/bianbu-linux/jdk/-/blob/main/pyjdk/pyjdk-0.1.0-cp312-cp312-linux_riscv64.whl

pip install pyjdk-0.1.0-cp312-cp312-linux_riscv64.whl
#Import
import pyjdk as jdk

枚举(Enums)

  • jdk.PixelFormat: NV12, MJPEG, JPEG(Corresponding to V4L2 FourCC)
  • jdk.CodingType: H264, H265, JPEG, MJPEG
  • jdk.MppPixelFormat: NV12, NV21
  • jdk.V2DFormat: Commonly used, such as RGB888 (the rest is subject to the actual compilation product)
  • jdk.DrmPixelFormat: NV12

jdk.V2DRect

r = jdk.V2DRect(x, y, w, h)
# also supports (x,y,w,h) tuple / list / dict passing to the drawing interface

jdk.Dma

dma = jdk.Dma()
dma.asyn(dst_fd: int, src_fd: int, size: int) -> int # Asynchronous DMA Copy (Packaging JdkDma::Asyn)

jdk.Frame (formerly JdkFrame)

f = jdk.Frame(dma_fd: int, size: int, width: int, height: int)

# Read-only attributes
f.dma_fd: int
f.size: int
f.width: int
f.height: int

# I/O and view
f.save(path: str) -> bool # Save the underlying buffer (NV12/raw/bit stream)
f.load_from_file(path: str, expected_size: int) -> bool
f.to_numpy_nv12(copy: bool = False) -> (y, uv) # Zero copy/deep copy to numpy (NV12 two-plane)
f.to_bytes() -> bytes # Directly export the bottom buffer
f.mem_copy(src: bytes|bytearray|memoryview, offset: int = 0) -> int

# Resource management
f.release() # Immediately return the underlying buffer (QBUF)
# Support syntax, automatically release when exiting ()
with f:
y, uv = f.to_numpy_nv12()

Description: The y/uv returned by to_numpy_nv12(copy=False) is a zero-copy view of the bottom buffer, and the life cycle is bound to the numpy object; if you need an independent copy, please set copy=True.


Capture camera (MIPI / USB)

jdk.MipiCam

cam = jdk.MipiCam.create(device: str, width: int, height: int,
fourcc: jdk.PixelFormat = jdk.PixelFormat.NV12,
req_count: int = 4) -> jdk.MipiCam
frame = cam.get_frame() # Block the frame and return jdk.Frame

# Also support for f in cam: iterative acquisition of frames; support with cam: enter/exit management

jdk.UsbCam

uc = jdk.UsbCam.create("/dev/video20", 1280, 720, jdk.PixelFormat.MJPEG)
f = uc.get_frame() # Return MJPEG bit stream frame (can be decoded with Decoder)

Corresponding examples: mipi_cam.py, usb_cam.py.


Encoder jdk.Encoder

enc = jdk.Encoder(width: int, height: int,
coding: jdk.CodingType = jdk.CodingType.H264,
pixfmt: jdk.MppPixelFormat = jdk.MppPixelFormat.NV12)

pkt = enc.encode(frame: jdk.Frame) -> jdk.Frame # Return to "code stream frame" (bitstream in Frame)

Note: The source code only exports encode(...) Method; If you use encode_frame(...) in your local script, please change it to encode(...) (encode_frame in the encode_h264.py example is the old name, and it is recommended to update it).


Decoder jdk.Decoder

dec = jdk.Decoder(width: int, height: int,
coding: jdk.CodingType = jdk.CodingType.JPEG,
pixfmt: jdk.MppPixelFormat = jdk.MppPixelFormat.NV12)

# 1) Decode from "Code Stream Frame"
yuv = dec.decode(bitstream_frame: jdk.Frame) -> jdk.Frame

# 2) Decode directly from bytes-like (bytes/bytearray/memoryview)
yuv = dec.decode(bitstream: bytes|bytearray|memoryview) -> jdk.Frame

Corresponding examples: decode_jpeg.py, encode_decode.py.


Image processing jdk.V2D

v2d = jdk.V2D()
out1 = v2d.convert_format(input: jdk.Frame, out_format: jdk.V2DFormat) -> jdk.Frame
out2 = v2d.resize(input: jdk.Frame, out_width: int, out_height: int) -> jdk.Frame
out3 = v2d.resize_and_convert(input: jdk.Frame, out_width: int, out_height: int,
out_format: jdk.V2DFormat) -> jdk.Frame

v2d.fill_rect(image: jdk.Frame, rect: jdk.V2DRect, rgba_color: int) -> bool
v2d.draw_rect(image: jdk.Frame, rect: jdk.V2DRect, rgba_color: int, thickness: int = 2) -> bool
v2d.draw_rects(image: jdk.Frame, rects: list[jdk.V2DRect], rgba_color: int, thickness: int = 2) -> bool

mixed = v2d.blend(bottom: jdk.Frame, top: jdk.Frame) -> jdk.Frame

Corresponding example: v2d_demo.py. Rgba_color uses 0xAARRGGBB.


Display output

jdk.Vo

vo = jdk.Vo(width: int, height: int, pixfmt: jdk.MppPixelFormat = jdk.MppPixelFormat.NV12)
vo.send_frame(frame: jdk.Frame) -> int

jdk.Drm

drm = jdk.Drm(width: int, height: int,
stride: int = 0,
pixfmt: jdk.DrmPixelFormat = jdk.DrmPixelFormat.NV12,
card: str = "/dev/dri/card0")

drm.send_frame(frame: jdk.Frame) -> int

Corresponding examples: jdk_vo.py, jdk_drm.py.


End-to-end example (consistent with the source code)

MIPI Collection → Encoding → Decoding (Refer to encode_decode.py)

cam = jdk.MipiCam.create("/dev/video50", 1920, 1080, jdk.PixelFormat.NV12)
enc = jdk.Encoder(1920, 1080, jdk.CodingType.H264, jdk.MppPixelFormat.NV12)
dec = jdk.Decoder(1920, 1080, jdk.CodingType.H264, jdk.MppPixelFormat.NV12)

for _ in range(60):
f = cam.get_frame()
pkt = enc.encode(f)
yuv = dec.decode(pkt)

Decode JPEG/MJPEG from bytes (refer to decode_jpeg.py)

bs_frame = jdk.Frame(-1, size, 1920, 1080)
bs_frame.load_from_file("examples/data/1920x1080.jpg", expected_size=size)
dec = jdk.Decoder(1920,1080, jdk.CodingType.MJPEG, jdk.MppPixelFormat.NV12)
yuv = dec.decode(bs_frame)

NV12 → RGB888 + Picture Frame (Refer to v2d_demo.py)

f = jdk.Frame(-1, w*h*3//2, w, h)
f.load_from_file("frame_1920x1080_nv12.yuv", expected_size=w*h*3//2)

v2d = jdk.V2D()
rgb = v2d.convert_format(f, jdk.V2DFormat.RGB888)
v2d.draw_rects(f, [jdk.V2DRect(30,20,100,80)], 0xFFFFFF00, 4)

Error handling and performance points

  • All blocking/recalculation paths (frame picking, encoding and decoding, V2D) have released GIL on the C++ side, which is beneficial to Python multi-threaded throughput.
  • Frame.to_numpy_nv12(copy=False) is a zero copy view; you need to carefully manage the life cycle and use up f.release() or exit with f:.
  • The decoder supports two input forms: Frame and bytes-like, which is convenient for network flow/file flow adaptation.