3.4.3 JDK API Description (including Python packaging)
Catalogue
- [Data type definition (data_type)](# data type definition data_type)
- [Core Interface (C++)](#Core Interface c)
- [JdkFrame](#jdkframe-Image Frame Encapsulation Class)
- [JdkDma / JdkDmaBuffer](#jdkdma-and -jdkdmabufferdma-memory and asynchronous transmission class)
- [JdkCamera](#jdkcamera-Camera Image Acquisition Interface)
- [JdkDecoder](#jdkdecoder-Hardware Video Decoder)
- [JdkEncoder](#jdkencoder-hardware video encoder)
- [JdkDrm](#jdkdrm-based-drm-based video output interface)
- [JdkV2D](#jdkv2d-Image Processing Scaling Format Conversion Image Overlay)
- [JdkVo](#jdkvo-vovideo-output display interface)
- [Python binding (pyjdk)](#python-binding pyjdk)
- [Installation and Import](#Installation and Import)
- [Python type convention](#python-type convention)
- pyjdk.JdkFrame
- pyjdk.JdkCamera
- pyjdk.JdkDecoder
- pyjdk.JdkEncoder
- pyjdk.JdkV2D
- pyjdk.JdkDrm
- pyjdk.JdkVo
- [End-to-end example: capture → zoom → display](#Error handling and performance points)
Data type definition (data_type)
Define the basic enumerations and structures used in API
Enumeration type: media_type
Enumeration of device media types:
Value | Implication |
---|---|
MEDIA_TYPE_CANT_STAT | 无法获取设备状态 |
MEDIA_TYPE_UNKNOWN | Unknown |
MEDIA_TYPE_VIDEO | 视频 |
MEDIA_TYPE_VBI | VBI(垂直消隐) |
MEDIA_TYPE_RADIO | Broadcast |
MEDIA_TYPE_SDR | SDR(软件定义无线电) |
MEDIA_TYPE_TOUCH | 触摸输入 |
MEDIA_TYPE_SUBDEV | Sub-equipment |
MEDIA_TYPE_DVB_FRONTEND | 数字电视前端 |
MEDIA_TYPE_DVB_DEMUX | 数字电视解复用 |
MEDIA_TYPE_DVB_DVR | 数字电视录像 |
MEDIA_TYPE_DVB_NET | 数字电视网络 |
MEDIA_TYPE_DTV_CA | 数字电视条件访问 |
MEDIA_TYPE_MEDIA | 媒体设备 |
Enumeration type: codec_type
Indicates whether the current device or context is encoded or decoded:
Value | Implication |
---|---|
NOT_CODEC | 非编解码 |
CODEC_DEC | Decode |
CODEC_ENC | Code |
Structure: v4l2_ctx
V4L2 Capture and Coding Context Structure Definition:
struct v4l2_ctx {
int fd; // Device file handle
unsigned int width; // Video width
unsigned int height; // Video height
unsigned int pixelformat; // Input pixel format
unsigned int out_pixelformat; // Output pixel format
int nplanes; // Input the number of planes
int out_nplanes; // Output plane number
struct buffer* cap_buffers; // Capture the buffer array
struct buffer* out_buffers; // Output buffer array
__u32 bytesperline[VIDEO_MAX_PLANES]; // The number of bytes of each input plane row
__u32 out_bytesperline[VIDEO_MAX_PLANES]; // The number of bytes of each output plane row
FILE* file[2]; // Input/output file pointer
int verbose; // Log detailed level
enum codec_type ctype; // Encoding/decoding type
};
Core interface (C++)
The main categories of multimedia: including frames, cameras, decoders/encoders, video output and image processing.
JdkFrame
: Image Frame Encapsulation Class
class JdkFrame {
public:
JdkFrame(int dma_fd_, size_t size_, int w, int h);
~JdkFrame();
// Map the DMA buffer to the CPU memory and return the pointer
unsigned char* toHost() const;
// Clone to return a copy of the data
std::vector<unsigned char> Clone() const;
// Save as a .yuv file in NV12 format
bool saveToFile(const std::string& filename) const;
// Load data from the file (pair with saveToFile)
bool loadFromFile(const std::string& filename, size_t expected_size);
// Get the underlying DMA FD
int getDMAFd() const;
// Get the size of the buffer
size_t getSize() const { return size_; }
// Get the resolution
int getWidth() const { return width_; }
int getHeight() const { return height_; }
// Copy the original NALU data to the internal buffer (such as writing after coding)
// offset: target buffer offset
int MemCopy(const uint8_t* nalu, int nalu_size, int offset = 0);
private:
size_t size_; // The total size of the buffer
int width_;
int height_;
JdkDma dma_; // DMA synchronization assistance
std::shared_ptr<JdkDmaBuffer> data; // Bottom DMA buffer
};
using JdkFramePtr = std::shared_ptr<JdkFrame>;
JdkDma
与 JdkDmaBuffer
class JdkDmaBuffer {
public:
// Construct and allocate DMA buffers
explicit JdkDmaBuffer(size_t size);
~JdkDmaBuffer();
// Return the mapped user space address
void* data() const;
// The whole block fill value
void fill(uint8_t val);
// Get the physical address (need to call map_phys_addr first)
void map_phys_addr();
// Public fields (read-only)
size_t m_size;
uint64_t m_phys;
};
class JdkDma {
public:
// Copy data asynchronously through the DMA engine
int Asyn(const JdkDmaBuffer& dst, const JdkDmaBuffer& src, size_t size);
// DMA copying between FD
int Asyn(const int& dst_fd, const int& src_fd, size_t size);
};
JdkCamera
class JdkCamera {
public:
/**
* Create and open the V4L2 device
* @param device Device path (e.g. "/dev/video0")
* @param width Expected collection width
* @param height Expected collection height
* @param pixfmt V4L2 pixel format (e.g. V4L2_PIX_FMT_NV12)
* @param req_count The number of requested buffers (default 4)
* @return Successfully return JdkCameraPtr, otherwise return nullptr
*/
static std::shared_ptr<JdkCamera> create(const std::string& device,
int width,
int height,
__u32 pixfmt,
int req_count = 4);
/** Get a frame of image (blocking) */
JdkFramePtr getFrame();
~JdkCamera();
private:
explicit JdkCamera(const std::string& device);
class Impl;
std::unique_ptr<Impl> impl_;
};
using JdkCameraPtr = std::shared_ptr<JdkCamera>;
JdkDecoder
class JdkDecoder {
public:
/**
* Initialize the hardware decoder
* @param width Output resolution width
* @param height Output resolution height
* @param payload Input code stream type (see MppCodingType)
* @param Format Output pixel format (default NV12)
*/
JdkDecoder(int width, int height,
MppCodingType payload,
MppPixelFormat Format = PIXEL_FORMAT_NV12);
~JdkDecoder();
/** Decoding, decoding from the encapsulated frame */
std::shared_ptr<JdkFrame> Decode(std::shared_ptr<JdkFrame> frame);
/** Decoding, decoding from naked NALU data */
std::shared_ptr<JdkFrame> Decode(const uint8_t* nalu, int nalu_size);
private:
int width_;
int height_;
MppCodingType payload_;
int format_;
int channel_id_;
MppVdecCtx* pVdecCtx = nullptr;
};
JdkEncoder
class JdkEncoder {
public:
/**
*Initialize the hardware encoder
* @param width Input resolution width
* @param height Input resolution height
* @param payload Output code stream type (see MppCodingType)
* @param Format Input pixel format (default NV12)
*/
JdkEncoder(int width, int height,
MppCodingType payload,
MppPixelFormat Format = PIXEL_FORMAT_NV12);
~JdkEncoder();
/** Encoding, encode the original frame into a compressed code stream */
std::shared_ptr<JdkFrame> Encode(std::shared_ptr<JdkFrame> frame);
private:
int width_;
int height_;
MppCodingType payload_;
int format_;
int encoder_id_ = 0;
MppVencCtx* pVencCtx = nullptr;
};
JdkDrm
/** Supported pixel formats */
enum class PixelFmt : uint32_t {
NV12 = DRM_FORMAT_NV12
};
class JdkDrm {
public:
/**
* Open the DRM device and initialize it
* @param width Display width
* @param height Display height
* @param stride Line span (bytes)
* @param fmt Pixel format
* @param device DRM device path (default "/dev/dri/card0")
*/
JdkDrm(int width, int height, int stride,
PixelFmt fmt = PixelFmt::NV12,
const char* device = "/dev/dri/card0");
~JdkDrm();
/** Send a frame to the DRM screen */
int sendFrame(std::shared_ptr<JdkFrame> frame);
/** Destroy the specified framebuffer */
void destroyFb(uint32_t fb, uint32_t handle);
/** Turn on the DRM device */
int openCard(const char* dev);
/** Automatically choose the appropriate connector/crtc/plane */
int pickConnectorCrtcPlane();
/** Import DMA FD as DRM framebuffer */
int importFb(int dma_fd, uint32_t& fb_id, uint32_t& handle);
private:
struct LastFB {
uint32_t fb_id;
uint32_t handle;
int dma_fd;
} last_;
};
JdkV2D
/** Supported target pixel format (please refer to the full header file for enumeration values) */
enum V2DFormat {
// For example: V2D_NV12, V2D_RGB888, ......
};
/** Rectangular area */
struct V2DRect {
int x, y, width, height;
};
class JdkV2D {
public:
JdkV2D() = default;
~JdkV2D() = default;
/** Format conversion */
JdkFramePtr convert_format(const JdkFramePtr& input,
V2DFormat out_format);
/** Zoom */
JdkFramePtr resize(const JdkFramePtr& input,
int out_width, int out_height);
/** Zoom and format conversion at the same time */
JdkFramePtr resize_and_convert(const JdkFramePtr& input,
int out_width, int out_height,
V2DFormat out_format);
/** Fill the rectangular area */
bool fill_rect(const JdkFramePtr& image,
const V2DRect& rect,
uint32_t rgba_color);
/** Draw a rectangular border */
bool draw_rect(const JdkFramePtr& image,
const V2DRect& rect,
uint32_t rgba_color,
int thickness = 2);
/** Draw multiple rectangles */
bool draw_rects(const JdkFramePtr& image,
const std::vector<V2DRect>& rects,
uint32_t rgba_color,
int thickness = 2);
/** Image fusion (bottom overlay top) */
JdkFramePtr blend(const JdkFramePtr& bottom,
const JdkFramePtr& top);
};
JdkVo
class JdkVo {
public:
/**
* Initialize Vo output
* @param width Output width
* @param height Output height
* @param Format Pixel format (default NV12)
*/
JdkVo(int width, int height,
MppPixelFormat Format = PIXEL_FORMAT_NV12);
~JdkVo();
/** Send a frame to Vo hardware output */
int sendFrame(std::shared_ptr<JdkFrame> frame);
private:
int width_;
int height_;
MppPixelFormat format_;
int channel_id_;
MppVoCtx* pVoCtx = nullptr;
};
Python binding (pyjdk)
Module import
#Download and install
wget https://gitlab.dc.com:8443/bianbu/bianbu-linux/jdk/-/blob/main/pyjdk/pyjdk-0.1.0-cp312-cp312-linux_riscv64.whl
pip install pyjdk-0.1.0-cp312-cp312-linux_riscv64.whl
#Import
import pyjdk as jdk
枚举(Enums)
jdk.PixelFormat
:NV12
,MJPEG
,JPEG
(Corresponding to V4L2 FourCC)jdk.CodingType
:H264
,H265
,JPEG
,MJPEG
jdk.MppPixelFormat
:NV12
,NV21
jdk.V2DFormat
: Commonly used, such as RGB888 (the rest is subject to the actual compilation product)jdk.DrmPixelFormat
:NV12
jdk.V2DRect
r = jdk.V2DRect(x, y, w, h)
# also supports (x,y,w,h) tuple / list / dict passing to the drawing interface
jdk.Dma
dma = jdk.Dma()
dma.asyn(dst_fd: int, src_fd: int, size: int) -> int # Asynchronous DMA Copy (Packaging JdkDma::Asyn)
jdk.Frame
(formerly JdkFrame)
f = jdk.Frame(dma_fd: int, size: int, width: int, height: int)
# Read-only attributes
f.dma_fd: int
f.size: int
f.width: int
f.height: int
# I/O and view
f.save(path: str) -> bool # Save the underlying buffer (NV12/raw/bit stream)
f.load_from_file(path: str, expected_size: int) -> bool
f.to_numpy_nv12(copy: bool = False) -> (y, uv) # Zero copy/deep copy to numpy (NV12 two-plane)
f.to_bytes() -> bytes # Directly export the bottom buffer
f.mem_copy(src: bytes|bytearray|memoryview, offset: int = 0) -> int
# Resource management
f.release() # Immediately return the underlying buffer (QBUF)
# Support syntax, automatically release when exiting ()
with f:
y, uv = f.to_numpy_nv12()
Description: The y/uv returned by to_numpy_nv12(copy=False) is a zero-copy view of the bottom buffer, and the life cycle is bound to the numpy object; if you need an independent copy, please set copy=True.
Capture camera (MIPI / USB)
jdk.MipiCam
cam = jdk.MipiCam.create(device: str, width: int, height: int,
fourcc: jdk.PixelFormat = jdk.PixelFormat.NV12,
req_count: int = 4) -> jdk.MipiCam
frame = cam.get_frame() # Block the frame and return jdk.Frame
# Also support for f in cam: iterative acquisition of frames; support with cam: enter/exit management
jdk.UsbCam
uc = jdk.UsbCam.create("/dev/video20", 1280, 720, jdk.PixelFormat.MJPEG)
f = uc.get_frame() # Return MJPEG bit stream frame (can be decoded with Decoder)
Corresponding examples: mipi_cam.py, usb_cam.py.
Encoder jdk.Encoder
enc = jdk.Encoder(width: int, height: int,
coding: jdk.CodingType = jdk.CodingType.H264,
pixfmt: jdk.MppPixelFormat = jdk.MppPixelFormat.NV12)
pkt = enc.encode(frame: jdk.Frame) -> jdk.Frame # Return to "code stream frame" (bitstream in Frame)
Note: The source code only exports encode(...) Method; If you use encode_frame(...) in your local script, please change it to encode(...) (encode_frame in the encode_h264.py example is the old name, and it is recommended to update it).
Decoder jdk.Decoder
dec = jdk.Decoder(width: int, height: int,
coding: jdk.CodingType = jdk.CodingType.JPEG,
pixfmt: jdk.MppPixelFormat = jdk.MppPixelFormat.NV12)
# 1) Decode from "Code Stream Frame"
yuv = dec.decode(bitstream_frame: jdk.Frame) -> jdk.Frame
# 2) Decode directly from bytes-like (bytes/bytearray/memoryview)
yuv = dec.decode(bitstream: bytes|bytearray|memoryview) -> jdk.Frame
Corresponding examples: decode_jpeg.py, encode_decode.py.
Image processing jdk.V2D
v2d = jdk.V2D()
out1 = v2d.convert_format(input: jdk.Frame, out_format: jdk.V2DFormat) -> jdk.Frame
out2 = v2d.resize(input: jdk.Frame, out_width: int, out_height: int) -> jdk.Frame
out3 = v2d.resize_and_convert(input: jdk.Frame, out_width: int, out_height: int,
out_format: jdk.V2DFormat) -> jdk.Frame
v2d.fill_rect(image: jdk.Frame, rect: jdk.V2DRect, rgba_color: int) -> bool
v2d.draw_rect(image: jdk.Frame, rect: jdk.V2DRect, rgba_color: int, thickness: int = 2) -> bool
v2d.draw_rects(image: jdk.Frame, rects: list[jdk.V2DRect], rgba_color: int, thickness: int = 2) -> bool
mixed = v2d.blend(bottom: jdk.Frame, top: jdk.Frame) -> jdk.Frame
Corresponding example: v2d_demo.py. Rgba_color uses 0xAARRGGBB.
Display output
jdk.Vo
vo = jdk.Vo(width: int, height: int, pixfmt: jdk.MppPixelFormat = jdk.MppPixelFormat.NV12)
vo.send_frame(frame: jdk.Frame) -> int
jdk.Drm
drm = jdk.Drm(width: int, height: int,
stride: int = 0,
pixfmt: jdk.DrmPixelFormat = jdk.DrmPixelFormat.NV12,
card: str = "/dev/dri/card0")
drm.send_frame(frame: jdk.Frame) -> int
Corresponding examples: jdk_vo.py, jdk_drm.py.
End-to-end example (consistent with the source code)
MIPI Collection → Encoding → Decoding (Refer to encode_decode.py)
cam = jdk.MipiCam.create("/dev/video50", 1920, 1080, jdk.PixelFormat.NV12)
enc = jdk.Encoder(1920, 1080, jdk.CodingType.H264, jdk.MppPixelFormat.NV12)
dec = jdk.Decoder(1920, 1080, jdk.CodingType.H264, jdk.MppPixelFormat.NV12)
for _ in range(60):
f = cam.get_frame()
pkt = enc.encode(f)
yuv = dec.decode(pkt)
Decode JPEG/MJPEG from bytes (refer to decode_jpeg.py)
bs_frame = jdk.Frame(-1, size, 1920, 1080)
bs_frame.load_from_file("examples/data/1920x1080.jpg", expected_size=size)
dec = jdk.Decoder(1920,1080, jdk.CodingType.MJPEG, jdk.MppPixelFormat.NV12)
yuv = dec.decode(bs_frame)
NV12 → RGB888 + Picture Frame (Refer to v2d_demo.py)
f = jdk.Frame(-1, w*h*3//2, w, h)
f.load_from_file("frame_1920x1080_nv12.yuv", expected_size=w*h*3//2)
v2d = jdk.V2D()
rgb = v2d.convert_format(f, jdk.V2DFormat.RGB888)
v2d.draw_rects(f, [jdk.V2DRect(30,20,100,80)], 0xFFFFFF00, 4)
Error handling and performance points
- All blocking/recalculation paths (frame picking, encoding and decoding, V2D) have released GIL on the C++ side, which is beneficial to Python multi-threaded throughput.
Frame.to_numpy_nv12(copy=False)
is a zero copy view; you need to carefully manage the life cycle and use up f.release() or exit with f:.- The decoder supports two input forms: Frame and bytes-like, which is convenient for network flow/file flow adaptation.