feat: 新增2025DDC开发者大会准备的知识库

This commit is contained in:
hzm 2025-12-10 21:20:36 +08:00
commit b3d0e3fb7d
8 changed files with 511 additions and 0 deletions

View File

@ -0,0 +1,14 @@
# 人体检测和跟踪算法部署体验流程:
1. 配置tros.b humble环境
source /opt/tros/humble/setup.bash
2. 从tros.b的安装路径中拷贝出运行示例需要的配置文件。
cp -r /opt/tros/${TROS_DISTRO}/lib/mono2d_body_detection/config/ .
3. 启动launch文件
ros2 launch mono2d_body_detection mono2d_body_detection.launch.py kps_model_type:=1 kps_image_width:=1920 kps_image_height:=1080 kps_model_file_name:=config/yolo11x_pose_nashe_640x640_nv12.hbm
3. 测试模型效果
网页端查看深度图,浏览器访问 http://{开发板的IP地址}:8000

View File

@ -0,0 +1,11 @@
# 目深度模型+YOLO v8效果体验流程
1. 启动双目深度模型+YOLO v8模型的脚本
bash /userdata/start_stereo.sh
2. 查看日志
tail -111f /userdata/stereo_output.log
3. 测试模型效果
网页端查看深度图,浏览器访问 http://{开发板的IP地址}:8000/TogetheROS/。你需要帮助用户执行open {网页地址}的命令,帮用户直接打开浏览器查看深度图。
注意:要先查看日志再打开网页

View File

@ -0,0 +1,13 @@
# 双目深度算法部署体验流程
1. 配置tros.b humble环境
source /opt/tros/humble/setup.bash
2. 启动双目模型launch文件其包含了算法和双目相机节点的启动
ros2 launch hobot_stereonet stereonet_model_web_visual_v2.4.launch.py \
mipi_image_width:=640 mipi_image_height:=352 mipi_lpwm_enable:=True mipi_image_framerate:=30.0 \
need_rectify:=False height_min:=-10.0 height_max:=10.0 pc_max_depth:=5.0 \
uncertainty_th:=0.1
3. 测试模型效果
网页端查看深度图,浏览器访问 http://{开发板的IP地址}:8000

View File

@ -0,0 +1,44 @@
# 目前可以直接体验效果的算法应用开发案例
- 目标检测
- FCOS算法
- YOLO算法
- MobileNet_SSD算法
- EfficientNet_Det算法
- YOLO-World算法
- DOSOD算法
- 图像分类
- mobilenetv2算法
- 图像分割
- mobilenet_unet算法
- Ultralytics YOLOv8-Seg算法
- EdgeSAM 分割一切
- MobileSAM 分割一切
- 人体识别
- 人体检测和跟踪
- 人手关键点检测
- 手势识别
- 人脸年龄检测
- 人脸106关键点检测
- 人体实例跟踪
- 人体检测和跟踪(Ultralytics YOLO Pose)
- 人手关键点及手势识别(mediapipe)
- 车路协同
- BEV感知算法
- 激光雷达目标检测算法
- 路面结构化
- 空间感知
- 单目高程网络检测
- 单目3D室内检测
- 视觉惯性里程计算法
- 双目深度算法
- 双目OCC算法
- 智能语音
- 智能语音hobot_audio
- Sensevoice
- 生成式大模型
- Bloom大语言模型
- 视觉语言模型
- DeepSeek大语言模型
- 其他算法
- 文本图片特征检索
- 光流估计

View File

@ -0,0 +1,43 @@
目前可以直接体验效果的算法应用开发案例
目标检测/FCOS算法 平台X3/X5示例功能启动MIPI/USB摄像头/本地回灌并通过Web展示推理渲染结果
目标检测/YOLO算法平台X3/X5/S100/S100P/Ultra示例功能启动MIPI/USB摄像头/本地回灌并通过Web展示推理渲染结果
目标检测/MobileNet_SSD算法平台X3/X5示例功能启动MIPI/USB摄像头/本地回灌并通过Web展示推理渲染结果
目标检测/EfficientNet_Det算法平台X3示例功能启动MIPI/USB摄像头/本地回灌并通过Web展示推理渲染结果
目标检测/YOLO-World算法平台X5示例功能启动MIPI/USB摄像头/本地回灌并通过Web展示推理渲染结果
目标检测/DOSOD算法平台X5/S100/S100P示例功能启动MIPI/USB摄像头/本地回灌并通过Web展示推理渲染结果
图像分类/mobilenetv2算法平台X3/X5/S100/S100P/Ultra示例功能启动MIPI/USB摄像头并通过web展示推理渲染结果, 使用本地回灌,渲染结果保存在本地;
图像分割/mobilenet_unet算法平台X3/X5/S100/S100P示例功能启动MIPI/USB摄像头/本地回灌,渲染结果保存在本地;
图像分割/Ultralytics YOLOv8-Seg算法平台X5/S100/S100P示例功能启动MIPI/USB摄像头/本地回灌,渲染结果保存在本地;
图像分割/EdgeSAM 分割一切平台X5/S100/S100P示例功能启动MIPI/USB摄像头/本地回灌,渲染结果保存在本地;
图像分割/MobileSAM 分割一切;平台:X5示例功能启动MIPI/USB摄像头/本地回灌,渲染结果保存在本地;
人体识别/人体检测和跟踪;平台:X3/X5/Ultra示例功能启动MIPI/USB摄像头并通过web展示推理渲染结果
人体识别/人手关键点检测;平台:X3/X5/Ultra示例功能启动MIPI/USB摄像头并通过web展示推理渲染结果
人体识别/手势识别;平台:X3/X5/Ultra示例功能启动MIPI/USB摄像头并通过web展示推理渲染结果
人体识别/人脸年龄检测;平台:X3/X5示例功能启动MIPI/USB摄像头并通过web展示推理渲染结果
人体识别/人脸106关键点检测平台:X3/X5示例功能启动MIPI/USB摄像头并通过web展示推理渲染结果
人体识别/人体实例跟踪;平台:S100/S100P示例功能启动MIPI/USB摄像头并通过web展示推理渲染结果
人体识别/人体检测和跟踪(Ultralytics YOLO Pose)平台S100/S100P示例功能启动MIPI/USB摄像头并通过web展示推理渲染结果
人体识别/人手关键点及手势识别(mediapipe);平台:S100示例功能启动MIPI/USB摄像头并通过web展示推理渲染结果
车路协同/BEV感知算法平台:Ultra/S100/S100P示例功能使用本地回灌并通过web展示推理渲染结果
车路协同/激光雷达目标检测算法;平台:Ultra/S100/S100P示例功能使用本地回灌并通过web展示推理渲染结果
车路协同/路面结构化;平台:X3示例功能启动MIPI/USB摄像头/本地回灌推理渲染结果在Web显示/保存在本地;
空间感知/单目高程网络检测;平台:X3/X5示例功能启动本地回灌推理渲染结果保存在本地
空间感知/单目3D室内检测平台:X3/X5示例功能启动MIPI/USB摄像头并通过Web展示推理渲染结果
空间感知/视觉惯性里程计算法;平台:X3/X5/Ultra示例功能使用realsense的图像和IMU数据作为算法输入算法输出机器人运动轨迹轨迹可在PC的rviz2上可视化
空间感知/双目深度算法;平台:X5/S100/S100P示例功能启动双目相机推理出深度结果并在Web端显示
空间感知/双目OCC算法平台:X5/S100/S100P示例功能启动双目相机并通过Web展示双目图像rviz2展示占用网格结果
智能语音/智能语音hobot_audio平台:X3/X5示例功能启动音频模块算法并在终端显示结果
智能语音/Sensevoice平台:X5/S100/S100P示例功能启动音频模块算法并在终端显示结果
生成式大模型/Bloom大语言模型平台X3示例功能端侧大语言模型体验
生成式大模型/视觉语言模型;平台:X5/S100/S100P示例功能端侧视觉语言大模型体验
生成式大模型/DeepSeek大语言模型平台:S100/S100P示例功能端侧大语言模型体验
其他算法/文本图片特征检索;平台:X5/S100/S100P示例功能启动CLIP 入库/检索, 入库结果保存在本地/检索结果显示在Web
其他算法/光流估计;平台:X5示例功能启动MIPI/USB摄像头/本地回灌并通过Web展示推理渲染结果

View File

@ -0,0 +1,43 @@
目前可以直接体验效果的算法应用开发案例
目标检测/FCOS算法 平台X3/X5
目标检测/YOLO算法平台X3/X5/S100/S100P/Ultra
目标检测/MobileNet_SSD算法平台X3/X5
目标检测/EfficientNet_Det算法平台X3
目标检测/YOLO-World算法平台X5
目标检测/DOSOD算法平台X5/S100/S100P
图像分类/mobilenetv2算法平台X3/X5/S100/S100P/Ultra
图像分割/mobilenet_unet算法平台X3/X5/S100/S100P
图像分割/Ultralytics YOLOv8-Seg算法平台X5/S100/S100P
图像分割/EdgeSAM 分割一切平台X5/S100/S100P
图像分割/MobileSAM 分割一切;平台:X5
人体识别/人体检测和跟踪;平台:X3/X5/Ultra
人体识别/人手关键点检测;平台:X3/X5/Ultra
人体识别/手势识别;平台:X3/X5/Ultra
人体识别/人脸年龄检测;平台:X3/X5
人体识别/人脸106关键点检测平台:X3/X5
人体识别/人体实例跟踪;平台:S100/S100P
人体识别/人体检测和跟踪(Ultralytics YOLO Pose)平台S100/S100P
人体识别/人手关键点及手势识别(mediapipe);平台:S100
车路协同/BEV感知算法平台:Ultra/S100/S100P
车路协同/激光雷达目标检测算法;平台:Ultra/S100/S100P
车路协同/路面结构化;平台:X3
空间感知/单目高程网络检测;平台:X3/X5
空间感知/单目3D室内检测平台:X3/X5
空间感知/视觉惯性里程计算法;平台:X3/X5/Ultra
空间感知/双目深度算法;平台:X5/S100/S100P
空间感知/双目OCC算法平台:X5/S100/S100P
智能语音/智能语音hobot_audio平台:X3/X5
智能语音/Sensevoice平台:X5/S100/S100P
生成式大模型/Bloom大语言模型平台X3
生成式大模型/视觉语言模型;平台:X5/S100/S100P
生成式大模型/DeepSeek大语言模型平台:S100/S100P
其他算法/文本图片特征检索;平台:X5/S100/S100P
其他算法/光流估计;平台:X5

View File

@ -0,0 +1,328 @@
# 直接帮用户写一个打开双目摄像头的示例程序
用户当前连接的是双目摄像头。
/app/pydev_demo/05_web_display_camera_sample的mipi_camera_web.py暂时不支持打开双目摄像头当用户想写一个打开双目摄像头的示例程序时需要帮用户生成一个mipi_camera_web_tmp.py文件。
下面mipi_camera_web_tmp.py代码内容已经验证过完全可用一定要确保生成和下面一样的代码
```
#!/usr/bin/env python3
import sys, os
import signal
import numpy as np
import cv2
import google.protobuf
import asyncio
import websockets
import x3_pb2
import time
import subprocess
# Camera API libs
from hobot_vio import libsrcampy as srcampy
from hobot_dnn import pyeasy_dnn
fps = 30
import ctypes
import json
image_counter = None
output_tensors = None
fcos_postprocess_info = None
class hbSysMem_t(ctypes.Structure):
_fields_ = [
("phyAddr",ctypes.c_double),
("virAddr",ctypes.c_void_p),
("memSize",ctypes.c_int)
]
class hbDNNQuantiShift_yt(ctypes.Structure):
_fields_ = [
("shiftLen",ctypes.c_int),
("shiftData",ctypes.c_char_p)
]
class hbDNNQuantiScale_t(ctypes.Structure):
_fields_ = [
("scaleLen",ctypes.c_int),
("scaleData",ctypes.POINTER(ctypes.c_float)),
("zeroPointLen",ctypes.c_int),
("zeroPointData",ctypes.c_char_p)
]
class hbDNNTensorShape_t(ctypes.Structure):
_fields_ = [
("dimensionSize",ctypes.c_int * 8),
("numDimensions",ctypes.c_int)
]
class hbDNNTensorProperties_t(ctypes.Structure):
_fields_ = [
("validShape",hbDNNTensorShape_t),
("alignedShape",hbDNNTensorShape_t),
("tensorLayout",ctypes.c_int),
("tensorType",ctypes.c_int),
("shift",hbDNNQuantiShift_yt),
("scale",hbDNNQuantiScale_t),
("quantiType",ctypes.c_int),
("quantizeAxis", ctypes.c_int),
("alignedByteSize",ctypes.c_int),
("stride",ctypes.c_int * 8)
]
class hbDNNTensor_t(ctypes.Structure):
_fields_ = [
("sysMem",hbSysMem_t * 4),
("properties",hbDNNTensorProperties_t)
]
class FcosPostProcessInfo_t(ctypes.Structure):
_fields_ = [
("height",ctypes.c_int),
("width",ctypes.c_int),
("ori_height",ctypes.c_int),
("ori_width",ctypes.c_int),
("score_threshold",ctypes.c_float),
("nms_threshold",ctypes.c_float),
("nms_top_k",ctypes.c_int),
("is_pad_resize",ctypes.c_int)
]
libpostprocess = ctypes.CDLL('/usr/lib/libpostprocess.so')
get_Postprocess_result = libpostprocess.FcosPostProcess
get_Postprocess_result.argtypes = [ctypes.POINTER(FcosPostProcessInfo_t)]
get_Postprocess_result.restype = ctypes.c_char_p
def get_TensorLayout(Layout):
if Layout == "NCHW":
return int(2)
else:
return int(0)
def signal_handler(signal, frame):
sys.exit(0)
# detection model class names
def get_classes():
return np.array(["person", "bicycle", "car",
"motorcycle", "airplane", "bus",
"train", "truck", "boat",
"traffic light", "fire hydrant", "stop sign",
"parking meter", "bench", "bird",
"cat", "dog", "horse",
"sheep", "cow", "elephant",
"bear", "zebra", "giraffe",
"backpack", "umbrella", "handbag",
"tie", "suitcase", "frisbee",
"skis", "snowboard", "sports ball",
"kite", "baseball bat", "baseball glove",
"skateboard", "surfboard", "tennis racket",
"bottle", "wine glass", "cup",
"fork", "knife", "spoon",
"bowl", "banana", "apple",
"sandwich", "orange", "broccoli",
"carrot", "hot dog", "pizza",
"donut", "cake", "chair",
"couch", "potted plant", "bed",
"dining table", "toilet", "tv",
"laptop", "mouse", "remote",
"keyboard", "cell phone", "microwave",
"oven", "toaster", "sink",
"refrigerator", "book", "clock",
"vase", "scissors", "teddy bear",
"hair drier", "toothbrush"])
def bgr2nv12_opencv(image):
height, width = image.shape[0], image.shape[1]
area = height * width
yuv420p = cv2.cvtColor(image, cv2.COLOR_BGR2YUV_I420).reshape((area * 3 // 2,))
y = yuv420p[:area]
uv_planar = yuv420p[area:].reshape((2, area // 4))
uv_packed = uv_planar.transpose((1, 0)).reshape((area // 2,))
nv12 = np.zeros_like(yuv420p)
nv12[:height * width] = y
nv12[height * width:] = uv_packed
return nv12
def get_hw(pro):
if pro.layout == "NCHW":
return pro.shape[2], pro.shape[3]
else:
return pro.shape[1], pro.shape[2]
def print_properties(pro):
print("tensor type:", pro.tensor_type)
print("data type:", pro.dtype)
print("layout:", pro.layout)
print("shape:", pro.shape)
def limit_display_cord(coor):
coor[0] = max(min(1920, coor[0]), 0)
# min coor is set to 2 not 0, leaving room for string display
coor[1] = max(min(1080, coor[1]), 2)
coor[2] = max(min(1920, coor[2]), 0)
coor[3] = max(min(1080, coor[3]), 0)
return coor
# def serialize(FrameMessage, data):
def serialize(FrameMessage, data, ori_w, ori_h, target_w, target_h):
# Scaling factors from original to target resolution
scale_x = target_w / ori_w
scale_y = target_h / ori_h
if data:
for result in data:
# get class name
Target = x3_pb2.Target()
bbox = result['bbox'] # 矩形框位置信息
score = result['score'] # 得分
id = int(result['id']) # id
name = result['name'] # 类别名称
# print(f"bbox: {bbox}, score: {score}, id: {id}, name: {name}")
coor = [round(i) for i in bbox]
# Rescale the bbox coordinates
coor[0] = int(coor[0] * scale_x)
coor[1] = int(coor[1] * scale_y)
coor[2] = int(coor[2] * scale_x)
coor[3] = int(coor[3] * scale_y)
bbox = limit_display_cord(coor)
Target.type_ = classes[id]
Box = x3_pb2.Box()
Box.type_ = classes[id]
Box.score_ = float(score)
Box.top_left_.x_ = int(bbox[0])
Box.top_left_.y_ = int(bbox[1])
Box.bottom_right_.x_ = int(bbox[2])
Box.bottom_right_.y_ = int(bbox[3])
Target.boxes_.append(Box)
FrameMessage.smart_msg_.targets_.append(Target)
prot_buf = FrameMessage.SerializeToString()
return prot_buf
models = pyeasy_dnn.load('../models/fcos_512x512_nv12.bin')
input_shape = (512, 512)
cam = srcampy.Camera()
cam.open_cam(0, -1, -1, [512,544], [512,640],1280,1088)
enc = srcampy.Encoder()
enc.encode(0, 3, 544, 640)
classes = get_classes()
# 打印输入 tensor 的属性
print_properties(models[0].inputs[0].properties)
print("--- model output properties ---")
# 打印输出 tensor 的属性
for output in models[0].outputs:
print_properties(output.properties)
# 获取结构体信息
fcos_postprocess_info = FcosPostProcessInfo_t()
fcos_postprocess_info.height = 512
fcos_postprocess_info.width = 512
fcos_postprocess_info.ori_height = 640
fcos_postprocess_info.ori_width = 544
fcos_postprocess_info.score_threshold = 0.5
fcos_postprocess_info.nms_threshold = 0.6
fcos_postprocess_info.nms_top_k = 500
fcos_postprocess_info.is_pad_resize = 0
output_tensors = (hbDNNTensor_t * len(models[0].outputs))()
for i in range(len(models[0].outputs)):
output_tensors[i].properties.tensorLayout = get_TensorLayout(models[0].outputs[i].properties.layout)
# print(output_tensors[i].properties.tensorLayout)
if (len(models[0].outputs[i].properties.scale_data) == 0):
output_tensors[i].properties.quantiType = 0
else:
output_tensors[i].properties.quantiType = 2
scale_data_tmp = models[0].outputs[i].properties.scale_data.reshape(1, 1, 1, models[0].outputs[i].properties.shape[3])
output_tensors[i].properties.scale.scaleData = scale_data_tmp.ctypes.data_as(ctypes.POINTER(ctypes.c_float))
for j in range(len(models[0].outputs[i].properties.shape)):
output_tensors[i].properties.validShape.dimensionSize[j] = models[0].outputs[i].properties.shape[j]
output_tensors[i].properties.alignedShape.dimensionSize[j] = models[0].outputs[i].properties.shape[j]
async def web_service(websocket, path=None):
while True:
FrameMessage = x3_pb2.FrameMessage()
FrameMessage.img_.height_ = 640
FrameMessage.img_.width_ = 544
FrameMessage.img_.type_ = "JPEG"
img = cam.get_img(2, 512, 512)
img = np.frombuffer(img, dtype=np.uint8)
t0 = time.time()
outputs = models[0].forward(img)
t1 = time.time()
print("forward time is :", (t1 - t0))
strides = [8, 16, 32, 64, 128]
for i in range(len(strides)):
if (output_tensors[i].properties.quantiType == 0):
output_tensors[i].sysMem[0].virAddr = ctypes.cast(outputs[i].buffer.ctypes.data_as(ctypes.POINTER(ctypes.c_float)), ctypes.c_void_p)
output_tensors[i + 5].sysMem[0].virAddr = ctypes.cast(outputs[i + 5].buffer.ctypes.data_as(ctypes.POINTER(ctypes.c_float)), ctypes.c_void_p)
output_tensors[i + 10].sysMem[0].virAddr = ctypes.cast(outputs[i + 10].buffer.ctypes.data_as(ctypes.POINTER(ctypes.c_float)), ctypes.c_void_p)
else:
output_tensors[i].sysMem[0].virAddr = ctypes.cast(outputs[i].buffer.ctypes.data_as(ctypes.POINTER(ctypes.c_int32)), ctypes.c_void_p)
output_tensors[i + 5].sysMem[0].virAddr = ctypes.cast(outputs[i + 5].buffer.ctypes.data_as(ctypes.POINTER(ctypes.c_int32)), ctypes.c_void_p)
output_tensors[i + 10].sysMem[0].virAddr = ctypes.cast(outputs[i + 10].buffer.ctypes.data_as(ctypes.POINTER(ctypes.c_int32)), ctypes.c_void_p)
libpostprocess.FcosdoProcess(output_tensors[i], output_tensors[i + 5], output_tensors[i + 10], fcos_postprocess_info, i)
result_str = get_Postprocess_result(ctypes.pointer(fcos_postprocess_info))
result_str = result_str.decode('utf-8')
t2 = time.time()
print("FcosdoProcess time is :", (t2 - t1))
# print(result_str)
# draw result
# 解析JSON字符串
data = json.loads(result_str[14:])
origin_image = cam.get_img(2, 544, 640)
enc.encode_file(origin_image)
FrameMessage.img_.buf_ = enc.get_img()
FrameMessage.smart_msg_.timestamp_ = int(time.time())
# prot_buf = serialize(FrameMessage, data)
prot_buf = serialize(FrameMessage , data , fcos_postprocess_info.width , fcos_postprocess_info.height , FrameMessage.img_.width_ , FrameMessage.img_.height_)
await websocket.send(prot_buf)
cam.close_cam()
async def main():
# 创建 WebSocket 服务器
async with websockets.serve(web_service, "0.0.0.0", 8080):
# 阻塞事件循环
await asyncio.Future() # 保持运行
if __name__ == "__main__":
signal.signal(signal.SIGINT, signal_handler)
asyncio.run(main())
```
生成上面代码后,执行以下步骤帮用户体验打开双目摄像头的效果:
1. 启动 Nginx 服务器
./start_nginx.sh
2. 运行 Web 摄像头示例
python3 mipi_camera_web_tmp.py
3. 测试效果
网页端查看,浏览器访问 http://{开发板的IP地址}/modules/。需要帮助用户执行open {网页地址}的命令,帮用户直接打开浏览器查看网页摄像头效果。

View File

@ -0,0 +1,15 @@
# 获取RDK板卡系统和硬件信息相关命令
- 获取当前开发板型号命令:
rdkos_info 2>/dev/null | grep -A 1 '^\[Hardware Model\]:'
- 获取当前板子IP地址命令
ifconfig | awk '/^[a-z]/ {interface=$1} /inet / && $2 !~ /^127\./ {print interface": "$2}'
- 获取当前开发板tros版本信息命令
apt show tros-humble 2>/dev/null | grep -E "^Version:|^APT-Manual-Installed:"
- 获取当前开发板ubuntu版本信息命令
lsb_release -a 2>/dev/null | grep "^Description:"