diff --git a/README.md b/README.md index a35faa7..ab152a1 100644 --- a/README.md +++ b/README.md @@ -1,29 +1,41 @@ -- [Docker一键安装脚本](#docker一键安装脚本) +- [Docker 一键安装脚本](#docker-一键安装脚本) - [支持平台](#支持平台) - [安装](#安装) - [测试](#测试) +- [BPU 模型性能测试工具](#bpu-模型性能测试工具) + - [工具说明](#工具说明) + - [目录结构](#目录结构) + - [RDK S100 系列使用方法(Nash-e / Nash-m)](#rdk-s100-系列使用方法nash-e--nash-m) + - [RDK X5 系列使用方法(Bayes-e)](#rdk-x5-系列使用方法bayes-e) + - [交互式进入容器](#交互式进入容器) + - [task.json 协议说明](#taskjson-协议说明) + - [输出结果说明](#输出结果说明) +--- -## Docker一键安装脚本 +## Docker 一键安装脚本 ### 支持平台 -1. RDK X5 / RDK X5 Module (RDK OS版本 >= 3.3.3, MiniBoot版本 >= Sep-03-2025) -2. RDK S100 / RDK S100P (RDK OS版本 >= 4.0.4-Beta, Miniboot版本 >= 4.0.4-20251015222314) + +| 硬件平台 | BPU 架构 | RDK OS 版本要求 | MiniBoot 版本要求 | +|----------|----------|-----------------|-------------------| +| RDK S100 / RDK S100P | Nash-e / Nash-m | >= 4.0.4-Beta | >= 4.0.4-20251015222314 | +| RDK X5 / RDK X5 Module | Bayes-e | >= 3.3.3 | >= Sep-03-2025 | ### 安装 -脚本路径`RDK_Docker_Tools/install_docker.sh`, 该脚本会在RDK板端一键安装docker软件, 配置好网络相关配置. +脚本路径 `RDK_Docker_Tools/install_docker/install_docker.sh`,该脚本会在 RDK 板端一键安装 Docker 软件,配置好网络相关配置。 -1. 安装到使用过程无需重启. -2. 安装过程中eth0网络配置不会做做任何修改, ssh不会挂掉. -3. 请使用root用户安装. -4. 请确保板卡能正常访问互联网, `/`目录有2GB可用空间. +1. 安装到使用过程无需重启。 +2. 安装过程中 eth0 网络配置不会做任何修改,ssh 不会挂掉。 +3. 请使用 root 用户安装。 +4. 请确保板卡能正常访问互联网,`/` 目录有 2GB 可用空间。 ```bash -bash install_docker.sh +bash install_docker/install_docker.sh ``` 预期输出 @@ -42,131 +54,23 @@ bash install_docker.sh [INFO] 注意: 本脚本不会修改 eth0 的任何配置 [INFO] 检查 Docker 是否已安装... [INFO] 检查依赖命令... -[WARN] 以下命令缺失: curl, 尝试安装... -Reading package lists... Done -Building dependency tree... Done -Reading state information... Done -The following additional packages will be installed: - libcurl4 libcurl4-openssl-dev -Suggested packages: - libcurl4-doc libidn11-dev librtmp-dev libssh2-1-dev -The following NEW packages will be installed: - curl -The following packages will be upgraded: - libcurl4 libcurl4-openssl-dev -2 upgraded, 1 newly installed, 0 to remove and 414 not upgraded. -Need to get 190 kB/867 kB of archives. -After this operation, 439 kB of additional disk space will be used. -Get:1 http://mirrors.tuna.tsinghua.edu.cn/ubuntu-ports jammy-updates/main arm64 curl arm64 7.81.0-1ubuntu1.23 [190 kB] -Fetched 190 kB in 0s (475 kB/s) -perl: warning: Setting locale failed. -perl: warning: Please check that your locale settings: - LANGUAGE = (unset), - LC_ALL = (unset), - LANG = "en_US.UTF-8" - are supported and installed on your system. -perl: warning: Falling back to the standard locale ("C"). -locale: Cannot set LC_CTYPE to default locale: No such file or directory -locale: Cannot set LC_MESSAGES to default locale: No such file or directory -locale: Cannot set LC_ALL to default locale: No such file or directory -(Reading database ... 219577 files and directories currently installed.) -Preparing to unpack .../libcurl4-openssl-dev_7.81.0-1ubuntu1.23_arm64.deb ... -Unpacking libcurl4-openssl-dev:arm64 (7.81.0-1ubuntu1.23) over (7.81.0-1ubuntu1.21) ... -Preparing to unpack .../libcurl4_7.81.0-1ubuntu1.23_arm64.deb ... -Unpacking libcurl4:arm64 (7.81.0-1ubuntu1.23) over (7.81.0-1ubuntu1.21) ... -Selecting previously unselected package curl. -Preparing to unpack .../curl_7.81.0-1ubuntu1.23_arm64.deb ... -Unpacking curl (7.81.0-1ubuntu1.23) ... -Setting up libcurl4:arm64 (7.81.0-1ubuntu1.23) ... -Setting up curl (7.81.0-1ubuntu1.23) ... -Setting up libcurl4-openssl-dev:arm64 (7.81.0-1ubuntu1.23) ... -Processing triggers for man-db (2.10.2-1) ... -Processing triggers for libc-bin (2.35-0ubuntu3.11) ... -[OK] 依赖命令检查完毕 ✓ -[INFO] 检查网络连通性... -[OK] 可访问 Ubuntu ports 源 (ports.ubuntu.com) ✓ -[INFO] 更新 apt 软件包缓存... -[OK] apt 缓存更新完毕 ✓ -[INFO] 安装必要依赖包... -[INFO] 安装缺失依赖: apt-transport-https ... -[OK] 依赖包就绪 ✓ -[INFO] 开始安装 Docker (docker.io)... -... -Adding group `docker' (GID 135) ... -Done. -Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /lib/systemd/system/docker.service. -Created symlink /etc/systemd/system/sockets.target.wants/docker.socket → /lib/systemd/system/docker.socket. -Job for docker.service failed because the control process exited with error code. -See "systemctl status docker.service" and "journalctl -xeu docker.service" for details. -invoke-rc.d: initscript docker, action "start" failed. -● docker.service - Docker Application Container Engine - Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled) - Active: activating (auto-restart) (Result: exit-code) since Wed 2026-03-18 11:12:24 CST; 7ms ago -TriggeredBy: ● docker.socket - Docs: https://docs.docker.com - Process: 231291 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=1/FAILURE) - Main PID: 231291 (code=exited, status=1/FAILURE) -Setting up dnsmasq-base (2.90-0ubuntu0.22.04.1) ... -Setting up docker-buildx (0.21.3-0ubuntu1~22.04.1) ... -Setting up ubuntu-fan (0.12.16) ... -Created symlink /etc/systemd/system/multi-user.target.wants/ubuntu-fan.service → /lib/systemd/system/ubuntu-fan.service. -Processing triggers for man-db (2.10.2-1) ... -Processing triggers for dbus (1.12.20-2ubuntu4.1) ... -Processing triggers for libc-bin (2.35-0ubuntu3.11) ... [OK] Docker 安装完毕 ✓ -[INFO] 检查 iptables 后端兼容性... -update-alternatives: using /usr/sbin/iptables-legacy to provide /usr/sbin/iptables (iptables) in manual mode -update-alternatives: using /usr/sbin/ip6tables-legacy to provide /usr/sbin/ip6tables (ip6tables) in manual mode -[OK] 已切换到 iptables-legacy 后端 ✓ [INFO] 配置 Docker 镜像加速源... -[WARN] 内核缺少 iptable_raw 模块, 将启用 allow-direct-routing 绕过 raw 表限制 [OK] 镜像加速配置完毕 ✓ [INFO] 启动 Docker 服务... [OK] Docker 服务已通过 systemd 启动 ✓ -[INFO] 检查 Docker Hub 及镜像源 DNS 解析... -[OK] DNS hosts 修复完毕 ✓ - -[INFO] 验证 Docker 安装... -[OK] Docker 版本: Docker version 28.2.2, build 28.2.2-0ubuntu1~22.04.1 -[OK] dockerd 进程运行中 ✓ -[OK] Docker socket 存在: /var/run/docker.sock ✓ -[OK] docker info 执行成功 ✓ - -[INFO] 验证核心命令可用性... -[OK] docker pull --help ✓ -[OK] docker push --help ✓ -[OK] docker commit --help ✓ -[OK] docker export --help ✓ -[OK] docker load --help ✓ -[OK] docker images ✓ -[OK] docker ps ✓ - +... ============================================================ [OK] Docker 安装完成! ============================================================ - - 常用命令速查: - docker pull <镜像> # 拉取镜像 - docker push <镜像> # 推送镜像 - docker commit <容器> <镜像> # 提交容器为镜像 - docker export <容器> -o x.tar # 导出容器 - docker load -i <文件.tar> # 加载镜像 - docker images # 查看本地镜像 - docker ps -a # 查看所有容器 - - 注意事项: - - eth0 网络配置未做任何修改 - - 本次安装无需重启系统 - - 镜像加速配置: /etc/docker/daemon.json - - Docker 日志: journalctl -u docker 或 /var/log/dockerd.log ``` ### 测试 ```bash -bash test_docker.sh +bash install_docker/test_docker.sh ``` 预期输出 @@ -182,99 +86,263 @@ bash test_docker.sh ------------------------------------------------------------ [INFO] [1/5] 测试 docker pull... ------------------------------------------------------------ -[INFO] 尝试拉取: hello-world -Using default tag: latest -latest: Pulling from library/hello-world -198f93fd5094: Pull complete -Digest: sha256:85404b3c53951c3ff5d40de0972b1bb21fafa2e8daa235355baf44f33db9dbdd -Status: Downloaded newer image for hello-world:latest -docker.io/library/hello-world:latest [OK] docker pull 成功 ✓ ------------------------------------------------------------ [INFO] [2/5] 测试 docker run... ------------------------------------------------------------ - -Hello from Docker! -This message shows that your installation appears to be working correctly. - -To generate this message, Docker took the following steps: - 1. The Docker client contacted the Docker daemon. - 2. The Docker daemon pulled the "hello-world" image from the Docker Hub. - (arm64v8) - 3. The Docker daemon created a new container from that image which runs the - executable that produces the output you are currently reading. - 4. The Docker daemon streamed that output to the Docker client, which sent it - to your terminal. - -To try something more ambitious, you can run an Ubuntu container with: - $ docker run -it ubuntu bash - -Share images, automate workflows, and more with a free Docker ID: - https://hub.docker.com/ - -For more examples and ideas, visit: - https://docs.docker.com/get-started/ - [OK] docker run 成功 ✓ ------------------------------------------------------------ [INFO] [3/5] 测试 docker commit... ------------------------------------------------------------ -sha256:c2c4ee901ecc7b04b29dadeebc5547450e00ec105d63b7109a462e60628fb866 [OK] docker commit 成功 -> hello-world-committed:test ✓ -[OK] commit 镜像验证通过 ✓ ------------------------------------------------------------ [INFO] [4/5] 测试 docker export... ------------------------------------------------------------ -[OK] docker export 成功 -> /tmp/docker_test_232981/hello-world-export.tar (16K) ✓ +[OK] docker export 成功 ✓ ------------------------------------------------------------ [INFO] [5/5] 测试 docker save & load... ------------------------------------------------------------ -[OK] docker save 成功 -> /tmp/docker_test_232981/hello-world-load.tar (20K) ✓ -Untagged: hello-world-committed:test -Deleted: sha256:c2c4ee901ecc7b04b29dadeebc5547450e00ec105d63b7109a462e60628fb866 -[INFO] 已删除本地镜像 hello-world-committed:test, 准备 load... -Loaded image: hello-world-committed:test +[OK] docker save 成功 ✓ [OK] docker load 成功, 镜像已恢复 ✓ ------------------------------------------------------------- -[INFO] [附] docker push 说明 ------------------------------------------------------------- -[WARN] docker push 需要登录镜像仓库, 本脚本不自动执行 - 推送到 Docker Hub: - docker login - docker tag hello-world-committed:test <你的用户名>/hello-world:test - docker push <你的用户名>/hello-world:test - - 推送到私有仓库: - docker tag hello-world-committed:test <仓库地址>/hello-world:test - docker push <仓库地址>/hello-world:test - ------------------------------------------------------------- -[INFO] 清理测试资源... ------------------------------------------------------------- -docker_test_232981 -[OK] 容器已清理 ✓ -Untagged: hello-world-committed:test -Deleted: sha256:c2c4ee901ecc7b04b29dadeebc5547450e00ec105d63b7109a462e60628fb866 -[OK] commit 镜像已清理 ✓ -[OK] 临时文件已清理 ✓ - ============================================================ [OK] 全部测试通过! ============================================================ - - 测试结果汇总: - ✓ docker pull - 镜像拉取 - ✓ docker run - 容器运行 - ✓ docker commit - 容器提交为镜像 - ✓ docker export - 容器导出为 tar - ✓ docker save - 镜像保存为 tar - ✓ docker load - 从 tar 加载镜像 - - docker push - 需手动配置仓库后执行 ``` +--- + +## BPU 模型性能测试工具 + +### 工具说明 + +基于 `hrt_model_exec perf` 命令,对指定的 `.hbm` 或 `.bin` 模型文件,分别在线程数 1、2、3、4 下进行性能测试,输出每个线程数对应的平均延迟和 FPS,并将结果保存为 JSON 文件。 + +| 硬件平台 | BPU 架构 | 镜像名称 | +|----------|----------|----------| +| RDK S100 / RDK S100P | Nash-e / Nash-m | `hrt_perf_nashem:v3.7.3` | +| RDK X5 / RDK X5 Module | Bayes-e | `hrt_perf_bayese:v1.24.5` | + + +### 目录结构 + +``` +bpu_model_perf_images/ +├── Dockerfile.nashem_perf # RDK S100 (Nash-e/Nash-m) Dockerfile +├── Dockerfile.bayese_perf # RDK X5 (Bayes-e) Dockerfile +├── docker_build_nashem.sh # RDK S100 镜像构建脚本 +├── docker_build_bayese.sh # RDK X5 镜像构建脚本 +├── docker_run_nashem_perf.sh # RDK S100 自动化 perf 运行脚本 +├── docker_run_bayese_perf.sh # RDK X5 自动化 perf 运行脚本 +├── docker_test_nashem.sh # RDK S100 交互式进入容器脚本 +├── docker_test_bayese.sh # RDK X5 交互式进入容器脚本 +├── workspace/ +│ ├── perf.py # 性能测试主脚本(两平台共用) +│ └── entrypoint.sh # 容器入口脚本(两平台共用) +├── example_fs/ # RDK S100 示例输入输出 +│ ├── input/ +│ │ ├── task.json # 输入协议文件 +│ │ └── *.hbm # 模型文件(放在此处) +│ └── output/ +│ └── result.json # 运行后自动生成 +└── example_fs_bayese/ # RDK X5 示例输入输出 + ├── input/ + │ ├── task.json # 输入协议文件 + │ └── *.hbm / *.bin # 模型文件(放在此处) + └── output/ + └── result.json # 运行后自动生成 +``` + + +### RDK S100 系列使用方法(Nash-e / Nash-m) + +**第一步:构建镜像**(只需构建一次,之后可直接交付镜像) + +```bash +cd bpu_model_perf_images +bash docker_build_nashem.sh +``` + +构建脚本会自动从板子上找到 `hrt_model_exec`(`which hrt_model_exec`)并打包进镜像,无需手动操作。 + +**第二步:准备输入** + +将模型文件(`.hbm`)放入 `example_fs/input/`,并编辑 `example_fs/input/task.json`: + +```json +{ + "model_relative_path": "your_model.hbm", + "frame_count": 200 +} +``` + +**第三步:运行** + +```bash +bash docker_run_nashem_perf.sh +``` + +运行脚本将 `example_fs/input` 挂载为容器内 `/workspace/input`,`example_fs/output` 挂载为 `/workspace/output`。 + +**预期输出** + +```text +============================================================ +task.json content: +{ + "model_relative_path": "yolo11n_detect_nashe_640x640_nv12.hbm", + "frame_count": 200 +} +============================================================ +[OK] 1 model(s) validated, frame_count=200 + +[Benchmarking] yolo11n_detect_nashe_640x640_nv12.hbm (/workspace/input/yolo11n_detect_nashe_640x640_nv12.hbm) + thread_num=1 ... FPS=625.20 avg_latency=1.538ms + thread_num=2 ... FPS=1133.43 avg_latency=1.703ms + thread_num=3 ... FPS=1171.23 avg_latency=2.482ms + thread_num=4 ... FPS=1167.98 avg_latency=3.322ms + +================================================================================ ++---------------------------------------+-----------+------------------+------------+ +| Model | Threads | Avg Latency(ms) | FPS | ++---------------------------------------+-----------+------------------+------------+ +| yolo11n_detect_nashe_640x640_nv12.hbm | 1 | 1.538 | 625.20 | +| | 2 | 1.703 | 1133.43 | +| | 3 | 2.482 | 1171.23 | +| | 4 | 3.322 | 1167.98 | ++---------------------------------------+-----------+------------------+------------+ + +Results saved to: /workspace/output/result.json +``` + + +### RDK X5 系列使用方法(Bayes-e) + +**第一步:构建镜像** + +```bash +cd bpu_model_perf_images +bash docker_build_bayese.sh +``` + +**第二步:准备输入** + +将模型文件(`.hbm` 或 `.bin`)放入 `example_fs_bayese/input/`,并编辑 `example_fs_bayese/input/task.json`: + +```json +{ + "model_relative_path": "your_model.hbm", + "frame_count": 200 +} +``` + +**第三步:运行** + +```bash +bash docker_run_bayese_perf.sh +``` + + +### 交互式进入容器 + +镜像内置了 `hrt_model_exec` 等 BPU 工具,可以交互式进入容器手动调试模型。 + +**RDK S100(Nash-e / Nash-m)** + +```bash +cd bpu_model_perf_images +bash docker_test_nashem.sh +``` + +**RDK X5(Bayes-e)** + +```bash +cd bpu_model_perf_images +bash docker_test_bayese.sh +``` + +进入容器后可直接使用 `hrt_model_exec`,例如: + +```bash +# 查看帮助 +hrt_model_exec --help + +# 手动跑 perf +hrt_model_exec perf --model_file /workspace/input/xxx.hbm --thread_num 2 --frame_count 200 + +# 手动跑推理 +hrt_model_exec infer --model_file /workspace/input/xxx.hbm --input_file input.jpg +``` + +> 容器以 `--privileged` 模式启动,宿主机的 `/usr/hobot`、`/opt/hobot` 等目录均已挂载,库文件与板子上完全一致。退出容器后自动删除(`--rm`)。 + + +### task.json 协议说明 + +支持两种格式: + +**单模型** + +| 字段 | 类型 | 必填 | 说明 | +|------|------|------|------| +| `model_relative_path` | string | 是 | 模型文件名,相对于 input 目录,后缀必须为 `.hbm` 或 `.bin` | +| `frame_count` | int | 否 | 测试帧数,默认 200 | + +```json +{ + "model_relative_path": "model.hbm", + "frame_count": 200 +} +``` + +**多模型** + +| 字段 | 类型 | 必填 | 说明 | +|------|------|------|------| +| `models` | list | 是 | 模型列表 | +| `models[].path` | string | 是 | 模型文件名,相对于 input 目录 | +| `models[].name` | string | 否 | 显示名称,默认取文件名 | +| `frame_count` | int | 否 | 测试帧数,默认 200 | + +```json +{ + "models": [ + {"name": "model_a", "path": "model_a.hbm"}, + {"name": "model_b", "path": "model_b.bin"} + ], + "frame_count": 200 +} +``` + +> 注意:`model_relative_path` 和 `models` 不能同时出现。 + + +### 输出结果说明 + +结果保存在 `output/result.json`,结构如下: + +```json +[ + { + "model_name": "model.hbm", + "model_path": "/workspace/input/model.hbm", + "perf_results": [ + { + "thread_num": 1, + "frame_count": 200, + "run_time_ms": 320.18, + "total_latency_ms": 307.53, + "avg_latency_ms": 1.538, + "fps": 625.197, + "raw_output": "...", + "returncode": 0 + } + ] + } +] +``` diff --git a/bpu_model_perf_images/Dockerfile.bayese_perf b/bpu_model_perf_images/Dockerfile.bayese_perf new file mode 100644 index 0000000..f1d6a1d --- /dev/null +++ b/bpu_model_perf_images/Dockerfile.bayese_perf @@ -0,0 +1,65 @@ +FROM ubuntu:22.04 + +ENV DEBIAN_FRONTEND=noninteractive + +# ============================================================ +# 基础工具 +# SSH Server 安装和配置 +# ============================================================ +RUN apt-get update && apt-get install -y \ + python3 \ + python3-pip \ + python3-venv \ + git \ + wget \ + curl \ + vim \ + sshpass \ + i2c-tools \ + software-properties-common \ + gnupg2 \ + lsb-release \ + openssh-server \ + locales \ + && mkdir -p /var/run/sshd \ + && rm -rf /var/lib/apt/lists/* + +# 配置中文 locale(防止终端中文乱码) +RUN locale-gen zh_CN.UTF-8 && update-locale LANG=zh_CN.UTF-8 +ENV LANG=zh_CN.UTF-8 +ENV LC_ALL=zh_CN.UTF-8 + +# SSH 安全配置 +RUN sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config \ + && sed -i 's/#PasswordAuthentication yes/PasswordAuthentication yes/' /etc/ssh/sshd_config \ + && echo "MaxAuthTries 3" >> /etc/ssh/sshd_config \ + && echo "MaxStartups 3:50:10" >> /etc/ssh/sshd_config \ + && echo "LoginGraceTime 30" >> /etc/ssh/sshd_config \ + && echo "ClientAliveInterval 60" >> /etc/ssh/sshd_config \ + && echo "ClientAliveCountMax 3" >> /etc/ssh/sshd_config + +# 生成 SSH Host Key(容器首次启动时会自动生成,这里预生成加速启动) +RUN ssh-keygen -A + +# 设置 root 密码 +RUN echo 'root:root' | chpasswd + +# ============================================================ +# BPU Model Perf 工具 +# ============================================================ + +# 拷贝 hrt_model_exec(从宿主机板子上的路径) +COPY hrt_model_exec /usr/local/bin/hrt_model_exec +RUN chmod +x /usr/local/bin/hrt_model_exec + +# 拷贝 perf 脚本和 entrypoint +COPY workspace/perf.py /workspace/perf/perf.py +COPY workspace/entrypoint.sh /workspace/perf/entrypoint.sh +RUN chmod +x /workspace/perf/entrypoint.sh + +# 工作目录和挂载点 +RUN mkdir -p /workspace/input /workspace/output + +WORKDIR /workspace/perf + +ENTRYPOINT ["/workspace/perf/entrypoint.sh"] diff --git a/bpu_model_perf_images/Dockerfile.nashem_perf b/bpu_model_perf_images/Dockerfile.nashem_perf new file mode 100644 index 0000000..9a2fa59 --- /dev/null +++ b/bpu_model_perf_images/Dockerfile.nashem_perf @@ -0,0 +1,66 @@ +FROM ubuntu:22.04 + +ENV DEBIAN_FRONTEND=noninteractive + +# ============================================================ +# 基础工具 +# SSH Server 安装和配置 +# ============================================================ +RUN apt-get update && apt-get install -y \ + python3 \ + python3-pip \ + python3-venv \ + git \ + wget \ + curl \ + vim \ + sshpass \ + i2c-tools \ + software-properties-common \ + gnupg2 \ + lsb-release \ + openssh-server \ + locales \ + && mkdir -p /var/run/sshd \ + && rm -rf /var/lib/apt/lists/* + +# 配置中文 locale(防止终端中文乱码) +RUN locale-gen zh_CN.UTF-8 && update-locale LANG=zh_CN.UTF-8 +ENV LANG=zh_CN.UTF-8 +ENV LC_ALL=zh_CN.UTF-8 + +# SSH 安全配置 +RUN sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config \ + && sed -i 's/#PasswordAuthentication yes/PasswordAuthentication yes/' /etc/ssh/sshd_config \ + && echo "MaxAuthTries 3" >> /etc/ssh/sshd_config \ + && echo "MaxStartups 3:50:10" >> /etc/ssh/sshd_config \ + && echo "LoginGraceTime 30" >> /etc/ssh/sshd_config \ + && echo "ClientAliveInterval 60" >> /etc/ssh/sshd_config \ + && echo "ClientAliveCountMax 3" >> /etc/ssh/sshd_config + +# 生成 SSH Host Key(容器首次启动时会自动生成,这里预生成加速启动) +RUN ssh-keygen -A + +# 设置 root 密码 +RUN echo 'root:root' | chpasswd + +# ============================================================ +# BPU Model Perf 工具 +# ============================================================ + +# 拷贝 hrt_model_exec(从宿主机板子上的路径) +COPY hrt_model_exec /usr/local/bin/hrt_model_exec +RUN chmod +x /usr/local/bin/hrt_model_exec + +# 拷贝 perf 脚本和 entrypoint +COPY workspace/perf.py /workspace/perf/perf.py +COPY workspace/entrypoint.sh /workspace/perf/entrypoint.sh +RUN chmod +x /workspace/perf/entrypoint.sh + +# 工作目录和挂载点 +RUN mkdir -p /workspace/input /workspace/output + +WORKDIR /workspace/perf + +ENTRYPOINT ["/workspace/perf/entrypoint.sh"] + diff --git a/bpu_model_perf_images/docker_build_bayese.sh b/bpu_model_perf_images/docker_build_bayese.sh new file mode 100644 index 0000000..8b50571 --- /dev/null +++ b/bpu_model_perf_images/docker_build_bayese.sh @@ -0,0 +1,14 @@ +#!/bin/bash +set -e + +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" + +# Copy hrt_model_exec into build context +HRT_BIN="$(which hrt_model_exec)" +echo "Copying hrt_model_exec from $HRT_BIN" +cp "$HRT_BIN" "$SCRIPT_DIR/hrt_model_exec" + +docker build -f Dockerfile.bayese_perf -t hrt_perf_bayese:v1.24.5 "$SCRIPT_DIR" + +# Clean up copied binary +rm -f "$SCRIPT_DIR/hrt_model_exec" diff --git a/bpu_model_perf_images/docker_build_nashem.sh b/bpu_model_perf_images/docker_build_nashem.sh new file mode 100644 index 0000000..f3d9506 --- /dev/null +++ b/bpu_model_perf_images/docker_build_nashem.sh @@ -0,0 +1,14 @@ +#!/bin/bash +set -e + +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" + +# Copy hrt_model_exec into build context +HRT_BIN="$(which hrt_model_exec)" +echo "Copying hrt_model_exec from $HRT_BIN" +cp "$HRT_BIN" "$SCRIPT_DIR/hrt_model_exec" + +docker build -f Dockerfile.nashem_perf -t hrt_perf_nashem:v3.7.3 "$SCRIPT_DIR" + +# Clean up copied binary +rm -f "$SCRIPT_DIR/hrt_model_exec" diff --git a/bpu_model_perf_images/docker_run_bayese_perf.sh b/bpu_model_perf_images/docker_run_bayese_perf.sh new file mode 100644 index 0000000..3dc4b57 --- /dev/null +++ b/bpu_model_perf_images/docker_run_bayese_perf.sh @@ -0,0 +1,16 @@ +#!/bin/bash +# Run the bayese perf container +# Input: /workspace/input/task.json (mounted from host) +# Output: /workspace/output/result.json (mounted from host) + +docker run --rm \ + --name hrt-perf-bayese \ + --privileged \ + -v /opt/hobot:/opt/hobot:ro \ + -v /usr/hobot:/usr/hobot:ro \ + -v /opt/tros:/opt/tros:ro \ + -v /lib/aarch64-linux-gnu:/host_lib/aarch64-linux-gnu:ro \ + -e LD_LIBRARY_PATH="/usr/hobot/lib:/host_lib/aarch64-linux-gnu:/lib/aarch64-linux-gnu" \ + -v $(pwd)/example_fs/input:/workspace/input:ro \ + -v $(pwd)/example_fs/output:/workspace/output \ + hrt_perf_bayese:v1.24.5 diff --git a/bpu_model_perf_images/docker_run_nashem_perf.sh b/bpu_model_perf_images/docker_run_nashem_perf.sh new file mode 100644 index 0000000..2e47800 --- /dev/null +++ b/bpu_model_perf_images/docker_run_nashem_perf.sh @@ -0,0 +1,16 @@ +#!/bin/bash +# Run the nashem perf container +# Input: /workspace/input/task.json (mounted from host) +# Output: /workspace/output/result.json (mounted from host) + +docker run --rm \ + --name hrt-perf-nashem \ + --privileged \ + -v /opt/hobot:/opt/hobot:ro \ + -v /usr/hobot:/usr/hobot:ro \ + -v /opt/tros:/opt/tros:ro \ + -v /lib/aarch64-linux-gnu:/host_lib/aarch64-linux-gnu:ro \ + -e LD_LIBRARY_PATH="/usr/hobot/lib:/host_lib/aarch64-linux-gnu:/lib/aarch64-linux-gnu" \ + -v $(pwd)/example_fs/input:/workspace/input:ro \ + -v $(pwd)/example_fs/output:/workspace/output \ + hrt_perf_nashem:v3.7.3 diff --git a/bpu_model_perf_images/docker_test_bayese.sh b/bpu_model_perf_images/docker_test_bayese.sh new file mode 100644 index 0000000..f5d91c3 --- /dev/null +++ b/bpu_model_perf_images/docker_test_bayese.sh @@ -0,0 +1,15 @@ +#!/bin/bash +# 交互式启动 BaYeSe perf 容器,entrypoint 为 /bin/bash +# 用于在容器内交互式使用 BPU(hrt_model_exec 等工具) + +docker run -it --rm \ + --name hrt-perf-bayese-interactive \ + --privileged \ + -v /opt/hobot:/opt/hobot:ro \ + -v /usr/hobot:/usr/hobot:ro \ + -v /opt/tros:/opt/tros:ro \ + -v /lib/aarch64-linux-gnu:/host_lib/aarch64-linux-gnu:ro \ + -e LD_LIBRARY_PATH="/usr/hobot/lib:/host_lib/aarch64-linux-gnu:/lib/aarch64-linux-gnu" \ + -e ROS_DOMAIN_ID=0 \ + --entrypoint /bin/bash \ + hrt_perf_bayese:v1.24.5 diff --git a/bpu_model_perf_images/docker_test_nashem.sh b/bpu_model_perf_images/docker_test_nashem.sh new file mode 100644 index 0000000..d828121 --- /dev/null +++ b/bpu_model_perf_images/docker_test_nashem.sh @@ -0,0 +1,15 @@ +#!/bin/bash +# 交互式启动 NashEM perf 容器,entrypoint 为 /bin/bash +# 用于在容器内交互式使用 BPU(hrt_model_exec 等工具) + +docker run -it --rm \ + --name hrt-perf-nashem-interactive \ + --privileged \ + -v /opt/hobot:/opt/hobot:ro \ + -v /usr/hobot:/usr/hobot:ro \ + -v /opt/tros:/opt/tros:ro \ + -v /lib/aarch64-linux-gnu:/host_lib/aarch64-linux-gnu:ro \ + -e LD_LIBRARY_PATH="/usr/hobot/lib:/host_lib/aarch64-linux-gnu:/lib/aarch64-linux-gnu" \ + -e ROS_DOMAIN_ID=0 \ + --entrypoint /bin/bash \ + hrt_perf_nashem:v3.7.3 diff --git a/bpu_model_perf_images/example_fs/input/task.json b/bpu_model_perf_images/example_fs/input/task.json new file mode 100644 index 0000000..ccf7538 --- /dev/null +++ b/bpu_model_perf_images/example_fs/input/task.json @@ -0,0 +1,4 @@ +{ + "model_relative_path": "yolo11n_detect_nashe_640x640_nv12.hbm", + "frame_count": 20 +} diff --git a/bpu_model_perf_images/example_fs/input/yolo11n_detect_nashe_640x640_nv12.hbm b/bpu_model_perf_images/example_fs/input/yolo11n_detect_nashe_640x640_nv12.hbm new file mode 100644 index 0000000..43ecf95 Binary files /dev/null and b/bpu_model_perf_images/example_fs/input/yolo11n_detect_nashe_640x640_nv12.hbm differ diff --git a/bpu_model_perf_images/example_fs/output/result.json b/bpu_model_perf_images/example_fs/output/result.json new file mode 100644 index 0000000..6e17bed --- /dev/null +++ b/bpu_model_perf_images/example_fs/output/result.json @@ -0,0 +1,48 @@ +[ + { + "model_name": "yolo11n_detect_nashe_640x640_nv12.hbm", + "model_path": "/workspace/input/yolo11n_detect_nashe_640x640_nv12.hbm", + "perf_results": [ + { + "thread_num": 1, + "frame_count": 20, + "run_time_ms": 32.467, + "total_latency_ms": 30.841, + "avg_latency_ms": 1.542, + "fps": 620.79, + "raw_output": "[UCP]: log level = 3\n[UCP]: UCP version = 3.7.3\n[VP]: log level = 3\n[DNN]: log level = 3\n[HPL]: log level = 3\n[UCPT]: log level = 6\nhrt_model_exec perf --model_file /workspace/input/yolo11n_detect_nashe_640x640_nv12.hbm --thread_num 1 --frame_count 20\n\n\u001b[1;33m [Warning]: These operators have range limitations on input data: \u001b[0m\n\u001b[1;33m [Acos, Acosh, Asin, Atanh, BevPoolV2, Div, Gather, GatherElements, GatherND, GridSample, ImageDecoder, IndexSelect, Log, Mod, Pow, Reciprocal, RoiAlign, ScatterElements, ScatterND, Slice, Sqrt, Tan, Tile, Topk, Upsample]. \u001b[0m\n\u001b[1;33m Please make sure that these operators are not in your model, when no input data is provided to the tool. \u001b[0m\n\u001b[1;33m [Suggestion]: Using --input_file command to specify perf input data, which can appoint valid input data. \u001b[0m\n\n[BPU][[BPU_MONITOR]][281473143438048][INFO]BPULib verison(2, 1, 2)[0d3f195]!\n[DNN] HBTL_EXT_DNN log level:6\n[DNN]: 3.7.3_(4.2.11 HBRT)\nLoad model to DDR cost 352.489ms.\n\u001b[32m[I][9][03-18][09:19:21:855][tensor_util.cpp:321][hrt_model_exec][HRT_MODEL_EXEC] Input[0] stride is dynamic, but you did not specify the stride, set as (409600,640,1,1)\n\u001b[m\u001b[K\u001b[32m[I][9][03-18][09:19:21:855][tensor_util.cpp:321][hrt_model_exec][HRT_MODEL_EXEC] Input[1] stride is dynamic, but you did not specify the stride, set as (204800,640,2,1)\n\u001b[m\u001b[K\nRunning condition:\n Thread number is: 1\n Frame count is: 20\n Program run time: 32.467 ms\nPerf result:\n Frame totally latency is: 30.841 ms\n Average latency is: 1.542 ms\n Frame rate is: 620.790 FPS\n", + "returncode": 0 + }, + { + "thread_num": 2, + "frame_count": 20, + "run_time_ms": 19.705, + "total_latency_ms": 36.596, + "avg_latency_ms": 1.837, + "fps": 1016.369, + "raw_output": "[UCP]: log level = 3\n[UCP]: UCP version = 3.7.3\n[VP]: log level = 3\n[DNN]: log level = 3\n[HPL]: log level = 3\n[UCPT]: log level = 6\nhrt_model_exec perf --model_file /workspace/input/yolo11n_detect_nashe_640x640_nv12.hbm --thread_num 2 --frame_count 20\n\n\u001b[1;33m [Warning]: These operators have range limitations on input data: \u001b[0m\n\u001b[1;33m [Acos, Acosh, Asin, Atanh, BevPoolV2, Div, Gather, GatherElements, GatherND, GridSample, ImageDecoder, IndexSelect, Log, Mod, Pow, Reciprocal, RoiAlign, ScatterElements, ScatterND, Slice, Sqrt, Tan, Tile, Topk, Upsample]. \u001b[0m\n\u001b[1;33m Please make sure that these operators are not in your model, when no input data is provided to the tool. \u001b[0m\n\u001b[1;33m [Suggestion]: Using --input_file command to specify perf input data, which can appoint valid input data. \u001b[0m\n\n[BPU][[BPU_MONITOR]][281472894728928][INFO]BPULib verison(2, 1, 2)[0d3f195]!\n[DNN] HBTL_EXT_DNN log level:6\n[DNN]: 3.7.3_(4.2.11 HBRT)\nLoad model to DDR cost 356.068ms.\n\u001b[32m[I][53][03-18][09:19:23:890][tensor_util.cpp:321][hrt_model_exec][HRT_MODEL_EXEC] Input[0] stride is dynamic, but you did not specify the stride, set as (409600,640,1,1)\n\u001b[m\u001b[K\u001b[32m[I][53][03-18][09:19:23:890][tensor_util.cpp:321][hrt_model_exec][HRT_MODEL_EXEC] Input[1] stride is dynamic, but you did not specify the stride, set as (204800,640,2,1)\n\u001b[m\u001b[K\nRunning condition:\n Thread number is: 2\n Frame count is: 20\n Program run time: 19.705 ms\nPerf result:\n Frame totally latency is: 36.596 ms\n Average latency is: 1.837 ms\n Frame rate is: 1016.369 FPS\n", + "returncode": 0 + }, + { + "thread_num": 3, + "frame_count": 20, + "run_time_ms": 18.084, + "total_latency_ms": 49.101, + "avg_latency_ms": 2.456, + "fps": 1111.797, + "raw_output": "[UCP]: log level = 3\n[UCP]: UCP version = 3.7.3\n[VP]: log level = 3\n[DNN]: log level = 3\n[HPL]: log level = 3\n[UCPT]: log level = 6\nhrt_model_exec perf --model_file /workspace/input/yolo11n_detect_nashe_640x640_nv12.hbm --thread_num 3 --frame_count 20\n\n\u001b[1;33m [Warning]: These operators have range limitations on input data: \u001b[0m\n\u001b[1;33m [Acos, Acosh, Asin, Atanh, BevPoolV2, Div, Gather, GatherElements, GatherND, GridSample, ImageDecoder, IndexSelect, Log, Mod, Pow, Reciprocal, RoiAlign, ScatterElements, ScatterND, Slice, Sqrt, Tan, Tile, Topk, Upsample]. \u001b[0m\n\u001b[1;33m Please make sure that these operators are not in your model, when no input data is provided to the tool. \u001b[0m\n\u001b[1;33m [Suggestion]: Using --input_file command to specify perf input data, which can appoint valid input data. \u001b[0m\n\n[BPU][[BPU_MONITOR]][281473576172256][INFO]BPULib verison(2, 1, 2)[0d3f195]!\n[DNN] HBTL_EXT_DNN log level:6\n[DNN]: 3.7.3_(4.2.11 HBRT)\nLoad model to DDR cost 347.722ms.\n\u001b[32m[I][98][03-18][09:19:25:890][tensor_util.cpp:321][hrt_model_exec][HRT_MODEL_EXEC] Input[0] stride is dynamic, but you did not specify the stride, set as (409600,640,1,1)\n\u001b[m\u001b[K\u001b[32m[I][98][03-18][09:19:25:890][tensor_util.cpp:321][hrt_model_exec][HRT_MODEL_EXEC] Input[1] stride is dynamic, but you did not specify the stride, set as (204800,640,2,1)\n\u001b[m\u001b[K\nRunning condition:\n Thread number is: 3\n Frame count is: 20\n Program run time: 18.084 ms\nPerf result:\n Frame totally latency is: 49.101 ms\n Average latency is: 2.456 ms\n Frame rate is: 1111.797 FPS\n", + "returncode": 0 + }, + { + "thread_num": 4, + "frame_count": 20, + "run_time_ms": 18.174, + "total_latency_ms": 64.363, + "avg_latency_ms": 3.217, + "fps": 1092.826, + "raw_output": "[UCP]: log level = 3\n[UCP]: UCP version = 3.7.3\n[VP]: log level = 3\n[DNN]: log level = 3\n[HPL]: log level = 3\n[UCPT]: log level = 6\nhrt_model_exec perf --model_file /workspace/input/yolo11n_detect_nashe_640x640_nv12.hbm --thread_num 4 --frame_count 20\n\n\u001b[1;33m [Warning]: These operators have range limitations on input data: \u001b[0m\n\u001b[1;33m [Acos, Acosh, Asin, Atanh, BevPoolV2, Div, Gather, GatherElements, GatherND, GridSample, ImageDecoder, IndexSelect, Log, Mod, Pow, Reciprocal, RoiAlign, ScatterElements, ScatterND, Slice, Sqrt, Tan, Tile, Topk, Upsample]. \u001b[0m\n\u001b[1;33m Please make sure that these operators are not in your model, when no input data is provided to the tool. \u001b[0m\n\u001b[1;33m [Suggestion]: Using --input_file command to specify perf input data, which can appoint valid input data. \u001b[0m\n\n[BPU][[BPU_MONITOR]][281472884636384][INFO]BPULib verison(2, 1, 2)[0d3f195]!\n[DNN] HBTL_EXT_DNN log level:6\n[DNN]: 3.7.3_(4.2.11 HBRT)\nLoad model to DDR cost 347.186ms.\n\u001b[32m[I][144][03-18][09:19:27:914][tensor_util.cpp:321][hrt_model_exec][HRT_MODEL_EXEC] Input[0] stride is dynamic, but you did not specify the stride, set as (409600,640,1,1)\n\u001b[m\u001b[K\u001b[32m[I][144][03-18][09:19:27:914][tensor_util.cpp:321][hrt_model_exec][HRT_MODEL_EXEC] Input[1] stride is dynamic, but you did not specify the stride, set as (204800,640,2,1)\n\u001b[m\u001b[K\nRunning condition:\n Thread number is: 4\n Frame count is: 20\n Program run time: 18.174 ms\nPerf result:\n Frame totally latency is: 64.363 ms\n Average latency is: 3.217 ms\n Frame rate is: 1092.826 FPS\n", + "returncode": 0 + } + ] + } +] \ No newline at end of file diff --git a/bpu_model_perf_images/example_fs_bayese/input/task.json b/bpu_model_perf_images/example_fs_bayese/input/task.json new file mode 100644 index 0000000..348c9aa --- /dev/null +++ b/bpu_model_perf_images/example_fs_bayese/input/task.json @@ -0,0 +1,4 @@ +{ + "model_relative_path": "your_bayese_model.hbm", + "frame_count": 200 +} diff --git a/bpu_model_perf_images/workspace/entrypoint.sh b/bpu_model_perf_images/workspace/entrypoint.sh new file mode 100644 index 0000000..ec19543 --- /dev/null +++ b/bpu_model_perf_images/workspace/entrypoint.sh @@ -0,0 +1,20 @@ +#!/bin/bash +set -e + +INPUT_DIR="/workspace/input" +OUTPUT_DIR="/workspace/output" +INPUT_JSON="${INPUT_DIR}/task.json" +OUTPUT_JSON="${OUTPUT_DIR}/result.json" + +# Allow overrides via env vars +INPUT_JSON="${PERF_INPUT:-$INPUT_JSON}" +OUTPUT_JSON="${PERF_OUTPUT:-$OUTPUT_JSON}" + +if [ ! -f "$INPUT_JSON" ]; then + echo "ERROR: input file not found: $INPUT_JSON" + exit 1 +fi + +mkdir -p "$(dirname "$OUTPUT_JSON")" + +exec python3 /workspace/perf/perf.py --input "$INPUT_JSON" --output "$OUTPUT_JSON" diff --git a/bpu_model_perf_images/workspace/perf.py b/bpu_model_perf_images/workspace/perf.py new file mode 100644 index 0000000..c2c819d --- /dev/null +++ b/bpu_model_perf_images/workspace/perf.py @@ -0,0 +1,221 @@ +#!/usr/bin/env python3 +""" +BPU Model Performance Benchmark Tool +Runs hrt_model_exec perf for each model at thread counts 1,2,3,4 +""" + +import argparse +import json +import re +import subprocess +import sys +from pathlib import Path + +HRT_MODEL_EXEC = "hrt_model_exec" +THREAD_COUNTS = [1, 2, 3, 4] + + +def run_perf(model_path: str, thread_num: int, frame_count: int = 200) -> dict: + """Run hrt_model_exec perf and return parsed results.""" + cmd = [ + HRT_MODEL_EXEC, "perf", + "--model_file", model_path, + "--thread_num", str(thread_num), + "--frame_count", str(frame_count), + ] + result = subprocess.run(cmd, capture_output=True, text=True) + output = result.stdout + result.stderr + + perf = { + "thread_num": thread_num, + "frame_count": frame_count, + "run_time_ms": None, + "total_latency_ms": None, + "avg_latency_ms": None, + "fps": None, + "raw_output": output, + "returncode": result.returncode, + } + + m = re.search(r"Program run time:\s*([\d.]+)\s*ms", output) + if m: + perf["run_time_ms"] = float(m.group(1)) + + m = re.search(r"Frame totally latency is:\s*([\d.]+)\s*ms", output) + if m: + perf["total_latency_ms"] = float(m.group(1)) + + m = re.search(r"Average\s+latency\s+is:\s*([\d.]+)\s*ms", output) + if m: + perf["avg_latency_ms"] = float(m.group(1)) + + m = re.search(r"Frame\s+rate\s+is:\s*([\d.]+)\s*FPS", output) + if m: + perf["fps"] = float(m.group(1)) + + return perf + + +def print_table(results: list): + """Print results as a human-readable table.""" + # 动态计算 model 列宽 + max_name_len = max(len(e["model_name"]) for e in results) + col_widths = [max(max_name_len, 10), 9, 16, 10] + headers = ["Model", "Threads", "Avg Latency(ms)", "FPS"] + + sep = "+" + "+".join("-" * (w + 2) for w in col_widths) + "+" + header_row = "|" + "|".join( + f" {h:<{w}} " for h, w in zip(headers, col_widths) + ) + "|" + + for entry in results: + name = entry["model_name"] + # 每个模型单独打印表头 + print(sep) + print(header_row) + print(sep) + for p in entry["perf_results"]: + avg = f"{p['avg_latency_ms']:.3f}" if p["avg_latency_ms"] is not None else "N/A" + fps = f"{p['fps']:.2f}" if p["fps"] is not None else "N/A" + row = [name, str(p["thread_num"]), avg, fps] + print("|" + "|".join( + f" {v:<{w}} " for v, w in zip(row, col_widths) + ) + "|") + name = "" + print(sep) + + +def validate_config(config: dict, input_dir: Path) -> list: + """Print and validate task.json, return normalized model list. Exit on error.""" + print("=" * 60) + print("task.json content:") + print(json.dumps(config, indent=2, ensure_ascii=False)) + print("=" * 60) + + errors = [] + + # --- frame_count --- + if "frame_count" in config: + if not isinstance(config["frame_count"], int) or config["frame_count"] <= 0: + errors.append(" [frame_count] must be a positive integer") + + # --- 判断格式 --- + has_single = "model_relative_path" in config + has_multi = "models" in config + + if not has_single and not has_multi: + errors.append(" missing required field: 'model_relative_path' or 'models'") + for e in errors: + print(f"[ERROR] {e}", file=sys.stderr) + sys.exit(1) + + if has_single and has_multi: + errors.append(" ambiguous: both 'model_relative_path' and 'models' are present, use one") + + models = [] + + if has_single: + rel = config["model_relative_path"] + if not isinstance(rel, str) or not rel.strip(): + errors.append(" [model_relative_path] must be a non-empty string") + elif not rel.endswith((".hbm", ".bin")): + errors.append(f" [model_relative_path] unsupported extension: '{rel}' (expected .hbm or .bin)") + else: + models = [{"name": Path(rel).name, "path": rel}] + + if has_multi: + if not isinstance(config["models"], list) or len(config["models"]) == 0: + errors.append(" [models] must be a non-empty list") + else: + for i, m in enumerate(config["models"]): + prefix = f" [models[{i}]]" + if not isinstance(m, dict): + errors.append(f"{prefix} each entry must be an object") + continue + if "path" not in m: + errors.append(f"{prefix} missing required field 'path'") + elif not isinstance(m["path"], str) or not m["path"].strip(): + errors.append(f"{prefix} 'path' must be a non-empty string") + elif not m["path"].endswith((".hbm", ".bin")): + errors.append(f"{prefix} unsupported extension: '{m['path']}' (expected .hbm or .bin)") + else: + models.append({"name": m.get("name", Path(m["path"]).name), "path": m["path"]}) + + if errors: + for e in errors: + print(f"[ERROR] {e}", file=sys.stderr) + sys.exit(1) + + # --- 文件存在性检查 --- + missing = [] + for m in models: + full = input_dir / m["path"] + if not full.exists(): + missing.append(f" model file not found: {full}") + if missing: + for e in missing: + print(f"[ERROR] {e}", file=sys.stderr) + sys.exit(1) + + print(f"[OK] {len(models)} model(s) validated, frame_count={config.get('frame_count', 200)}\n") + return models + + +def main(): + parser = argparse.ArgumentParser(description="BPU model perf benchmark") + parser.add_argument("--input", required=True, help="Input JSON file") + parser.add_argument("--output", required=True, help="Output JSON file") + args = parser.parse_args() + + if not Path(args.input).exists(): + print(f"[ERROR] input file not found: {args.input}", file=sys.stderr) + sys.exit(1) + + with open(args.input) as f: + config = json.load(f) + + input_dir = Path(args.input).parent + frame_count = config.get("frame_count", 200) + models = validate_config(config, input_dir) + + if not models: + print("[ERROR] no models specified in input JSON", file=sys.stderr) + sys.exit(1) + + output_results = [] + + for model in models: + rel_path = model["path"] + model_path = str(input_dir / rel_path) + model_name = model.get("name", Path(rel_path).name) + print(f"\n[Benchmarking] {model_name} ({model_path})") + + perf_results = [] + for t in THREAD_COUNTS: + print(f" thread_num={t} ...", end=" ", flush=True) + p = run_perf(model_path, t, frame_count) + perf_results.append(p) + if p["fps"] is not None: + print(f"FPS={p['fps']:.2f} avg_latency={p['avg_latency_ms']:.3f}ms") + else: + print("FAILED (check raw_output in result JSON)") + + output_results.append({ + "model_name": model_name, + "model_path": model_path, + "perf_results": perf_results, + }) + + print("\n" + "=" * 80) + print_table(output_results) + + output_path = Path(args.output) + output_path.parent.mkdir(parents=True, exist_ok=True) + with open(output_path, "w") as f: + json.dump(output_results, f, indent=2) + + print(f"\nResults saved to: {args.output}") + + +if __name__ == "__main__": + main() diff --git a/install_docker.sh b/install_docker/install_docker.sh similarity index 100% rename from install_docker.sh rename to install_docker/install_docker.sh diff --git a/test_docker.sh b/install_docker/test_docker.sh similarity index 100% rename from test_docker.sh rename to install_docker/test_docker.sh