BMNNSDK(BitMain Neural Network SDk)is the Sophon’s proprietary deep learning SDK based on BM AI chip, with its powerful tools, you can deploy the deep learning application in the runtime environment, and deliver the maximum inference throughput and efficiency.

Full-Stack solution for Deep Learning Development & Deployment

BMNNSDK is composed of BMNet Compiler and BMRuntime.

BMNet Compiler It is responsible for optimizing and converting various deep neural network models (such as caffemodel), fully balancing the EU operation and memory access time, improving the parallelism of operations, and finally converting it to the bmodel model supported by Sophon TPU.
BMRuntime It is responsible for driving the TPU chip and providing a unified programmable interface for the upper-layer application program, so that the program can perform neural network inference through the bmodel model, and the user does not need to care about the implementation details of the underlying hardware.
Full-Stack solution

There are two device drive modes, PCIE and SOC, developers have more choices.

Combined with the AI chip independently developed by Sophon, it provides the largest inference throughput and the simplest application deployment environment.

Provide runtime library programming interface for manipulating the underlying computing resources, users can conduct in-depth development.

The runtime library provides concurrent processing capabilities and supports multi-process and multi-thread modes.

Product Function


BMNNSDK has two kinds of compilation. For the layer that TPU support, you can use the BMNet to compile and deploy. For the layer that TPU can’t support currently, you can extend the compiler by BMNet programming interface, use the BMKernel programming interface or CPU instructions to add custom network layer, enable users to compile a non-public network.


We provide developers with docker image for development, which integrated the tools and libraries required for BMNNSDK, developers can use it to develop the deep learning application.


The compiled network and the deep learning application can be deployed through BMRuntime after integrated. In the deployed process, you can use the BMNet inference engine API interface for programming.