SOPHON SE5 Deep Learning Computing Box is a high-performance, low-power edge computing product based on processors and modules, which targets a wider range of scenarios than module-shaped products. It is equipped with SOPHON's independently developed third-generation TPU processor BM1684, capable of an INT8 computing power of up to 17.6 TOPS, and can simultaneously process 16 channels of high-definition video, providing intelligent computing for various security, comprehensive security, education, finance, and security inspection projects.
The SE5 Deep Learning Computing Box is a small-scale server based on edge computing, supporting algorithms from various industries. With a complete ecosystem, it facilitates users in porting well-trained models. It not only supports facial recognition algorithm models but also supports dozens of auxiliary models, making it highly versatile for different scenarios. It can be applied in indoor and outdoor environments such as parks, communities, commercial buildings, and semi-enclosed integrated outdoor scenarios, without relying on X86 architecture servers. It fully utilizes its internal ARM resources, enabling independent integrated application development.
This computing box boasts high computing power and strong market competitiveness, while also preserving a portion of high-precision computing power. In scenarios requiring high-precision computing power, it retains the advantage of high precision, such as in dynamic visual unmanned retail cabinets and product recognition in smart refrigerator systems. Practical applications of the SE5 include deployment as an edge facial server in parks for entrance identification or park monitoring, facial payments in smart canteens, student facial recognition in home-school systems, access management in school dormitory systems, implementation of dish recognition algorithms in catering systems for billing purposes, replacing traditional security personnel in image recognition, higher accuracy in machine judgment, reduced training costs for security personnel, faster passage, and intelligent assistance in security checks. The diverse range of implantable algorithm models enables diversified application scenarios.
This course will explain the SE5 computing box and its application processes. By taking this course, you'll gain a clear understanding of this experimental box and become familiar with applying it to specific scenarios.
Systematic teaching: From product introduction to environment setup and application processes.
Comprehensive materials: The course includes instructional videos, documentation, code scripts, etc., providing detailed and rich information
This course requires a foundation in Python programming and linux development.
As a bridge between the framework and hardware, the Deep learning compiler can realize the goal of one-time code development and reuse of various computing power processors. Recently, Computational Energy has also opened source its self-developed TPU compiler tool - TPU-MLIR (Multi-Level Intermediate Representation). Tpu-mlir is an open source TPU compiler for Deep learning processors. The project provides a complete tool chain, which converts the pre-trained neural network under various frameworks into a binary file bmodel that can operate efficiently in TPU to achieve more efficient reasoning. This course is driven by actual practice, leading you to intuitively understand, practice, and master the TPU compiler framework of intelligent Deep learning processors.
At present, the TPU-MLIR project has been applied to the latest generation of deep learning processor BM1684X, which is developed by Computational Energy. Combined with the high-performance ARM core of the processor itself and the corresponding SDK, it can realize the rapid deployment of deep learning algorithms. The course will cover the basic syntax of MLIR and the implementation details of various optimization operations in the compiler, such as figure optimization, int8 quantization, operator segmentation, and address allocation.
TPU-MLIR has several advantages over other compilation tools
1. Simple and convenient
By reading the development manual and the samples included in the project, users can understand the model conversion process and principles, and quickly get started. Moreover, TPU-MLIR is designed based on the current mainstream compiler tool library MLIR, and users can also learn the application of MLIR through it. The project has provided a complete set of tool chain, users can directly through the existing interface to quickly complete the model transformation work, do not have to adapt to different networks
2. General
At present, TPU-MLIR already supports the TFLite and onnx formats, and the models of these two formats can be directly converted into the bmodel available for TPU. What if it's not either of these formats? In fact, onnx provides a set of conversion tools that can convert models written by major deep learning frameworks on the market today to onnx format, and then proceed to bmodel
3, precision and efficiency coexist
During the process of model conversion, accuracy is sometimes lost. TPU-MLIR supports INT8 symmetric and asymmetric quantization, which can greatly improve the performance and ensure the high accuracy of the model combined with Calibration and Tune technology of the original development company. In addition, TPU-MLIR also uses a lot of graph optimization and operator segmentation optimization techniques to ensure the efficient operation of the model.
4. Achieve the ultimate cost performance and build the next generation of Deep learning compiler
In order to support graphic computation, operators in neural network model need to develop a graphic version; To adapt the TPU, a version of the TPU should be developed for each operator. In addition, some scenarios need to be adapted to different models of the same computing power processor, which must be manually compiled each time, which will be very time-consuming. The Deep learning compiler is designed to solve these problems. Tpu-mlir's range of automatic optimization tools can save a lot of manual optimization time, so that models developed on RISC-V can be smoothly and freely ported to the TPU for the best performance and price ratio.
5. Complete information
Courses include Chinese and English video teaching, documentation guidance, code scripts, etc., detailed and rich video materials detailed application guidance clear code script TPU-MLIR standing on the shoulders of MLIR giants to build, now all the code of the entire project has been open source, open to all users free of charge.
Code Download Link: https://github.com/sophgo/tpu-mlir
TPU-MLIR Development Reference Manual: https://tpumlir.org/docs/developer_manual/01_introduction.html
The Overall Design Ideas Paper: https://arxiv.org/abs/2210.15016
Video Tutorials: https://space.bilibili.com/1829795304/channel/collectiondetail?sid=734875"
Course catalog
序号 | 课程名 | 课程分类 | 课程资料 | ||
视频 | 文档 | 代码 | |||
1.1 | Deep learning编译器基础 | TPU_MLIR基础 | √ | √ | √ |
1.2 | MLIR基础 | TPU_MLIR基础 | √ | √ | √ |
1.3 | MLIR基本结构 | TPU_MLIR基础 | √ | √ | √ |
1.4 | MLIR之op定义 | TPU_MLIR基础 | √ | √ | √ |
1.5 | TPU_MLIR介绍(一) | TPU_MLIR基础 | √ | √ | √ |
1.6 | TPU_MLIR介绍(二) | TPU_MLIR基础 | √ | √ | √ |
1.7 | TPU_MLIR介绍(三) | TPU_MLIR基础 | √ | √ | √ |
1.8 | 量化概述 | TPU_MLIR基础 | √ | √ | √ |
1.9 | 量化推导 | TPU_MLIR基础 | √ | √ | √ |
1.10 | 量化校准 | TPU_MLIR基础 | √ | √ | √ |
1.11 | 量化感知训练(一) | TPU_MLIR基础 | √ | √ | √ |
1.12 | 量化感知训练(二) | TPU_MLIR基础 | √ | √ | √ |
2.1 | Pattern Rewriting | TPU_MLIR实战 | √ | √ | √ |
2.2 | Dialect Conversion | TPU_MLIR实战 | √ | √ | √ |
2.3 | 前端转换 | TPU_MLIR实战 | √ | √ | √ |
2.4 | Lowering in TPU_MLIR | TPU_MLIR实战 | √ | √ | √ |
2.5 | 添加新算子 | TPU_MLIR实战 | √ | √ | √ |
2.6 | TPU_MLIR图优化 | TPU_MLIR实战 | √ | √ | √ |
2.7 | TPU_MLIR常用操作 | TPU_MLIR实战 | √ | √ | √ |
2.8 | TPU原理(一) | TPU_MLIR实战 | √ | √ | √ |
2.9 | TPU原理(二) | TPU_MLIR实战 | √ | √ | √ |
2.10 | 后端算子实现 | TPU_MLIR实战 | √ | √ | √ |
2.11 | TPU层优化 | TPU_MLIR实战 | √ | √ | √ |
2.12 | bmodel生成 | TPU_MLIR实战 | √ | √ | √ |
2.13 | To ONNX format | TPU_MLIR实战 | √ | √ | √ |
2.14 | Add a New Operator | TPU_MLIR实战 | √ | √ | √ |
2.15 | TPU_MLIR模型适配 | TPU_MLIR实战 | √ | √ | √ |
2.16 | Fuse Preprocess | TPU_MLIR实战 | √ | √ | √ |
2.17 | 精度验证 | TPU_MLIR实战 | √ | √ | √ |
This course introduces the hardware circuit design and basic environment set up, as well as provides some simple development examples and some basic Deep learning examples.
Milk-V Duo is an ultra-compact embedded development platform based on CV1800B. It has small size and comprehensive functionality, it is equipped with dual cores and can run linux and rtos systems separately, and has various connectable peripherals.
Course features:
The deep neural network model can be trained and tested quickly and then deployed by the industry to effectively perform tasks in the real world. Deploying such systems on small-sized, low-power Deep learning edge computing platforms is highly favored by the industry. This course takes a practice-driven approach to lead you to intuitively learn, practice, and master the knowledge and technology of deep neural networks.
The SOPHON Deep learning microserver SE5 is a high-performance, low-power edge computing product equipped with the third-generation TPU processor BM1684 developed independently by SOPHGO. With an INT8 computing power of up to 17.6 TOPS, it supports 32 channels of Full HD video hardware decoding and 2 channels of encoding. This course will quickly guide you through the powerful features of the SE5 server. Through this course, you can understand the basics of Deep learning and master its basic applications.
Course Features
1. One-stop service
All common problems encountered in SE5 applications can be found here.
• Provide a full-stack solution for Deep learning micro servers
• Break down the development process step by step, in detail and clearly
• Support all mainstream frameworks, easy to use products
2. Systematic teaching
It includes everything from setting up the environment, developing applications, converting models, and deploying products, as well as having a mirrored practical environment.
• How is the environment built?
• How is the model compiled?
• How is the application developed?
• How are scenarios deployed?
3. Complete materials
The course includes video tutorials, document guides, code scripts, and other comprehensive materials.
• Rich video materials
• Detailed application guidance
• Clear code scripts
Code download link: https://github.com/sophon-ai-algo/examples
4. Free cloud development resources
Online free application for using SE5-16 microserver cloud testing space
• SE5-16 microserver cloud testing space can be used for online development and testing, supporting user data retention and export
• SE5-16 microserver cloud testing space has the same resource performance as the physical machine environment
Cloud platform application link: https://account.sophgo.com/sign_in?service=https://cloud.sophgo.com&locale=zh-CN
Cloud platform usage instructions: https://cloud.sophgo.com/tpu.pdf