TPU processor, 16 channels HD video intelligent analysis, 16 channels of full HD video decoding, 10 channels of full HD video encoding
TPU processor, 32 channels HD video intelligent analysis, 32 channels of full HD video decoding, 12 channels of full HD video encoding
RISC-V + ARM intelligent deep learning processor
Based on the RISC-V core, operating at a frequency of 2GHz, the processor features a single SOC with 64 cores and 64MB shared L3 cache.
SRC1-10 is an excellent performance server cluster based on RISC-V arch. It has both computing and storage capabilities, and the full stack of software and hardware is domestically produced.
The RISC-V Fusion Server, supports dual-processor interconnection and enabled intelligent computing acceleration.
SRB1-20 is an excellent performance storage server based on RISC-V arch. It supports CCIX, 128-core concurrent, multi-disk large-capacity secure storage, and the full stack of software and hardware is domestically produced.
SRA1-20 is an excellent performance computing server based on RISC-V arch. It supports CCIX, 128-core concurrent, both software and hardware are open source and controllable.
SRA3-40 is a RISC-V server for high-performance computing, domestic main processor,excellent performance,fusion of intelligent computing, support powerful codec.
SRB3-40 is a high-performance RISC-V storage server with multiple disk slots and large-capacity secure storage.
Intelligent computing server SGM7-40, adapted to mainstream LLM, a single card can run a 70B large language model
SOM1684, BM1684, 16-Channel HD Video Analysis
Core-1684-JD4,BM1684, 16-Channel HD Video Analysis
SBC-6841,BM1684, 16-Channel HD Video Analysis
iCore-1684XQ,BM1684X,32-Channel HD Video Analysis
Core-1684XJD4,BM1684X,32-Channel HD Video Analysis
Shaolin PI SLKY01,BM1684, 16-Channel HD Video Analysis
QY-AIM16T-M,BM1684, 16-Channel HD Video Analysis
QY-AIM16T-M-G,BM1684, 16-Channel HD Video Analysis
QY-AIM16T-W,BM1684, 16-Channel HD Video Analysis
AIV02T,1684*2,Half-Height Half-Length Accelerator Card
AIO-1684JD4,BM1684, 16-Channel HD Video Analysis
AIO-1684XJD4,BM1684X,32-Channel HD Video Analysis
AIO-1684XQ,BM1684X,32-Channel HD Video Analysis
IVP03X,BM1684X,32-Channel HD Video Analysis
IVP03A,Microserver, passive cooling, 12GB RAM
Coeus-3550T,BM1684, 16-Channel HD Video Analysis
EC-1684JD4,BM1684, 16-Channel HD Video Analysis
CSA1-N8S1684,BM1684*8,1U Cluster Server
DZFT-ZDFX,BM1684X,Electronic Seal Analyzer,ARM+DSP architecture
ZNFX-32,BM1684, 16-Channel HD Video Analysis
ZNFX-8,BM1684X,ARM+DSP architecture,Flameproof and Intrinsic Safety Analysis Device
EC-A1684JD4,Microserver with active cooling, 16GB RAM, 32GB eMMC
EC-A1684JD4 FD,BM1684, 16-Channel HD Video Analysis,6GB of RAM, 32GB eMMC
EC-A1684XJD4 FD,BM1684X,32-Channel HD Video Analysis
ECE-S01, BM1684, 16-Channel HD Video Analysis
IOEHM-AIRC01,BM1684,Microserver Active Cooling,16-Channel HD Video Analysis
IOEHM-VCAE01, BM1684, 16-Channel HD Video Analysis
CSA1-N8S1684X,BM1684*8,1U Cluster Server
QY-S1U-16, BM1684, 1U Server
QY-S1U-192, BM1684*12, 1U Cluster Server
QY-S1X-384, BM1684*12, 1U Cluster Server
Deep learning intelligent analysis helps make city management more efficient and precise
Using deep learning video technology to analyze sources of dust generation and dust events, contributing to ecological environmental protection
Using deep learning intelligent analysis to monitor scenarios such as safety production, urban firefighting, and unexpected incidents for emergency regulation.
Using deep learning technology to detect and analyze individuals, vehicles, and security incidents in grassroots governance
Empowering the problems of traffic congestion, driving safety, vehicle violations, and road pollution control
Utilizing domestically developed computational power to support the structured analysis of massive volumes of videos, catering to practical applications in law enforcement
Build a "smart, collaborative, efficient, innovative" gait recognition big data analysis system centered around data
Effectively resolving incidents of objects thrown from height, achieving real-time monitoring of such incidents, pinpointing the location of the thrown object, triggering alerts, and effectively safeguarding the safety of the public from falling objects
Using edge computing architecture to timely and accurately monitor community emergencies and safety hazards
SOPHGO with SOPHON.TEAM ecosystem partners to build a deep learning supervision solution for smart hospitals, enhancing safety management efficiency in hospitals
SOPHGO with SOPHON.TEAM ecosystem partners to build a smart safe campus solution
Using a combination of cloud-edge deep learning methods to address food safety supervision requirements across multiple restaurant establishments, creating a closed-loop supervision system for government and enterprise-level stakeholders
SOPHON's self-developed computing hardware devices, such as SG6/SE5/SE6, equipped with SOPHON.TEAM video analysis algorithms, are used to make industrial safety production become smarter
Combining deep learning, edge computing and other technologies, it has the ability to intelligently identify people, objects, things and their specific behaviors in the refueling area and unloading area. It also automatically detects and captures illegal incidents at gas stations to facilitate effective traceability afterwards and provide data for safety management.
SOPHGO, in collaboration with SOPHON.TEAM and its ecosystem partners, is focusing on three major scene requirements: "Production Safety Supervision," "Comprehensive Park Management," and "Personnel Safety & Behavioral Standard Supervision." Together, they are developing a comprehensive deep learning scenario solution, integrating "algorithm + computing power + platform."
SOPHGO, cooperates with SOPHON.TEAM ecological partners to build a deep learning monitoring solution for safety risks in chemical industry parks
SOPHGO with SOPHON.TEAM ecosystem partners to build a Smart Computing Center solution, establishing a unified management and scheduling cloud-edge collaborative smart computing center
SOPHGO, in collaboration with SOPHON.TEAM ecosystem, have jointly developed a set of hardware leveraging domestically-produced deep learning computational power products. This is based on an AutoML zero-code automated deep learning training platform, enabling rapid and efficient implementation of deep learning engineering solutions
transpose所做的工作是将一个数组进行换轴操作。对于多维数组,其包含多个轴,比如常用的四维数据就有四个索引维度,而transpose做的是按照多维数组的轴的索引进行变换。
typedef struct {
int N, C, H, W;
int order[4];
unsigned long long output_addr;
unsigned long long input_addr;
} __attribute__((packed)) param_t;
说明:
N, C, H, W: 指定了四个维度的具体大小;
order[4]: 指定了换轴的参数,例如order={0,2,3,1}则指出了张量从{0, 1, 2, 3}的索引顺序变为(0, 2, 3, 1), 即从N*C*H*W变为N*H*W*C;
output_addr:输出数据的地址(global memory);
input_addr: 输入数据的地址(global memory);
可以参考“okkernel/host/transpose.cpp”中的transpose函数,该函数计算的结果将用于与device端输出结果进行比较,判断device端输出的结果是否正确:
void transpose(const T *input, T *buffer, const int *input_shape, const int *trans_order, const int *trans_shape, const int shape_dims) { for (int n = 0; n < input_shape[0]; n++) { for (int c = 0; c < input_shape[1]; c++) { for (int h = 0; h < input_shape[2]; h++) { for (int w = 0; w < input_shape[3]; w++) { int nchw[4] = {n, c, h, w}; int dst_idx = nchw[trans_order[0]] * trans_shape[1] * trans_shape[2] * trans_shape[3] + nchw[trans_order[1]] * trans_shape[2] * trans_shape[3] + nchw[trans_order[2]] * trans_shape[3] + nchw[trans_order[3]]; int src_idx = n * input_shape[1] * input_shape[2] * input_shape[3] + c * input_shape[2] * input_shape[3] + h * input_shape[3] + w; buffer[dst_idx] = input[src_idx]; } } } } }
数据在内存中按章N-C-H-W维度顺序存放,遍历n*c*h*w个数据,为每个数据 input[src_idx]寻找转换后的索引dst_idx,将其放入buffer的相应位置buffer[dst_idx],完成transpose过程。
比如一个NCHW=(2, 5, h, w)的张量,其送到tpu上示意图如左侧所示,如果order[4]={1, 0, 2, 3},则输出张量为(5, 2, h, w),最简单的思路是先利用gdmaS2L将索引N=0的张量数据送入tpu,再采用gdmaL2S送出,通过设置合理的stride实现合理的输出位置,随后再处理索引N=1的张量数据。