Compiler Service: AI/ML Compiler for inferencing of TensorFlow models on the edge
Helprack has been developing AI/ML compilers for models written in TensorFlow, ONNX, PyTorch etc, to run on custom compute fabric for Edge Computing (Inferencing) for a couple of stealth mode startups in the AI/ML chip space
This customer has developed an AI/ML chip that has a compute fabric akin to a GPU and can compute on the Edge. The chip is supposed to be deployed for inferencing use cases and does not conform to a traditional register/stack based architecture.
HelpRack developed the entire compiler toolchain for the custom architecture making use of the MLIR and LLVM framework. The compiler takes as its input a model of dense or convolutional neural networks (DNNs and CNNs) in the Tensorflow Saved Model format and compiles a combination of ARM instructions and the custom ISA of the compute fabric. Additionally, HelpRack has been working on writing aggressive optimization and partitioning passes using MLIR and LLVM APIs that exploit the special features of the custom hardware that are specific to that architecture.
HelpRack is also helping another AI chip manufacturer to develop a similar solution for their RISC-V based AI acceleration hardware that includes a more traditional register/stack architecture.
NEXT STORY> Software Development of Performance Platform for an Engineering Org