EdgeCortix Inc., a fabless semiconductor company specializing in energy-efficient AI processing, has unveiled its next-generation SAKURA-II Edge AI accelerator. This new platform, integrated with EdgeCortix’s second-generation Dynamic Neural Accelerator (DNA) architecture, is designed to manage complex Generative AI tasks at the edge efficiently. It features low latency, high accuracy, best-in-class memory bandwidth, and compact form factors, making it suitable for a variety of applications, including Large Language Models (LLMs), Large Vision Models (LVMs), and multi-modal transformer-based applications.
SAKURA-II is optimized for use in manufacturing, Industry 4.0, security, robotics, aerospace, and telecommunications sectors. It incorporates the latest generation runtime reconfigurable neural processing engine, DNA-II, delivering up to 60 trillion operations per second (TOPS) of 8-bit integer performance and 30 trillion 16-bit brain floating-point operations per second (TFLOPS). It also offers power efficiency, real-time processing, and the ability to execute multiple deep neural network models with low latency.
The SAKURA-II platform includes the MERA software suite, which provides a heterogeneous compiler platform, advanced quantization, and model calibration capabilities. The software suite supports development frameworks like PyTorch, TensorFlow Lite, and ONNX, and features a flexible host-to-accelerator unified runtime for efficient scaling across single, multi-chip, and multi-card systems. This facilitates AI inferencing and shortens deployment times for data scientists. Additionally, the integration with the MERA Model Library and Hugging Face Optimum provides access to a wide range of the latest transformer models, ensuring smooth transitions from training to edge inference.
“SAKURA-II’s 60 TOPS performance within 8W of typical power consumption, combined with its mixed-precision and built-in memory compression capabilities, makes it a pivotal technology for Generative AI solutions at the edge,” said Sakyasingha Dasgupta, CEO and Founder of EdgeCortix. “Whether running traditional AI models or the latest Llama 2/3, Stable-diffusion, Whisper, or Vision-transformer models, SAKURA-II offers deployment flexibility with superior performance per watt and cost-efficiency.”
Key benefits of SAKURA-II include its optimization for Generative AI workloads with minimal power consumption, capability to handle complex models within an 8W power envelope, seamless software integration with the MERA suite, enhanced memory bandwidth, real-time data streaming optimization, advanced precision, support for sparse computation, versatile functionality, efficient data handling, and robust power management capabilities.
SAKURA-II will be available as a stand-alone device, M.2 modules with varying DRAM capacities, and single and dual-device low-profile PCIe cards. Customers can reserve M.2 modules and PCIe cards today for delivery in the second half of 2024.