SoC uses computing-in-memory for speech processing at the edge
Computing-in-memory technology is poised to eliminate the massive data communications bottlenecks associated with AI speech processing at the network’s edge, said Witinmen. The company has worked with Microchip Technology’s subsidiary Silicon Storage Technology (SST) to develop an embedded memory that simultaneously performs neural network computation and stores weights. Microchip Technology announced that its SuperFlash memBrain neuromorphic memory has been combined with the Witinmem neural processing SoC. The SoC is claimed to be the first in volume production that enables sub-mA systems to reduce speech noise and recognise hundreds of command words, in real time and immediately after power-up.
Microchip has worked with Witinmem to incorporate Microchip’s memBrain analogue in-memory computing, based on SuperFlash technology, into Witinmem’s low-power SoC. The SoC features computing-in-memory technology for neural networks processing including speech recognition, voice-print recognition, deep speech noise reduction, scene detection, and health status monitoring. Witinmem is working with multiple customers to bring products to market during 2022 based on this SoC.
“Witinmem is breaking new ground with Microchip’s memBrain solution for addressing the compute-intensive requirements of real time AI speech at the network edge based on advanced neural network models,” said Shaodi Wang, CEO of Witinmem. “We were the first to develop a computing-in-memory chip for audio in 2019, and now we have achieved another milestone with volume production of this technology in our ultra-low-power neural processing SoC that streamlines and improves speech processing performance in intelligent voice and health products.”
Microchip’s memBrain neuromorphic memory is optimised to perform vector matrix multiplication (VMM) for neural networks. It enables processors used in battery-powered and deeply-embedded edge devices to deliver the highest possible AI inference performance per Watt. This is accomplished by both storing the neural model weights as values in the memory array and using the memory array as the neural compute element. The result is 10 to 20 times lower power consumption than alternative approaches, claims Microchip, and a lower overall processor bill of materials (BoM) costs because external DRAM and NOR are not required.
Permanently storing neural models inside the memBrain’s processing element also supports instant-on functionality for real time neural network processing. Witinmem has leveraged SuperFlash technology’s floating gate cells’ non-volatility to power down its computing-in-memory macros during the idle state to further reduce leakage power in demanding IoT use cases.