FPGA for machine learning is based on adaptable architecture
To meet the design challenges of artificial intelligence, machine learning and high-bandwidth data acceleration applications, Achronix has developed the Speedster7t FPGA family.
The Speedster7t family is based on an architecture specifically optimised for high-bandwidth workloads, with 2D network-on-chip (NoCs), and new machine learning processor (MLPs) blocks optimised for high-bandwidth and artificial intelligence/machine learning (AI/ML) workloads, says Achronix. Blending FPGA programmability with ASIC routing structures and compute engines, the Speedster7t family results in what the company calls “FPGA+” class technology.
The massively parallel array of programmable compute elements within the new MLPs deliver the industry’s highest FPGA-based compute density, says Achronix. The MLPs are highly configurable, compute-intensive blocks that support integer formats from four to 24 bits and efficient floating-point modes including direct support for TensorFlow’s 16-bit format as well as the supercharged block floating-point format that doubles the compute engines per MLP.
The MLPs are tightly coupled with embedded memory blocks, eliminating the traditional delays associated with FPGA routing to ensure that data is delivered to the MLPs at the maximum performance of 750MHz. This combination of high-density compute and high-performance data delivery results in a processor fabric that delivers the highest usable FPGA-based Tera- operations (TOps) per second.
Achronix’s engineering team rethought the FPGA architecture to balance on-chip processing, interconnect and external I/O, to maximise the throughput of data-intensive workloads such as those found in edge- and server-based AI/ML applications, networking and storage.
The Speedster7t FPGAs are manufactured on TSMC’s 7nm FinFET process. They include high-bandwidth GDDR6 interfaces, 400G Ethernet ports, and PCI Express Gen5 all interconnected to deliver ASIC-level bandwidth with the full programmability of FPGAs.
Speedster7t devices are the only FPGAs with support for GDDR6 memories, the highest bandwidth external memory devices. With each of the GDDR6 memory controllers capable of supporting 512Gbits per second of bandwidth, up to eight GDDR6 controllers can be deployed in a Speedster7t device to support an aggregate GDDR6 bandwidth of 4Tbits per second, delivering the equivalent memory bandwidth of an HBM-based FPGA at a fraction of the cost, summarises Achronix.
Speedster7t devices also have up to 72 SerDes that can operate from one to 112 Gbits per second plus hard 400G Ethernet MACs with forward error correction (FEC), supporting four 100G and eight 50G configurations, plus hard PCI Express Gen5 controllers with eight or 16 lanes per controller.
Third-party attacks are countered with multiple layers of defence for protecting bitstream secrecy and integrity. Keys are encrypted based on a tamper-resistant physically unclonable function (PUF), and bitstreams are encrypted and authenticated by 256-bit AES-GCM. To defend against side-channel attacks, bitstreams are segmented, with separately derived keys are used for each segment, and the decryption hardware employs differential power analysis (DPA) counter measures. Additionally, a 2048-bit RSA public key authentication protocol is used to activate the decryption and authentication hardware.
The Speedster7t FPGA devices range from 363k to 2.6M six-input LUTs. The first devices and development boards for evaluation will be available in Q4 2019. The ACE design tools that support all of Achronix’s products including Speedcore eFPGA and SpeedchipTM FPGA chiplets are available today.