Field-programmable gate arrays (FPGAs) are remarkably versatile. They are used in a wide variety of applications and industries where use of application-specific integrated circuits (ASICs) is less economically feasible. Despite the area, cost, and power challenges designers face when integrating FPGAs into devices, they provide significant security and performance benefits. Many of these benefits can be realized in client compute hardware such as laptops, tablets, and smartphones.
An FPGA is an integrated circuit (IC) composed of configurable logic blocks (CLBs) connected via programmable interconnects (Figure 1);22 it can be configured to desired application or functionality requirements after having been manufactured (hence, field-programmable). In contrast, an ASIC cannot be modified or changed after manufacturing. Examples are a CPU, GPU, or SoC (system on a chip).
Hardware designers use hardware description languages (HDLs) such as VHDL or Verilog to describe the structure and/or behavior of the logic elements within the FPGA. Electronic design automation (EDA) tools are then used to synthesize the design and generate the FPGA configuration, often referred to as a bitstream; finally, the bitstream is applied to the FPGA. (This explanation is a drastic oversimplification and should serve only as a rudimentary description of FPGAs.)
The world's largest FPGA manufacturers are Xilinx, recently acquired by Advanced Micro Devices (AMD), and Intel (formerly Altera, which Intel acquired in 2015). As of 2019, the FPGA market size was valued at $9 billion and is expected to reach $14.2 billion by 2025.8
Despite their versatility and heavy use in a variety of applications, FPGAs are notoriously absent from modern client compute hardware (for example, laptops, smartphones, desktops, and tablets). This article examines the challenges and benefits of using FPGAs in client compute hardware.
FPGAs are commonly used in the early stages of hardware design for rapid prototyping, testing, and development because they can be reconfigured at will. Otherwise, designers would have to send their designs to the foundry to be fabricated every time they were updated or modified; this can be time-consuming and costly. Other common applications include those in aerospace and defense (for example, avionics and missile defense systems), audio and video (digital signal processing, encoding, and decoding), medical (ultrasound and x-ray), and various other markets and industries.
FPGAs are also commonly used in cloud and datacenter applications. Microsoft's Azure SmartNIC is a network card that uses an FPGA to accelerate network performance (lower latency and higher throughput) of the virtual machines offered through its Azure cloud service.7 Amazon Web Services (AWS) offers virtual machines with FP-GAs that developers can use to accelerate their applications.20 For those who wish to run applications in their own datacenters, FPGAs are available as Peripheral Component Interconnect Express (PCIe) add-in cards that can be integrated into new or existing servers; developers can then use them to accelerate their applications.19
Generally speaking, FPGAs are found wherever the volume of units or devices produced is relatively small such that it is more economical to use an FPGA for the application than it is to design and fabricate an ASIC.
A variety of nontrivial challenges arise when hardware designers integrate FPGAs in client compute devices. These include area (the amount of space on the printed circuit board that the FPGA will occupy), power consumption, and cost.
Area. At Macworld 2008, Steve Jobs, the CEO of Apple at the time, unveiled the MacBook Air, a laptop computer so thin and light it could fit inside an envelope. At the time, the consumer electronics industry had already been moving toward thinner and lighter devices; it was a natural progression. The MacBook Air, however, was so radically thin and light in comparison to its competition that OEMs had no choice but to press the fast-forward button and make size and weight of their devices a priority in order to remain competitive. To this day, size and weight remain a priority for both hardware designers and consumers of client compute devices.
In comparison to ASICs, FPGAs use more area for the equivalent amount of logic and functionality implementation. The units of measure in question are square millimeters (mm2), which may seem negligible at first glance; in modern device design, however, every micron counts.
Cost. From a bill of materials (BOM) perspective, FPGAs are more costly than ASICs. Depending on the target audience and market, such as consumer electronics, the OEM or system designer must deal with tight margins and specific price targets for a product to be economically feasible and sell at worthwhile volumes. Increasing the BOM by introducing an FPGA may cause the overall target price of the product or system to be outside of an acceptable range.
Power. FPGAs consume more power than ASICs.12 In large, complex systems (where power consumption is either not as much of a constraint or where the increase in power from one or multiple FPGAs is negligible) power consumption might not be much of a concern. In client compute devices, however, power consumption is given high priority in overall system design. Lower total cost of ownership (TCO) and increased battery life (in the context of mobile devices) are highly desirable characteristics in the consumer electronics world.
Integrating FPGAs into client compute hardware realizes a variety of benefits.18 They can be used to accelerate otherwise-expensive operations, which, in turn, can lead to increased performance and power efficiency. Their reconfigurable nature lets hardware that has already been deployed be updated throughout its life cycle (for example, fix a security issue or improve performance). If done carefully, choosing to implement a particular function in hardware by means of an FPGA has the potential to increase the overall security of a device.
Hardware acceleration. Hardware acceleration and heterogeneous compute architectures are becoming more prevalent. In other words, use the right tool for the job; not all workloads are well-suited for one particular type of hardware (for example, CPUs or GPUs).
There are several examples of this in modern compute devices:
FPGAs can be used to accelerate specific workloads in situations where such workloads are significant enough to realize performance and/or power benefits. OEMs and system designers may favor an FPGA over an ASIC for these reasons:
Patching and updates. As previously discussed, a major benefit of using an FPGA is its ability to be reconfigured once deployed. In practice, this means that hardware can be modified or updated over time. This benefits both designers (for example, less manufacturing overhead) and consumers (for example, no confusion over which product to buy or product obsolescence). For example, if accelerating the encode or decode of a particular audio or video codec, the implementation may change or be updated over time.
Another major benefit is the ability to fix or patch security vulnerabilities discovered over time.6 ASICs suffer from the inability to be modified after being manufactured; Spectre and Meltdown, and variations of each, are glaring examples of how crippling a hardware vulnerability can be. The ability to patch hardware cannot be overstated.
FPGAs are found wherever the volume of units or devices produced is relatively small such that it is more economical to use an FPGA for the application than it is to design and fabricate an ASIC.
Enhanced security. Before diving into this section, let's make two things crystal clear:
Despite these two sobering points, implementing a particular function in hardware can improve the overall security of a system. For example:
Choosing to implement certain functions of a system's architecture can thus yield security benefits. FPGAs offer the same potential benefits as an ASIC would with the added benefit of being reconfigured throughout the system's life cycle. If security is a priority in system design (which it should be), then being prepared when security vulnerabilities are found and being able to remediate them (perhaps by means of an update to the FPGA) should also be a priority.
This section focuses on a scenario in which an FPGA is used instead of an ASIC to demonstrate the practicality of integrating programmable logic into client compute hardware designs. In this scenario, a solid-state storage device (SSD) inside a client compute device uses an FPGA to implement the functions of the storage controller; Figure 5 is a high-level depiction of the SSD's architecture. The FPGA will be used to implement:
It's important to note that many of the points being made in this section would apply if an FPGA were used in other ways (for example, audio/video, encode/decode, facial recognition).
Area. A standard M.2 2280 SSD measures 22mm x 80mm, a total of 1,760mm2. This form factor is a single, rigid piece of hardware. While an FPGA required to implement the logic described in this hypothetical SSD would be larger than most storage controller ASICs found on commercial SSDs, it would be advantageous to the designer to break apart the components. Rather than cram everything onto a single M.2 2280 form factor, why not take advantage of the space of the PCB (printed circuit board) and spread the components out?
There is no such thing as a perfectly secure system. Philosophically, we, as human beings, are imperfect, and the systems we design and use are inherently imperfect.
As shown in Figure 6, the 13-inch MacBook Pro from 2020 uses an irregularly shaped PCB to take advantage of every square millimeter available in the chassis. Components are placed throughout, rather than using larger, rigid components such as M.2 SSDs. (Note the placement of the NAND flash modules and the Apple M1 SoC.)
Figure 7 shows an example layout of the components of the SSD distributed throughout the PCB rather than lumped together in an M.2 2280 form factor. An M.2 SSD would take up a considerable amount of space on the PCB and make it more challenging to fit other components. Breaking the components apart and spreading them across the PCB makes better use of the space. This is a fair compromise to ensure that size/space requirements are met.
Power and performance. An Intel Optane 905P SSD consumes 9.35W when active and 2.52W when idle; it has a sequential read and write bandwidth of 2,600MB/s and 2,200MB/s, respectively.9 Ruan et al. have demonstrated that an SSD design similar to the one proposed here (an FPGA-accelerated SSD) consumes around 10W active and realizes an average 12x performance increase compared with an SSD that uses a quad-core ARM CPU as the storage controller.15 While there is an increase in power consumed, there is also a significant increase in performance.
IP theft. Reverse engineering an IC can reveal its structure, design, and functionality. A common protection against this is IC camouflaging, but this brings with it significant area, power, and delay overhead.14 TechInsights offers professional services to reverse engineer ICs,17 and Degate even offers software products that can be used to perform reverse engineering of ICs.5 While reverse engineering is commonly performed for legitimate reasons, a malicious entity could reverse engineer an IC to steal/pirate the design. When attempting to do so on an FPGA IC, however, a malicious entity could learn of only the FPGA itself and not the logic or IP implemented.
It is worth noting that FPGAs are potentially exposed to other malicious attacks and/or piracy through manipulation and/or reverse engineering of the bitstream. The bitstream can be manipulated such that when it is loaded into the FPGA, it causes unwanted behavior; it can also potentially be extracted and reverse engineered. While this may seem alarming, both commercial4,23 and academic10 methods are available for protecting the bitstream.
Cost. For end users of a compute device using an FPGA, overall TCO is reduced. Throughout the life cycle of the device, as implementations of the IP blocks in the SSD improve (for example, efficiency and security) and the FPGA configuration is updated, the need for a "new" device decreases over time (why buy a new device when the one you have can be "good as new"?). Capital expenses are reduced because fewer devices need to be purchased; this also lowers the overall carbon footprint for the end users. Operational expenses are reduced because, in the event of an inevitable security-related issue, the remedy is simply to update the SSD, which presumably bears no performance penalty. The only issue, realistically, is wear-leveling in the NAND flash modules inside the SSD; realistically, however, this would be an issue only with extensive use of the storage device, such as exceeding the SSD's drive writes per day (DWPD).
For the designers of the compute device, as with any engineering project, the overall cost is more than just the BOM; non-recurring engineering (NRE) costs are often significant in major projects. There are also the continued development and support costs that come with revising hardware designs and providing iterative product updates. While BOM cost will likely increase when using an FPGA instead of an ASIC to implement the SSD controller, the FPGA may offer reduced ongoing development costs. This is because the functionality of the controller may be updated over time, thereby reducing the need to design a new ASIC and have the product go through another round of regulatory certifications.
System designers need only "push" an update to the SSD (via the Internet or some other means) to update or change its functionality. It would be worthwhile for OEMs and system designers to consider balancing the reduced ongoing development costs with the increase in BOM cost when using FPGAs. In other words, it may be more feasible to "absorb" some of the increased BOM cost that otherwise would have been passed onto the end users because the designers' overall costs likely would have been reduced.
While the challenges of using FPGAs in client compute hardware cannot be discounted, the benefits strongly outweigh the work and effort required to integrate them. Here are a few notable examples of FPGAs already being used in client compute hardware:
These products are not only sold by well-known and reputable OEMs, but also have been well received by their target audiences and markets.
Interestingly, AMD recently applied for a patent that integrates programmable logic into a CPU.11 The integration of programmable logic into other types of hardware (that is, not just as another component in the system but as a part of the component itself) opens the door for more types of hardware designs. For example, if an independent software vendor (ISV) application uses a particular operation that is computationally expensive, it would make sense to accelerate it in hardware for both power efficiency and performance benefits.
For hardware designers, however, it is impractical to account for all ISV applications. By integrating programmable logic, or programmable execution units, into the CPU of a client compute device, the ISV can bundle the information required to configure those programmable execution units (that is, the bitstream) such that the CPU uses them to accelerate those expensive operations when that application is in use. This is illustrated in Figure 8.11
In the end, how do hardware acceleration, ease of product updates, and enhanced security translate to the consumers of client compute hardware? Very simply: a better overall experience when using the product and lower overall TCO. These are characteristics that all client compute devices are designed for (no one wants a device that is difficult to use and expensive), so it should be no surprise the aforementioned products have been well received, and it should serve as an indicator that, if done properly, products that integrate FPGAs can be successful.
1. Alarcon, M., Fontaine, R., James, D., Krishnamurthy, R., Morrison, J., Yang, D., Young, C. Samsung Galaxy S5 teardown. Tech Insights, 2014; https://www.techinsights.com/blog/samsung-galaxy-s5-teardown.
2. Apple Support. About the afterburner accelerator card for Mac Pro, 2019; https://support.apple.com/en-us/HT210748.
3. Apple Support. Dedicated AES engine, 2020; https://apple.co/3iHBYbY.
4. Baetoniu, C. FPGA IFF copy protection using Dallas Semiconductor/Maxim DS2432 Secure EEPROMs. Xilinx, 2010; https://www.xilinx.com/support/documentation/application_notes/xapp780.pdf.
5. Degate. Reverse engineering integrated circuits with Degate; https://degate.readthedocs.io/en/latest/.
6. Dessouky, G., Frassetto, T., Jauernig, P., Sadeghi, A.R., Stapf, E. With great complexity comes great vulnerability: from stand-alone fixes to reconfigurable security. IEEE Security and Privacy 18, 5 (2020), 57–66; https://dl.acm.org/doi/abs/10.1109/MSEC.2020.2994978.
7. Firestone, D., et al. Azure accelerated networking: SmartNICs in the public cloud. In 15th Usenix Symposium on Networked Systems Design and Implementation, 2018, 51–66; https://www.usenix.org/conference/nsdi18/presentation/firestone.
8. Grand View Research. Field-programmable gate array market size report, 2020–2027; https://www.grandviewresearch.com/industry-analysis/fpga-market.
9. Intel. Intel Optane SSD 905P Series product specifications, 2018; https://intel.ly/3qGeUib.
10. Karam, R., Hoque, T., Ray, S., Tehranipoor, M., Bhunia, S. Robust bitstream protection in FPGA-based systems through low-overhead obfuscation. IEEE. In Proceedings of the 2016 Intern. Conf. ReConFigurable Computing and FPGAs, 1–8; https://ieeexplore.ieee.org/document/7857187.
12. Kuon, I., Rose, J. Measuring the gap between FPGAs and ASICs. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems 26, 2 (2007), 203–215; https://ieeexplore.ieee.org/document/4068926.
13. Mattioli, M., Lahtiranta, A. Hidden potential within video game consoles. IEEE Micro 41, 2 (2021), 72–77; https://ieeexplore.ieee.org/document/9340369.
14. Rajendran, J., Sam, M., Sinanoglu, O., Karri, R. Security analysis of integrated circuit camouflaging. In Proceedings of the 2013 ACM SIGSAC Con. Computer and Communications Security, 709–720; https://doi.org/10.1145/2508859.2516656.
15. Ruan, Z., He, T., Cong, J. Insider: designing in-storage computing system for emerging high-performance drive. In Proceedings of the 2019 Usenix Annual Technical Conference, 379–394; https://www.usenix.org/system/files/atc19-ruan_0.pdf.
16. Shilov, A. Synaptics' next-gen fingerprint sensor security: the FS7600 Match-In-Sensor. AnandTech, 2018; https://bit.ly/3tN4XkJ.
17. TechInsights. Scope of analysis, 2021; https://bit.ly/38b6t89.
19. Venkatakrishnan, R., Misra, A., Kindratenko, V. High-level synthesis-based approach for accelerating scientific codes on FPGAs. Computing in Science and Engineering 22, 4 (2020), 104–109; https://dl.acm.org/doi/10.1109/MCSE.2020.2996072.
20. Wang, X., Niu, Y., Liu, F., Xu, Z. When FPGA meets cloud: a first look at performance. IEEE Trans. Cloud Computing, 2020; https://ieeexplore.ieee.org/abstract/document/9086121.
21. Wegner, S., Cowsky, A., Davis, C., James, D., Yang, D., Fontaine, R., Morrison, J. Apple iPhone 7 teardown. TechInsights, 2016; https://www.techinsights.com/blog/apple-iphone-7-teardown.
22. Wikimedia Commons. File:Xerox ColorQube 8570 - main controller - Xilinx Spartan XC3S400A-0205.jpg; https://bit.ly/3tObAU3.
23. Xilinx. 40360 - FPGA - What are the methods to protect the FPGA bitstream against unauthorized duplication? 2021; https://www.xilinx.com/support/answers/40360.html.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2022 ACM, Inc.
No entries found