Apple’s M1 chip is Apple’s fastest single-core CPU benchmark on a Mac, and it beats many high-end Intel competitors in multi-core performance. Developer Erik Engheim recently shared an in-depth look at the M1 chip, exploring why Apple’s new processor is so much faster than the Intel chip it replaces. M1 is not a CPU! First, the M1 is not a simple CPU. As Apple explains, itR...
“As chips need to process more and more calculations per second, their designs are becoming more complex, and at the same time, ensuring the timely transmission of large amounts of data in chips faces significant challenges. Designers often overlook critical data flow aspects, Sondrel explained, because of the complexity of the network-on-chip (NoC) design responsible for this task, and because of the many edge cases, it is difficult to verify that performance requirements are met in all cases be satisfied. This results in only sub-optimal data transfer for the network-on-chip and difficult system-on-chip (SoC) implementation.
As chips need to process more and more calculations per second, their designs are becoming more complex, and at the same time, ensuring the timely transmission of large amounts of data in chips faces significant challenges. Designers often overlook critical data flow aspects, Sondrel explained, because of the complexity of the network-on-chip (NoC) design responsible for this task, and because of the many edge cases, it is difficult to verify that performance requirements are met in all cases be satisfied. This results in only sub-optimal data transfer for the network-on-chip and difficult system-on-chip (SoC) implementation.
Ben Fletcher, director of engineering at Sondrel, explains: “The performance of the network-on-chip must match the computing portion of the system-on-chip. The role of the network-on-chip is to provide input data fast enough, keep the compute IP on the chip running at maximum capacity, and store the output. data, preventing system congestion. We use Arteris® FlexNoC® IP as the network-on-chip communication backbone for the SoC, which enables us to design more complex chips in less time.”
Why choose FlexNoc?
He finds that there are many advantages to using FlexNoC interconnect technology. The first advantage is the reduction in area and number of cores. This advantage is achieved by leveraging the packetization and serialization capabilities of the transport layer, allowing network-on-chip designers to precisely control the portion of the network-on-chip that can reduce wire count and area without sacrificing performance. The second advantage is reduced power consumption. Power consumption can be controlled within a specified budget by configuring power management features such as clock domain crossing and gated clock support. The third advantage is the realization of a physically aware design. Because the network-on-chip design methodology takes the system-on-chip floorplan and any physical design constraints into account from the outset, the design team is able to deliver a netlist to the back-end team that is guaranteed to meet timing requirements. The fourth advantage is FlexNoc’s advanced configuration tools and excellent UI. FlexNoc provides an easy-to-learn tool suite for generating well-performing, clean-timing interconnects that is very easy for network-on-chip designers to use, resulting in increased productivity.
What does the network on chip do?
The network-on-chip will be interconnected with almost all parts of the system-on-chip, and is inherently related to the chip’s layout, architecture, functional requirements, startup, security, security, and other aspects. Ben Fletcher cautions: “This means that the floorplan is likely to change over the life of the project, requiring changes to the on-chip network. Changes in turn affect the floorplan, creating feedback loops that can cause delays and cost overruns. “With years of experience designing large and complex systems-on-chips, we have developed a number of techniques that allow us to explore and verify performance early in the project. By identifying requirements early and quickly verifying that changes to the network-on-chip meet those requirements, we are able to define the plane layout and network-on-chip design, reducing unnecessary design changes, thereby reducing risk and additional cost.”
Example of network on chip with blue area on the left and blue area on the right in a floorplan
The network-on-chip on the functional block diagram on the right of the legend looks simple – just has a lot of connections, however, the floorplan on the left shows that it occupies a considerable area for high clock speeds and large amounts of data to be transferred in the chip, And the complex layout, coupled with the scattered physical locations of IP functional blocks, also makes timing closure difficult.
Floorplan first or on-chip network first?
Typically, designers start the chip design process with a floorplan or network-on-chip, which leads to the aforementioned feedback loop. Sondrel’s approach avoids this by performing performance exploration at the very beginning of the design phase, identifying and testing the architecture by specifying performance requirements, reducing the probability of change, and specifying the floorplan and network-on-chip design accordingly. Performance exploration solves the typical problem of only validating IP functional blocks individually. This verification method fails to take into account the interactions between IP functional blocks. The more IP functional blocks on a chip, the harder it is to understand all the dependencies between them that can have a critical impact on chip performance. For example, the master/slave interface may not match, shared memory conflicts, clock skew, etc. For more details see Sondrel’s white paper “Top 10 Practical Steps to Modeling and Designing Complex SoCs” (www.sondrel.com/solutions/white-paper)
Once the performance exploration is complete and performance requirements are determined, there is sufficient information to configure the network on chip. We needed a way to test the generated RTL against these requirements to determine how well the requirements were met, and then implement rapid iterations to achieve the desired level of performance. To this end, Sondrel has developed a proprietary test platform called the “Performance Verification Environment”. The platform uses synthesizable RTL rather than approximate models, and processors and subsystems are replaced with processors defined in Python code. This enables memory-mapped bus traffic to be generated in Python and driven through the on-chip network, allowing for a quick overview of current design progress and how changes are optimized for data traffic. For more details, see Sondrel’s white paper “Comparing Performance Verification Environments with RTL” (http://www.sondrel.com/solutions/white-papers)
These rapid iterations allowed us to quickly explore the network-on-chip configuration, find a suitable solution, and apply it to the floorplan design process for subsequent synergistic optimization of the network-on-chip and floorplan. This results in a faster steady state, which reduces project risk.
Chip specifications may change with market needs, and this modeling process can be updated as a whole without having to start from scratch, enabling evidence-based data to understand whether a revised chip design meets the new needs.