Author: IAR Systems
In the development and application of MCU, engineers need to test the capability of MCU. The common method is to use Benchmark (benchmark) program to test. However, when doing benchmark tests, the optimization capabilities of the compiler also have a more obvious impact on the test results. The same hardware platform, using different compilers and different optimization options, may result in large differences.
In order to maximize the performance of the MCU and achieve the best results in the benchmark test, it is often necessary for engineers not only to have a better understanding of their own hardware, but also to understand the optimization principles of the compiler and apply them flexibly in order to perform in the benchmark test. The full performance of the MCU. As a world-renowned embedded tool manufacturer, IAR Systems’ compilers have unique advantages in optimization capabilities. MCUs and IAR compilers can often get better benchmark results.
This article takes the IAR Embedded Workbench development tool suite that has been widely used in the MCU field as an example to share the items that should be paid attention to in the MCU software benchmark test and the following skills, so as to help readers to generate the most efficient and complete code in the industry. Using the following items and settings, engineers can accurately adjust the optimization level, maximize testing and improve the performance of the developed and application code.

Choose code size or execution speed
Using development tool suites such as IAR Embedded Workbench, engineers can set the optimization level and type for the entire project scope or for a single file. In the source code, you can even use the #pragma optimize directive for a single function to accomplish this.
The purpose of optimization is to reduce code size and increase execution speed. If only one of the goals can be met, the compiler will give priority to the settings according to the user-specified settings. Therefore, in actual software benchmark tests, engineers can try various settings to get the best results. For example, since function inlining focuses more on the optimization of execution speed, compared with the optimization setting of general code size, the use of function inlining and general code execution speed optimization setting will obtain smaller program code.

Choose a small memory Model
In order to give full play to the performance of MCU devices and reduce application problems, software development must fully consider the memory and other resource constraints of the MCU devices. Therefore, it is necessary to choose the smallest possible memory model for the target device and project. The advantages of the small memory model include:
• Smaller address
• Smaller size instructions
• Smaller hands
• higher efficiency
• Less code
Mature development tool suites such as IAR Embedded Workbench also integrate relevant evaluation functions, which can evaluate the memory model in many ways, thereby helping engineers to test the scale of the software and optimize the design.
Choosing the right runtime library
By default, the runtime library is compiled with the highest code size optimization level. If you want to optimize for speed, please consider recompiling and generating these libraries. The most suitable level for certain standard library functions (such as locale, file descriptor, and multibyte) can be set through configuration options.
According to specific needs, select scanf input and printf format in the library options. The default option is not the smallest format.


Use the correct data type
The data type is closely related to the code size or execution speed, so it is necessary to use development tools to observe and analyze the data type in order to find the type suitable for the hardware resource. In the IAR Embedded Workbench development tool suite, developers can start testing and optimization from the following aspects:
Choose the data type size that best suits your application
Try to use unsigned character types so that you can perform bit operations instead of arithmetic operations

Check target specific options
Check the target-specific options that can improve performance. This often requires engineers to have considerable experience in daily MCU design and application development, but by using mature development tool suites such as IAR Embedded Workbench, you can quickly and completely complete the necessary Performance check:
Efficient addressing mode-can achieve efficient memory access
Use specific registers to handle constants/variables-code operations on registers are more efficient than on memory
Even-aligned function entry-even-aligned instructions can increase speed
Byte-aligned objects-smaller storage space requirements, but may generate larger code size
Use benchmark related code
All MCU development tools should provide relevant code for benchmarking, but the code bases of mature general development tools are the condensed experience of these providers in related fields, so they are more comprehensive and efficient. The important lessons include:
The benchmark test of the embedded system should be designed according to the characteristics of the embedded program.
The actual application is usually also suitable for benchmarking, but you need to ensure the executable code. The linker deletes unreferenced code and variables, but not all linkers have this feature.
Ensure that the test code is not affected by the test tool (test related functions). The following example is actually a benchmark test of printf() (test related functions).

Compare the code generated after linking. One compiler may use inline code, while another compiler may call libraries.
Fully understand the application code used to perform the benchmark!
Summarize
By using mature development tool suites such as IAR Embedded Workbench and leveraging their knowledge gathered and iterated in decades of global applications, MCU design and application development engineers can quickly complete the above-mentioned necessary performance tests, and at the same time Further give full play to the performance of the MCU, so as to achieve the optimized function of the target device and the integration of software and hardware.