Profile-Guided Optimization (PGO)
Profile-Guided Optimization (PGO) is an advanced compilation technique that uses runtime performance data to guide compiler optimizations, resulting in more efficient executable code. The ZCC compiler supports PGO in bare-metal environments (without an operating system).
PGO quick start
Follow this quick start guide using hello.c:
#include <stdio.h>
int main() {
printf("Hello\n");
return 0;
}
Step 1
Compile the source file with the -fprofile-instr-generate option. This directs the compiler to insert instrumentation code that collects runtime performance data.
zcc hello.c -fprofile-instr-generate -o a.out
Step 2
Execute the generated program. In bare-metal environments, the profile data is output as formatted text. Redirect the standard output to a file to save this data.
./a.out > a.txt
- The profile data is wrapped between specific markers:
====LLVMProfilingData:begin====
...(profile data)...
====LLVMProfilingData:end==== - You can execute the program multiple times (with the same or different parameters) and save each output separately (e.g.,
a1.txt,a2.txt) to gather comprehensive performance samples.
Step 3
Use the llvm-profdata tool to convert the text-based profile data into binary format for further processing.
llvm-profdata translate a.txt --output=a.bin
If multiple data files exist, convert each one:
llvm-profdata translate a1.txt --output=a1.bin
llvm-profdata translate a2.txt --output=a2.bin
Step 4
Combine multiple binary profile data files into a single profile that will be used to guide compiler optimizations.
-
Merge specific files:
llvm-profdata merge -output=a.profdata a1.bin a2.bin a3.bin -
Merge using wildcards:
llvm-profdata merge -output=a.profdata a*.bin
Step 5
Recompile the source code using the -fprofile-instr-use option with the merged profile (a.profdata). The compiler will use the profiling data to perform targeted optimizations and generate an optimized executable.
zcc hello.c -fprofile-instr-use=a.profdata -o a.pgo.out
Step 6
Execute the final PGO-optimized program to experience the performance improvements.
./a.pgo.out
PGO print function
ZCC provides the interface void __zcc_baremetal_profile_dump(void) to print PGO data. By default, the compiler will call __zcc_baremetal_profile_dump in the main function after the main function ends to print the PGO data. If the user's main function does not end, the user can manually call the __zcc_baremetal_profile_dump function in the main function to print the PGO data.
Linker script
After enabling PGO, the compiler will generate some sections starting with __llvm_prf_, such as __llvm_prf_names, __llvm_prf_cnts, __llvm_prf_data, etc. Each section defines two symbols, one pointing to the beginning and the other to the end of the section. For example, __start___llvm_prf_names and __stop___llvm_prf_names point to the beginning and end of __llvm_prf_names, respectively.
If you put the section starting with __llvm_prf_ into other sections in the linker script, the compiler will not define the __start_ and __stop_ symbols for this section. In this case, you need to explicitly define these two symbols and ensure that the relevant section is aligned to 8 bytes. For example:
.data :
{
*(.data .data.*)
. = ALIGN(8);
__start___llvm_prf_cnts = .;
*(__llvm_prf_cnts)
. = ALIGN(8);
__stop___llvm_prf_cnts = .;
. = ALIGN(8);
__start___llvm_prf_names = .;
*(__llvm_prf_names)
. = ALIGN(8);
__stop___llvm_prf_names = .;
. = ALIGN(8);
__start___llvm_prf_data = .;
*(__llvm_prf_data)
. = ALIGN(8);
__stop___llvm_prf_data = .;
}

鄂公网安备 42018502007513