Skip to main content
Version: 4.x

ZCC 4.x User Manual

Terapines compiler ZCC is a high-performance C/C++ compiler for RISC-V based on LLVM. It supports the most recent C and C++ standards, including C17, C99, C11, C++17, C++14 and C++11 etc and brings the following key features.

  • RVV auto-vectorization and other compiler optimizations.

  • Support RISC-V ISAs, including extensions and vendor extensions from XuanTie,Nuclei and Andes.

Download and Installation

ZCC is a high performance RISC-V toolchain that provides consistent experience on Windows and Linux.

System requirements

You can review the system requirements to check if your computer configuration is supported.

  • Windows : Windows 10 (32-bit and 64-bit) and above

  • Linux:

    • Ubuntu 18, Ubuntu 20, Ubuntu 22.04 and Ubuntu 24.04

    • Centos 6, CentOS 7 and Centos8

    • Fedora 42

    • openSUSE Leap 15.5

Setup using the Terapines Installer

The Terapines Installer is the recommended tool to setup Terapines products. For users without a graphical interface, please use the Command Line Interface (CLI) installer.

  1. Select Windows from pull-down box to download the Installer .exe from ZCC download page.

  2. Run the Installer with administrator privileges and select the product that you want to install.

  3. ZCC toolchain by default included LibZCC. For additional libraries, choose to install. The installed libraries would be added in toolchain and apply globally.

    • LibDSP: It offers a collection of functions and tools specifically designed for digital signal processing (DSP).

    • LibNN: Specialized library for implementing and running neural network (NN) algorithms.

You can use Product Manager to manage several versions of the samne products. Product Manager is installed in the C:\Program Files\Terapines directory. It can be opened directly from the Start menu if you check the box of adding system PATH. For community edition users, you don't have to sign in. For Commercial edition users, log in to your Terapines account from Product Manager, and it will automatically activate the available license for the product you install.

Language standards

  • -x <language>

    Treat subsequent input files as having type <language>. The optional <language> is C or C++.

  • -std=<standard>

    Select the language standard to compile for. Supported values are listed in the table below. The supported language standards maintain with the specifications and details provided by the upstream source. Please refer to C++ Support in Clang and C Support in Clang.

  • -ansi

    Same as "-std=c89".

StandardsVersions
C Standardc18; c17; c11; c99; c90; c89
C++ Standardc++17; c++14; c++11; c++03; c++98
GNU Cgnu18; gnu17; gnu11; gnu89; gnu++17; gnu++14; gnu++11; gnu++98
ISO Ciso9899:2018; iso9899:2017; iso9899:2011; iso9899:1999; iso9899:199409; iso9899:1990
tip
  • C17 & C18: Since it was under development in 2017, and officially published in 2018, C17 is sometimes referred to as C18.
  • C89 & C90: C90 is the same standard as C89 was ratified by ISO/IEC as ISO/IEC 9899:1990, with only formatting changes. Therefore, the terms "C89" and "C90" refer to essentially the same language.

RISC-V target support

Use -print-supported-extensions to print the list of all extensions that are supported in ZCC.

Base ISAs

Currently, ZCC fully supports three base instruction sets: RV32I, RV32E and RV64I.

To specify the target triple:

  • riscv32 RISC-V with XLEN=32 (i.e. RV32I or RV32E)
  • riscv64 RISC-V with XLEN=64 (i.e. RV64I)

To select an E variant ISA (e.g. RV32E instead of RV32I), use the base architecture string (e.g. riscv32) with the extension e.

Extensions

The table below provides the extensions in ZCC that ensure compatibility, including standard extensions as well as some experimental extensions from older versions.

tip

The Zp052b extension is version 0.52 of the P extension, which is an older experimental version. ZCC has extracted and designated it as a standard extension named Zp052b. When using this extension, there is no need to specify a version number—simply use zp052b.

zcc -march=rv32imaczp052b -c hello.c
zcc -march=rv32imafdc -c hello.c
ExtensionVersionFeatureDescription
I2.1iBase Integer Instruction Set
E2.0eImplements RV64E (provides 16 rather than 32 GPRs)
M2.0mInteger Multiplication and Division
A2.1aAtomic Instructions
F2.2fSingle-Precision Floating-Point
D2.2dDouble-Precision Floating-Point
C2.0cCompressed Instructions
B1.0bthe collection of the Zba, Zbb, Zbs extensions
V1.0vVector Extension for Application Processors
H1.0hHypervisor
Zic64b1.0zic64bCache Block Size Is 64 Bytes
Zicbom1.0zicbomCache-Block Management Instructions
Zicbop1.0zicbopCache-Block Prefetch Instructions
Zicboz1.0zicbozCache-Block Zero Instructions
Ziccamoa1.0ziccamoaMain Memory Supports All Atomics in A
Ziccif1.0ziccifMain Memory Supports Instruction Fetch with Atomicity Requirement
Zicclsm1.0zicclsmMain Memory Supports Misaligned Loads/Stores
Ziccrse1.0ziccrseMain Memory Supports Forward Progress on LR/SC Sequences
Zicntr2.0zicntrBase Counters and Timers
Zicond1.0zicondInteger Conditional Operations
Zicsr2.0zicsrControl and Status Register (CSR) Instructions
Zifencei2.0zifenceifence.i
Zihintntl1.0zihintntlNon-Temporal Locality Hints
Zihintpause2.0zihintpausePause Hint
Zihpm2.0zihpmHardware Performance Counters
Zimop1.0zimopMay-Be-Operations
Zmmul1.0zmmulInteger Multiplication
Za128rs1.0za128rsReservation Set Size of at Most 128 Bytes
Za64rs1.0za64rsReservation Set Size of at Most 64 Bytes
Zaamo1.0zaamoAtomic Memory Operations
Zabha1.0zabhaByte and Halfword Atomic Memory Operations
Zalrsc1.0zalrscLoad-Reserved/Store-Conditional
Zama16b1.0zama16bAtomic 16-byte misaligned loads, stores and AMOs
Zawrs1.0zawrsWait on Reservation Set
Zfa1.0zfaAdditional Floating-Point
Zfbfmin1.0zfbfminScalar BF16 Converts
Zfh1.0zfhHalf-Precision Floating-Point
Zfhmin1.0zfhminHalf-Precision Floating-Point Minimal
Zfinx1.0zfinxFloat in Integer
Zdinx1.0zdinxDouble in Integer
Zca1.0zcapart of the C extension, excluding compressed floating point loads/stores
Zcb1.0zcbCompressed basic bit manipulation instructions
Zcd1.0zcdCompressed Double-Precision Floating-Point Instructions
Zce1.0zceCompressed extensions for microcontrollers
Zcf1.0zcfCompressed Single-Precision Floating-Point Instructions
Zcmop1.0zcmopCompressed May-Be-Operations
Zcmp1.0zcmpsequenced instructions for code-size reduction
Zcmt1.0zcmttable jump instructions for code-size reduction
Zba1.0zbaAddress Generation Instructions
Zbb1.0zbbBasic Bit-Manipulation
Zbc1.0zbcCarry-Less Multiplication
Zbkb1.0zbkbBitmanip instructions for Cryptography
Zbkc1.0zbkcCarry-less multiply instructions for Cryptography
Zbkx1.0zbkxCrossbar permutation instructions
Zbs1.0zbsSingle-Bit Instructions
Zk1.0zkStandard scalar cryptography extension
Zkn1.0zknNIST Algorithm Suite
Zknd1.0zkndNIST Suite: AES Decryption
Zkne1.0zkneNIST Suite: AES Encryption
Zknh1.0zknhNIST Suite: Hash Function Instructions
Zkr1.0zkrEntropy Source Extension
Zks1.0zksShangMi Algorithm Suite
Zksed1.0zksedShangMi Suite: SM4 Block Cipher Instructions
Zksh1.0zkshShangMi Suite: SM3 Hash Function Instructions
Zkt1.0zktData Independent Execution Latency
Ztso1.0ztsoMemory Model - Total Store Order
Zp052b0.52zp052bPacked-SIMD Instructions
Zp053b0.53zp053bPacked-SIMD Instructions
Zp054b0.54zp054bPacked-SIMD Instructions
Zp64054b0.54zp095bRV32 only 'P' Instructions
Zp095b0.95zp64054bPacked-SIMD Instructions
Zpn095b0.95zpn095bNormal 'P' Instructions
Zprvsfextra095b0.95zprvsfextra095bRV64 only 'P' Instructions
Zpsfoperand095b0.95zpsfoperand095bPaired-register operand 'P' Instructions
Zvbb1.0zvbbVector basic bit-manipulation instructions
Zvbc1.0zvbcVector Carryless Multiplication
Zve32f1.0zve32fVector Extensions for Embedded Processors with maximal 32 EEW and F ex
Zve32x1.0zve32xVector Extensions for Embedded Processors with maximal 32 EEW
Zve64d1.0zve64dVector Extensions for Embedded Processors with maximal 64 EEW, F and D
Zve64f1.0zve64fVector Extensions for Embedded Processors with maximal 64 EEW and F ex
Zve64x1.0zve64xVector Extensions for Embedded Processors with maximal 64 EEW
Zvbfmin1.0zvfbfminVector BF16 Converts
Zvfbfwma1.0zvfbfwmaVector BF16 widening mul-add
Zvfh1.0zvfhVector Half-Precision Floating-Point
Zvfhmin1.0zvfhminVector Half-Precision Floating-Point Minimal
Zvkb1.0zvkbVector Bit-manipulation used in Cryptography
Zvkg1.0zvkgVector GCM instructions for Cryptography
Zvkn1.0zvknshorthand for 'Zvkned', 'Zvknhb', 'Zvkb', and 'Zvkt'
Zvknc1.0zvkncshorthand for 'Zvknc' and 'Zvbc'
Zvkned1.0zvknedVector AES Encryption & Decryption (Single Round)
Zvkng1.0zvkngshorthand for 'Zvkn' and 'Zvkg'
Zvknha1.0zvknhaVector SHA-2 (SHA-256 only)
Zvknhb1.0zvknhbVector SHA-2 (SHA-256 and SHA-512)
Zvks1.0zvksshorthand for 'Zvksed', 'Zvksh', 'Zvkb', and 'Zvkt'
Zvksc1.0zvkscshorthand for 'Zvks' and 'Zvbc'
Zvksed1.0zvksedSM4 Block Cipher Instructions
Zvksg1.0zvksgshorthand for 'Zvks' and 'Zvkg'
Zvksh1.0zvkshSM3 Hash Function Instructions
Zvkt1.0zvktVector Data-Independent Execution Latency
Zvl1024b1.0zvl1024bZvl (Minimum Vector Length) 1024
Zvl128b1.0zvl128bZvl (Minimum Vector Length) 128
Zvl16384b1.0zvl16384bZvl (Minimum Vector Length) 16384
Zvl2048b1.0zvl2048bZvl (Minimum Vector Length) 2048
Zvl256b1.0zvl256bZvl (Minimum Vector Length) 256
Zvl32768b1.0zvl32768bZvl (Minimum Vector Length) 32768
Zvl32b1.0zvl32bZvl (Minimum Vector Length) 32
Zvl4096b1.0zvl4096bZvl (Minimum Vector Length) 4096
Zvl512b1.0zvl512bZvl (Minimum Vector Length) 512
Zvl64b1.0zvl64bZvl (Minimum Vector Length) 64
Zvl65536b1.0zvl65536bZvl (Minimum Vector Length) 65536
Zvl8192b1.0zvl8192bZvl (Minimum Vector Length) 8192
Zhinx1.0zhinxZhinx (Half Float in Integer)
Zhinxmin1.0zhinxminZhinxmin (Half Float in Integer Minimal)
Shcounterenw1.0shcounterenwSupport writeable hcounteren enable bit for any hpmcounter that is not read-only zero
Shgatpa1.0shgatpaSvNNx4 mode supported for all modes supported by satp, as well as Bare
Shtvala1.0shtvalahtval provides all needed values
Shvsatpa1.0shvsatpavsatp supports all modes supported by satp
Shvstvala1.0shvstvalavstval provides all needed values
Shvstvecd1.0shvstvecdvstvec supports Direct mode
Smaia1.0smaiaAdvanced Interrupt Architecture Machine Level
Smcdeleg1.0smcdelegCounter Delegation Machine Level
Smcsrind1.0smcsrindIndirect CSR Access Machine Level
Smepmp1.0smepmpEnhanced Physical Memory Protection
Smstateen1.0smstateenMachine-mode view of the state-enable extension
Ssaia1.0ssaiaAdvanced Interrupt Architecture Supervisor Level
Ssccfg1.0ssccfgCounter Configuration Supervisor Level
Ssccptr1.0ssccptrMain memory supports page table reads
Sscofpmf1.0sscofpmfCount Overflow and Mode-Based Filtering
Sscounterenw1.0sscounterenwSupport writeable scounteren enable bit for any hpmcounter that is not read-only zero
Sscsrind1.0sscsrindIndirect CSR Access Supervisor Level
Ssstateen1.0ssstateenSupervisor-mode view of the state-enable extension
Ssstrict1.0ssstrictNo non-conforming extensions are present
Sstc1.0sstcSupervisor-mode timer interrupts
Sstvala1.0sstvalastval provides all needed values
Sstvecd1.0sstvecdstvec supports Direct mode
Ssu64xl1.0ssu64xlUXLEN=64 supported
Svade1.0svadeRaise exceptions on improper A/D bits
Svadu1.0svaduHardware A/D updates
Svbare1.0svbare$(satp mode Bare supported)
Svinval1.0svinvalFine-Grained Address-Translation Cache Invalidation
Svnapot1.0svnapotNAPOT Translation Contiguity
Svpbmt1.0svpbmtPage-Based Memory Types

Experimental Extensions

Experimental extensions are expected to either transition to ratified status, or the old version. The compatibility of extensions between toolchain versions is not guaranteed. When using these extensions, version numbers must be added.

tip

For example, when using the Zalasr extension, you need to add the version number, that is, use zalasr0p1 instead of zalasr.

  zcc -march=rv32imaczalasr0p1 -c hello.c
zcc -march=rv32imaczicfiss1p0 -c hello.c
ExtensionVersionDescription
Zicfilp1.0Landing pad
Zicfiss1.0Shadow stack
Zacas1.0Atomic Compare-And-Swap Instructions
Zalasr0.1Load-Acquire and Store-Release Instructions
Smmpm1.0Machine-level Pointer Masking for M-mode
Smnpm1.0Machine-level Pointer Masking for next lower privilege mode
Ssnpm1.0Supervisor-level Pointer Masking for next lower privilege mode
Sspm1.0Indicates Supervisor-mode Pointer Masking
Ssqosid1.0Quality-of-Service (QoS) Identifiers
Supm1.0Indicates User-mode Pointer Masking
Supm1.0Indicates User-mode Pointer Masking

Vendor Extensions

Vendor extensions are extensions which are defined by a hardware vendor.

ExtensionVersionFeatureDescription
Xandes5.0xandesAndeStar V5 Extension Specification
Xcvalu1.0xcvaluCORE-V ALU Operations
XCVbi1.0xcvbiCORE-V Immediate Branching
XCVbitmanip1.0xcvbitmanipCORE-V Bit Manipulation
XCVelw1.0xcvelwCORE-V Event Load Word
XCVmac1.0xcvmacCORE-V Multiply-Accumulate
XCVmem1.0xcvmemCORE-V Post-incrementing Load & Store
XCVsimd1.0xcvsimdCORE-V SIMD ALU
Xgap8dsp4.0xgap8dsp'Xp' (GAP8 DSP extension)
Xgap8m4.0xgap8m'Xm' (GAP8 Integer Multiplication)
Xgap8v4.0xgap8v'Xv' (GAP8 Vector extension)
XSfcease1.0xsfceaseSiFive sf.cease Instruction
XSfvcp1.0xsfvcpSiFive Custom Vector Coprocessor Interface Instructions
XSfvfnrclipxfqf1.0xsfvfnrclipxfqfSiFive FP32-to-int8 Ranged Clip Instructions
XSfvfwmaccqqq1.0xsfvfwmaccqqqSiFive Matrix Multiply Accumulate Instruction and 4-by-4
XSfvqmaccdod1.0xsfvqmaccdodSiFive Int8 Matrix Multiplication Instructions (2-by-8 and 8-by-2)
XSfvqmaccqoq1.0xsfvqmaccqoqSiFive Int8 Matrix Multiplication Instructions (4-by-8 and 8-by-4)
XSiFivecdiscarddlone1.0xsifivecdiscarddloneSiFive sf.cdiscard.d.l1 Instruction
XSiFivecflushdlone1.0xsifivecflushdloneSiFive sf.cflush.d.l1 Instruction
XTHeadBa1.0xtheadbaT-Head address calculation instructions
XTHeadBb1.0xtheadbbT-Head basic bit-manipulation instructions
XTHeadBs1.0xtheadbsT-Head single-bit instructions
XTHeadCmo1.0xtheadcmoT-Head cache management instructions
XTHeadCondMov1.0xtheadcondmovT-Head conditional move instructions
XTHeadFMemIdx1.0xtheadfmemidxT-Head FP Indexed Memory Operations
XTHeadMac1.0xtheadmacT-Head Multiply-Accumulate Instructions
XTHeadMemIdx1.0xtheadmemidxT-Head Indexed Memory Operations
XTHeadMemPair1.0xtheadmempairT-Head two-GPR Memory Operations
XTHeadSync1.0xtheadsyncT-Head multicore synchronization instructions
XTHeadVdot1.0xtheadvdotT-Head Vector Extensions for Dot
XVentanaCondOps1.0xventanacondopsVentana Conditional Ops
Xwchc2.2xwchcWCH/QingKe additional compressed opcodes
Xxlcz1.0xxlczNuclei Additional Xlcz Instruction for Codesize
Xxldsp1.0xxldspNuclei customized DSP instructions for both RV32 and RV64
Xxldspn1x1.0xxldspn1xNuclei customized DSP N1 instructions only for RV32
Xxldspn2x1.0xxldspn2xNuclei customized DSP N2 instructions only for RV32
Xxldspn3x1.0xxldspn3xNuclei customized DSP N3 instructions only for RV32
Xxlvqmacc1.0xxlvqmaccNuclei Int8 Matrix Multiplication Instructions (4-by-4 and 4-by-4)

Profiles

Supported RISC-V profile names can be passed using -march instead of a standard ISA naming string. Currently supported profiles:

Supported ProfilesExperimental Profiles
rva20s64rva23s64
rva20u64rva23u64
rva22s64rvb23s64
rva22u64rvb23u64
rvi20u32rvm23u32
rvi20u64

Note that you can also append additional extension names to be enabled, e.g. rva20u64_zicond will enable the zicond extension in addition to those in the rva20u64 profile.

Compilation Options

Since ZCC 4.x is based on LLVM 19.1.6, most of the LLVM compiler options are applicable to ZCC.

-target option

Specify the -target <architecture> to build for. Arguments that can be used are listed below:

  • riscv64-unknown-elf
  • riscv32-unknown-elf
  • riscv64-unknown-linux-gnu

-march option

specify -march=<architecture> to generate code for a specific processor architectures.

For the detection rules of march, the format follows -march=rv[32|64][i|e][extensions]. The order of components is not strictly enforced when using it, and the final linking will generate instructions according to the specified march. For example: when using -march=rv32imafc, ZCC will look for libraries with rv32ifa and also generate instructions for the M extension.

Multilib

ZCC will select compatible zcc libraries to add to the application based on the arch/abi combination specified by the user. For example:

Specified arch/abi combinationApplied zcc library
-march=rv32imafc -mabi=ilp32frv32ifa/ilp32f
-march=rv32imafc_zba_zbb_zbc_zbs_zp052b -mabi=ilp32frv32ifap0p95_zp052b_zp053b_zp054b/ilp32f

For applications using Arch extensions that do not exist in the library, such as M and C extensions, their optimizations will be performed during the IR to assembly translation stage, so no optimizations will be missing. For certain Arch extensions, such as P and V extensions, since their optimizations are performed during C source code to IR translation, separate libraries must be created for them.

-mtune option

When specify -mtune=, ZCC will perform optimization on the target CPU. Arguments that can be used by -mtune= are listed below:

  • THead
    • thead-c908-series
  • Andes
    • andes-kavalan
    • andes-vicuna
    • andes-d25-series
    • andes-d45-series
  • Tenstorrent
    • ascalon
  • Nuclei
    • nuclei-100-series
    • nuclei-200-series
    • nuclei-300-series
    • nuclei-310-series
    • nuclei-600-series
    • nuclei-900-series
    • nuclei-1000-series
  • Rocket
    • rocket
  • Imagination
    • rtxm2200
  • Sifive
    • sifive-7-series
  • Syntacore
    • syntacore-scr1-series

Optimization options

  • -O0, -O1, -O2, -O3, -Os

    Specify which optimization level to use:

    • O0: Means “no optimization”: this level compiles the fastest and generates the most debuggable code.

    • O1: Somewhere between -O0 and -O2.

    • O2: Moderate level of optimization which enables most optimizations.

    • O3: Like -O2, except that it enables optimizations that take longer to perform or that may generate larger code (in an attempt to make the program run faster).

    • Os: Like -O2 with extra optimizations to reduce code size.

Troubleshooting

C/C++ compilation options

  1. The auto-vectorization in ZCC is highly aggressive. For the vast majority of inner-most loops, the auto-vectorization can achieve performance on par with or even exceeding handwritten intrinsics. For nested (outer) loop vectorization, the results are also impressive.For AI kernels commonly used in practical scenarios, such as correlation and resizing, the code performance generated by ZCC's auto-vectorization is very close to that of handwritten intrinsic.

    When enabling auto-vectorization with the options -march=rv32/64gcv -O3, you can activate aggressive optimizations for operations less than 32 (or 64) bits by adding the option -mllvm --no-integer-promotions. Note that when using this option, there should be no implicit type conversions in the source code, as this could result in undefined behavior. Examples of correct and incorrect usage are as follows:

    • Correct Example

      void foo(size_t count, int8_t* channel1, int8_t* channel2, int16_t* output) {
      for (size_t i = 0; i < count; i++) {
      // With --no-integer-promotions, only explicit type promotion from int8_t to int16_t occurs
      output[i] = (int16_t)channel1[i] * (int16_t)channel2[i];
      }
      }
    • Error example

      void foo(size_t count, int8_t* channel1, int8_t* channel2, int16_t* output) {
      for (size_t i = 0; i < count; i++) {
      // Implicit type promotion from int16_t and int8_t to int32_t occurs
      output[i] = channel1[i] * channel2[i];
      }
  2. It is sometimes impossible for the compiler to determine which loops require what kind of vectorization, especially in the case of nested loop vectorization. In such cases, users need to use #pragma directives to explicitly instruct the compiler on which loop levels to vectorize, how to configure the maximum register grouping, and other related optimizations. See the example below for reference:

    //Currently, multi-level loop vectors only support loops without dependencies, or the user can ensure that the program will not have dependencies when vectorizing loops at this level.
    // The option vectorize(assume_safety) is required. It mainly tells the compiler that the memory accesses in the loop will not overlap (pointer alias or partial alias). This option can also ignore unknown accesses like a[idx[i]] Due to memory limitations, please ensure that loops do not have dependencies.
    //The option vectorize_width(16, scalable) optionally determines the number of RVV register groups, based on the width of the widest element in the loop multiplied by 16. If not selected, the compiler will automatically calculate an appropriate number of RVV register groups.
    // as follows
    // mf8 # LMUL=1/8 base on 8bit
    // mf4 # LMUL=1/4 base on 16bit
    // mf2 # LMUL=1/2 base on 32bit
    // m1 # LMUL=1 base on 64bit
    // m2 # LMUL=2 base on 128bit
    // m4 # LMUL=4 base on 256bit
    // m8 # LMUL=8 base on 512bit
    #pragma clang loop vectorize(assume_safety) vectorize_width(16, scalable)
    for (uint16_t w = 0; w < width: w++) {
    .......
    }
    info

    Please refer to the LLVM User Manual for specific usage examples.

  3. To enable nested (outer) loop auto-vectorization, an additional option -mllvm --enable-vplan-native-path needs to be added.

    tip

    This option is in the development stage and only needs to be added if the outer loop uses pragma. It will be removed when auto-vectorization is fully optimized.

  4. ZCC enables link-time optimization by default, so the generated intermediate result files (generated with the -c option) are in LLVM bytecode. If you need to analyze the assembly code of the intermediate results or disable link-time optimization, you can add the -fno-lto option.

  5. More aggressive code size optimization options:

    • -mllvm --riscv-machine-outliner=true: In the case of LTO (which is enabled by default in ZCC), this option requires appending -Wl,-mllvm,--riscv-machine-outliner=true. This optimization is focused on reducing code size, but it may result in a decrease in program performance.

    • -config small.cfg: This option enables ZCC to link libraries optimized for code size.

  6. By default, ZCC does not enable the fp-contract optimization. When the F and D extensions are available, you can manually enable this optimization using the -ffp-contract option, which allows the generation of additional FMAD-type instructions.

  7. The -munaligned-access option can generate unaligned memory access instructions and align global variables to 1 byte.

  8. ZCC does not generate RVV strided/index load instructions by default. The -mllvm --riscv-enable-gather option can be used to generate RVV strided/index load instructions.

  9. ZCC RVV auto-vectorization uses a default register grouping of LMUL = 8. The -mllvm --riscv-v-register-bit-width-lmul option allows you to specify the vector register grouping, supporting LMUL values of 1, 2, 4, and 8.

  10. Data locality optimization

    The -fdlo option enables data locality optimization and must be included in both the compile and link options. Please note that this optimization is still in the testing phase.

  11. The default minimum trip count for RVV loop vectorization is 5.

    The -mllvm --rvv-vectorizer-min-trip-count option allows you to specify the minimum trip count for loop vectorization. If the loop count is smaller than this value, the loop will not be vectorized.

  12. Delayed loop unrolling optimization can be enabled using the -flate-loop-unroll option. This allows ZCC to use more efficient loop unrolling algorithms during both the compilation and linking processes.

Fortran compilation options

When using ZFC, you need to add the following options to specify tartget and libraries with their path in local file system.

--target=riscv64-unknown-linux-gnu -L ./install-zcc_protected/riscv64-unknown-linux-gnu/lib -lFortran_main -lFortranRuntime -lpthread -lm
tip

The Fortran compiler ZFC currently only supports Linux rv64imafdc. Other architectures will be supported in the future.

gcc -specs

GCC allows multiple --specs options to specify configuration files that override default settings. For newlib, spec files like nano.specs, nosys.specs, and simihost.specs can be used.

In contrast, ZCC uses a single combined cfg file specified with the --config option. Since ZCC does not allow multiple --config options, the cfg file for newlib in ZCC includes nano-nosys.cfg, nano.cfg, nosys .cfg, sim.cfg and semihost.cfg.

Multi target

ZCC is a multi-target, multi-arch, multi-abi compiler. By default, ZCC will generate code for rv64imafdc/ilp32d. If you need to use ZCC to generate code for other targets, you need to use -march=<arch> and -mabi=<abi> at the same time. The supported arch/abi for the RISC-V architecture of ZCC are listed in Multilib.

warning

When using ZCC to generate RV64 code, you must also specify --target=riscv64-unknown-elf; otherwise it will cause a link error.

Code model

ZCC compiler supports the medany and medlow options, which are equivalent to the medium and small options in the GCC compiler for the RISC-V architecture.

Align arguments (RISC-V)

align targetGCCZCC
align function-falign-functions=n:m:n2:m2;--align-functions=n:m:n2:m2-falign-functions=N;-mllvm --align-all-functions=unit
align all branch targets-falign-labels=n:m:n2:m2;--align-labels=n:m:n2:m2-falign-labels=N(ignore);--align-labels=N(ignore);-mllvm --align-all-blocks=unit
align loops-falign-loops=n:m:n2:m2; --align-loops=n:m:n2:m2-falign-loops=N;--align-loops=N(ignore)
align branch target can only be reached by jumping-falign-jumps=n:m:n2:m2;--align-jumps=n:m:n2:m2-falign-jumps=N(ignore); --align-jumps=N(ignore);-mllvm --align-all-nofallthru-blocks=unit
tip
  • ignore: ZCC will only consume this argument and do nothing.

  • N: Must be power of 2 (e.g 4 means align on 4B boundaries, -falign-functions=8 means that functions will be aligned to 8 bytes boundary.).

  • unit: Force the alignment in log2 format (e.g 4 means align on 16B boundaries, -mllvm --align-all-functions=8 means that functions will be aligned to 256 bytes boundary.)

linker script

  1. The ZCC linker does not support the DEFINED macro in linker scripts. For linker scripts that use DEFINED, the following modification is currently required:

    - __stack_size = DEFINED(__stack_size) ? __stack_size : 2K;
    + __stack_size = 2K;
  2. For the support of GNU's ld for linker script's MEMORY command, refer to sourceware docs. Currently, ZCC supports most MEMORY commands, but does not support the "I" attribute. For the "I" command in ldscript, you need to make modifications as following:

    MEMORY
    {
    - ilm (rxai!w) : ORIGIN = 0x80000000, LENGTH = 64K
    - ram (wxa!ri) : ORIGIN = 0x90000000, LENGTH = 64K
    + ilm (rxa!w) : ORIGIN = 0x80000000, LENGTH = 64K
    + ram (wxa!r) : ORIGIN = 0x90000000, LENGTH = 64K
    }
  3. The ZCC linker script does not support GNU ld's position-based cumulative operations in output sections. If you need to set an offset at the current location in the output section, you should use an absolute address. For example, replace . = __stack_size; with . += __stack_size;.

    .stack ORIGIN(ram) + LENGTH(ram) - __stack_size :
    {
    PROVIDE( _heap_end = . );
    - . = __stack_size;
    + . += __stack_size;
    PROVIDE( _sp = . );
    } >ram AT>ram
  4. By default, ZCC compiles non-builtin sections from linker scripts into the .data segment. To place these sections in .bss instead, use the NOLOAD attribute. For example, in the gcc_demosoc_ilm.ld script from nuclei_sdk, you can modify the .stack section as shown below for compatibility with ZCC.

    -   .stack ORIGIN(ram) + LENGTH(ram) - __stack_size :
    + .stack ORIGIN(ram) + LENGTH(ram) - __stack_size (NOLOAD) :
    {
    PROVIDE( _heap_end = . );
    - . = __stack_size;
    + . += __stack_size;
    PROVIDE( _sp = . );
    } >ram AT>ram
    }
  5. If you manually use __attribute__((section(".sec_abc"))) to place a specific initialization function pointer into a designated section, but the C/C++ source code does not directly reference the object containing the function pointer, the compiler may optimize away the object. To prevent this, you need to manually add the used attribute, like so: __attribute__((section(".sec_abc"), used)). This forces the compiler to retain the unused object and prevent it from being optimized away.

  6. Negative number representation in assembly code/inline assembly

    GNU assembler (as) will perform sign extension for 32/64-bit numbers, depending on the corresponding 32/64-bit platform. For example, in the case of GNU as on RV32, it will recognize 0xFFFFF800 as -2048. However, in RV64, GNU as will throw an error for the code below. However, ZCC's assembler (ZCC as), regardless of whether it is on RV32 or RV64, treats 0xFFFFF800 as a positive number.

    and a0, a0, 0xFFFFF800

    Furthermore, the GNU Assembler user guide clearly indicates that when working with negative numbers in assembly code, the negative sign must be placed directly before the number, as demonstrated below:

    and a0, a0, -0x800
  7. The lld does not support the ALIGN_WITH_INPUT attribute for output sections. Instead, you can use the ALIGN(x) attribute to specify alignment. For example:

    -   .data : ALIGN_WITH_INPUT
    + .data : ALIGN(8)
    {
    . = ALIGN(8)
    ...
    }
  8. Issue when using -M and -Map parameters at the same time in linker

    • GCC behavior dictates that the last parameter takes effect. For example, in -Wl,-M,-Map, -Map takes effect, while in -Wl,-Map,-M, -M takes effect.

    • Clang always gives priority to the -M option when both parameters are used together. Currently, ZCC follows the same behavior as Clang.

  9. In the libunwind library used by ZCC, symbols such as eh_frame_start, eh_frame_end, eh_frame_hdr_start, and eh_frame_hdr_end are referenced. When using a custom linker script, you need to include the following code to set the values of these symbols.

    eh_frame :
    {
    __eh_frame_start = .;
    KEEP(*(.eh_frame))
    __eh_frame_end = .;
    }
    .eh_frame_hdr :
    {
    KEEP(*(.eh_frame_hdr))
    }

    __eh_frame_hdr_start = SIZEOF(.eh_frame_hdr) > 0 ? ADDR(.eh_frame_hdr) : 0;
    __eh_frame_hdr_end = SIZEOF(.eh_frame_hdr) > 0 ? . : 0;

Behavior different from GCC

Uninitialized local variables

Using uninitialized local variables in C/C++ results in undefined behavior because the value of uninitialized local variables may be 0, random memory values, or arbitrary values. If the source code makes use of uninitialized local variables, the assignment from different compilers might legitimately be different.

For example, in the code below, a is an uninitialized local variable. GCC initializes a to 0, while ZCC/Clang initializes a to 0xFFFFFFFF.

#include <stdio.h>
#include <stdlib.h>

void main()
{
unsigned int a, b;
b = 1;
a |= b;
printf("a %d, b %d\n", a, b);
}

Additionally, it's important to note that compilers' treatment of uninitialized local variables is not consistent. Therefore, you should not rely on compiler-specific behavior for variable initialization. For instance, in the example below, ZCC/Clang initializes a to 0.

#include <stdio.h>
#include <stdlib.h>

void main()
{
unsigned int a, b;
b = 1;
a &= b;
printf("a %d, b %d\n", a, b);
}

To avoid undefined behavior, you can use the -Wuninitialized flag during compilation. This enables warnings for the use of uninitialized local variables, allowing the compiler to notify you about potential issues.

Get help

If you need help or have a question with any aspect of ZCC, feel free to discuss on 1nfinite developer forum. Our team is here to provide responses and enhance your user experience.

ZCC (Commercial)

We open issue tracking system for ZCC (Commercial) users. Please report bugs on ticket page of Terapines Support.