as: Xtensa Automatic Alignment
1
1 9.55.3.2 Automatic Instruction Alignment
1 ........................................
1
1 The Xtensa assembler will automatically align certain instructions, both
1 to optimize performance and to satisfy architectural requirements.
1
1 As an optimization to improve performance, the assembler attempts to
1 align branch targets so they do not cross instruction fetch boundaries.
1 (Xtensa processors can be configured with either 32-bit or 64-bit
1 instruction fetch widths.) An instruction immediately following a call
1 is treated as a branch target in this context, because it will be the
1 target of a return from the call. This alignment has the potential to
1 reduce branch penalties at some expense in code size. This optimization
1 is enabled by default. You can disable it with the '--no-target-align'
1 command-line option (⇒Command Line Options Xtensa Options.).
1
1 The target alignment optimization is done without adding instructions
1 that could increase the execution time of the program. If there are
1 density instructions in the code preceding a target, the assembler can
1 change the target alignment by widening some of those instructions to
1 the equivalent 24-bit instructions. Extra bytes of padding can be
1 inserted immediately following unconditional jump and return
1 instructions. This approach is usually successful in aligning many, but
1 not all, branch targets.
1
1 The 'LOOP' family of instructions must be aligned such that the first
1 instruction in the loop body does not cross an instruction fetch
1 boundary (e.g., with a 32-bit fetch width, a 'LOOP' instruction must be
1 on either a 1 or 2 mod 4 byte boundary). The assembler knows about this
1 restriction and inserts the minimal number of 2 or 3 byte no-op
1 instructions to satisfy it. When no-op instructions are added, any
1 label immediately preceding the original loop will be moved in order to
1 refer to the loop instruction, not the newly generated no-op
1 instruction. To preserve binary compatibility across processors with
1 different fetch widths, the assembler conservatively assumes a 32-bit
1 fetch width when aligning 'LOOP' instructions (except if the first
1 instruction in the loop is a 64-bit instruction).
1
1 Previous versions of the assembler automatically aligned 'ENTRY'
1 instructions to 4-byte boundaries, but that alignment is now the
1 programmer's responsibility.
1