cpp: Implementation-defined behavior

1 
1 11.1 Implementation-defined behavior
1 ====================================
1 
1 This is how CPP behaves in all the cases which the C standard describes
1 as "implementation-defined".  This term means that the implementation is
1 free to do what it likes, but must document its choice and stick to it.
1 
1    * The mapping of physical source file multi-byte characters to the
1      execution character set.
1 
1      The input character set can be specified using the
1      '-finput-charset' option, while the execution character set may be
1      controlled using the '-fexec-charset' and '-fwide-exec-charset'
1      options.
1 
1    * Identifier characters.
1 
1      The C and C++ standards allow identifiers to be composed of '_' and
1      the alphanumeric characters.  C++ also allows universal character
1      names.  C99 and later C standards permit both universal character
1      names and implementation-defined characters.
1 
1      GCC allows the '$' character in identifiers as an extension for
1      most targets.  This is true regardless of the 'std=' switch, since
1      this extension cannot conflict with standards-conforming programs.
1      When preprocessing assembler, however, dollars are not identifier
1      characters by default.
1 
1      Currently the targets that by default do not permit '$' are AVR,
1      IP2K, MMIX, MIPS Irix 3, ARM aout, and PowerPC targets for the AIX
1      operating system.
1 
1      You can override the default with '-fdollars-in-identifiers' or
1      'fno-dollars-in-identifiers'.  ⇒fdollars-in-identifiers.
1 
1    * Non-empty sequences of whitespace characters.
1 
1      In textual output, each whitespace sequence is collapsed to a
1      single space.  For aesthetic reasons, the first token on each
1      non-directive line of output is preceded with sufficient spaces
1      that it appears in the same column as it did in the original source
1      file.
1 
1    * The numeric value of character constants in preprocessor
1      expressions.
1 
1      The preprocessor and compiler interpret character constants in the
1      same way; i.e. escape sequences such as '\a' are given the values
1      they would have on the target machine.
1 
1      The compiler evaluates a multi-character character constant a
1      character at a time, shifting the previous value left by the number
1      of bits per target character, and then or-ing in the bit-pattern of
1      the new character truncated to the width of a target character.
1      The final bit-pattern is given type 'int', and is therefore signed,
1      regardless of whether single characters are signed or not.  If
1      there are more characters in the constant than would fit in the
1      target 'int' the compiler issues a warning, and the excess leading
1      characters are ignored.
1 
1      For example, ''ab'' for a target with an 8-bit 'char' would be
1      interpreted as
1      '(int) ((unsigned char) 'a' * 256 + (unsigned char) 'b')', and
1      ''\234a'' as
1      '(int) ((unsigned char) '\234' * 256 + (unsigned char) 'a')'.
1 
1    * Source file inclusion.
1 
1      For a discussion on how the preprocessor locates header files,
1      ⇒Include Operation.
1 
1    * Interpretation of the filename resulting from a macro-expanded
1      '#include' directive.
1 
1      ⇒Computed Includes.
1 
1    * Treatment of a '#pragma' directive that after macro-expansion
1      results in a standard pragma.
1 
1      No macro expansion occurs on any '#pragma' directive line, so the
1      question does not arise.
1 
1      Note that GCC does not yet implement any of the standard pragmas.
1