cpp: Implementation-defined behavior
1
1 11.1 Implementation-defined behavior
1 ====================================
1
1 This is how CPP behaves in all the cases which the C standard describes
1 as "implementation-defined". This term means that the implementation is
1 free to do what it likes, but must document its choice and stick to it.
1
1 * The mapping of physical source file multi-byte characters to the
1 execution character set.
1
1 The input character set can be specified using the
1 '-finput-charset' option, while the execution character set may be
1 controlled using the '-fexec-charset' and '-fwide-exec-charset'
1 options.
1
1 * Identifier characters.
1
1 The C and C++ standards allow identifiers to be composed of '_' and
1 the alphanumeric characters. C++ also allows universal character
1 names. C99 and later C standards permit both universal character
1 names and implementation-defined characters.
1
1 GCC allows the '$' character in identifiers as an extension for
1 most targets. This is true regardless of the 'std=' switch, since
1 this extension cannot conflict with standards-conforming programs.
1 When preprocessing assembler, however, dollars are not identifier
1 characters by default.
1
1 Currently the targets that by default do not permit '$' are AVR,
1 IP2K, MMIX, MIPS Irix 3, ARM aout, and PowerPC targets for the AIX
1 operating system.
1
1 You can override the default with '-fdollars-in-identifiers' or
1 'fno-dollars-in-identifiers'. ⇒fdollars-in-identifiers.
1
1 * Non-empty sequences of whitespace characters.
1
1 In textual output, each whitespace sequence is collapsed to a
1 single space. For aesthetic reasons, the first token on each
1 non-directive line of output is preceded with sufficient spaces
1 that it appears in the same column as it did in the original source
1 file.
1
1 * The numeric value of character constants in preprocessor
1 expressions.
1
1 The preprocessor and compiler interpret character constants in the
1 same way; i.e. escape sequences such as '\a' are given the values
1 they would have on the target machine.
1
1 The compiler evaluates a multi-character character constant a
1 character at a time, shifting the previous value left by the number
1 of bits per target character, and then or-ing in the bit-pattern of
1 the new character truncated to the width of a target character.
1 The final bit-pattern is given type 'int', and is therefore signed,
1 regardless of whether single characters are signed or not. If
1 there are more characters in the constant than would fit in the
1 target 'int' the compiler issues a warning, and the excess leading
1 characters are ignored.
1
1 For example, ''ab'' for a target with an 8-bit 'char' would be
1 interpreted as
1 '(int) ((unsigned char) 'a' * 256 + (unsigned char) 'b')', and
1 ''\234a'' as
1 '(int) ((unsigned char) '\234' * 256 + (unsigned char) 'a')'.
1
1 * Source file inclusion.
1
1 For a discussion on how the preprocessor locates header files,
1 ⇒Include Operation.
1
1 * Interpretation of the filename resulting from a macro-expanded
1 '#include' directive.
1
1 ⇒Computed Includes.
1
1 * Treatment of a '#pragma' directive that after macro-expansion
1 results in a standard pragma.
1
1 No macro expansion occurs on any '#pragma' directive line, so the
1 question does not arise.
1
1 Note that GCC does not yet implement any of the standard pragmas.
1