cpp: Traditional lexical analysis
1
1 10.1 Traditional lexical analysis
1 =================================
1
1 The traditional preprocessor does not decompose its input into tokens
1 the same way a standards-conforming preprocessor does. The input is
1 simply treated as a stream of text with minimal internal form.
1
1 This implementation does not treat trigraphs (⇒trigraphs)
1 specially since they were an invention of the standards committee. It
1 handles arbitrarily-positioned escaped newlines properly and splices the
1 lines as you would expect; many traditional preprocessors did not do
1 this.
1
1 The form of horizontal whitespace in the input file is preserved in
1 the output. In particular, hard tabs remain hard tabs. This can be
1 useful if, for example, you are preprocessing a Makefile.
1
1 Traditional CPP only recognizes C-style block comments, and treats
1 the '/*' sequence as introducing a comment only if it lies outside
1 quoted text. Quoted text is introduced by the usual single and double
1 quotes, and also by an initial '<' in a '#include' directive.
1
1 Traditionally, comments are completely removed and are not replaced
1 with a space. Since a traditional compiler does its own tokenization of
1 the output of the preprocessor, this means that comments can effectively
1 be used as token paste operators. However, comments behave like
1 separators for text handled by the preprocessor itself, since it doesn't
1 re-lex its input. For example, in
1
1 #if foo/**/bar
1
1 'foo' and 'bar' are distinct identifiers and expanded separately if they
1 happen to be macros. In other words, this directive is equivalent to
1
1 #if foo bar
1
1 rather than
1
1 #if foobar
1
1 Generally speaking, in traditional mode an opening quote need not
1 have a matching closing quote. In particular, a macro may be defined
1 with replacement text that contains an unmatched quote. Of course, if
1 you attempt to compile preprocessed output containing an unmatched quote
1 you will get a syntax error.
1
1 However, all preprocessing directives other than '#define' require
1 matching quotes. For example:
1
1 #define m This macro's fine and has an unmatched quote
1 "/* This is not a comment. */
1 /* This is a comment. The following #include directive
1 is ill-formed. */
1 #include <stdio.h
1
1 Just as for the ISO preprocessor, what would be a closing quote can
1 be escaped with a backslash to prevent the quoted text from closing.
1