gcc: Type encoding
1
1 8.3 Type Encoding
1 =================
1
1 This is an advanced section. Type encodings are used extensively by the
1 compiler and by the runtime, but you generally do not need to know about
1 them to use Objective-C.
1
1 The Objective-C compiler generates type encodings for all the types.
1 These type encodings are used at runtime to find out information about
1 selectors and methods and about objects and classes.
1
1 The types are encoded in the following way:
1
1 '_Bool' 'B'
1 'char' 'c'
1 'unsigned char' 'C'
1 'short' 's'
1 'unsigned short' 'S'
1 'int' 'i'
1 'unsigned int' 'I'
1 'long' 'l'
1 'unsigned long' 'L'
1 'long long' 'q'
1 'unsigned long 'Q'
1 long'
1 'float' 'f'
1 'double' 'd'
1 'long double' 'D'
1 'void' 'v'
1 'id' '@'
1 'Class' '#'
1 'SEL' ':'
1 'char*' '*'
1 'enum' an 'enum' is encoded exactly as the integer type
1 that the compiler uses for it, which depends on the
1 enumeration values. Often the compiler users
1 'unsigned int', which is then encoded as 'I'.
1 unknown type '?'
1 Complex types 'j' followed by the inner type. For example
1 '_Complex double' is encoded as "jd".
1 bit-fields 'b' followed by the starting position of the
1 bit-field, the type of the bit-field and the size of
1 the bit-field (the bit-fields encoding was changed
1 from the NeXT's compiler encoding, see below)
1
1 The encoding of bit-fields has changed to allow bit-fields to be
1 properly handled by the runtime functions that compute sizes and
1 alignments of types that contain bit-fields. The previous encoding
1 contained only the size of the bit-field. Using only this information
1 it is not possible to reliably compute the size occupied by the
1 bit-field. This is very important in the presence of the Boehm's
1 garbage collector because the objects are allocated using the typed
1 memory facility available in this collector. The typed memory
1 allocation requires information about where the pointers are located
1 inside the object.
1
1 The position in the bit-field is the position, counting in bits, of the
1 bit closest to the beginning of the structure.
1
1 The non-atomic types are encoded as follows:
1
1 pointers '^' followed by the pointed type.
1 arrays '[' followed by the number of elements in the array
1 followed by the type of the elements followed by ']'
1 structures '{' followed by the name of the structure (or '?' if the
1 structure is unnamed), the '=' sign, the type of the
1 members and by '}'
1 unions '(' followed by the name of the structure (or '?' if the
1 union is unnamed), the '=' sign, the type of the members
1 followed by ')'
1 vectors '![' followed by the vector_size (the number of bytes
1 composing the vector) followed by a comma, followed by
1 the alignment (in bytes) of the vector, followed by the
1 type of the elements followed by ']'
1
1 Here are some types and their encodings, as they are generated by the
1 compiler on an i386 machine:
1
1
1 Objective-C type Compiler encoding
1 int a[10]; '[10i]'
1 struct { '{?=i[3f]b128i3b131i2c}'
1 int i;
1 float f[3];
1 int a:3;
1 int b:2;
1 char c;
1 }
1 int a __attribute__ ((vector_size (16)));'![16,16i]' (alignment
1 depends on the machine)
1
1
1 In addition to the types the compiler also encodes the type specifiers.
1 The table below describes the encoding of the current Objective-C type
1 specifiers:
1
1
1 Specifier Encoding
1 'const' 'r'
1 'in' 'n'
1 'inout' 'N'
1 'out' 'o'
1 'bycopy' 'O'
1 'byref' 'R'
1 'oneway' 'V'
1
1
1 The type specifiers are encoded just before the type. Unlike types
1 however, the type specifiers are only encoded when they appear in method
1 argument types.
1
1 Note how 'const' interacts with pointers:
1
1
1 Objective-C type Compiler encoding
1 const int 'ri'
1 const int* '^ri'
1 int *const 'r^i'
1
1
1 'const int*' is a pointer to a 'const int', and so is encoded as '^ri'.
1 'int* const', instead, is a 'const' pointer to an 'int', and so is
1 encoded as 'r^i'.
1
1 Finally, there is a complication when encoding 'const char *' versus
1 'char * const'. Because 'char *' is encoded as '*' and not as '^c',
1 there is no way to express the fact that 'r' applies to the pointer or
1 to the pointee.
1
1 Hence, it is assumed as a convention that 'r*' means 'const char *'
1 (since it is what is most often meant), and there is no way to encode
1 'char *const'. 'char *const' would simply be encoded as '*', and the
1 'const' is lost.
1
Menu