gawk: Basic Data Typing

1 
1 D.2 Data Values in a Computer
1 =============================
1 
1 In a program, you keep track of information and values in things called
1 "variables".  A variable is just a name for a given value, such as
1 'first_name', 'last_name', 'address', and so on.  'awk' has several
1 predefined variables, and it has special names to refer to the current
1 input record and the fields of the record.  You may also group multiple
1 associated values under one name, as an array.
1 
1    Data, particularly in 'awk', consists of either numeric values, such
1 as 42 or 3.1415927, or string values.  String values are essentially
1 anything that's not a number, such as a name.  Strings are sometimes
1 referred to as "character data", since they store the individual
1 characters that comprise them.  Individual variables, as well as numeric
1 and string variables, are referred to as "scalar" values.  Groups of
1 values, such as arrays, are not scalars.
1 
1    ⇒Computer Arithmetic, provided a basic introduction to numeric
1 types (integer and floating-point) and how they are used in a computer.
1 Please review that information, including a number of caveats that were
1 presented.
1 
1    While you are probably used to the idea of a number without a value
1 (i.e., zero), it takes a bit more getting used to the idea of
1 zero-length character data.  Nevertheless, such a thing exists.  It is
1 called the "null string".  The null string is character data that has no
1 value.  In other words, it is empty.  It is written in 'awk' programs
1 like this: '""'.
1 
1    Humans are used to working in decimal; i.e., base 10.  In base 10,
1 numbers go from 0 to 9, and then "roll over" into the next column.
1 (Remember grade school?  42 = 4 x 10 + 2.)
1 
1    There are other number bases though.  Computers commonly use base 2
1 or "binary", base 8 or "octal", and base 16 or "hexadecimal".  In
1 binary, each column represents two times the value in the column to its
1 right.  Each column may contain either a 0 or a 1.  Thus, binary 1010
1 represents (1 x 8) + (0 x 4) + (1 x 2) + (0 x 1), or decimal 10.  Octal
1 and hexadecimal are discussed more in ⇒Nondecimal-numbers.
1 
1    At the very lowest level, computers store values as groups of binary
1 digits, or "bits".  Modern computers group bits into groups of eight,
1 called "bytes".  Advanced applications sometimes have to manipulate bits
1 directly, and 'gawk' provides functions for doing so.
1 
1    Programs are written in programming languages.  Hundreds, if not
1 thousands, of programming languages exist.  One of the most popular is
1 the C programming language.  The C language had a very strong influence
1 on the design of the 'awk' language.
1 
1    There have been several versions of C. The first is often referred to
1 as "K&R" C, after the initials of Brian Kernighan and Dennis Ritchie,
1 the authors of the first book on C. (Dennis Ritchie created the
1 language, and Brian Kernighan was one of the creators of 'awk'.)
1 
1    In the mid-1980s, an effort began to produce an international
1 standard for C. This work culminated in 1989, with the production of the
1 ANSI standard for C. This standard became an ISO standard in 1990.  In
1 1999, a revised ISO C standard was approved and released.  Where it
1 makes sense, POSIX 'awk' is compatible with 1999 ISO C.
1