gawk: Strtonum Function
1
1 10.2.1 Converting Strings to Numbers
1 ------------------------------------
1
1 The 'strtonum()' function (⇒String Functions) is a 'gawk'
1 extension. The following function provides an implementation for other
1 versions of 'awk':
1
1 # mystrtonum --- convert string to number
1
1 function mystrtonum(str, ret, n, i, k, c)
1 {
1 if (str ~ /^0[0-7]*$/) {
1 # octal
1 n = length(str)
1 ret = 0
1 for (i = 1; i <= n; i++) {
1 c = substr(str, i, 1)
1 # index() returns 0 if c not in string,
1 # includes c == "0"
1 k = index("1234567", c)
1
1 ret = ret * 8 + k
1 }
1 } else if (str ~ /^0[xX][[:xdigit:]]+$/) {
1 # hexadecimal
1 str = substr(str, 3) # lop off leading 0x
1 n = length(str)
1 ret = 0
1 for (i = 1; i <= n; i++) {
1 c = substr(str, i, 1)
1 c = tolower(c)
1 # index() returns 0 if c not in string,
1 # includes c == "0"
1 k = index("123456789abcdef", c)
1
1 ret = ret * 16 + k
1 }
1 } else if (str ~ \
1 /^[-+]?([0-9]+([.][0-9]*([Ee][0-9]+)?)?|([.][0-9]+([Ee][-+]?[0-9]+)?))$/) {
1 # decimal number, possibly floating point
1 ret = str + 0
1 } else
1 ret = "NOT-A-NUMBER"
1
1 return ret
1 }
1
1 # BEGIN { # gawk test harness
1 # a[1] = "25"
1 # a[2] = ".31"
1 # a[3] = "0123"
1 # a[4] = "0xdeadBEEF"
1 # a[5] = "123.45"
1 # a[6] = "1.e3"
1 # a[7] = "1.32"
1 # a[8] = "1.32E2"
1 #
1 # for (i = 1; i in a; i++)
1 # print a[i], strtonum(a[i]), mystrtonum(a[i])
1 # }
1
1 The function first looks for C-style octal numbers (base 8). If the
1 input string matches a regular expression describing octal numbers, then
1 'mystrtonum()' loops through each character in the string. It sets 'k'
1 to the index in '"1234567"' of the current octal digit. The return
1 value will either be the same number as the digit, or zero if the
1 character is not there, which will be true for a '0'. This is safe,
1 because the regexp test in the 'if' ensures that only octal values are
1 converted.
1
1 Similar logic applies to the code that checks for and converts a
1 hexadecimal value, which starts with '0x' or '0X'. The use of
1 'tolower()' simplifies the computation for finding the correct numeric
1 value for each hexadecimal digit.
1
1 Finally, if the string matches the (rather complicated) regexp for a
1 regular decimal integer or floating-point number, the computation 'ret =
1 str + 0' lets 'awk' convert the value to a number.
1
1 A commented-out test program is included, so that the function can be
1 tested with 'gawk' and the results compared to the built-in 'strtonum()'
1 function.
1