sed: Branching and flow control
1
1 6.4 Branching and Flow Control
1 ==============================
1
1 The branching commands 'b', 't', and 'T' enable changing the flow of
1 'sed' programs.
1
1 By default, 'sed' reads an input line into the pattern buffer, then
1 continues to processes all commands in order. Commands without
1 addresses affect all lines. Commands with addresses affect only
1 matching lines. ⇒Execution Cycle and ⇒Addresses overview.
1
1 'sed' does not support a typical 'if/then' construct. Instead, some
1 commands can be used as conditionals or to change the default flow
1 control:
1
1 'd'
1 delete (clears) the current pattern space, and restart the program
1 cycle without processing the rest of the commands and without
1 printing the pattern space.
1
1 'D'
1 delete the contents of the pattern space _up to the first newline_,
1 and restart the program cycle without processing the rest of the
1 commands and without printing the pattern space.
1
1 '[addr]X'
1 '[addr]{ X ; X ; X }'
1 '/regexp/X'
1 '/regexp/{ X ; X ; X }'
1 Addresses and regular expressions can be used as an 'if/then'
1 conditional: If [ADDR] matches the current pattern space, execute
1 the command(s). For example: The command '/^#/d' means: _if_ the
1 current pattern matches the regular expression '^#' (a line
1 starting with a hash), _then_ execute the 'd' command: delete the
1 line without printing it, and restart the program cycle
1 immediately.
1
1 'b'
1 branch unconditionally (that is: always jump to a label, skipping
1 or repeating other commands, without restarting a new cycle).
1 Combined with an address, the branch can be conditionally executed
1 on matched lines.
1
1 't'
1 branch conditionally (that is: jump to a label) _only if_ a 's///'
1 command has succeeded since the last input line was read or another
1 conditional branch was taken.
1
1 'T'
1 similar but opposite to the 't' command: branch only if there has
1 been _no_ successful substitutions since the last input line was
1 read.
1
1 The following two 'sed' programs are equivalent. The first
1 (contrived) example uses the 'b' command to skip the 's///' command on
1 lines containing '1'. The second example uses an address with negation
1 ('!') to perform substitution only on desired lines. The 'y///' command
1 is still executed on all lines:
1
1 $ printf '%s\n' a1 a2 a3 | sed -E '/1/bx ; s/a/z/ ; :x ; y/123/456/'
1 a4
1 z5
1 z6
1
1 $ printf '%s\n' a1 a2 a3 | sed -E '/1/!s/a/z/ ; y/123/456/'
1 a4
1 z5
1 z6
1
1 6.4.1 Branching and Cycles
1 --------------------------
1
1 The 'b','t' and 'T' commands can be followed by a label (typically a
1 single letter). Labels are defined with a colon followed by one or more
1 letters (e.g. ':x'). If the label is omitted the branch commands
1 restart the cycle. Note the difference between branching to a label and
1 restarting the cycle: when a cycle is restarted, 'sed' first prints the
1 current content of the pattern space, then reads the next input line
1 into the pattern space; Jumping to a label (even if it is at the
1 beginning of the program) does not print the pattern space and does not
1 read the next input line.
1
1 The following program is a no-op. The 'b' command (the only command
1 in the program) does not have a label, and thus simply restarts the
1 cycle. On each cycle, the pattern space is printed and the next input
1 line is read:
1
1 $ seq 3 | sed b
1 1
1 2
1 3
1
1 The following example is an infinite-loop - it doesn't terminate and
1 doesn't print anything. The 'b' command jumps to the 'x' label, and a
1 new cycle is never started:
1
1 $ seq 3 | sed ':x ; bx'
1
1 # The above command requires gnu sed (which supports additional
1 # commands following a label, without a newline). A portable equivalent:
1 # sed -e ':x' -e bx
1
1 Branching is often complemented with the 'n' or 'N' commands: both
1 commands read the next input line into the pattern space without waiting
1 for the cycle to restart. Before reading the next input line, 'n'
1 prints the current pattern space then empties it, while 'N' appends a
1 newline and the next input line to the pattern space.
1
1 Consider the following two examples:
1
1 $ seq 3 | sed ':x ; n ; bx'
1 1
1 2
1 3
1
1 $ seq 3 | sed ':x ; N ; bx'
1 1
1 2
1 3
1
1 * Both examples do not inf-loop, despite never starting a new cycle.
1
1 * In the first example, the 'n' commands first prints the content of
1 the pattern space, empties the pattern space then reads the next
1 input line.
1
1 * In the second example, the 'N' commands appends the next input line
1 to the pattern space (with a newline). Lines are accumulated in
1 the pattern space until there are no more input lines to read, then
1 the 'N' command terminates the 'sed' program. When the program
1 terminates, the end-of-cycle actions are performed, and the entire
1 pattern space is printed.
1
1 * The second example requires GNU 'sed', because it uses the
1 non-POSIX-standard behavior of 'N'. See the "'N' command on the
1 last line" paragraph in ⇒Reporting Bugs.
1
1 * To further examine the difference between the two examples, try the
1 following commands:
1 printf '%s\n' aa bb cc dd | sed ':x ; n ; = ; bx'
1 printf '%s\n' aa bb cc dd | sed ':x ; N ; = ; bx'
1 printf '%s\n' aa bb cc dd | sed ':x ; n ; s/\n/***/ ; bx'
1 printf '%s\n' aa bb cc dd | sed ':x ; N ; s/\n/***/ ; bx'
1
1 6.4.2 Branching example: joining lines
1 --------------------------------------
1
1 As a real-world example of using branching, consider the case of
1 quoted-printable (https://en.wikipedia.org/wiki/Quoted-printable) files,
1 typically used to encode email messages. In these files long lines are
1 split and marked with a "soft line break" consisting of a single '='
1 character at the end of the line:
1
1 $ cat jaques.txt
1 All the wor=
1 ld's a stag=
1 e,
1 And all the=
1 men and wo=
1 men merely =
1 players:
1 They have t=
1 heir exits =
1 and their e=
1 ntrances;
1 And one man=
1 in his tim=
1 e plays man=
1 y parts.
1
1 The following program uses an address match '/=$/' as a conditional:
1 If the current pattern space ends with a '=', it reads the next input
1 line using 'N', replaces all '=' characters which are followed by a
1 newline, and unconditionally branches ('b') to the beginning of the
1 program without restarting a new cycle. If the pattern space does not
1 ends with '=', the default action is performed: the pattern space is
1 printed and a new cycle is started:
1
1 $ sed ':x ; /=$/ { N ; s/=\n//g ; bx }' jaques.txt
1 All the world's a stage,
1 And all the men and women merely players:
1 They have their exits and their entrances;
1 And one man in his time plays many parts.
1
1 Here's an alternative program with a slightly different approach: On
1 all lines except the last, 'N' appends the line to the pattern space. A
1 substitution command then removes soft line breaks ('=' at the end of a
1 line, i.e. followed by a newline) by replacing them with an empty
1 string. _if_ the substitution was successful (meaning the pattern space
1 contained a line which should be joined), The conditional branch command
1 't' jumps to the beginning of the program without completing or
1 restarting the cycle. If the substitution failed (meaning there were no
1 soft line breaks), The 't' command will _not_ branch. Then, 'P' will
1 print the pattern space content until the first newline, and 'D' will
1 delete the pattern space content until the first new line. (To learn
1 more about 'N', 'P' and 'D' commands ⇒Multiline techniques).
1
1 $ sed ':x ; $!N ; s/=\n// ; tx ; P ; D' jaques.txt
1 All the world's a stage,
1 And all the men and women merely players:
1 They have their exits and their entrances;
1 And one man in his time plays many parts.
1
1 For more line-joining examples ⇒Joining lines.
1