coreutils: csplit invocation
1
1 5.4 ‘csplit’: Split a file into context-determined pieces
1 =========================================================
1
1 ‘csplit’ creates zero or more output files containing sections of INPUT
1 (standard input if INPUT is ‘-’). Synopsis:
1
1 csplit [OPTION]... INPUT PATTERN...
1
1 The contents of the output files are determined by the PATTERN
1 arguments, as detailed below. An error occurs if a PATTERN argument
1 refers to a nonexistent line of the input file (e.g., if no remaining
1 line matches a given regular expression). After every PATTERN has been
1 matched, any remaining input is copied into one last output file.
1
1 By default, ‘csplit’ prints the number of bytes written to each
1 output file after it has been created.
1
1 The types of pattern arguments are:
1
1 ‘N’
1 Create an output file containing the input up to but not including
1 line N (a positive integer). If followed by a repeat count, also
1 create an output file containing the next N lines of the input file
1 once for each repeat.
1
1 ‘/REGEXP/[OFFSET]’
1 Create an output file containing the current line up to (but not
1 including) the next line of the input file that contains a match
1 for REGEXP. The optional OFFSET is an integer. If it is given,
1 the input up to (but not including) the matching line plus or minus
1 OFFSET is put into the output file, and the line after that begins
1 the next section of input.
1
1 ‘%REGEXP%[OFFSET]’
1 Like the previous type, except that it does not create an output
1 file, so that section of the input file is effectively ignored.
1
1 ‘{REPEAT-COUNT}’
1 Repeat the previous pattern REPEAT-COUNT additional times. The
1 REPEAT-COUNT can either be a positive integer or an asterisk,
1 meaning repeat as many times as necessary until the input is
1 exhausted.
1
1 The output files’ names consist of a prefix (‘xx’ by default)
1 followed by a suffix. By default, the suffix is an ascending sequence
1 of two-digit decimal numbers from ‘00’ to ‘99’. In any case,
1 concatenating the output files in sorted order by file name produces the
1 original input file.
1
1 By default, if ‘csplit’ encounters an error or receives a hangup,
1 interrupt, quit, or terminate signal, it removes any output files that
1 it has created so far before it exits.
1
11 The program accepts the following options. Also see ⇒Common
options.
1
1 ‘-f PREFIX’
1 ‘--prefix=PREFIX’
1 Use PREFIX as the output file name prefix.
1
1 ‘-b FORMAT’
1 ‘--suffix-format=FORMAT’
1 Use FORMAT as the output file name suffix. When this option is
1 specified, the suffix string must include exactly one
1 ‘printf(3)’-style conversion specification, possibly including
1 format specification flags, a field width, a precision
1 specifications, or all of these kinds of modifiers. The format
1 letter must convert a binary unsigned integer argument to readable
1 form. The format letters ‘d’ and ‘i’ are aliases for ‘u’, and the
1 ‘u’, ‘o’, ‘x’, and ‘X’ conversions are allowed. The entire FORMAT
1 is given (with the current output file number) to ‘sprintf(3)’ to
1 form the file name suffixes for each of the individual output files
1 in turn. If this option is used, the ‘--digits’ option is ignored.
1
1 ‘-n DIGITS’
1 ‘--digits=DIGITS’
1 Use output file names containing numbers that are DIGITS digits
1 long instead of the default 2.
1
1 ‘-k’
1 ‘--keep-files’
1 Do not remove output files when errors are encountered.
1
1 ‘--suppress-matched’
1 Do not output lines matching the specified PATTERN. I.e., suppress
1 the boundary line from the start of the second and subsequent
1 splits.
1
1 ‘-z’
1 ‘--elide-empty-files’
1 Suppress the generation of zero-length output files. (In cases
1 where the section delimiters of the input file are supposed to mark
1 the first lines of each of the sections, the first output file will
1 generally be a zero-length file unless you use this option.) The
1 output file sequence numbers always run consecutively starting from
1 0, even when this option is specified.
1
1 ‘-s’
1 ‘-q’
1 ‘--silent’
1 ‘--quiet’
1 Do not print counts of output file sizes.
1
1 An exit status of zero indicates success, and a nonzero value
1 indicates failure.
1
1 Here is an example of its usage. First, create an empty directory
1 for the exercise, and cd into it:
1
1 $ mkdir d && cd d
1
1 Now, split the sequence of 1..14 on lines that end with 0 or 5:
1
1 $ seq 14 | csplit - '/[05]$/' '{*}'
1 8
1 10
1 15
1
1 Each number printed above is the size of an output file that csplit
1 has just created. List the names of those output files:
1
1 $ ls
1 xx00 xx01 xx02
1
1 Use ‘head’ to show their contents:
1
1 $ head xx*
1 ==> xx00 <==
1 1
1 2
1 3
1 4
1
1 ==> xx01 <==
1 5
1 6
1 7
1 8
1 9
1
1 ==> xx02 <==
1 10
1 11
1 12
1 13
1 14
1
1 Example of splitting input by empty lines:
1
1 $ csplit --suppress-matched INPUT.TXT '/^$/' '{*}'
1