liblouis: The Context and Multipass Opcodes
1
1 2.11 The Context and Multipass Opcodes
1 ======================================
1
1 The 'context' and multipass opcodes ('pass2', 'pass3' and 'pass4')
1 provide translation capabilities beyond those of the basic translation
1 opcodes (⇒Translation Opcodes) discussed previously. The
1 multipass opcodes cause additional passes to be made over the string to
1 be translated. The number after the word 'pass' indicates in which pass
1 the entry is to be applied. If no multipass opcodes are given, only the
1 first translation pass is made. The 'context' opcode is basically a
1 multipass opcode for the first pass. It differs slightly from the
1 multipass opcodes per se. The format of all these opcodes is 'opcode
1 test action'. The specific opcodes are invoked as follows:
1
1 'context test action'
1 'pass2 test action'
1 'pass3 test action'
1 'pass4 test action'
1
1 The 'test' and 'action' operands have suboperands. Each suboperand
1 begins with a non-alphanumeric character and ends when another
1 non-alphanumeric character is encountered. The suboperands and their
1 initial characters are as follows.
1
1 '" (double quote)'
1 a string of characters. This string must be terminated by another
1 double quote. It may contain any characters. If a double quote is
1 needed within the string, it must be preceded by a backslash ('\').
1 If a space is needed, it must be represented by the escape sequence
1 \s. This suboperand is valid only in the test part of the
1 'context' opcode.
1
1 '@ (at sign)'
1 a sequence of dot patterns. Cells are separated by hyphens as
1 usual. This suboperand is not valid in the test part of the
1 context and correct opcodes.
1
1 '` (accent mark)'
1 If this is the beginning of the string being translated this
1 suboperand is true. It is valid only in the test part and must be
1 the first thing in this operand.
1
1 '~ (tilde)'
1 If this is the end of the string being translated this suboperand
1 is true. It is valid only in the test part and must be the last
1 thing in this operand.
1
1 '$ (dollar sign)'
1 a string of attributes, such as 'd' for digit, 'l' for letter, etc.
1 More than one attribute can be given. If you wish to check
1 characters with any attribute, use the letter 'a'. Input
1 characters are checked to see if they have at least one of the
1 attributes. The attribute string can be followed by numbers
1 specifying how many characters are to be checked. If no numbers
1 are given, 1 is assumed. If two numbers separated by a hyphen are
1 given, the input is checked to make sure that at least the first
1 number of characters with the attributes are present, but no more
1 than the second number. If only one number is present, then
1 exactly that many characters must have the attributes. A period
1 instead of the numbers indicates an indefinite number of characters
1 (for technical reasons the number of characters that are actually
1 matched is limited to 65535).
1
1 This suboperand is valid in all test parts but not in action parts.
1 For the characters which can be used in attribute strings, see the
1 following table.
1
1 '! (exclamation point)'
1 reverses the logical meaning of the suboperand which follows. For
1 example, !$d is true only if the character is _NOT_ a digit. This
1 suboperand is valid in test parts only.
1
1 '% (percent sign)'
11 the name of a class defined by the 'class' opcode (⇒class
class opcode.) or the name of a swap set defined by the swap
1 opcodes (⇒Swap Opcodes). Names may contain only letters.
1 The letters may be upper or lower-case. The case matters. Class
1 names may be used in test parts only. Swap names are valid
1 everywhere.
1
1 '{ (left brace)'
1 Name: the name of a grouping pair. The left brace indicates that
1 the first (or left) member of the pair is to be used in matching.
1 If this is between replacement brackets it must be the only item.
1 This is also valid in the action part.
1
1 '} (right brace)'
1 Name: the name of a grouping pair. The right brace indicates that
1 the second (or right) member is to be used in matching. See the
1 remarks on the left brace immediately above.
1
1 '/ (slash)'
1 Search the input for the expression following the slash and return
1 true if found. This can be used to set a variable.
1
1 '_ (underscore)'
1 Move backward. If a number follows, move backward that number of
1 characters. The program never moves backward beyond the beginning
1 of the input string. This suboperand is valid only in test parts.
1
1 '[ (left bracket)'
1 start replacement here. This suboperand must always be paired with
1 a right bracket and is valid only in test parts. Multiple pairs of
1 square brackets in a single expression are not allowed.
1
1 '] (right bracket)'
1 end replacement here. This suboperand must always be paired with a
1 left bracket and is valid only in test parts.
1
1 '# (number sign or crosshatch)'
1 test or set a variable. Variables are referred to by numbers 1 to
1 50, for example, '#1', '#2', '#25'. Variables may be set by one
1 'context' or multipass opcode and tested by another. Thus, an
1 operation that occurs at one place in a translation can tell an
1 operation that occurs later about itself. This feature will be
1 used in math translation, and it may also help to alleviate the
1 need for new opcodes. This suboperand is valid everywhere.
1
1 Variables are set in the action part. To set a variable use an
1 expression like '#1=1', '#2=5', etc. Variables are also
1 incremented and decremented in the action part with expressions
1 like '#1+', '#3-', etc. These operators increment or decrement the
1 variable by 1.
1
1 Variables are tested in the test part with expressions like '#1=2',
1 '#3<4', '#5>6', etc.
1
1 '* (asterisk)'
1 Copy the characters or dot patterns in the input within the
1 replacement brackets into the output and discard anything else that
1 may match. This feature is used, for example, for handling numeric
1 subscripts in Nemeth. This suboperand is valid only in action
1 parts.
1
1 '? (question mark)'
1 Valid only in the action part. The characters to be replaced are
1 simply ignored. That is, they are replaced with nothing. If
1 either member of a grouping pair is in the replace brackets the
1 other member at the same level is also removed.
1
1 The characters which can be used in attribute strings are as follows:
1
1 'a'
1 any attribute
1 'd'
1 digit
1 'D'
1 literary digit
1 'l'
1 letter
1 'm'
1 math
1 'p'
1 punctuation
1 'S'
1 sign
1 's'
1 space
1 'U'
1 uppercase
1 'u'
1 lowercase
1 'w'
1 first user-defined class
1 'x'
1 second user-defined class
1 'y'
1 third user-defined class
1 'z'
1 fourth user-defined class
1
1 The following illustrates the algorithm how text is evaluated with
1 multipass expressions:
1
1 Loop over context, pass2, pass3 and pass4 and do the following for each
1 pass:
1
1 a. Match the text following the cursor against all expressions in the
1 current pass
1 b. If there is no match: shift the cursor one position to the right
1 and continue the loop
1 c. If there is a match: choose the longest match
1 d. Do the replacement (everything between square brackets)
1 e. Place the cursor after the replaced text
1 f. continue loop
1