liblouis: Overview

1 
1 2.1 Overview
1 ============
1 
1 Many translation (contraction) tables have already been made up.  They
1 are included in the distribution in the tables directory and can be
1 studied as part of the documentation.  Some of the more helpful (and
1 normative) are listed in the following table:
1 
1 'chardefs.cti'
1      Character definitions for U.S. tables
1 'compress.ctb'
1      Remove excessive whitespace
1 'en-us-g1.ctb'
1      Uncontracted American English
1 'en-us-g2.ctb'
1      Contracted or Grade 2 American English
1 'en-us-brf.dis'
1      Make liblouis output conform to BRF standard
1 'en-us-comp8.ctb'
1      8-dot computer braille for use in coding examples
1 'en-us-comp6.ctb'
1      6-dot computer braille
1 'nemeth.ctb'
1      Nemeth Code translation for use with liblouisutdml
1 'nemeth_edit.ctb'
1      Fixes errors at the boundaries of math and text
1 
1    The names used for files containing translation tables are completely
1 arbitrary.  They are not interpreted in any way by the translator.
1 Contraction tables may be 8-bit ASCII files, UTF-8, 16-bit big-endian
1 Unicode files or 16-bit little-endian Unicode files.  Blank lines are
1 ignored.  Any leading and trailing whitespace (any number of blanks
1 and/or tabs) is ignored.  Lines which begin with a number sign or hatch
1 mark ('#') are ignored, i.e.  they are comments.  If the number sign is
1 not the first non-blank character in the line, it is treated as an
1 ordinary character.  If the first non-blank character is less-than ('<')
1 the line is also treated as a comment.  This makes it possible to mark
1 up tables as xhtml documents.  Lines which are not blank or comments
1 define table entries.  The general format of a table entry is:
1 
1      opcode operands comments
1 
1    Table entries may not be split between lines.  The opcode is a
1 mnemonic that specifies what the entry does.  The operands may be
1 character sequences, braille dot patterns or occasionally something
1 else.  They are described for each opcode, please ⇒Opcode Index.
1 With some exceptions, opcodes expect a certain number of operands.  Any
1 text on the line after the last operand is ignored, and may be a
1 comment.  A few opcodes accept a variable number of operands.  In this
1 case a number sign ('#') begins a comment unless it is preceded by a
1 backslash ('\').
1 
1    Here are some examples of table entries.
1 
1      # This is a comment.
1      always world 456-2456 A word and the dot pattern of its contraction
1 
1    Most opcodes have both a "characters" operand and a "dots" operand,
1 though some have only one and a few have other types.
1 
1    The characters operand consists of any combination of characters and
1 escape sequences proceeded and followed by whitespace.  Escape sequences
1 are used to represent difficult characters.  They begin with a backslash
1 ('\').  They are:
1 
1 '\'
1      backslash
1 '\f'
1      form feed
1 '\n'
1      new line
1 '\r'
1      carriage return
1 '\s'
1      blank (space)
1 '\t'
1      horizontal tab
1 '\v'
1      vertical tab
1 '\e'
1      "escape" character (hex 1b, dec 27)
1 '\xhhhh'
1      4-digit hexadecimal value of a character
1 
1    If liblouis has been compiled for 32-bit Unicode the following are
1 also recognized.
1 
1 '\yhhhhh'
1      5-digit (20 bit) character
1 '\zhhhhhhhh'
1      Full 32-bit value.
1 
1    The dots operand is a braille dot pattern.  The real braille dots, 1
1 through 8, must be specified with their standard numbers.  liblouis
1 recognizes "virtual dots," which are used for special purposes, such as
1 distinguishing accent marks.  There are seven virtual dots.  They are
1 specified by the number 9 and the letters 'a' through 'f'.  For a
1 multi-cell dot pattern, the cell specifications must be separated from
1 one another by a dash ('-').  For example, the contraction for the
1 English word 'lord' (the letter 'l' preceded by dot 5) would be
1 specified as 5-123.  A space may be specified with the special dot
1 number 0.
1 
1    An opcode which is helpful in writing translation tables is
1 'include'.  Its format is:
1 
1      include filename
1 
1    It reads the file indicated by 'filename' and incorporates or
1 includes its entries into the table.  Included files can include other
1 files, which can include other files, etc.  For an example, see what
1 files are included by the entry 'include en-us-g1.ctb' in the table
1 'en-us-g2.ctb'.  If the included file is not in the same directory as
1 the main table, use a full path name for filename.  Tables can also be
1 specified in a table list, in which the table names are separated by
1 commas and given as a single table name in calls to the translation
1 functions.
1 
1    The order of the various types of opcodes or table entries is
1 important.  Character-definition opcodes should come first.  However, if
1 the optional 'display' opcode (⇒display display opcode.) is used
1 it should precede character-definition opcodes.  Braille-indicator
1 opcodes should come next.  Translation opcodes should follow.  The
1 'context' opcode (⇒context context opcode.) is a translation
1 opcode, even though it is considered along with the multipass opcodes.
1 These latter should follow the translation opcodes.  The 'correct'
1 opcode (⇒correct correct opcode.) can be used anywhere after the
1 character-definition opcodes, but it is probably a good idea to group
11 all 'correct' opcodes together.  The 'include' opcode (⇒include
 include opcode.) can be used anywhere, but the order of entries in the
1 combined table must conform to the order given above.  Within each type
1 of opcode, the order of entries is generally unimportant.  Thus the
1 translation entries can be grouped alphabetically or in any other order
1 that is convenient.  Hyphenation tables may be specified either with an
1 'include' opcode or as part of a table list.  They should come after
1 everything else.  Character-definition opcodes are necessary for
1 hyphenation tables to work.
1