aspell: Compound Words

1 
1 C.1 Compound Words
1 ==================
1 
1 In some languages, such as German, it is acceptable to string two words
1 together, thus forming a compound word.  However, there are rules to
1 when this can be done.  Furthermore, it is not always sufficient to
1 simply concatenate the two words.  For example, sometimes a letter is
1 inserted between the two words.  Aspell currently has support for
1 unconditionally stringing words together.  I tried implementing more
1 sophisticated support for compound words in Aspell but it was too
1 limiting and no one used it.
1 
1    After receiving feedback from several people it seems that acceptable
1 support for compound words involved two basically independent parts.
1 If this is not sufficient for your language please let me know.
1 
1 Part One
1 ========
1 
1 Describes how the word needs to be changed when forming a compound
1 
1      CMP <flag> <strip> <add> <cond> <cond2>
1 
1      <flag>  is the compound flag
1      <strip> is the string to strip or 0 for the null string
1      <add>   is the string to add or 0 for the null string
1      <cond>  is the condition to match at the end of the current word
1      <cond2> is the condition to match at the beginning of the next word
1 
1 All but the last field are the same as a suffix entry in the existing
1 affix code.
1 
1    <cond> is a simplified regular expression.  Some examples:
1      . (for anything)
1      e
1      [^aeiou]y
1      [^ey]
1      [aeiou]y
1 
1    It does not seem necessary to change the beginning of a word when
1 forming compounds
1 
1 Part Two
1 ========
1 
1 Describes the position a word can appear in (beginning, middle, or end)
1 and with which words.
1 
1    To do this each word can be assigned a category.  Then each category
1 can be given a set of rules to describe how it can be used in a
1 compound word for example
1 
1      A + B: indicates that category A may appear at the beginning of a
1        word when followed by a category B word.  When combined it is then
1        considered a category B word.
1      A + C + B: here a C word may only appear between an A or B word
1      A + A + B
1      A + A
1      A + A + A
1      etc..
1 
1    I have not decided if a word should be allowed to belong to more than
1 one category as a new category can be created in necessary to mean
1 words in both category A and B for example.
1 
1 C.1.1 To Implement
1 ------------------
1 
1 To implement support for compound words based on the above description
1 the following will need to be done:
1 
1   1. expand the affix code to support special compound flags as
1      described in part one
1 
1   2. write code to store the conditions as described in part two
1 
1   3. expand the compound checking code to check against the conditions
1 
1   4. expand the dictionary format to store the necessary compound info
1      with the word
1 
1 
1    I don't know when I will be able to actually implement this.  If you
1 would like to try please let me know.
1