libidn: PR29 Functions

1 
1 8 PR29 Functions
1 ****************
1 
1 A deficiency in the specification of Unicode Normalization Forms has
1 been found.  The consequence is that some strings can be normalized into
1 different strings by different implementations.  In other words, two
1 different implementations may return different output for the same input
1 (because the interpretation of the specification is ambiguous).
1 Further, an implementation invoked again on the one of the output
1 strings may return a different string (because one of the interpretation
1 of the ambiguous specification make normalization non-idempotent).
1 Fortunately, only a select few character sequence exhibit this problem,
1 and none of them are expected to occur in natural languages (due to
1 different linguistic uses of the involved characters).
1 
1    A full discussion of the problem may be found at:
1 
1    <http://www.unicode.org/review/pr-29.html>
1 
1    The PR29 functions below allow you to detect the problem sequence.
1 So when would you want to use these functions?  For most applications,
1 such as those using Nameprep for IDN, this is likely only to be an
1 interoperability problem.  Thus, you may not want to care about it, as
1 the character sequences will rarely occur naturally.  However, if you
1 are using a profile, such as SASLPrep, to process authentication tokens;
1 authorization tokens; or passwords, there is a real danger that
1 attackers may try to use the peculiarities in these strings to attack
1 parts of your system.  As only a small number of strings, and no
1 naturally occurring strings, exhibit this problem, the conservative
1 approach of rejecting the strings is recommended.  If this approach is
1 not used, you should instead verify that all parts of your system, that
1 process the tokens and passwords, use a NFKC implementation that produce
1 the same output for the same input.
1 
1    Technically inclined readers may be interested in knowing more about
1 the implementation aspects of the PR29 flaw.  ⇒PR29 discussion.
1 
1 8.1 Header file ‘pr29.h’
1 ========================
1 
1 To use the functions explained in this chapter, you need to include the
1 file ‘pr29.h’ using:
1 
1      #include <pr29.h>
1 
1 8.2 Core Functions
1 ==================
1 
1 pr29_4
1 ------
1 
1  -- Function: int pr29_4 (const uint32_t * IN, size_t LEN)
1      IN: input array with unicode code points.
1 
1      LEN: length of input array with unicode code points.
1 
1      Check the input to see if it may be normalized into different
1      strings by different NFKC implementations, due to an anomaly in the
1      NFKC specifications.
1 
1      Return value: Returns the ‘Pr29_rc’ value ‘PR29_SUCCESS’ on
1      success, and ‘PR29_PROBLEM’ if the input sequence is a "problem
1      sequence" (i.e., may be normalized into different strings by
1      different implementations).
1 
1 8.3 Utility Functions
1 =====================
1 
1 pr29_4z
1 -------
1 
1  -- Function: int pr29_4z (const uint32_t * IN)
1      IN: zero terminated array of Unicode code points.
1 
1      Check the input to see if it may be normalized into different
1      strings by different NFKC implementations, due to an anomaly in the
1      NFKC specifications.
1 
1      Return value: Returns the ‘Pr29_rc’ value ‘PR29_SUCCESS’ on
1      success, and ‘PR29_PROBLEM’ if the input sequence is a "problem
1      sequence" (i.e., may be normalized into different strings by
1      different implementations).
1 
1 pr29_8z
1 -------
1 
1  -- Function: int pr29_8z (const char * IN)
1      IN: zero terminated input UTF-8 string.
1 
1      Check the input to see if it may be normalized into different
1      strings by different NFKC implementations, due to an anomaly in the
1      NFKC specifications.
1 
1      Return value: Returns the ‘Pr29_rc’ value ‘PR29_SUCCESS’ on
1      success, and ‘PR29_PROBLEM’ if the input sequence is a "problem
1      sequence" (i.e., may be normalized into different strings by
1      different implementations), or ‘PR29_STRINGPREP_ERROR’ if there was
1      a problem converting the string from UTF-8 to UCS-4.
1 
1 8.4 Error Handling
1 ==================
1 
1 pr29_strerror
1 -------------
1 
1  -- Function: const char * pr29_strerror (Pr29_rc RC)
1      RC: an ‘Pr29_rc’ return code.
1 
1      Convert a return code integer to a text string.  This string can be
1      used to output a diagnostic message to the user.
1 
1      *PR29_SUCCESS:* Successful operation.  This value is guaranteed to
1      always be zero, the remaining ones are only guaranteed to hold
1      non-zero values, for logical comparison purposes.
1 
1      *PR29_PROBLEM:* A problem sequence was encountered.
1 
1      *PR29_STRINGPREP_ERROR:* The character set conversion failed (only
1      for ‘pr29_8z()’ ).
1 
1      Return value: Returns a pointer to a statically allocated string
1      containing a description of the error with the return code ‘rc’ .
1