libidn: Java API

1 
1 12 Java API
1 ***********
1 
1 Libidn has been ported to the Java programming language, and as a
1 consequence most of the API is available to native Java applications.
1 This section contain notes on this support, complete documentation is
1 pending.
1 
11    The Java library, if Libidn has been built with Java support (⇒
 Downloading and Installing), will be placed in ‘java/libidn-1.34.jar’.
1 The source code is below ‘java/’ in Maven directory layout, and there is
1 a Maven ‘pom.xml’ build script as well.  Source code files are in
1 ‘java/src/main/java/gnu/inet/encoding/’.
1 
1 12.1 Overview
1 =============
1 
1 This package provides a Java implementation of the Internationalized
1 Domain Names in Applications (IDNA) standard.  It is written entirely in
1 Java and does not require any additional libraries to be set up.
1 
1    The gnu.inet.encoding.IDNA class offers two public functions, toASCII
1 and toUnicode which can be used as follows:
1 
1      gnu.inet.encoding.IDNA.toASCII("blöds.züg");
1      gnu.inet.encoding.IDNA.toUnicode("xn--blds-6qa.xn--zg-xka");
1 
1 12.2 Miscellaneous Programs
1 ===========================
1 
1 The ‘java/src/util/java/’ directory contains several programs that are
1 related to the Java part of GNU Libidn, but that don’t need to be
1 included in the main source tree or the JAR file.
1 
1 12.2.1 GenerateRFC3454
1 ----------------------
1 
1 This program parses RFC3454 and creates the RFC3454.java program that is
1 required during the StringPrep phase.
1 
1    The RFC can be found at various locations, for example at
1 <http://www.ietf.org/rfc/rfc3454.txt>.
1 
1    Invoke the program as follows:
1 
1      $ java GenerateRFC3454
1      Creating RFC3454.java... Ok.
1 
1 12.2.2 GenerateNFKC
1 -------------------
1 
1 The GenerateNFKC program parses the Unicode character database file and
1 generates all the tables required for NFKC. This program requires the
1 two files UnicodeData.txt and CompositionExclusions.txt of version 3.2
1 of the Unicode files.  Note that RFC3454 (Stringprep) defines that
1 Unicode version 3.2 is to be used, not the latest version.
1 
1    The Unicode data files can be found at
1 <http://www.unicode.org/Public/>.
1 
1    Invoke the program as follows:
1 
1      $ java GenerateNFKC
1      Creating CombiningClass.java... Ok.
1      Creating DecompositionKeys.java... Ok.
1      Creating DecompositionMappings.java... Ok.
1      Creating Composition.java... Ok.
1 
1 12.2.3 TestIDNA
1 ---------------
1 
1 The TestIDNA program allows to test the IDNA implementation manually or
1 against Simon Josefsson’s test vectors.
1 
1    The test vectors can be found at the Libidn homepage,
1 <http://www.gnu.org/software/libidn/>.
1 
1    To test the transformation manually, use:
1 
1      $ java -cp .:/usr/share/java/libidn.jar TestIDNA -a <string to test>
1      Input: <string to test>
1      Output: <toASCII(string to test)>
1      $ java -cp .:/usr/share/java/libidn.jar TestIDNA -u <string to test>
1      Input: <string to test>
1      Output: <toUnicode(string to test)>
1 
1    To test against draft-josefsson-idn-test-vectors.html, use:
1 
1      $ java -cp .:/usr/share/java/libidn/libidn.jar TestIDNA -t
1      No errors detected!
1 
1 12.2.4 TestNFKC
1 ---------------
1 
1 The TestNFKC program allows to test the NFKC implementation manually or
1 against the NormalizationTest.txt file from the Unicode data files.
1 
1    To test the normalization manually, use:
1 
1      $ java -cp .:/usr/share/java/libidn.jar TestNFKC <string to test>
1      Input: <string to test>
1      Output: <nfkc version of the string to test>
1 
1    To test against NormalizationTest.txt:
1 
1      $ java -cp .:/usr/share/java/libidn.jar TestNFKC
1      No errors detected!
1 
1 12.3 Possible Problems
1 ======================
1 
1 Beware of Bugs: This Java API needs a lot more testing, especially with
1 "exotic" character sets.  While it works for me, it may not work for
1 you.
1 
1    Encoding of your Java sources: If you are using non-ASCII characters
1 in your Java source code, make sure javac compiles your programs with
1 the correct encoding.  If necessary specify the encoding using the
1 -encoding parameter.
1 
1    Java Unicode handling: Java 1.4 only handles 16-bit Unicode code
1 points (i.e.  characters in the Basic Multilingual Plane), this
1 implementation therefore ignores all references to so-called
1 Supplementary Characters (U+10000 to U+10FFFF). Starting from Java 1.5,
1 these characters will also be supported by Java, but this will require
1 changes to this library.  See also the next section.
1 
1 12.4 A Note on Java and Unicode
1 ===============================
1 
1 This library uses Java’s built-in ’char’ datatype.  Up to Java 1.4, this
1 datatype only supports 16-bit Unicode code points, also called the Basic
1 Multilingual Plane.  For this reason, this library doesn’t work for
1 Supplementary Characters (i.e.  characters from U+10000 to U+10FFFF).
1 All references to such characters are silently ignored.
1 
1    Starting from Java 1.5, also Supplementary Characters will be
1 supported.  However, this will require changes in the present version of
1 the library.  Java 1.5 is currently in beta status.
1 
1    For more information refer to the documentation of
1 java.lang.Character in the JDK API.
1