Generated by

java.text Documentation Differences

This file contains all the changes in documentation in the package java.text as colored differences. Deletions are shown like this, and additions are shown like this.
If no deletions or additions are shown in an entry, the HTML tags will be what has changed. The new HTML tags are shown in the differences. If no documentation existed, and then some was added in a later version, this change is noted in the appropriate class pages of differences, but the change is not shown on this page. Only changes in existing text are shown here. Similarly, documentation which was inherited from another class or interface is not shown here.
Note that an HTML error in the new documentation may cause the display of other documentation changes to be presented incorrectly. For instance, failure to close a <code> tag will cause all subsequent paragraphs to be displayed differently.

Class Bidi, constructor Bidi(AttributedCharacterIterator)

Create Bidi from the given paragraph of text.

The RUN_DIRECTION attribute in the text if present determines the base direction (left-to-right or right-to-left). If not present the base direction is computes using the Unicode Bidirectional Algorithm defaulting to left-to-right if there are no strong directional characters in the text. This attribute if present must be applied to all the text in the paragraph.

The BIDI_EMBEDDING attribute in the text if present represents embedding level information. Negative values from -1 to -62 indicate overrides at the absolute value of the level. Positive values from 1 to 62 indicate embeddings. Where values are zero or not defined the base embedding level as determined by the base direction is assumed.

The NUMERIC_SHAPING attribute in the text if present converts European digits to other decimal digits before running the bidi algorithm. This attribute if present must be applied to all the text in the paragraph. @param paragraph a paragraph of text with optional character and paragraph attribute information @see TextAttribute.#BIDI_EMBEDDING @see TextAttribute.#NUMERIC_SHAPING @see TextAttribute.#RUN_DIRECTION

Class DecimalFormatSymbols

This class represents the set of symbols (such as the decimal separator the grouping separator and so on) needed by DecimalFormat to format numbers. DecimalFormat creates for itself an instance of DecimalFormatSymbols from its locale data. If you need to change any of these symbols you can get the DecimalFormatSymbols object from your DecimalFormat and modify it. @see java.util.Locale @see DecimalFormat @version 1.35 1237 01/0316/0102 @author Mark Davis @author Alan Liu

Class RuleBasedCollator

The RuleBasedCollator class is a concrete subclass of Collator that provides a simple data-driven table collator. With this class you can create a customized table-based Collator. RuleBasedCollator maps characters to sort keys.

RuleBasedCollator has the following restrictions for efficiency (other subclasses may be used for more complex languages) :

  1. If a special collation rule controlled by a <modifier> is specified it applies to the whole collator object.
  2. All non-mentioned Unicode characters are at the end of the collation order.

The collation table is composed of a list of collation rules where each rule is of one of three forms:

 <modifier> <relation> <text-argument> <reset> <text-argument> 
The definitions of the rule elements is as follows:

This sounds more complicated than it is in practice. For example the following are equivalent ways of expressing the same thing:

 a < b < c a < b & b < c a < c & a < b 
Notice that the order is important as the subsequent item goes immediately after the text-argument. The following are not equivalent:
 a < b & a < c a < c & a < b 
Either the text-argument must already be present in the sequence or some initial substring of the text-argument must be present. (e.g. "a < b & ae < e" is valid since "a" is present in the sequence before "ae" is reset). In this latter case "ae" is not entered and treated as a single character; instead "e" is sorted as if it were expanded to two characters: "a" followed by an "e". This difference appears in natural languages: in traditional Spanish "ch" is treated as though it contracts to a single character (expressed as "c < ch < d") while in traditional German a-umlaut is treated as though it expanded to two characters (expressed as "a A < b B ... &ae;\u00e3&AE;\u00c3"). [\u00e3 and \u00c3 are of course the escape sequences for a-umlaut.]

Ignorable Characters

For ignorable characters the first rule must start with a relation (the examples we have used above are really fragments; "a < b" really should be "< a < b"). If however the first relation is not "<" then all the all text-arguments up to the first "<" are ignorable. For example " - < a < b" makes "-" an ignorable character as we saw earlier in the word "black-birds". In the samples for different languages you see that most accents are ignorable.

Normalization and Accents

RuleBasedCollator automatically processes its rule table to include both pre-composed and combining-character versions of accented characters. Even if the provided rule string contains only base characters and separate combining accent characters the pre-composed accented characters matching all canonical combinations of characters from the rule string will be entered in the table.

This allows you to use a RuleBasedCollator to compare accented strings even when the collator is set to NO_DECOMPOSITION. There are two caveats however. First if the strings to be collated contain combining sequences that may not be in canonical order you should set the collator to CANONICAL_DECOMPOSITION or FULL_DECOMPOSITION to enable sorting of combining sequences. Second if the strings contain characters with compatibility decompositions (such as full-width and half-width forms) you must use FULL_DECOMPOSITION since the rule tables only include canonical mappings. For more information see The Unicode Standard Version 2.0.)


The following are errors:

If you produce one of these errors a RuleBasedCollator throws a ParseException.


Simple: "< a < b < c < d"

Norwegian: "< a A< b B< c C< d D< e E< f F< g G< h H< i I< j J < k K< l L< m M< n N< o O< p P< q Q< r R< s S< t T < u U< v V< w W< x X< y Y< z Z < \u00E5=a\u030A \u00C5=A\u030A ;aa AA< \u00E6 \u00C6< \u00F8 \u00D8"

Normally to create a rule-based Collator object you will use Collator's factory method getInstance. However to create a rule-based Collator object with specialized rules tailored to your needs you construct the RuleBasedCollator with the rules contained in a String object. For example:

 String Simple = "< a< b< c< d"; RuleBasedCollator mySimple = new RuleBasedCollator(Simple); 
 String Norwegian = "< a A< b B< c C< d D< e E< f F< g G< h H< i I< j J" + "< k K< l L< m M< n N< o O< p P< q Q< r R< s S< t T" + "< u U< v V< w W< x X< y Y< z Z" + "< \u00E5=a\u030A \u00C5=A\u030A" + ";aa AA< \u00E6 \u00C6< \u00F8 \u00D8"; RuleBasedCollator myNorwegian = new RuleBasedCollator(Norwegian); 

Combining Collators is as simple as concatenating strings. Here's an example that combines two Collators from two different locales:

 // Create an en_US Collator object RuleBasedCollator en_USCollator = (RuleBasedCollator) Collator.getInstance(new Locale("en" "US" "")); // Create a da_DK Collator object RuleBasedCollator da_DKCollator = (RuleBasedCollator) Collator.getInstance(new Locale("da" "DK" "")); // Combine the two // First get the collation rules from en_USCollator String en_USRules = en_USCollator.getRules(); // Second get the collation rules from da_DKCollator String da_DKRules = da_DKCollator.getRules(); RuleBasedCollator newCollator = new RuleBasedCollator(en_USRules + da_DKRules); // newCollator has the combined rules 

Another more interesting example would be to make changes on an existing table to create a new Collator object. For example add "&C< ch cH Ch CH" to the en_USCollator object to create your own:

 // Create a new Collator object with additional rules String addRules = "&C< ch cH Ch CH"; RuleBasedCollator myCollator = new RuleBasedCollator(en_USCollator + addRules); // myCollator contains the new rules 

The following example demonstrates how to change the order of non-spacing accents

 // old rule String oldRules = "=\u0301;\u0300;\u0302;\u0308" // main accents + ";\u0327;\u0303;\u0304;\u0305" // main accents + ";\u0306;\u0307;\u0309;\u030A" // main accents + ";\u030B;\u030C;\u030D;\u030E" // main accents + ";\u030F;\u0310;\u0311;\u0312" // main accents + "< a A ; ae AE ; \u00e6 \u00c6" + "< b B < c C < e E & C < d D"; // change the order of accent characters String addOn = "& \u0300 ; \u0308 ; \u0302"; RuleBasedCollator myCollator = new RuleBasedCollator(oldRules + addOn); 

The last example shows how to put new primary ordering in before the default setting. For example in Japanese Collator you can either sort English characters before or after Japanese characters

 // get en_US Collator rules RuleBasedCollator en_USCollator = (RuleBasedCollator)Collator.getInstance(Locale.US); // add a few Japanese character to sort before English characters // suppose the last character before the first base letter 'a' in // the English collation rule is \u2212 String jaString = "& \u2212 < \u3041 \u3042 < \u3043 \u3044"; RuleBasedCollator myJapaneseCollator = new RuleBasedCollator(en_USCollator.getRules() + jaString); 
@see Collator @see CollationElementIterator @version 1.25 07/24/98 @author Helena Shih Laura Werner Richard Gillam