java.text.Collator类的使用及代码示例

x33g5p2x  于2022-01-18 转载在 其他  
字(11.1k)|赞(0)|评价(0)|浏览(215)

本文整理了Java中java.text.Collator类的一些代码示例,展示了Collator类的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Collator类的具体详情如下:
包路径:java.text.Collator
类名称:Collator

Collator介绍

[英]Performs locale-sensitive string comparison. A concrete subclass, RuleBasedCollator, allows customization of the collation ordering by the use of rule sets.

Following the Unicode Consortium's specifications for the Unicode Collation Algorithm (UCA), there are 4 different levels of strength used in comparisons:

  • PRIMARY strength: Typically, this is used to denote differences between base characters (for example, "a" < "b"). It is the strongest difference. For example, dictionaries are divided into different sections by base character.
  • SECONDARY strength: Accents in the characters are considered secondary differences (for example, "as" < "às" < "at"). Other differences between letters can also be considered secondary differences, depending on the language. A secondary difference is ignored when there is a primary difference anywhere in the strings.
  • TERTIARY strength: Upper and lower case differences in characters are distinguished at tertiary strength (for example, "ao" < "Ao" < "aò"). In addition, a variant of a letter differs from the base form on the tertiary strength (such as "A" and "Ⓐ"). Another example is the difference between large and small Kana. A tertiary difference is ignored when there is a primary or secondary difference anywhere in the strings.
  • IDENTICAL strength: When all other strengths are equal, the IDENTICAL strength is used as a tiebreaker. The Unicode code point values of the NFD form of each string are compared, just in case there is no difference. For example, Hebrew cantellation marks are only distinguished at this strength. This strength should be used sparingly, as only code point value differences between two strings are an extremely rare occurrence. Using this strength substantially decreases the performance for both comparison and collation key generation APIs. This strength also increases the size of the collation key.

This Collator deals only with two decomposition modes, the canonical decomposition mode and one that does not use any decomposition. The compatibility decomposition mode java.text.Collator.FULL_DECOMPOSITION is not supported here. If the canonical decomposition mode is set, Collator handles un-normalized text properly, producing the same results as if the text were normalized in NFD. If canonical decomposition is turned off, it is the user's responsibility to ensure that all text is already in the appropriate form before performing a comparison or before getting a CollationKey.

Examples:

// Get the Collator for US English and set its strength to PRIMARY 
Collator usCollator = Collator.getInstance(Locale.US); 
usCollator.setStrength(Collator.PRIMARY); 
if (usCollator.compare("abc", "ABC") == 0) { 
System.out.println("Strings are equivalent"); 
}

The following example shows how to compare two strings using the collator for the default locale.

// Compare two strings in the default locale 
Collator myCollator = Collator.getInstance(); 
myCollator.setDecomposition(Collator.NO_DECOMPOSITION); 
if (myCollator.compare("\u00e0\u0325", "a\u0325\u0300") != 0) { 
System.out.println("\u00e0\u0325 is not equal to a\u0325\u0300 without decomposition"); 
myCollator.setDecomposition(Collator.CANONICAL_DECOMPOSITION); 
if (myCollator.compare("\u00e0\u0325", "a\u0325\u0300") != 0) { 
System.out.println("Error: \u00e0\u0325 should be equal to a\u0325\u0300 with decomposition"); 
} else { 
System.out.println("\u00e0\u0325 is equal to a\u0325\u0300 with decomposition"); 
} 
} else { 
System.out.println("Error: \u00e0\u0325 should be not equal to a\u0325\u0300 without decomposition"); 
}

[中]执行区分区域设置的字符串比较。一个具体的子类RuleBasedCalator允许通过使用规则集自定义排序规则顺序。
根据Unicode ConsortiumUnicode Collation Algorithm (UCA)的规范,在比较中使用了4种不同的强度级别:
*主要强度:通常用于表示基本字符之间的差异(例如,“a”<“b”)。这是最大的区别。例如,字典按基本字符划分为不同的部分。
*次要强度:字符中的重音被视为次要差异(例如,“as”<“s”<“at”)。字母之间的其他差异也可以视为次要差异,具体取决于语言。当字符串中的任何地方存在主差异时,将忽略次差异。
*三级强度:字符的大小写差异以三级强度区分(例如,“ao”<“ao”<“aò”)。此外,字母的变体在第三级强度(如“a”和“a”)上与基本形式不同Ⓐ另一个例子是大假名和小假名之间的差异。当字符串中的任何地方存在主或次差异时,将忽略第三个差异。
*相同强度:当所有其他强度相等时,相同强度用作平局决胜球。比较每个字符串的NFD形式的Unicode代码点值,以防万一没有差异。例如,希伯来语的钟声标记只有在这种强度下才能区分。这种强度应该谨慎使用,因为两个字符串之间只有代码点值的差异是极为罕见的。使用这种强度会大大降低比较和排序键生成API的性能。此强度还增加了排序规则键的大小。
这个Collator只处理两种分解模式,规范分解模式和不使用任何分解的模式。java的兼容性分解模式。文本科拉托。这里不支持完全分解。如果设置了规范化分解模式,Collator将正确处理未规范化的文本,产生与在NFD中规范化文本相同的结果。如果禁用规范分解,则用户有责任确保在执行比较或获取排序规则键之前,所有文本都已采用适当的格式。
示例:

// Get the Collator for US English and set its strength to PRIMARY 
Collator usCollator = Collator.getInstance(Locale.US); 
usCollator.setStrength(Collator.PRIMARY); 
if (usCollator.compare("abc", "ABC") == 0) { 
System.out.println("Strings are equivalent"); 
}

下面的示例演示如何使用默认区域设置的collator比较两个字符串。

// Compare two strings in the default locale 
Collator myCollator = Collator.getInstance(); 
myCollator.setDecomposition(Collator.NO_DECOMPOSITION); 
if (myCollator.compare("\u00e0\u0325", "a\u0325\u0300") != 0) { 
System.out.println("\u00e0\u0325 is not equal to a\u0325\u0300 without decomposition"); 
myCollator.setDecomposition(Collator.CANONICAL_DECOMPOSITION); 
if (myCollator.compare("\u00e0\u0325", "a\u0325\u0300") != 0) { 
System.out.println("Error: \u00e0\u0325 should be equal to a\u0325\u0300 with decomposition"); 
} else { 
System.out.println("\u00e0\u0325 is equal to a\u0325\u0300 with decomposition"); 
} 
} else { 
System.out.println("Error: \u00e0\u0325 should be not equal to a\u0325\u0300 without decomposition"); 
}

代码示例

代码示例来源:origin: robovm/robovm

/**
 * Returns a {@code Collator} instance which is appropriate for the user's default
 * {@code Locale}.
 * See "<a href="../util/Locale.html#default_locale">Be wary of the default locale</a>".
 */
public static Collator getInstance() {
  return getInstance(Locale.getDefault());
}

代码示例来源:origin: alibaba/druid

private int compare(String o1, String o2) {
  return Collator.getInstance().compare(o1, o2);
}

代码示例来源:origin: osmandapp/Osmand

public static net.osmand.Collator primaryCollator() {
  // romanian locale encounters diacritics as different symbols
  final java.text.Collator instance = Locale.getDefault().getLanguage().equals("ro")  ||
      Locale.getDefault().getLanguage().equals("cs") ||
      Locale.getDefault().getLanguage().equals("sk")? java.text.Collator.getInstance(Locale.US)
      : java.text.Collator.getInstance();
  instance.setStrength(java.text.Collator.PRIMARY);
  return wrapCollator(instance);
}

代码示例来源:origin: stackoverflow.com

Collator usCollator = Collator.getInstance(Locale.US);
usCollator.setStrength(Collator.PRIMARY); // ignores casing

Collections.sort(strings, usCollator);

代码示例来源:origin: pentaho/pentaho-kettle

int getDefaultCollationStrength( Locale aLocale ) {
 int defaultStrength = Collator.IDENTICAL;
 if ( aLocale != null ) {
  Collator curDefCollator = Collator.getInstance( aLocale );
  if ( curDefCollator != null ) {
   defaultStrength = curDefCollator.getStrength();
  }
 }
 return defaultStrength;
}

代码示例来源:origin: robovm/robovm

m_locale = new Locale(langValue.toLowerCase(), 
       Locale.getDefault().getCountry());
m_col = Collator.getInstance(m_locale);
                new Object[]{ langValue });  //"Could not find Collator for <sort xml:lang="+langValue);
 m_col = Collator.getInstance();

代码示例来源:origin: xalan/xalan

public Collator getCollator(String lang, String country) {
  return Collator.getInstance(new Locale(lang, country));
}

代码示例来源:origin: pchmn/MaterialChipsInput

mContext = context;
mRecyclerView = recyclerView;
mCollator = Collator.getInstance(Locale.getDefault());
mCollator.setStrength(Collator.PRIMARY);
mComparator = new Comparator<ChipInterface>() {
  @Override

代码示例来源:origin: tylersuehr7/chips-input-layout

@Override
  public int compare(Chip c1, Chip c2) {
    if (sCollator == null) {
      sCollator = Collator.getInstance(Locale.getDefault());
    }
    return sCollator.compare(c1.getTitle(), c2.getTitle());
  }
};

代码示例来源:origin: stackoverflow.com

public int sort(Object ent1, Object ent2) {
  String s1 = (String) ent1;
  String s2 = (String) ent2;

  Collator collator = Collator.getInstance(new Locale("cs"));  //Your locale here
  collator.setStrength(Collator.IDENTICAL);
  return collator.compare(s1, s2);
}

代码示例来源:origin: com.h2database/h2

Locale locale = new Locale(StringUtils.toLowerEnglish(name), "");
if (compareLocaleNames(locale, name)) {
  result = Collator.getInstance(locale);
  String language = StringUtils.toLowerEnglish(name.substring(0, idx));
  String country = name.substring(idx + 1);
  Locale locale = new Locale(language, country);
  if (compareLocaleNames(locale, name)) {
    result = Collator.getInstance(locale);
for (Locale locale : Collator.getAvailableLocales()) {
  if (compareLocaleNames(locale, name)) {
    result = Collator.getInstance(locale);
    break;

代码示例来源:origin: stackoverflow.com

List<String> list = ...;

// create collator for arabic
Collator collator = Collator.getInstance(new Locale("ar"));
collator.setDecomposition(Collator.FULL_DECOMPOSITION);
collator.setStrength(Collator.SECONDARY); // ignores lower/upper case

// sort list
Collections.sort(list, collator);
// or use it as any other comparator

代码示例来源:origin: stackoverflow.com

public static void main( String[] args ) {
  String withVowels = "בַּיִת";
  String withoutVowels = "בית";

  String withVowelsTwo = "הַבַּיְתָה";
  String withoutVowelsTwo = "הביתה";

  System.out.println( "These two strings are " + (withVowels.equals( withoutVowels ) ? "" : "not ") + "equal" );
  System.out.println( "The second two strings are " + (withVowelsTwo.equals( withoutVowelsTwo ) ? "" : "not ") + "equal" );

  Collator collator = Collator.getInstance( new Locale( "he" ) );
  collator.setStrength( Collator.PRIMARY );

  System.out.println( collator.equals( withVowels, withoutVowels ) );
  System.out.println( collator.equals( withVowelsTwo, withoutVowelsTwo ) );
}

代码示例来源:origin: stackoverflow.com

public int compare(String arg1, Sting arg2) {
  Collator usCollator = Collator.getInstance(Locale.US); //Your locale here
  usCollator.setStrength(Collator.PRIMARY);
  return usCollator.compare(arg1, arg2);
}

代码示例来源:origin: com.github.marschall/memoryfilesystem

static Collator caseInsensitiveCollator(Locale locale) {
 CollatorCache cache = INSENSITIVE_COLLATOR.get();
 if (cache == null || !cache.locale.equals(locale)) {
  Collator collator = Collator.getInstance(locale);
  collator.setDecomposition(Collator.NO_DECOMPOSITION);
  collator.setStrength(Collator.SECONDARY);
  INSENSITIVE_COLLATOR.set(new CollatorCache(locale, collator));
  return collator;
 } else {
  return cache.collator;
 }
}

代码示例来源:origin: pentaho/pentaho-kettle

/**
 * @ sets the collator Locale
 */
@Override
public void setCollatorLocale( Locale locale ) {
 // Update the collator only if required
 if ( collatorLocale == null || !collatorLocale.equals( locale ) ) {
  this.collatorLocale = locale;
  this.collator = Collator.getInstance( locale );
 }
}

代码示例来源:origin: looly/hutool

@Override
public int compare(String o1, String o2) {
  return collator.compare(o1, o2);
}

代码示例来源:origin: jenkinsci/jenkins

FileComparator(Locale locale) {
  this.collator = Collator.getInstance(locale);
}

代码示例来源:origin: opentoutatice-ecm.platform/opentoutatice-ecm-platform-automation

public VocabularyEntryComparator() {
  this.collator = Collator.getInstance();
  this.collator.setDecomposition(Collator.CANONICAL_DECOMPOSITION);
  this.collator.setStrength(Collator.TERTIARY);   
}

代码示例来源:origin: stackoverflow.com

Collator c = Collator.getInstance();
c.setStrength(Collator.PRIMARY);
Map<CollationKey, String> dictionary = new TreeMap<CollationKey, String>();
dictionary.put(c.getCollationKey("Björn"), "Björn");
...
CollationKey query = c.getCollationKey("bjorn");
System.out.println(dictionary.get(query)); // --> "Björn"

相关文章