Common Locale Data Repository

The Common Locale Data Repository (CLDR) is a project of the Unicode Consortium to provide locale data in XML format for use in computer applications. CLDR contains locale-specific information that an operating system will typically provide to applications. CLDR is written in the Locale Data Markup Language (LDML).

CLDR is maintained by a technical committee which includes employees from IBM, Apple, Google, Microsoft, and some government-based organizations. The committee is chaired by John Emmons, of IBM; Mark Davis, of Google, is vice-chair.

Details

Among the types of data that CLDR includes are the following:

  • Translations for language names
  • Translations for territory and country names
  • Translations for currency names, including singular/plural modifications
  • Translations for weekday, month, era, period of day, in full and abbreviated forms
  • Translations for time zones and example cities (or similar) for time zones
  • Translations for calendar fields
  • Patterns for formatting/parsing dates or times of day
  • Exemplar sets of characters used for writing the language
  • Patterns for formatting/parsing numbers
  • Rules for language-adapted collation
  • Rules for spelling out numbers as words
  • Rules for formatting numbers in traditional numeral systems (such as Roman and Armenian numerals)
  • Rules for transliteration between scripts, much of it based on BGN/PCGN romanization

The information is currently used in International Components for Unicode, Apple's macOS, LibreOffice, MediaWiki, and IBM's AIX, among other applications and operating systems.

CLDR overlaps somewhat with ISO/IEC 15897 (POSIX locales). POSIX locale information can be derived from CLDR by using some of CLDR's conversion tools.

The CLDR covers 400+ languages.

References

Uses material from the Wikipedia article Common Locale Data Repository, released under the CC BY-SA 4.0 license.