ICU Message Format: Quick Guide with Syntax Examples

by Denis Augsburger

First things first: We love to translate ICU messages. Why? Because they are used in multilingual software projects and provide a more granular way to localize. ICU messages provide more context about the expected data in your source messages. With the advanced pluralization and selection rules you can translate your product reliably to different languages. We combine machine translation with ICU message support and speed up the localization process.

Table of Contents

If your use case requires you to translate a simple phrase without any variables, plurals or selects, you can just use simple text. Depending on your i18n library these texts are organized in different formats like JSON, YAML, properties files or PHP Arrays. Simple text can be used within ICU messages and doesn't need any special syntax. For example in JSON:

{
  "header": "Simple ICU Messages",
  "text": "Just write text without any variables"
}

Variables in ICU Messages

Variables (also called interpolations in many i18n libraries or arguments in the ICU Docs) are used to insert data. This data will be provided during runtime of the app and can represent different things like numbers, dates, time and text.

ICU variables are inserted with the variable name in curly brackets like {variable} in your ICU messages.

Simple Variables

Simple ICU variables can be used for any kind of data but are specifically useful for text insertions without any need of pluralization or selection.

Combined with the YAML format our example looks as follows:

---
header: Simple ICU Messages with variable
text: Just write text with variables in {format}

In this example, the variable “format” could be replaced with "yaml", "json" or "php array". You can insert any data you need, but for additional formatting and localization support use the examples below for numbers, dates and time.

Variables with Numbers

Numbers are formatted differently depending on the language. This results in different usage of the thousand separator (10'000 vs 10.000) or the declaration of floating numbers (1,4 vs 1.4). Numbers can also be written in a different numeral system, i.e. Arabic.

---
header: Different number formats in ICU messages
totalExample: Your score is {total, number}.
percentExample: We've increased your productivity by {increase, number, percent}.
amountExample: Your total is {total, number, currency} 
amountSkeletonExample: Your total is {total, number, ::currency/EUR}. 

The syntax of ICU variables that represent a number looks as follows: { variableName, number, style }.

There are three styles for numbers:

  • integer
  • percent
  • currency

If the style starts with :: it is called a skeleton. These can be declared to use a specific format or transformation before inserting the data. Our amountExample in a checkout process would show the total in the user's locale, i.e. in USD. If you want to keep your prices but show a suitable version for the user locales, then you should use the amountSkeletonExample to keep prices in EUR but show a localized version to the user.

With the additional information that this is a number -more specifically a number that represents a total in EUR - the translator (machine or human) can choose a more suitable word, adjective or adverb for these cases. This leads to better translation results by human and machine translations.

There are a lot more varieties to format numbers, for example for percentages and more. See format number skeletons in the official docs.

Variables with Dates

Dates are formatted differently depending on your target language. This can be a different arrangement of day, month and year or the usage of another separator. Examples for this can be:

DE: 31.12.2020 US: 2020/12/31 UK: 31-12-2020

Thus, the syntax for ICU variables with dates are: {variableName, date, style}

There are four styles for dates:

  • SHORT i.e. 12/12/20
  • MEDIUM i.e. Dec. 12, 2020
  • LONG i.e. December 12, 2020
  • FULL i.e. Saturday, December 12, 2020 AD

The style can also represent a skeleton, marked with ::. Check this out if you need to create your own date skeleton.

Variables with Times

Times are handled similar to dates.

So the syntax with times for ICU messages is: {variableName, time, style}

There are three styles for time:

  • SHORT i.e. 5:30pm
  • LONG i.e. 5:30:42pm
  • FULL i.e. 5:30:42pm PST

The format can also be adapted with a skeleton, see chapter Variables with Dates.

Let's see an example with date and time in a JSON formatted source file:

{
  "created": "The blog post was posted on {publishedAt, date, medium} at {publishedAt, time, short}"
}

Let's translate our example with Simpleen to German:

{
  "created": "Der Blog-Beitrag wurde am {publishedAt, date, medium} um {publishedAt, time, short} veröffentlicht."
}

The propositions are correctly handled and the translation result needs no or just minimal post-editing.

Complex ICU Structures

Complex variables should be used on the outermost structure. From the perspective of a developer this looks counterproductive, because you repeat your ICU messages more often. But from a localization aspect it makes total sense, because some depending words can be rearranged accordingly. If you need to nest selects and plurals, use selects on the outside (fixed selections) and plurals on the inside (variable cases depending on the target language).

You can also insert variables in your complex ICU messages.

Pluralization Format for ICU Messages

Pluralization rules depend on the language's grammar and can easily exceed a singularity and plurality as we know it from English. There are also languages like Japanese that don't differ between a singular and a plural case.

The syntax looks like this: {count, plurals, one {Showing one result} other {Showing # results}}

"One" and "other" are pluralization categories. It is always mandatory to have one case of “other”. You can check out what “other” is used for and which cases are necessary by target language here or just use Simpleen to automatically handle these cases.

Now let's see another example with YAML:

---
availableMembers: "{members, plural, =0 {No members available} one {There is one member available.} other {There are # members available.}}"

This ICU Message expects a variable named "members" as a number. There are three options provided for English which can be chosen from. Depending on the value of members, the shown message to the user is "No members available", "There is one member available" or "There are 5 members available.". The Hash-Sign (#) is used for inserting the provided members number.

In English there are 2 plural categories, "one" for singularity and "other" for pluralization. With =0 we introduced an additional option if the members variable is equal to zero.

Let's translate this example with Simpleen into Russian, a language with more pluralization rules.

availableMembers: "{members, plural,  =0 {Нет доступных членов}  one {Есть один член.}  few {Имеется # члена.}  many {Есть # членов.}  other {# члена.}}"

There are additional plural cases which are automatically inserted and translated. As a result you can localize to more languages correctly.

Select Format for ICU Messages

The select format is like a switch statement in programming, while the default case is represented by “other”. Most likely you will use it for a gender based selection of messages. But it can be used for anything, like different messages depending on the mood, group or state.

The syntax looks like this:

{variableName, select, case1 {Text 1} other {Text 2}}

Let's take an example of a weather app with the states sunny, cloudy, snowy and rainy with a message embedded in JSON.

{
  "todayIs": "{weather, select, sunny {Today is sunny} cloudy {Today is cloudy} snowy {Today is snowy} other {Today is rainy}}" 
}

Start ICU Message Translation

You can start to localize & translate now by creating a Simpleen account. Reach out via Twitter or Mail with questions & feedback.