You are here

Esperanto and English Haiku Lengths

In Twtr: Which tongues work best for microblogs? the Economist asks how many characters are needed in different languages, as compared with English. They find Chinese is the most concise (as ideograms use a single character to express a whole word) and most romance language are longer. I wondered how Esperanto compares.

For data, I used the haiku that I post daily at Twitter in Esperanto with English translation. I selected the haiku I've posted in the past year (~350), eliminated the ones where I didn't include a translation, and counted the characters in each, and plotted them.

The total number of characters was 19227 in Esperanto and 20371 in English -- for a mean/median (both) of 3 characters more in English. In the same metric used in the original article (ie, as compared with 1000 characters of English text), Esperanto would use 56 characters less, placing it above Chinese, but below the next most concise language in their study (Arabic).

Haiku are actually an interesting data source, in that the goal is to have a particular count in syllables, but the data may also include some biases. I conform to the traditional 5-7-5 syllable counts in the Esperanto versions, but less consistently in the English translations (where art and custom suggest you should strive for fewer), which may account for some of the excess verbiage. At the same time, in order to post a haiku with the three hashtags I want to use (#hajko #haiku #esperanto which total 25 characters), I'm often looking for characters to shave from the English translation to get everything to fit (by replacing the word "and" with an ampersand, for example).

The whole idea of worrying about how concise the language itself is, however, is really kind of silly. As they point out in the article, people have always found all sorts of ways to shorten messages, even just to avoid typing, like the ubiquitous "LOL" (or "MDR" or "KKK" depending on which language you speak).

Esperanto is a great language for twitter, not because it's concise, but because it's easy to learn and has global reach. One weakness of Esperanto is that there aren't a lot of Esperanto speakers in any one place -- if a language is "a dialect with an army", Esperanto won't ever get there. But there are some Esperanto speakers almost everywhere and the Internet has made it easy for them to use their language with each other every day. Twitter is a natural fit.

Twitter has, until recently, not expressed much direct support for Esperanto: Esperanto isn't one of the languages currently available for the interface. But that may be changing: on their new "Twitter International" blog, their first message began with an Esperanto greeting. The Esperanto community would welcome more support for Esperanto from Twitter.