Return Homepage Cooltang's Box
Homepage Article Title UTF-8 SAMPLER
Catalog Original URL http://www.columbia.edu/kermit/utf8.html
Backup Time 2003-3-1 10:01:40
Executor IP 218.97.236.146

UTF-8 SAMPLER

  ¥  £    $  ¢  ₡  ₢  ₣  ₤  ₥  ₦  ₧  ₨  ₩  ₪  ₫  ₭  ₮  ₯

Frank da Cruz
The Kermit Project - Columbia University
New York City
fdc@columbia.edu

Last update: Sat Feb 22 10:45:28 2003


Poetry ] [ I Can Eat Glass ] [ The Quick Brown Fox ] [ HTML Features ] [ Credits, Tools, Commentary ]

UTF-8 is an ASCII-preserving encoding method for Unicode (ISO 10646), the Universal Character Set (UCS). The UCS encodes most of the world's writing systems in a single character set, allowing you to mix languages and scripts within a document without needing any tricks for switching character sets. This web page is encoded directly in UTF-8.

As shown HERE, Columbia University's Kermit 95 terminal emulation software can display UTF-8 plain text in Windows 95, 98, ME, NT, XP, or 2000 when using a monospace Unicode font like Andale Mono WT J or Everson Mono Terminal, or the lesser populated Courier New, Lucida Console, or Andale Mono. C-Kermit can handle it too, if you have a Unicode display. As many languages as are representable in your font can be seen on the screen at the same time.

This, however, is a Web page. Some Web browsers can handle UTF-8, some can't. And those that can might not have a sufficiently populated font to work with (some browsers might pick glyphs dynamically from multiple fonts; Netscape 6 seems to do this). CLICK HERE for a survey of Unicode fonts for Windows.

The subtitle above shows currency symbols of many lands. If they don't appear as blobs, we're off to a good start!


Poetry

From the Anglo-Saxon Rune Poem (Rune version):

ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ
ᛋᚳᛖᚪᛚ᛫ᚦᛖᚪᚻ᛫ᛗᚪᚾᚾᚪ᛫ᚷᛖᚻᚹᛦᛚᚳ᛫ᛗᛁᚳᛚᚢᚾ᛫ᚻᛦᛏ᛫ᛞᚫᛚᚪᚾ
ᚷᛁᚠ᛫ᚻᛖ᛫ᚹᛁᛚᛖ᛫ᚠᚩᚱ᛫ᛞᚱᛁᚻᛏᚾᛖ᛫ᛞᚩᛗᛖᛋ᛫ᚻᛚᛇᛏᚪᚾ᛬

From Laȝamon's Brut (The Chronicles of England, Middle English, West Midlands):

An preost wes on leoden, Laȝamon was ihoten
He wes Leovenaðes sone -- liðe him be Drihten.
He wonede at Ernleȝe at æðelen are chirechen,
Uppen Sevarne staþe, sel þar him þuhte,
Onfest Radestone, þer he bock radde.

(The third letter in the author's name is Yogh, missing from many fonts; CLICK HERE for another Middle English sample with some explanation of letters and encoding).

From the Tagelied of Wolfram von Eschenbach (Middle High German):

Sîne klâwen durh die wolken sint geslagen,
er stîget ûf mit grôzer kraft,
ich sih in grâwen tägelîch als er wil tagen,
den tac, der im geselleschaft
erwenden wil, dem werden man,
den ich mit sorgen în verliez.
ich bringe in hinnen, ob ich kan.
sîn vil manegiu tugent michz leisten hiez.

Some lines of Odysseus Elytis (Greek):

鴦æώ칦ҦҦ ̦Ϧ έᓦĦئҦf 醦Ŧ˦˦Ǧͦɦή
ᎦӦ 酦Ҧί嬦Ӧ 켦զӦئ֦ɦό ҦӦς 奦f̦̦ϦԦĦές ӦϦ 塦ή຦ѦϦ.
໦Ϧάᴦ֦ έ嬦æͦϦɦ 奦 鏦æώ顦ҦҦ 唦̦Ϧ 蹦ҦӦς 鐦f̦̦ϦԦĦές 崦ӦϦ 實ήᬦѦϦ.

学fό Ӧ Ά칦ΦɦϦ Ҧί
ʦӦϦ 젦ĦԦҦέ蠦 夦ύ酦Ӧ

The first stanza of Pushkin's Bronze Horseman (Russian):

崧 ᳧ҧ֧§ԧ ວŧħߧߧͧ ӧ'
鯧ħ ', էŧ ӧ֧ݧڧܧڧ 䠧',
姧 ӧէѧݧ ݧ֧. ວ§ 饧ߧڧ 寧ʧ§ܧ
駲֧ܧ ᰧߧ֧çѧç; ҧ֧էߧͧ ɧݧ
喧 顧֧ 鏧ç§ާڧݧç 'ڧߧ'.
 吧ʧçͧ, ħgڧ ᣧҧ֧§ԧѧ
ṧ֧§֧ݧ 姧ڧ٧ҧ 鴧է֧ç 䭧ħ,
䭧§ ŧ'' ວŧ'ȧ;
ݧ֧, ֧ࠧӧ֧է'ͧ ݧŧѧ
᳧ ħާѧߧ 卧ç§ħߧߧ' 哧'ߧȧ,
§ԧ' ʧާ֧.

Šota Rustaveli's Veṗxis Ṭq̇aosani, ̣︡Th, The Knight in the Tiger's Skin (Georgian):

ვეპხის ტყაოსანი შოთა რუსთაველი

ღმერთსი შემვედრე, ნუთუ კვლა დამხსნას სოფლისა შრომასა, ცეცხლს, წყალსა და მიწასა, ჰაერთა თანა მრომასა; მომცნეს ფრთენი და აღვფრინდე, მივჰხვდე მას ჩემსა ნდომასა, დღისით და ღამით ვჰხედვიდე მზისა ელვათა კრთომაასა.


I Can Eat Glass

And from the sublime to the ridiculous, here is a certain phrase in an assortment of languages (1):

  1. Sanskrit (5): काचं शक्नोम्यत्तुम् । नोपिहनिस्त माम् ।
  2. Sanskrit (standard transcription): k໨caṃ śaknomyattum; nopahinasti mm.
  3. Classical Greek: ὕf˦Ϧ ϕ訦fæῖ嵦 奦ύᥦͦf̦f· ྦӦῦ荦Ӧ 奦ὔ 實̦ 蠦¦ά౦ЦӦŦ.
  4. Greek: ྦЦϦώ 嬦ͦ Ꭶά夦 ໦ҦЦfҦέ實ͦ ୦æԦf˦ά 崦֦ئίς 䠦ͦ ᮦά䠦Ȧ ᢦίᡦЦϦӦ.
  5. Etruscan: (NEEDED)
  6. Latin: Vitrum edere possum; mihi non nocet.
  7. Esperanto: Mi povas manĝi vitron, ĝi ne damaĝas min.
  8. French: Je peux manger du verre, ça ne me fait pas de mal.
  9. Provençal / Occitan: Pࠨdi manjar de veire, me nafrari pas.
  10. Qubcois: J'peux manger d'la vitre, ça m'fa pas mal.
  11. Walloon: Dji pou magnî do vre, çoula m' freut nn må.
  12. Champenois: (NEEDED)
  13. Lorrain: (NEEDED)
  14. Picard: (NEEDED)
  15. Corsican: (NEEDED)
  16. Basque: Kristala jan dezaket, ez dit minik ematen.
  17. Catalan: Puc menjar vidre que no em fa mal.
  18. Spanish: Puedo comer vidrio, no me hace daño.
  19. Aragones: Puedo minchar beire, no me'n fa mal .
  20. Galician: Eu podo xantar cristais e non cortarme.
  21. Portuguese: Posso comer vidro, não me faz mal.
  22. Brazilian Portuguese: Consigo comer vidro. Não me machuca.
  23. Cabo Verde Creole: M' pod cum vidru, ca ta maguâ-m'.
  24. Papiamentu: (NEEDED)
  25. Italian: Posso mangiare il vetro e non mi fa male.
  26. Roman: Me posso magna' er vetro, e nun me fa male.
  27. Sicilian: Puotsu mangiari u vitru, nun mi fa mali.
  28. Milanese: Sôn bôn de magn el vder, el me fa minga mal.
  29. Venetian: Mi posso magnare el vetro, no'l me fa mae.
  30. Rheto-Romance: (NEEDED)
  31. Romanian: Pot să mănânc sticlă și ea nu mă rănește.
  32. Pictish: (NEEDED)
  33. Breton: (NEEDED)
  34. Cornish: Mý a yl dybry gwder hag f ny wra ow ankenya.
  35. Welsh: Dw i'n gallu bwyta gwydr, 'dyw e ddim yn gwneud dolur i mi.
  36. Manx Gaelic: Foddym gee glonney agh cha jean eh gortaghey mee.
  37. Old Irish (Ogham): ᚛᚛ᚉᚑᚅᚔᚉᚉᚔᚋ ᚔᚈᚔ ᚍᚂᚐᚅᚑ ᚅᚔᚋᚌᚓᚅᚐ᚜
  38. Old Irish (Latin): Coniccim ithi nglano. Nmgna.
  39. Irish: Is fidir liom gloinne a ithe. N dhanann s dochar ar bith dom.
  40. Scottish Gaelic: S urrainn dhomh gloinne ithe; cha ghoirtich i mi.
  41. Anglo-Saxon (Runes): ᛁᚳ᛫ᛗᚨᚷ᛫ᚷᛚᚨᛋ᛫ᛖᚩᛏᚪᚾ᛫ᚩᚾᛞ᛫ᚻᛁᛏ᛫ᚾᛖ᛫ᚻᛖᚪᚱᛗᛁᚪᚧ᛫ᛗᛖ᛬
  42. Anglo-Saxon (Latin): Ic mæg glæs eotan ond hit ne hearmiað me.
  43. Middle English: Ich canne glas eten and hit hirtiþ me nouȝt.
  44. English: I can eat glass and it doesn't hurt me.
  45. English (Braille): ⠊⠀⠉⠁⠝⠀⠑⠁⠞⠀⠛⠇⠁⠎⠎⠀⠁⠝⠙⠀⠊⠞⠀⠙⠕⠑⠎⠝⠞⠀⠓⠥⠗⠞⠀⠍⠑
  46. Lalland Scots / Doric: Ah can eat gless, it disnae hurt us.
  47. Glaswegian: (NEEDED)
  48. Gothic: (NEEDED)
  49. Old Norse (Runes): ᛖᚴ ᚷᛖᛏ ᛖᛏᛁ ᚧ ᚷᛚᛖᚱ ᛘᚾ ᚦᛖᛋᛋ ᚨᚧ ᚡᛖ ᚱᚧᚨ ᛋᚨᚱ
  50. Old Norse: Ek get etið gler n þess að verða sr.
  51. Norsk / Norwegian (Nynorsk): Eg kan eta glas utan å skada meg.
  52. Norsk / Norwegian (Bokmål): Jeg kan spise glass uten å skade meg.
  53. Føroyskt / Faroese: (NEEDED)
  54. Íslenska / Icelandic: Ég get etið gler n þess að meiða mig.
  55. Svensk / Swedish: Jag kan äta glas utan att skada mig.
  56. Dansk / Danish: Jeg kan spise glas, det gør ikke ondt på mig.
  57. Soenderjysk: Æ ka æe glass uhen at det go mæ naue.
  58. Frysk / Frisian: Ik kin gls ite, it docht me net sear.
  59. Nederlands / Dutch: Ik kan glas eten. Het doet me geen pijn.
  60. Afrikaans: Ek kan glas eet, maar dit maak my nie seer nie.
  61. Lëtzebuergescht / Luxemburgish: Ech kan Glas iessen, daat deet mir nët wei.
  62. Deutsch / German: Ich kann Glas essen, ohne mir weh zu tun.
  63. Ruhrdeutsch: Ich kann Glas verkasematucken, ohne dattet mich wat jucken tut.
  64. Sächsisch / Saxon: 'sch kann Glos essn, ohne dass'sch mer wehtue.
  65. Pfälzisch: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud.
  66. Schwäbisch / Swabian: I kå Glas frässa, ond des macht mr nix!
  67. Bayrisch / Bavarian: I koh Glos esa, und es duard ma ned wei.
  68. Allemannisch: I kaun Gloos essen, es tuat ma ned weh.
  69. Schwyzerdtsch: Ich chan Glaas ässe, das tuet mir nöd weeh.
  70. Hungarian: Meg tudom enni az veget, nem lesz tőle bajom.
  71. Suomi / Finnish: Voin syödä lasia, se ei vahingoita minua.
  72. Sami (Northern): Shtn borrat lsa, dat ii leat bvččas.
  73. Estonian: Ma võin klaasi sa, see ei tee mulle midagi.
  74. Latvian: Es varu st stiklu, tas man nekait.
  75. Lithuanian: Aš galiu valgyti stiklą ir jis manęs nežeidžia
  76. Old Prussian: (NEEDED)
  77. Sorbian / Lusatian / Wendish: (NEEDED)
  78. Czech: Mohu jst sklo, neublž mi.
  79. Slovak: Môžem jesť sklo. Nezran ma.
  80. Polska / Polish: Mogę jeść szkło i mi nie szkodzi.
  81. Slovenian: Lahko jem steklo, ne da bi mi škodovalo.
  82. Croatian: Ja mogu jesti staklo i ne boli me.
  83. Serbian (Latin): Mogu jesti staklo a da mi ne škodi.
  84. Serbian (Cyrillic): ' ј袧ç ᓧçѧܧݧ 켧 է 嬧ާ ᒧߧ 奧ʧ'.
  85. Macedonian: ວ'ѧ 䊧է ј大ѧէѧ 鐧çѧܧݧ, ྦྷ ߧ 酧ާ ᬧʧ֧ħ.
  86. Russian: 赧ާ' 讧ç 쳧ħܧݧ, ᰧߧ ஧ߧ ෧ߧ 젧ӧ§էڧ.
  87. Belarusian (Cyrillic): ṧ ާѧԧ 婧çі ʧݧ, ᳧ߧ 顧ߧ ߧ ʧ'іᓧȧ.
  88. Belarusian (Lacinka): Ja mahu jeści škło, jano mne ne škodzić.
  89. Ukrainian: ᩧާ' ї岧ħ ʧݧ, 壧 ӧ' 大֧і ߧ ᮧgʧ'ڧħ.
  90. Bulgarian: ' է 顧 ç̧ݧ 鐧 ߧ 켧ާ ྦྷҧ'.
  91. Georgian: მინას ვჭამ და არა მტკივა.
  92. Armenian: Կրնամ ապակի ուտել և ինծի անհանգիստ չըներ։
  93. Albanian: Unë mund të ha qelq dhe nuk më gjen gjë.
  94. Turkish: Cam yiyebilirim, bana zararı dokunmaz.
  95. Turkish (Ottoman): جام ييه بلورم بڭا ضررى طوقونمز
  96. Marathi: मी काच खाऊ शकतो, मला ते दुखत नाही.
  97. Hindi: मैं काँच खा सकता हूँ, मुझे उस से कोई पीडा नहीं होती.
  98. Urdu(2): میں کانچ کھا سکتا ہوں اور مجھے تکلیف نہیں ہوتی ۔
  99. Pashto(2): زه شيشه خوړلې شم، هغه ما نه خوږوي
  100. Farsi / Persian: .من می توانم بدونِ احساس درد شيشه بخورم
  101. Arabic(2): أنا قادر على أكل الزجاج و هذا لا يؤلمني.
  102. Aramaic: (NEEDED)
  103. Hebrew(2): אני יכול לאכול זכוכית וזה לא מזיק לי.
  104. Yiddish(2): איך קען עסן גלאָז און עס טוט מיר נישט װײ.
  105. Ladino: (NEEDED)
  106. Gǝʼǝz: (NEEDED)
  107. Amharic: (NEEDED)
  108. Twi: Metumi awe tumpan, ɜnyɜ me hwee.
  109. Hausa (Latin): Inā iya taunar gilāshi kuma in gamā lāfiyā.
  110. Hausa (Ajami) (2): إِنا إِىَ تَونَر غِلَاشِ كُمَ إِن غَمَا لَافِىَا
  111. Yoruba(3): Mo l辨 je̩ dg, k n pa m lra.
  112. Malay: Saya boleh makan kaca dan ia tidak mencederakan saya.
  113. Tagalog: Kaya kong kumain nang bubog at hindi ako masaktan.
  114. Chamorro: Siña yo' chumocho krestat, ti ha na'lalamen yo'.
  115. Javanese: Aku isa mangan beling tanpa lara.
  116. Vietnamese (quốc ngữ): Tôi c thể ăn thủy tinh m không hại g.
  117. Vietnamese (nôm) (4): Щ 𣎏 H ˮ 𦓡 𣎏 젺 
  118. Mongolian: (NEEDED)
  119. Tibetan: ཤེལ་སྒོ་ཟ་ནས་ང་ན་གི་མ་རེད།
  120. Chinese:  g಻ 塣
  121. Japanese: ˽ϥ饹٤ɤޤ̤˽Ĥޤ
  122. Korean: 나는 유리를 먹을 수 있어요. 그래도 아프지 않아요
  123. Thai: ฉันกินกระจกได้ แต่มันไม่ทำให้ฉันเจ็บ
  124. Hawaiian: Hiki iaʻu ke ʻai i ke aniani; ʻaʻole n൨ l au e ʻeha.
  125. Marquesan: E koʻana e kai i te karahi, mea ʻ, ʻaʻe hauhau.
  126. Navajo: Tssǫʼ yishą́ągo bnshghah d doo shił neezgai da.
  127. Cherokee (and Cree, Ojibwa, Inuktitut, and other Native American languages): (NEEDED)
  128. Garifuna: (NEEDED)
  129. Gullah: (NEEDED)
  130. Lojban: mi kakne le nu citka le blaci .iku'i le se go'i na xrani mi
  131. Nrdicg: Ljœr ye caudran crneþ ý jor cẃran.

(Additions, corrections, completions, gratefully accepted.)

For testing purposes, some of these are repeated in a monospace font . . .

  1. Euro Symbol: .
  2. Greek: ЦϦώ 嵦ͦ 醦άᓦ ྦҦЦfҦέᒦͦ 奦æԦf˦ά 蠦֦ئίς 嵦ͦ 醦άᲦȦ 鵦ί䠦ЦϦӦ.
  3. Íslenska / Icelandic: Ég get etið gler 岨n þess að meiða mig.
  4. Polish: Mogę jeść szkło, i mi nie szkodzi.
  5. Romanian: Pot să mănânc sticlă și ea nu mă rănește.
  6. Ukrainian: 쯧ާ' ї슧ħ ʧݧ, ᯧ ӧ' ᴧ֧і 讧ߧ 䲧gʧ'ڧħ.
  7. Armenian: Կրնամ ապակի ուտել և ինծի անհանգիստ չըներ։
  8. Georgian: მინას ვჭამ და არა მტკივა.
  9. Hindi: मैं काँच खा सकता हूँ, मुझे उस से कोई पीडा नहीं होती.
  10. Hebrew(2): אני יכול לאכול זכוכית וזה לא מזיק לי.
  11. Yiddish(2): איך קען עסן גלאָז און עס טוט מיר נישט װײ.
  12. Arabic(2): أنا قادر على أكل الزجاج و هذا لا يؤلمني.
  13. Japanese: ˽ϥ饹٤ɤޤ̤˽Ĥޤ
  14. Thai: ฉันกินกระจกได้ แต่มันไม่ทำให้ฉันเจ็บ

Notes:

  1. The numbering of the samples is arbitrary, done only to keep track of how many there are, and can change any time a new entry is added. The arrangement is also arbitrary but with some attempt to group related examples together. Bug #1: the (WANTED) examples shouldn't count. Fix: Fill them in! Bug #2: All languages not listed are wanted, not just the ones that say (WANTED).
  2. Correct right-to-left display of these languages depends on the capabilities of your browser. The period should appear on the left. In the monospace Yiddish example, the Yiddish digraphs should occupy one character cell. Note: unlike the other RTL examples, the Farsi phrase was entered "backwards".
  3. Yoruba: The third word is Latin letter small 'j' followed by small 'e' with U+0329, Combining Vertical Line Below. This displays correctly only if your Unicode font includes the U+0329 glyph and your browser supports combining diacritical marks. The Indic examples also include combining sequences.
  4. Vietnamese Nôm includes Unicode 3.1 Plane 2 characters.
  5. Devanagari (used for writing Sanskrit and other Indic languages) requires complex rendering that most browsers are not capable of; furthermore it is far from settled how best to encode it in Unicode to achieve effects such as ligation. CLICK HERE for a lengthy and illustrated discussion.


The Quick Brown Fox

The "I can eat glass" sentences do not necessarily show off the orthography of each language to best advantage. In many alphabetic written languages it is possible to include all (or most) letters (or "special" characters) in a single (often nonsense) pangram. These were traditionally used in typewriter instruction; now they are useful for stress-testing computer fonts and keyboard input methods. Here are a few examples (SEND MORE):

  1. English: The quick brown fox jumps over the lazy dog.
  2. German: Falsches Üben von Xylophonmusik quält jeden größeren Zwerg. (1)
  3. Swedish: Flygande bäckasiner söka strax hwila på mjuka tuvor.
  4. Czech: Př荨liš žluťoučký ků pl ďbelsk kdy.
  5. Slovak: Starý kô na hŕbe knh žuje tško povädnut ruže, na stĺpe sa ďateľ uč kvkať nov du o živote.
  6. Russian: ɧ˧ 峧ԧ 姧اڧ-钧ҧͧ ȧħŧ? , 嬧ߧ ࠧѧݧΧڧӧͧ ϧ٧֧ާg! ק.
  7. Sami (Northern): Vuol Ruoŧa geđggiid leat mŋga luosa ja čuovžža.
  8. Hungarian: Árvztűrő tkörfrgp.
  9. Spanish: El pingino Wenceslao hizo kilmetros bajo exhaustiva lluvia y fro, añoraba a su querido cachorro.
  10. French: Les naïfs ægithales hâtifs pondant Noël o il gle sont sûrs d'tre dçus et de voir leurs drôles d'œufs abîms.
  11. Esperanto: Eĥoŝanĝo ĉiuĵaŭde.

Notes:

  1. Other phrases commonly used in Germany include: "Ein wackerer Bayer vertilgt ja bequem zwo Pfund Kalbshaxe" and, more recently, "Franz jagt im komplett verwahrlosten Taxi quer durch Bayern", but both lack umlauts and esszet. Previously, going for the shortest sentence that has all the umlauts and special characters, I had "Grße aus Bärenhöfe (und Óechtringen)!" Acute accents are not used in native German words, so I was surprised to discover "Óechtringen" in the Deutsche Bundespost Postleitzahlenbuch (Vorsicht! 2.8MB JPG image). It's a small village in eastern Lower Saxony. Later, Alex Bochannek reported, "I heard back from the city hall people of the town. There is no diacritical mark on the O. The name of the town was mentioned in early documents as 'Ochterdingen' and over time became 'Oechtringen'. The 'dingen' part is probably derived from 'Dingplatz', which means assembly place. It's pronounced Öchtringen."


HTML Features

Here is the Russian alphabet (uppercase only) coded in three different ways, which should look identical:

  1. '   (Literal UTF-8)
  2. '   (Decimal numeric character reference)
  3. ྦྷ'   (Hexadecimal numeric character reference)

In another test, we use HTML language tags to distinguish Bulgarian, Russian, and Serbian, which have different italic forms for lowercase , 캧, ᴧ, 饧, and/or 酧:

Bulgarian:   [ ҧԧէg ]   ҧԧէg ]   ' ໧է ᦧ ç̧ݧ ໧ ߧ 鐧ާ 吧ҧ'.
Russian: [ 鐧ҧԧէg ]   ҧԧէg ]   ާ' 嵧ç Ყħܧݧ, ħ ާߧ ᩧߧ 킧ӧ§էڧ.
Serbian: [ ԧէg ]   ҧԧէg ]   ' јç 大çѧܧݧ 饧 է ާ 쭧ߧ ວʧ'.


Credits, Tools, and Commentary

Credits:
The "I can eat glass" phrase and the initial collection of translations: Ethan Mollick. Transcription / conversion to UTF-8: Frank da Cruz. Albanian: Sindi Keesan. Afrikaans: Johan Fourie. Anglo Saxon: Frank da Cruz. Arabic: Najib Tounsi. Armenian: Vaçe Kundakçı. Belarusian: Alexey Chernyak, Braille: Frank da Cruz. Bulgarian: Sindi Keesan, Guentcho Skordev. Cabo Verde Creole: Cludio Alexandre Duarte. Chinese: Jack Soo. Cornish: Chris Stephens. Croatian: Marjan Baće. Czech: Stanislav Pecha, Radovan Garabk. Dutch: Peter Gotink. Esperanto: Franko Luin, Radovan Garabk. Estonian: Meelis Roos. Farsi/Persian: Payam Elahi. Finnish: Sampsa Toivanen. French: Luc Carissimo, Anne Colin du Terrail. Galician: Laura Probaos. Georgian: Giorgi Lebanidze. German: Christoph Päper, Otto Stolz, Frank da Cruz. Greek: Ariel Glenn, Constantine Stathopoulos, Siva Nataraja. Hebrew: Jonathan Rosenne. Hausa: Malami Buba, Tom Gewecke. Hawaiian: na Hauʻoli Motta, Anela de Rego, Kaliko Trapp. Hindi: Shirish Kalele. Hungarian: Andrs Rcz, Mark Holczhammer. Icelandic: Andrs Magnsson. Irish: Michael Everson. Italian: Thomas De Bellis. Japanese: Makoto Takahashi. Korean: Jungshik Shin. Lëtzebuergescht: Stefaan Eeckels. Lithuanian: Gediminas Grigas. Lojban: Edward Cherlin. Macedonian: Sindi Keesan. Malay: Zarina Mustapha. Manx: Éanna Ó Brdaigh. Marathi: Shirish Kalele. Marquesan: Kaliko Trapp. Middle English: Frank da Cruz. Milanese: Marco Cimarosti. Navajo: Tom Gewecke. Nrdicg: Yẃlyan Rott. Norwegian: Herman Ranes. Old Irish: Michael Everson. Old Norse: Andrs Magnsson. Pashto: N.R. Liwal. Pfälzisch: Dr. Johannes Sander. Polish: Juliusz Chroboczek. Qubcois: Laurent Detillieux. Roman: Pierpaolo Bernardi. Romanian: Juliusz Chroboczek, Ionel Mugurel. Ruhrdeutsch: "Timwi". Russian: Alexey Chernyak, Serge Nesterovitch. Sami: Anne Colin du Terrail, Luc Carissimo. Sanskrit: Siva Nataraja / Vincent Ramos. Sächsisch: Andr Mller. Schwäbisch: Otto Stolz. Scots: Jonathan Riddell. Serbian: Sindi Keesan, Ranko Narancic, Boris Daljevic, Szilvia Csorba. Slovak: G. Adam Stanislav, Radovan Garabk. Slovenian: Albert Kolar. Spanish: Laura Probaos. Swedish: Christian Rose. Tagalog: Jim Soliven. Tibetan: D. Germano, Tom Gewecke. Thai: Alan Wood's wife. Turkish: Vaçe Kundakçı, Tom Gewecke, Merlign Olnon. Ukrainian: Michael Zajac. Urdu: Mustafa Ali. Vietnamese: Dixon Au, [James] Đỗ B Phước . Walloon: Pablo Saratxaga. Welsh: Geiriadur Prifysgol Cymru (Andrew). Yiddish: Mark David.

Tools Used to Create This Web Page:
The UTF8-aware Kermit 95 terminal emulator on Windows, to a Unix host with the EMACS text editor. Kermit 95 displays UTF-8 and also allows keyboard entry of arbitrary Unicode BMP characters as 4 hex digits, as shown HERE. Hex codes for Unicode values can be found in The Unicode Standard (recommended) and the online code charts. When submissions arrive by email encoded in some other character set (Latin-1, Latin-2, KOI, various PC code pages, JEUC, etc), I use the TRANSLATE command of C-Kermit on the Unix host (where I read my mail) to convert the character set to UTF-8 (I could also use Kermit 95 for this; it has the same TRANSLATE command). That's it -- no "Web authoring" tools, no locales, no "smart" anything. It's just plain text, nothing more. By the way, there's nothing special about EMACS -- any text editor will do, providing it allows entry of arbitrary 8-bit bytes as text, including the 0x80-0x9F "C1" range. EMACS 21.1 actually supports UTF-8; earlier versions don't know about it and display the octal codes; either way is OK for this purpose.

Commentary:
Date: Wed, 27 Feb 2002 13:21:59 +0100
From: "Bruno DEDOMINICIS" <b.dedominicis@cite-sciences.fr>
Subject: Je peux manger du verre, cela ne me fait pas mal.

I just found out your website and it makes me feel like proposing an interpretation of the choice of this peculiar phrase.

Glass is transparent and can hurt as everyone knows. The relation between people and civilisations is sometimes effusional and more often rude. The concept of breaking frontiers through globalization, in a way, is also an attempt to deny any difference. Isn't "transparency" the flag of modernity? Nothing should be hidden any more, authority is obsolete, and the new powers are supposed to reign through loving and smiling and no more through coercion...

Eating glass without pain sounds like a very nice metaphor of this attempt. That is, frontiers should become glass transparent first, and be denied by incorporating them. On the reverse, it shows that through globalization, frontiers undergo a process of displacement, that is, when they are not any more speakable, they become repressed from the speech and are therefore incorporated and might become painful symptoms, as for example what happens when one tries to eat glass.

The frontiers that used to separate bodies one from another tend to divide bodies from within and make them suffer.... The chosen phrase then appears as a denial of the symptom that might result from the destitution of traditional frontiers.

Best,
Bruno De Dominicis, Paris, France

Other Unicode samplers:

Unicode fonts:

[ Kermit 95 ] [ K95 Screen Shots ] [ C-Kermit ] [ Kermit Home ] [ Unicode Fonts ] [ The Unicode Consortium ]


UTF-8 Sampler / The Kermit Project / Columbia University / kermit@columbia.edu / 22 February 2003
Visit Cooltang's Homepage TOP