 |
 |
 |
UTF-8
SAMPLER ¥ £ $ ¢ ₡ ₢ ₣ ₤ ₥ ₦ ₧ ₨ ₩ ₪ ₫ ₭ ₮ ₯
Frank da Cruz The Kermit Project - Columbia
University New York City fdc@columbia.edu
Last update: Sat Feb 22 10:45:28 2003
[ Poetry ] [ I Can Eat
Glass ] [ The Quick Brown Fox ]
[ HTML Features ] [ Credits, Tools, Commentary ]
UTF-8 is an ASCII-preserving encoding method for Unicode (ISO 10646), the Universal Character Set (UCS).
The UCS encodes most of the world's writing systems in a single character set,
allowing you to mix languages and scripts within a document without needing any
tricks for switching character sets. This web page is encoded directly in UTF-8.
As shown HERE, Columbia University's Kermit 95 terminal emulation software can display UTF-8
plain text in Windows 95, 98, ME, NT, XP, or 2000 when using a monospace Unicode
font like Andale Mono WT J or Everson Mono Terminal, or the lesser
populated Courier New, Lucida Console, or Andale Mono. C-Kermit can handle it too, if you have a Unicode
display. As many languages as are representable in your font can be seen on
the screen at the same time.
This, however, is a Web page. Some Web browsers can handle UTF-8, some can't.
And those that can might not have a sufficiently populated font to work with
(some browsers might pick glyphs dynamically from multiple fonts; Netscape 6
seems to do this). CLICK
HERE for a survey of Unicode fonts for Windows.
The subtitle above shows currency symbols of many lands. If they don't appear
as blobs, we're off to a good start!
From the Anglo-Saxon Rune Poem (Rune
version):
ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ ᛋᚳᛖᚪᛚ᛫ᚦᛖᚪᚻ᛫ᛗᚪᚾᚾᚪ᛫ᚷᛖᚻᚹᛦᛚᚳ᛫ᛗᛁᚳᛚᚢᚾ᛫ᚻᛦᛏ᛫ᛞᚫᛚᚪᚾ ᚷᛁᚠ᛫ᚻᛖ᛫ᚹᛁᛚᛖ᛫ᚠᚩᚱ᛫ᛞᚱᛁᚻᛏᚾᛖ᛫ᛞᚩᛗᛖᛋ᛫ᚻᛚᛇᛏᚪᚾ᛬
From Laȝamon's Brut
(The Chronicles of England, Middle English, West Midlands):
An preost wes on leoden, Laȝamon was ihoten He wes Leovenaðes
sone -- liðe him be Drihten. He wonede at Ernleȝe at æðelen are
chirechen, Uppen Sevarne staþe, sel þar him þuhte, Onfest Radestone, þer
he bock radde.
(The third letter in the author's name is Yogh, missing from many fonts; CLICK HERE for another Middle English sample with
some explanation of letters and encoding).
From the Tagelied of Wolfram von
Eschenbach (Middle High German):
Sîne klâwen durh die wolken sint geslagen, er stîget ûf mit
grôzer kraft, ich sih in grâwen tägelîch als er wil tagen, den tac, der im
geselleschaft erwenden wil, dem werden man, den ich mit sorgen în
verliez. ich bringe in hinnen, ob ich kan. sîn vil manegiu tugent michz
leisten hiez.
Some lines of Odysseus
Elytis (Greek):
鴦æώ칦ҦҦ ̦Ϧ έᓦĦئҦf 醦Ŧ˦˦Ǧͦɦή ᎦӦ 酦Ҧί嬦Ӧ 켦զӦئ֦ɦό ҦӦς 奦f̦̦ϦԦĦές ӦϦ
塦ήѦϦ. Ϧάᴦ֦ έ嬦æͦϦɦ 奦 鏦æώ顦ҦҦ 唦̦Ϧ 蹦ҦӦς 鐦f̦̦ϦԦĦές 崦ӦϦ 實ήᬦѦϦ.
学fό Ӧ Ά칦ΦɦϦ Ҧί ʦӦϦ 젦ĦԦҦέ蠦 夦ύ酦Ӧ
The first stanza of Pushkin's
Bronze Horseman (Russian):
崧 ᳧ҧ֧§ԧ ວŧħߧߧͧ ӧ' 鯧ħ ', էŧ ӧ֧ݧڧܧڧ 䠧', 姧 ӧէѧݧ
ݧ֧. ວ§ 饧ߧڧ 寧ʧ§ܧ 駲֧ܧ ᰧߧ֧çѧç; ҧ֧էߧͧ ɧݧ 喧 顧֧ 鏧ç§ާڧݧç
'ڧߧ'. 吧ʧçͧ, ħgڧ ᣧҧ֧§ԧѧ ṧ֧§֧ݧ 姧ڧ٧ҧ 鴧է֧ç 䭧ħ, 䭧§
ŧ'' ວŧ'ȧ; ݧ֧, ֧ࠧӧ֧է'ͧ ݧŧѧ ᳧ ħާѧߧ 卧ç§ħߧߧ'
哧'ߧȧ, §ԧ' ʧާ֧.
Šota
Rustaveli's Veṗxis Ṭq̇aosani, ̣︡Th, The Knight in the Tiger's
Skin (Georgian):
ვეპხის ტყაოსანი შოთა რუსთაველი
ღმერთსი შემვედრე, ნუთუ კვლა დამხსნას სოფლისა შრომასა, ცეცხლს, წყალსა და
მიწასა, ჰაერთა თანა მრომასა; მომცნეს ფრთენი და აღვფრინდე, მივჰხვდე მას ჩემსა
ნდომასა, დღისით და ღამით ვჰხედვიდე მზისა ელვათა კრთომაასა.
And from the sublime to the ridiculous, here is a certain phrase in an
assortment of languages (1):
- Sanskrit (5): काचं शक्नोम्यत्तुम् । नोपिहनिस्त
माम् ।
- Sanskrit (standard transcription): kcaṃ śaknomyattum;
nopahinasti mm.
- Classical Greek: ὕf˦Ϧ ϕ訦fæῖ嵦 奦ύᥦͦf̦f· ྦӦῦ荦Ӧ 奦ὔ 實̦ 蠦¦ά౦ЦӦŦ.
- Greek: ྦЦϦώ 嬦ͦ Ꭶά夦 ҦЦfҦέ實ͦ ୦æԦf˦ά 崦֦ئίς 䠦ͦ ᮦά䠦Ȧ ᢦίᡦЦϦӦ.
- Etruscan: (NEEDED)
- Latin: Vitrum edere possum; mihi non nocet.
- Esperanto: Mi povas manĝi vitron, ĝi ne damaĝas min.
- French: Je peux manger du verre, ça ne me fait pas de mal.
- Provençal / Occitan: Pࠨdi manjar de veire, me nafrari pas.
- Qubcois: J'peux manger d'la vitre, ça m'fa pas mal.
- Walloon: Dji pou magnî do vre, çoula m' freut nn må.
- Champenois: (NEEDED)
- Lorrain: (NEEDED)
- Picard: (NEEDED)
- Corsican: (NEEDED)
- Basque: Kristala jan dezaket, ez dit minik ematen.
- Catalan: Puc menjar vidre que no em fa mal.
- Spanish: Puedo comer vidrio, no me hace daño.
- Aragones: Puedo minchar beire, no me'n fa mal .
- Galician: Eu podo xantar cristais e non cortarme.
- Portuguese: Posso comer vidro, não me faz mal.
- Brazilian Portuguese: Consigo comer vidro. Não me machuca.
- Cabo Verde Creole: M' pod cum vidru, ca ta maguâ-m'.
- Papiamentu: (NEEDED)
- Italian: Posso mangiare il vetro e non mi fa male.
- Roman: Me posso magna' er vetro, e nun me fa male.
- Sicilian: Puotsu mangiari u vitru, nun mi fa mali.
- Milanese: Sôn bôn de magn el vder, el me fa minga mal.
- Venetian: Mi posso magnare el vetro, no'l me fa mae.
- Rheto-Romance: (NEEDED)
- Romanian: Pot să mănânc sticlă și ea nu mă rănește.
- Pictish: (NEEDED)
- Breton: (NEEDED)
- Cornish: Mý a yl dybry gwder hag f ny wra ow ankenya.
- Welsh: Dw i'n gallu bwyta gwydr, 'dyw e ddim yn gwneud dolur i mi.
- Manx Gaelic: Foddym gee glonney agh cha jean eh gortaghey mee.
- Old Irish (Ogham): ᚛᚛ᚉᚑᚅᚔᚉᚉᚔᚋ ᚔᚈᚔ ᚍᚂᚐᚅᚑ ᚅᚔᚋᚌᚓᚅᚐ᚜
- Old Irish (Latin): Coniccim ithi nglano. Nmgna.
- Irish: Is fidir liom gloinne a ithe. N dhanann s dochar ar bith
dom.
- Scottish Gaelic: S urrainn dhomh gloinne ithe; cha ghoirtich i mi.
- Anglo-Saxon (Runes): ᛁᚳ᛫ᛗᚨᚷ᛫ᚷᛚᚨᛋ᛫ᛖᚩᛏᚪᚾ᛫ᚩᚾᛞ᛫ᚻᛁᛏ᛫ᚾᛖ᛫ᚻᛖᚪᚱᛗᛁᚪᚧ᛫ᛗᛖ᛬
- Anglo-Saxon (Latin): Ic mæg glæs eotan ond hit ne hearmiað me.
- Middle English: Ich canne glas eten and hit hirtiþ me nouȝt.
- English: I can eat glass and it doesn't hurt me.
- English (Braille): ⠊⠀⠉⠁⠝⠀⠑⠁⠞⠀⠛⠇⠁⠎⠎⠀⠁⠝⠙⠀⠊⠞⠀⠙⠕⠑⠎⠝⠞⠀⠓⠥⠗⠞⠀⠍⠑
- Lalland Scots / Doric: Ah can eat gless, it disnae hurt us.
- Glaswegian: (NEEDED)
- Gothic: (NEEDED)
- Old Norse (Runes): ᛖᚴ ᚷᛖᛏ ᛖᛏᛁ ᚧ ᚷᛚᛖᚱ ᛘᚾ ᚦᛖᛋᛋ ᚨᚧ ᚡᛖ ᚱᚧᚨ ᛋᚨᚱ
- Old Norse: Ek get etið gler n þess að verða sr.
- Norsk / Norwegian (Nynorsk): Eg kan eta glas utan å skada meg.
- Norsk / Norwegian (Bokmål): Jeg kan spise glass uten å skade meg.
- Føroyskt / Faroese: (NEEDED)
- Íslenska / Icelandic: Ég get etið gler n þess að meiða mig.
- Svensk / Swedish: Jag kan äta glas utan att skada mig.
- Dansk / Danish: Jeg kan spise glas, det gør ikke ondt på mig.
- Soenderjysk: Æ ka æe glass uhen at det go mæ naue.
- Frysk / Frisian: Ik kin gls ite, it docht me net sear.
- Nederlands / Dutch: Ik kan glas eten. Het doet me geen pijn.
- Afrikaans: Ek kan glas eet, maar dit maak my nie seer nie.
- Lëtzebuergescht / Luxemburgish: Ech kan Glas iessen, daat deet mir
nët wei.
- Deutsch / German: Ich kann Glas essen, ohne mir weh zu tun.
- Ruhrdeutsch: Ich kann Glas verkasematucken, ohne dattet mich wat
jucken tut.
- Sächsisch / Saxon: 'sch kann Glos essn, ohne dass'sch mer wehtue.
- Pfälzisch: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud.
- Schwäbisch / Swabian: I kå Glas frässa, ond des macht mr nix!
- Bayrisch / Bavarian: I koh Glos esa, und es duard ma ned wei.
- Allemannisch: I kaun Gloos essen, es tuat ma ned weh.
- Schwyzerdtsch: Ich chan Glaas ässe, das tuet mir nöd weeh.
- Hungarian: Meg tudom enni az veget, nem lesz tőle bajom.
- Suomi / Finnish: Voin syödä lasia, se ei vahingoita minua.
- Sami (Northern): Shtn borrat lsa, dat ii leat bvččas.
- Estonian: Ma võin klaasi sa, see ei tee mulle midagi.
- Latvian: Es varu st stiklu, tas man nekait.
- Lithuanian: Aš galiu valgyti stiklą ir jis manęs nežeidžia
- Old Prussian: (NEEDED)
- Sorbian / Lusatian / Wendish: (NEEDED)
- Czech: Mohu jst sklo, neublž mi.
- Slovak: Môžem jesť sklo. Nezran ma.
- Polska / Polish: Mogę jeść szkło i mi nie szkodzi.
- Slovenian: Lahko jem steklo, ne da bi mi škodovalo.
- Croatian: Ja mogu jesti staklo i ne boli me.
- Serbian (Latin): Mogu jesti staklo a da mi ne škodi.
- Serbian (Cyrillic): ' ј袧ç ᓧçѧܧݧ 켧 է 嬧ާ ᒧߧ 奧ʧ'.
- Macedonian: ວ'ѧ 䊧է ј大ѧէѧ 鐧çѧܧݧ, ྦྷ ߧ 酧ާ ᬧʧ֧ħ.
- Russian: 赧ާ' 讧ç 쳧ħܧݧ, ᰧߧ ߧ ෧ߧ 젧ӧ§էڧ.
- Belarusian (Cyrillic): ṧ ާѧԧ 婧çі ʧݧ, ᳧ߧ 顧ߧ ߧ ʧ'іᓧȧ.
- Belarusian (Lacinka): Ja mahu jeści škło, jano mne ne škodzić.
- Ukrainian: ᩧާ' ї岧ħ ʧݧ, 壧 ӧ' 大֧і ߧ ᮧgʧ'ڧħ.
- Bulgarian: ' է 顧 ç̧ݧ 鐧 ߧ 켧ާ ྦྷҧ'.
- Georgian: მინას ვჭამ და არა მტკივა.
- Armenian: Կրնամ ապակի ուտել և ինծի անհանգիստ չըներ։
- Albanian: Unë mund të ha qelq dhe nuk më gjen gjë.
- Turkish: Cam yiyebilirim, bana zararı dokunmaz.
- Turkish (Ottoman): جام ييه بلورم بڭا ضررى طوقونمز
- Marathi: मी काच खाऊ शकतो, मला ते दुखत नाही.
- Hindi: मैं काँच खा सकता हूँ, मुझे उस से कोई पीडा नहीं होती.
- Urdu(2): میں کانچ کھا سکتا
ہوں اور مجھے تکلیف نہیں ہوتی ۔
- Pashto(2): زه شيشه خوړلې شم، هغه ما نه خوږوي
- Farsi / Persian: .من می توانم بدونِ احساس درد شيشه بخورم
- Arabic(2): أنا قادر على
أكل الزجاج و هذا لا يؤلمني.
- Aramaic: (NEEDED)
- Hebrew(2): אני יכול לאכול
זכוכית וזה לא מזיק לי.
- Yiddish(2): איך קען עסן
גלאָז און עס טוט מיר נישט װײ.
- Ladino: (NEEDED)
- Gǝʼǝz: (NEEDED)
- Amharic: (NEEDED)
- Twi: Metumi awe tumpan, ɜnyɜ me hwee.
- Hausa (Latin): Inā iya taunar gilāshi kuma in gamā
lāfiyā.
- Hausa (Ajami) (2): إِنا إِىَ تَونَر غِلَاشِ كُمَ إِن غَمَا لَافِىَا
- Yoruba(3): Mo l辨 je̩ dg, k n pa m lra.
- Malay: Saya boleh makan kaca dan ia tidak mencederakan saya.
- Tagalog: Kaya kong kumain nang bubog at hindi ako masaktan.
- Chamorro: Siña yo' chumocho krestat, ti ha na'lalamen yo'.
- Javanese: Aku isa mangan beling tanpa lara.
- Vietnamese (quốc ngữ): Tôi c thể ăn thủy tinh m không hại g.
- Vietnamese (nôm) (4): Щ 𣎏
H ˮ 𦓡 𣎏 젺
- Mongolian: (NEEDED)
- Tibetan: ཤེལ་སྒོ་ཟ་ནས་ང་ན་གི་མ་རེད།
- Chinese:
g
塣
- Japanese: ˽ϥ饹٤ɤޤ̤˽Ĥޤ
- Korean: 나는 유리를 먹을 수 있어요. 그래도 아프지 않아요
- Thai: ฉันกินกระจกได้ แต่มันไม่ทำให้ฉันเจ็บ
- Hawaiian: Hiki iaʻu ke ʻai i ke aniani; ʻaʻole n൨ l au e ʻeha.
- Marquesan: E koʻana e kai i te karahi, mea ʻ, ʻaʻe hauhau.
- Navajo: Tssǫʼ yishą́ągo bnshghah d doo shił neezgai da.
- Cherokee (and Cree, Ojibwa, Inuktitut, and other Native American
languages): (NEEDED)
- Garifuna: (NEEDED)
- Gullah: (NEEDED)
- Lojban: mi kakne le nu citka le blaci .iku'i le se go'i na xrani mi
- Nrdicg: Ljœr ye caudran crneþ ý jor cẃran.
(Additions, corrections, completions, gratefully accepted.)
For testing purposes, some of these are repeated in a monospace
font . . .
- Euro Symbol: .
- Greek: ЦϦώ 嵦ͦ 醦άᓦ ྦҦЦfҦέᒦͦ 奦æԦf˦ά 蠦֦ئίς 嵦ͦ 醦άᲦȦ 鵦ί䠦ЦϦӦ.
- Íslenska / Icelandic: Ég get etið gler 岨n þess að meiða mig.
- Polish: Mogę jeść szkło, i mi nie szkodzi.
- Romanian: Pot să mănânc sticlă și ea nu mă rănește.
- Ukrainian: 쯧ާ' ї슧ħ ʧݧ, ᯧ ӧ' ᴧ֧і 讧ߧ 䲧gʧ'ڧħ.
- Armenian: Կրնամ ապակի ուտել և ինծի անհանգիստ չըներ։
- Georgian: მინას ვჭამ და არა მტკივა.
- Hindi: मैं काँच खा सकता हूँ, मुझे उस से कोई पीडा नहीं होती.
- Hebrew(2): אני יכול לאכול
זכוכית וזה לא מזיק לי.
- Yiddish(2): איך קען עסן גלאָז
און עס טוט מיר נישט װײ.
- Arabic(2): أنا قادر على أكل
الزجاج و هذا لا يؤلمني.
- Japanese: ˽ϥ饹٤ɤޤ̤˽Ĥޤ
- Thai: ฉันกินกระจกได้ แต่มันไม่ทำให้ฉันเจ็บ
Notes:
- The numbering of the samples is arbitrary, done only to keep track of how
many there are, and can change any time a new entry is added. The arrangement is
also arbitrary but with some attempt to group related examples together. Bug #1:
the (WANTED) examples shouldn't count. Fix: Fill them in! Bug #2: All languages
not listed are wanted, not just the ones that say (WANTED).
- Correct right-to-left display of these languages depends on
the capabilities of your browser. The period should appear on the left. In
the monospace Yiddish example, the Yiddish digraphs should occupy one character
cell. Note: unlike the other RTL examples, the Farsi phrase was entered
"backwards".
- Yoruba: The third word is Latin letter small 'j' followed by small 'e'
with U+0329, Combining Vertical Line Below. This displays correctly only if your
Unicode font includes the U+0329 glyph and your browser supports combining
diacritical marks. The Indic examples also include combining sequences.
- Vietnamese Nôm includes Unicode 3.1 Plane 2 characters.
- Devanagari (used for writing Sanskrit and other Indic languages) requires
complex rendering that most browsers are not capable of; furthermore it is far
from settled how best to encode it in Unicode to achieve effects such as
ligation. CLICK HERE for a lengthy and
illustrated discussion.
The "I can eat glass" sentences do not necessarily
show off the orthography of each language to best advantage. In many alphabetic
written languages it is possible to include all (or most) letters (or "special"
characters) in a single (often nonsense) pangram. These were
traditionally used in typewriter instruction; now they are useful for
stress-testing computer fonts and keyboard input methods. Here are a few
examples (SEND MORE):
- English: The quick brown fox jumps over the lazy dog.
- German: Falsches Üben von Xylophonmusik quält jeden größeren Zwerg.
(1)
- Swedish: Flygande bäckasiner söka strax hwila på mjuka tuvor.
- Czech: Př荨liš žluťoučký ků pl ďbelsk kdy.
- Slovak: Starý kô na hŕbe knh žuje tško povädnut ruže, na stĺpe sa
ďateľ uč kvkať nov du o živote.
- Russian: ɧ˧ 峧ԧ 姧اڧ-钧ҧͧ ȧħŧ? , 嬧ߧ ࠧѧݧΧڧӧͧ ϧ٧֧ާg! ק.
- Sami (Northern): Vuol Ruoŧa geđggiid leat mŋga luosa ja čuovžža.
- Hungarian: Árvztűrő tkörfrgp.
- Spanish: El pingino Wenceslao hizo kilmetros bajo exhaustiva lluvia
y fro, añoraba a su querido cachorro.
- French: Les naïfs ægithales hâtifs pondant Noël o il gle sont
sûrs d'tre dçus et de voir leurs drôles d'œufs abîms.
- Esperanto: Eĥoŝanĝo ĉiuĵaŭde.
Notes:
- Other phrases commonly used in Germany include: "Ein wackerer Bayer vertilgt
ja bequem zwo Pfund Kalbshaxe" and, more recently, "Franz jagt im komplett
verwahrlosten Taxi quer durch Bayern", but both lack umlauts and esszet.
Previously, going for the shortest sentence that has all the umlauts and special
characters, I had "Grße aus Bärenhöfe (und Óechtringen)!" Acute accents are not
used in native German words, so I was surprised to discover "Óechtringen" in the
Deutsche Bundespost Postleitzahlenbuch
(Vorsicht! 2.8MB JPG image). It's a small village in eastern Lower Saxony.
Later, Alex Bochannek reported, "I heard back from the city hall people of the
town. There is no diacritical mark on the O. The name of the town was mentioned
in early documents as 'Ochterdingen' and over time became 'Oechtringen'. The
'dingen' part is probably derived from 'Dingplatz', which means assembly place.
It's pronounced Öchtringen."
Here is the Russian alphabet (uppercase only) coded in
three different ways, which should look identical:
- ' (Literal UTF-8)
- ' (Decimal numeric character
reference)
- ྦྷ' (Hexadecimal numeric character
reference)
In another test, we use HTML language tags to distinguish Bulgarian, Russian,
and Serbian,
which have different italic forms for lowercase , 캧, ᴧ, 饧, and/or 酧:
| Bulgarian:
| [ ҧԧէg ]
| [ ҧԧէg ]
| ' է ᦧ ç̧ݧ ߧ 鐧ާ 吧ҧ'.
|
| Russian:
| [ 鐧ҧԧէg ]
| [ ҧԧէg ]
| ާ' 嵧ç Ყħܧݧ, ħ ާߧ ᩧߧ 킧ӧ§էڧ.
|
| Serbian:
| [ ԧէg ]
| [ ҧԧէg ]
| ' јç 大çѧܧݧ 饧 է ާ 쭧ߧ ວʧ'.
|
- Credits:
- The "I can eat glass" phrase and the initial collection of translations: Ethan Mollick. Transcription /
conversion to UTF-8: Frank da Cruz. Albanian: Sindi Keesan. Afrikaans:
Johan Fourie. Anglo Saxon: Frank da Cruz. Arabic: Najib Tounsi. Armenian:
Vaçe Kundakçı. Belarusian: Alexey Chernyak, Braille: Frank da Cruz.
Bulgarian: Sindi Keesan, Guentcho Skordev. Cabo Verde Creole: Cludio Alexandre
Duarte. Chinese: Jack Soo. Cornish: Chris Stephens. Croatian: Marjan Baće.
Czech: Stanislav Pecha, Radovan Garabk. Dutch: Peter Gotink. Esperanto: Franko
Luin, Radovan Garabk. Estonian: Meelis Roos. Farsi/Persian: Payam Elahi.
Finnish: Sampsa Toivanen. French: Luc Carissimo, Anne Colin du Terrail.
Galician: Laura Probaos. Georgian: Giorgi Lebanidze. German: Christoph Päper,
Otto Stolz, Frank da Cruz. Greek: Ariel Glenn, Constantine Stathopoulos,
Siva Nataraja. Hebrew: Jonathan Rosenne. Hausa: Malami Buba, Tom Gewecke.
Hawaiian: na Hauʻoli Motta, Anela de Rego, Kaliko Trapp. Hindi: Shirish
Kalele. Hungarian: Andrs Rcz, Mark Holczhammer. Icelandic: Andrs Magnsson.
Irish: Michael Everson. Italian: Thomas De Bellis. Japanese: Makoto Takahashi.
Korean: Jungshik Shin. Lëtzebuergescht: Stefaan Eeckels. Lithuanian: Gediminas
Grigas. Lojban: Edward Cherlin. Macedonian: Sindi Keesan. Malay: Zarina
Mustapha. Manx: Éanna Ó Brdaigh. Marathi: Shirish Kalele. Marquesan: Kaliko
Trapp. Middle English: Frank da Cruz. Milanese: Marco Cimarosti. Navajo:
Tom Gewecke. Nrdicg:
Yẃlyan Rott. Norwegian: Herman Ranes. Old Irish: Michael Everson. Old Norse:
Andrs Magnsson. Pashto: N.R. Liwal. Pfälzisch: Dr. Johannes Sander. Polish:
Juliusz Chroboczek. Qubcois: Laurent Detillieux. Roman: Pierpaolo Bernardi.
Romanian: Juliusz Chroboczek, Ionel Mugurel. Ruhrdeutsch: "Timwi". Russian:
Alexey Chernyak, Serge Nesterovitch. Sami: Anne Colin du Terrail, Luc
Carissimo. Sanskrit: Siva Nataraja / Vincent Ramos. Sächsisch: Andr Mller.
Schwäbisch: Otto Stolz. Scots: Jonathan Riddell. Serbian: Sindi Keesan, Ranko
Narancic, Boris Daljevic, Szilvia Csorba. Slovak: G. Adam Stanislav, Radovan
Garabk. Slovenian: Albert Kolar. Spanish: Laura Probaos. Swedish: Christian
Rose. Tagalog: Jim Soliven. Tibetan: D. Germano, Tom Gewecke. Thai: Alan Wood's
wife. Turkish: Vaçe Kundakçı, Tom Gewecke, Merlign Olnon. Ukrainian: Michael
Zajac. Urdu: Mustafa Ali. Vietnamese:
Dixon Au, [James] Đỗ B Phước ᳲ . Walloon: Pablo
Saratxaga. Welsh: Geiriadur Prifysgol Cymru (Andrew). Yiddish: Mark David.
- Tools Used to Create This Web Page:
- The UTF8-aware Kermit 95 terminal emulator on
Windows, to a Unix host with the EMACS text editor. Kermit 95
displays UTF-8 and also allows keyboard entry of arbitrary Unicode BMP
characters as 4 hex digits, as shown HERE. Hex codes
for Unicode values can be found in The Unicode Standard
(recommended) and the online code
charts. When submissions arrive by email encoded in some other character set
(Latin-1, Latin-2, KOI, various PC code pages, JEUC, etc), I use the TRANSLATE
command of C-Kermit on the Unix host (where I read my mail) to convert the character set to UTF-8
(I could also use Kermit 95 for this; it has the same TRANSLATE command). That's
it -- no "Web authoring" tools, no locales, no "smart" anything. It's just plain
text, nothing more. By the way, there's nothing special about EMACS -- any text
editor will do, providing it allows entry of arbitrary 8-bit bytes as text,
including the 0x80-0x9F "C1" range. EMACS 21.1 actually supports UTF-8; earlier
versions don't know about it and display the octal codes; either way is OK for
this purpose.
- Commentary:
- Date: Wed, 27 Feb 2002 13:21:59 +0100
From: "Bruno DEDOMINICIS"
<b.dedominicis@cite-sciences.fr> Subject: Je peux manger du
verre, cela ne me fait pas mal.
I just found out your website and it makes me feel like proposing an
interpretation of the choice of this peculiar phrase.
Glass is transparent and can hurt as everyone knows. The relation between
people and civilisations is sometimes effusional and more often rude. The
concept of breaking frontiers through globalization, in a way, is also an
attempt to deny any difference. Isn't "transparency" the flag of modernity?
Nothing should be hidden any more, authority is obsolete, and the new powers are
supposed to reign through loving and smiling and no more through coercion...
Eating glass without pain sounds like a very nice metaphor of this attempt.
That is, frontiers should become glass transparent first, and be denied by
incorporating them. On the reverse, it shows that through globalization,
frontiers undergo a process of displacement, that is, when they are not any more
speakable, they become repressed from the speech and are therefore incorporated
and might become painful symptoms, as for example what happens when one tries to
eat glass.
The frontiers that used to separate bodies one from another tend to divide
bodies from within and make them suffer.... The chosen phrase then appears as a
denial of the symptom that might result from the destitution of traditional
frontiers.
Best, Bruno De Dominicis, Paris, France
Other Unicode samplers:
Unicode fonts:
[ Kermit 95 ] [ K95 Screen
Shots ] [ C-Kermit ] [ Kermit Home ] [ Unicode Fonts ] [ The Unicode Consortium ]
UTF-8 Sampler / The Kermit Project / Columbia University / kermit@columbia.edu / 22 February 2003
|