E.5. Using Unicode

Visual Studio uses Unicode UTF-16 encoding to represent all characters. Figure E.3 uses C# to display the text "Welcome to Unicode" in eight languagesEnglish, French, German, Japanese, Portuguese, Russian, Spanish and Traditional Chinese. [Note: The Unicode Consortium's Web site contains a link to code charts that lists the 16-bit Unicode code values.]

Figure E.3. Windows application demonstrating Unicode encoding.

1 // Fig. E.3: UnicodeForm.cs 2 // Unicode enconding demonstration. 3 using System; 4 using System.Windows.Forms; 5 6 namespace UnicodeDemo 7 { 8 public partial class UnicodeForm : Form 9 { 10 public UnicodeForm() 11 { 12 InitializeComponent(); 13 } 14 15 // assign Unicode strings to each Label 16 private void UnicodeForm_Load( object sender, EventArgs e ) 17 { 18 // English 19 char[] english = { 'u0057', 'u0065', 'u006C', 20 'u0063', 'u006F', 'u006D', 'u0065', 'u0020', 21 'u0074', 'u006F', 'u0020' }; 22 englishLabel.Text = new string( english ) + 23 "Unicode" + 'u0021'; 24 25 // French 26 char[] french = { 'u0042', 'u0069', 'u0065', 27 'u006E', 'u0076', 'u0065', 'u006E', 'u0075', 28 'u0065', 'u0020', 'u0061', 'u0075', 'u0020' }; 29 frenchLabel.Text = new string( french ) + 30 "Unicode" + 'u0021'; 31 32 // German 33 char[] german = { 'u0057', 'u0069', 'u006C', 34 'u006B', 'u006F', 'u006D', 'u006D', 'u0065', 35 'u006E', 'u0020', 'u007A', 'u0075', 'u0020' }; 36 germanLabel.Text = new string( german ) + 37 "Unicode" + 'u0021'; 38 39 // Japanese 40 char[] japanese = { 'u3078', 'u3087', 'u3045', 41 'u3053', 'u305D', 'u0021' }; 42 japaneseLabel.Text = "Unicode" + new string( japanese ); 43 44 // Portuguese 45 char[] portuguese = { 'u0053', 'u0065', 'u006A', 46 'u0061', 'u0020', 'u0062', 'u0065', 'u006D', 47 'u0020', 'u0076', 'u0069', 'u006E', 'u0064', 48 'u006F', 'u0020', 'u0061', 'u0020' }; 49 portugueseLabel.Text = new string( portuguese ) + 50 "Unicode" + 'u0021'; 51 52 // Russian 53 char[] russian = { 'u0414', 'u043E', 'u0431', 54 'u0440', 'u043E', 'u0020', 'u043F', 'u043E', 55 'u0436', 'u0430', 'u043B', 'u043E', 'u0432', 56 'u0430', 'u0442', 'u044A', 'u0020', 'u0432', 'u0020' }; 57 russianLabel.Text = new string( russian ) + 58 "Unicode" + 'u0021'; 59 60 // Spanish 61 char[] spanish = { 'u0042', 'u0069', 'u0065', 62 'u006E', 'u0076', 'u0065', 'u006E', 'u0069', 63 'u0064', 'u006F', 'u0020', 'u0061', 'u0020' }; 64 spanishLabel.Text = new string( spanish ) + 65 "Unicode" + 'u0021'; 66 67 // Simplified Chinese 68 char[] chinese = { 'u6B22', 'u8FCE', 'u4F7F', 69 'u7528', 'u0020' }; 70 chineseLabel.Text = new string( chinese ) + 71 "Unicode" + 'u0021'; 72 } // end method UnicodeForm_Load 73 } // end class UnicodeForm 74 } // end namespace UnicodeDemo

The first welcome message (lines 1923) contains the hexadecimal codes for the English text. The Code Charts page on the Unicode Consortium Web site contains a document that lists the code values for the Basic Latin block (or category), which includes the English alphabet. The hexadecimal codes in lines 1920 equate to "Welcome" and a space character (u0020). Unicode characters in C# use the format uyyyy, where yyyy represents the hexadecimal Unicode encoding. For example, the letter "W" (in "Welcome") is denoted by u0057. The hexadecimal values for the word "to" and a space character appear on line 21 and the word "Unicode" is on line 23. "Unicode" is not encoded because it is a registered trademark and has no equivalent translation in most languages.

The remaining welcome messages (lines 2671) contain the hexadecimal codes for the other seven languages. The code values used for the French, German, Portuguese and Spanish text are located in the Basic Latin block, the code values used for the Traditional Chinese text are located in the CJK Unified Ideographs block, the code values used for the Russian text are located in the Cyrillic block and the code values used for the Japanese text are located in the Hiragana block.

[Note: To render the Asian characters in a Windows application, you would need to install the proper language files on your computer. To do this, open the Regional Options dialog from the Control Panel (Start > Settings > Control Panel). At the bottom of the General tab is a list of languages. Check the Japanese and the Traditional Chinese checkboxes and press Apply. Follow the directions of the install wizard to install the languages. For additional assistance, visit www.unicode.org/help/display_problems.html.]

E 6 Character Ranges

Категории