UTF-16 encodes Unicode code points above U+FFFF using surrogate pairs that take up 4 bytes.
You can specify a surrogate pair within a string literal by inserting the character directly into the string (provided that you have a keyboard that can insert the character):
1 | string myString = "𠈓" ; // CJK Ideograph |
You can also represent the surrogate pair within a string literal using the \Unnnnnnnn (4 byte) syntax to specify the Unicode code point or the \unnnn\unnnn syntax to specify the encoded surrogate pair value.
1 2 | string s1 = "\U00020213" ; // Codepoint E+20213 string s2 = "\uD840\uDE13" ; // Surrogate pair |
Note that because a surrogate pair requires more then 2 bytes, you cannot represent a surrogate pair within a single character (System.Char) literal.