In .NET, string data is stored in memory as Unicode data encoded as UTF-16 (2 bytes per character, or 4 bytes for surrogate pairs).
When you persist string data out to a file, however, you must be aware of what encoding is being used. In the example below, we use a StreamWriter to write string data to a file. StreamWriter by default uses UTF-8 as the encoding.
1 2 3 4 5 6 7 8 9 10 11 12 | string s1 = "A"; // U+0041string s2 = "\u00e9"; // U+00E9 accented estring s3 = "\u0100"; // Capital A with barstring s4 = "\U00020213"; // CJK ideograph (d840, de13 surrogate)using (StreamWriter sw = new StreamWriter(@"C:\Users\Gaurav\Documents\<span class="skimlinks-unlinked">sometext.txt</span>")){ sw.WriteLine(s1); sw.WriteLine(s2); sw.WriteLine(s3); sw.WriteLine(s4);} |
We could also explicitly specify a UTF-16 encoding (Encoding.Unicode) when creating the StreamWriter object.
1 | using (StreamWriter sw = new StreamWriter(@"C:\Users\Gaurav\Documents\<span class="skimlinks-unlinked">sometext.txt</span>", false, Encoding.Unicode)) |

