Use StringInfo to Get Specific Characters From A UTF32 String

We saw that you cannot use the normal string index [] to get individual characters from a UTF32 string.  Instead, you need to use the System.Globalization.StringInfo class.
In the example below, we first get a list of indexes to each of the three characters in our UTF32 string.  We then extract index each character separately.


1
2
3
4
5
6
7
8
9
10
s = "A𠈓C";
int n = s.Length;     // 4, because of 4-byte character in middle
 
// Get locations of text elements
int[] indexes = StringInfo.ParseCombiningCharacters(s);  // 0, 1 and 3
 
// Retrieve single element
string nextChar = StringInfo.GetNextTextElement(s, 0);   // A
nextChar = StringInfo.GetNextTextElement(s, 1);          // 𠈓
nextChar = StringInfo.GetNextTextElement(s, 3);          // C