Wednesday, November 12, 2008

How to make delimited strings in C#

In prior popular MS programming languages and within the COM runtime environment, strings were mutable that is, the program could change the contents and size of a string in memory without having to copy the whole thing over in memory.

In the world of .Net, though, strings are immutable, which means once they are made, any changes to them causes them to be completely copied in memory to be modified. As you can imagine, this is quite painful, performance-wise. This is why the .Net runtime has a StringBuilder class, which mimics the effects of real mutable strings.

Now, in mutable string languages, building a delimited string quickly was as simple as doing this:

Dim delimiter = ""
Dim delimitedString = ""

For Each Item In intArray
delimitedString = delimitedString & delimiter & CStr(intArray)

'Single char assignments are super cheap
' because the string has already been allocated.
delimiter = ","
End For

Do that in VB.Net or C#, and it'll run like poo, because every time the delimiter is assigned, a new string is allocated in memory and then its value set, versus what VB6 did, which was to just change char[0] of the delimiter string to a comma...over and over and over again.

Here's how you do the same thing in a .Net language and retain performance:

if(intArray == null || intArray.Length == 0)
return string.Empty;
else
{
StringBuilder sb = new StringBuilder(intArray[0]);

for(int itemCounter = 0; itemCounter < intArray.Length; itemCounter++)
{
sb.AppendFormat(", {0}", intArray);
}

return sb.ToString();
}

In VB:

Return Join(intArray, ", ")


I couldn't tell you what kind of performance the VB Join() method gets (and I have no doubt writing a loop yourself is faster), but you can't argue with the ease of it. And before you say C# can use string.Join(), you have to know that string.Join() requires its input array to already be an array of strings, which would have required me to loop through the int array above and make a string array just to call string.Join(). Which would be stupid. Because you never loop over a data set twice when once will do the trick.



No comments:

Post a Comment