Welcome to AspAdvice Sign in | Join | Help

From .NET Geek's Desk

Thoughts and Findings on .NET
String Interning in C#

We all know that string objects are immutable in C# i.e we can only create a new instance of the object we cannot alter or modify them.Let us take a quick look into the following lines of code:

static void Main(string[] args)
    {
        string s1 = "sankarsan";
        string s2 = "sankarsan";
        if (object.ReferenceEquals(s1, s2))
        {
            Console.WriteLine("Both s1 and s2 refer to same object");
        }
        else
        {
            Console.WriteLine("s1 and s2 refer to different object");
        }
        Console.Read();
    }

As strings are immutable s1 and s2 should be two different objects and output of the program should be "s1 and s2 refer to different object".But somehow that is not the case the output of the above code is "Both s1 and s2 refer to same object".But how can this happen?Let us also take a look into the IL code

IL_0001:  ldstr      "sankarsan"
IL_0006:  stloc.0
IL_0007:  ldstr      "sankarsan"
IL_000c:  stloc.1

ldstr basically allocates memory for a string and stloc stores the reference into a variable in stack.

Now let us carefully study the documentation of the ldstr opcode in MSDN : http://msdn.microsoft.com/en-us/library/system.reflection.emit.opcodes.ldstr.aspx.

The following lines in MSDN needs to be carefully noted:

The Common Language Infrastructure (CLI) guarantees that the result of two ldstr instructions referring to two metadata tokens that have the same sequence of characters return precisely the same string object (a process known as "string interning").

CLR internally maintains a hashtable like structure called intern pool which contains an entry for each unique literal string as key and the memory location of the string object as value.When a string literal is assigned to the variable CLR checks if the entry present in the intern pool,if exists it returns reference to that object otherwise creates the string object, adds to the pool and returns the reference.This is String Interning.The basic objective of this is reduce memory usage by avoiding duplication of same strings which are immutable objects.

But this can have negative performance impact as well.This is because the additional hashtable lookups are costly and moreover all the interned strings are not unloaded from the memory till the app domain is unloaded.So they will occupy memory even if they are not used.

We can try to off string interning by adding the following attribute to the assembly

[assembly:CompilationRelaxations(CompilationRelaxations.NoStringInterning)]

But it is upto the CLR as it may or may not consider this attribute.But if the native image is compiled using Ngen.exe then it considers this attribute.

This feature of string interning is not something specific to CLR but also present languages like Java,Python etc.

Posted: Thursday, December 25, 2008 1:39 AM by sankarsan
Filed under: ,

Comments

Abu Ismail said:

Very interesting topic... very good observation too... simplyy thee goood..
# December 26, 2008 12:30 PM

Sunil Punjabi said:

A real insight on memory usage in .Net.
# January 2, 2009 6:53 AM

sankarsan said:

Thanks Abu.

Thanks Sunil.

# January 3, 2009 6:17 AM
Leave a Comment

(required) 

(required) 

(optional)

(required) 

Enter the code you see below

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS