Humblecoder

Caution, this blog may be ironically named

Spotting a Memory Leak With WinDBG in .NET

| Comments

Introduction

Lately I’ve spent a lot of time tracking down a handle leak in a .NET application that called into a DCOM object.  In the process I had a crash course in WinDBG.  Before this I was, frankly, scared of WinDBG it was just fast scrolling text with a command interface, far from user friendly.  But I have discovered it is extremely powerful and not quite the scary monster I thought it was.

To demonstrate this I’m going to use a fictional ASP.NET app called MemoryMuncher that leaks memory with each page reload to give a quick overview of diagnosing a managed memory leak with WinDBG.

.NET Doesn’t Have Memory Leaks!

It is a common misconception that .NET can not leak memory because it has a garbage collector.  The most common situation where it would leak memory is in registering delegates with an event and not unregistering them.  Consider the following code:

private void DoSomeWork()
{
    SomeClass bigClass = new SomeClass();
    HoldOnToARef.IDoSomethingImportantMaybeAjaxy += bigClass.IHandleSomethingImportant;  
    //Some other code
}

When the SomeClass instance goes out of scope the pointer, bigClass, to it will be deleted.  When the GC runs next it will walk the stack looking for references to the memory and it will find one!  The one it will find is the link between HoldOnToARef.DoSomethingImportantMaybeAjaxy event and the delegate we registered with it.  But we don’t have a reference to bigClass anymore to deregister the delegate.  So it will last until HoldOnToARef goes out of scope, which it won’t in this case as it is a static.

You maybe looking thinking “I would never write code like that” but we all have (maybe not with the statics as that’s an extreme example) with objects that live for a long time or we have all inherited code like this.

Memory Muncher is…errmm….Munching Memory

Every time a page refreshes on Memory Muncher the host process’s, w3wp.exe, memory usage raises by 5MB and when left it does not go down by itself.  At this rate, on a 32bit system, the process will run out of virtual address space in 400 page views causing the app to crash and the App pool to recycle. If it’s a 64bit system, it will just start to really slow down the server as it grows, so no big deal there :P.

The first thing to do is to take a process dump when the memory usage is high. This is relatively straight forward on Server 2008+ / Vista+, go to task manager find the offending process, right click and select “Create Dump File” (fig1). If you’re on Server 2003 / XP you need to use the built in Dr Watson tool or Process Explorer to create a dump.

fig1: Menu option

Create Dump Menu

This should let you select a location to save the dump to or tell you where it has saved it to.

Ready, Steady, WinDBG…

Now we have the dump lets get WinDBGing.  WinDBG can be downloaded from here, you will need the 32 bit or 64 bit version depending on what you want to analyse and to analyse .NET dumps you need to do it on a machine with .NET installed.

First thing to do is open the dump and load the SOS extensions for .NET.  I normally find it best to start WinDBG elevated on Vista/2008 systems.  The SOS extension bring additional commands to WinDBG for dealing with the CLR and the managed memory model.  They are loaded by typing

.loadby sos mscorwks

This is used by .NET 2.0 + if you’re using .NET 1.x there is a different command to load the extension.

Next we need the symbols, symbols make the stack traces readable.  If this is the first time you have used WinDBG type .symfix at the prompt.  This will setup the symbols path to use the public Microsoft symbol server.  You will need the symbols for your application as well.  The best way to do this is get them together in a folder, they need to be the correct version for the build you just took the dump from. Go to the menu and select File –> Symbol File Path and then browse for the folder you put them in.  Check the reload option in that dialog then hit OK.  You should now be good to go :)

Back to Memory Muncher and we are ready to start to look for the leak.  The first thing to do is dump the heap, this done by entering the command

!dumpheap

This will give a lot of information, for Memory Muncher we ended up with: image

There is an almost overwhelming amount of information here.  What we have is, from left to right, the method table address, the number of instances of the type, the amount of memory being used in total by all instances of a type in bytes and finally the type.  I’ve highlighted the two important ones here, the number of objects for a given type and the amount of memory taken up.  The list sorted by what is taking up the most memory.

I have underlined my object that is using the most amount of memory, this seems like a good place to start.  The questions I’m trying to answer at this point is why do we have so many of these objects alive?  What is the object? What is keeping it alive?  To start answering these we need a reference to just one of them, to get this we can use !dumpheap again but this time for the type, this is achieved by running the following command:

!dumpheap -type MemoryMuncher.SomeOtherClass

This produces:

image

Again an awful of text flying past but the important bits are the first column, this is the address of the instance of an object on the heap.  The other information is the method table address (this is for reflection to look up metadata) and the size in memory of the instance.  At the bottom is a list of who owns the references to these objects and how many of them there are.

We know that the garbage collector is compacting so that means the lower the address the older the object is, this is pertinent because if we look at the older objects they have probably survived a few garbage collector generations and as such are more likely to show us who is holding the reference causing the leak.

So, we can at this point have a look at what the object is and its current instance values by doing !do

.  This would help identify individual objects and what state they are in but since there is so many it would be wiser to see who is holding a reference.  This can be done by running the following command:

!gcroot <address>

This produces:

image

This output shows we are being held on to by a class called SomeClass which is being held open by an object called MyCallbackDelegate. At this point we can find these objects in the code and start the looking at what MyCallbackDelegate is.  The full Memory Muncher code is listed below and we can see that a handler is registered but never deregistered.  So it should be a simple fix.

namespace MemoryMuncher
{  
    public delegate void MyCallbackDelegate();  
    public class HoldOnToARef
    {
        public static event MyCallbackDelegate IDoSomethingImportantMaybeAjaxy;  
        public static void DoThatImportantThing()
        {
            IDoSomethingImportantMaybeAjaxy();  
        }
    }  
    public class SomeClass
    {
        private List<SomeOtherClass> m_SomeList = new List<SomeOtherClass>();
        public SomeClass()
        {
            for (int i = 0; i < 500; i++)
            {
                m_SomeList.Add(new SomeOtherClass());
            }
        }  
        public void IHandleSomethingImportant()
        {
            foreach (SomeOtherClass aClass in m_SomeList)
            {
                Console.WriteLine(aClass.BigByteArray);
            }
        }
    }  
    public class SomeOtherClass
    {
        public Byte[] BigByteArray;  
        public SomeOtherClass()
        {
            BigByteArray = new Byte[1024];
            for (int i = 0; i < 1024; i++)
            {
                BigByteArray[i] = 0xA0;
            }
        }  
    }  
    public partial class _Default : System.Web.UI.Page
    {
        protected void Page_Load(object sender, EventArgs e)
        {
            for (int i = 0; i < 10; i++)
            {
                SomeClass bigClass = new SomeClass();
                HoldOnToARef.IDoSomethingImportantMaybeAjaxy += bigClass.IHandleSomethingImportant;
            }
        }
    }
}

Final Thoughts and Further Reading

This is a fairly noddy example that would have been caught simply by the reviewing the code and as such the dump is a quick study.  In a real world example there might be multiple objects that exist in their thousands or complex object graphs that make GCRoot’s output more cryptic.  Either way the same methodology would apply, look for suspicious objects you know should have fairly short life cycle and dig into why the reference is being held open.

This is only a tiny portion of what WinDBG can do and I haven’t even began to do it justice.  If you want to know more these are great places to start

Also, a general read for all .NET developers with lots of useful background information for this is CLR via C#.

Happy WinDBGing!

Comments