SLaks.Blog

Making the world a better place, one line of code at a time

Dissecting the new .Net Reference Source Browser

Posted on Monday, February 24, 2014, at 6:30:00 PM UTC

The new .Net Reference Source Browser (see my previous post) is an excellent example of one of the less-obvious uses of the new Roslyn toolchain. Kirill Osenkov used Roslyn to recreate Visual Studio's source browser experience in a standalone webapp, allowing people to browse the .Net Reference Source from anywhere. In this post, I will explore how this app is implemented.

The source browser is generated by a conversion tool that uses Roslyn to parse every source file in the codebase and generate a massive collection of HTML files comprising the full browser. Almost all of the work is done statically at build time; the only server-side code is the search engine. This makes it much faster to use, at the cost of consuming hundreds of thousands of files and a couple of gigabytes of disk space.

Generating output files

The converter tool loads every project file in the source release into a Roslyn workspace. It then traverses through every file in every assembly found in the workspace to generate source. As I mentioned last time, some of the projects reference assemblies that do not have source available (either because they're written in managed C++ or for licensing reasons). For these projects, the converter uses Roslyn's generated Metadata As Source files (from Visual Studio's Go to Definition feature).

It runs each source file through the same Roslyn syntax highlighter used by VS itself, converting the result classification spans to HTML to generate the basic source code.

Symbol Navigation

For the navigation features, each member is assigned a unique identifier, based on the standard doc comment ID (the identifiers used in compiled XML doc comment files). These identifiers are then hashed (using the first half of the MD5 to save space), and compiled into a giant index file (separate for each assembly) mapping the hash to the source file defining the member.

The definition of each symbol then gets this hash in its id="", so that adding the hash to the fragment in the URL to the source file will jump to the definition. Some identifiers are the definition for more than one symbol (eg, type declarations are also the definitions of their default constructors); for these, the identifier is wrapped in a <span> tag so that it has two separate IDs.

Javascript code in the index file looks up the hash from its URL fragment and redirects to the actual source file. Thus, you can navigate to a URL like http://referencesource-beta.microsoft.com/mscorlib/a.html#266f59a804f72937 and end up at the actual definition of the symbol with that hash. This results in much shorter URLs than including the full path to the source file in the URL, and also allows other people to build URLs without knowing which file a symbol is defined (for example, this is how I built Ref12).

The converter runs each source file through the binding phase of the Roslyn compiler to get a semantic model mapping each source token to the symbol it refers to. This model is used to wrap each symbol in an HTML hyperlink pointing to its hash in the index file. When the link is clicked, Javascript code in the index file looks up the hash in the URL fragment, then redirects to the actual source file.

Similarly, it generates a static HTML file for each symbol containing a list of references to that symbol (each a hyperlink to a line of source). The converter wraps each member definition in a link to this file to power the Find All References feature.

For members defined in the same assembly, it generates links pointing directly to the source file defining the member; this saves the step of redirecting to the index file. For members defined in other assemblies, it links to a.html, so that assemblies aren't tightly coupled to other projects' source code.

Search

The search engine is the only piece of server-side code in the entire browser. It's implemented using an ASP.Net Web API project, powered by the D.txt file for each assembly. Typing into the search box sends an AJAX request to http://referencesource-beta.microsoft.com/api/symbols/?symbol=text, which finds all symbols matching the search query and replies with a set of HTML links to the definitions (through the a.html index mentioned above, so that the server doesn't need to know where the results are defined in source).

Occasionally, you might see a search come back with Index is being rebuilt; this means that the API AppDomain was unloaded, so it needs to read the index data back into memory before it can perform any searches.

Icons

The icons used in the Reference Source Browser are the same icons used by Visual Studio itself. These are a numbered set of 236 different icons, and can be found at http://referencesource-beta.microsoft.com/content/icons/0.png (through 235.png). The names of these icons come from the StandardGlyphGroup and StandardGlyphItem enums in Microsoft.VisualStudio.Language.Intellisense.dll. The first 190 icons come in groups of 6; one icon for every possible accessibility (including both FamOrAssem, which corresponds to C#'s internal keyword, and FamAndAssem, which is not supported by C#). You can run the following code (which requires a reference to Microsoft.VisualStudio.Language.Intellisense) to download all of the icons with the correct names:

string targetDir = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.MyPictures), @"Visual Studio Glyphs\");
Directory.CreateDirectory(targetDir);

foreach (StandardGlyphGroup g in Enum.GetValues(typeof(StandardGlyphGroup))) {
    if (g == StandardGlyphGroup.GlyphGroupUnknown) continue;

    if (!g.ToString().StartsWith("GlyphGroup")) {
        new WebClient().DownloadFile(
            "http://referencesource-beta.microsoft.com/content/icons/" + (int)g + ".png",
            targetDir + (int)g + " - " + g.ToString().Replace("Glyph", "") + ".png");
        continue;
    }

    for (int i = 0; i < 6; i++) {
        int index = (int)g + i;
        new WebClient().DownloadFile(
            "http://referencesource-beta.microsoft.com/content/icons/" + index + ".png",
            targetDir + index + " - " + g.ToString().Replace("GlyphGroup", "") + "-" + ((StandardGlyphItem)i).ToString().Replace("GlyphItem", "") + ".png");
    }
}

Categories: .net, c#, roslyn, reference-source Tweet this post

comments powered by Disqus