This article is hopefully the first in a series of articles where I document my adventures in attempting to modify Microsoft's Roslyn compiler. My final goals are somewhat diffuse - I sort of know what I'd like to do with Roslyn, but I'm not sure if it's possible. And even if it's possible, I may end up deciding that the problems of maintaining my modified Roslyn may make the whole idea infeasible

I should perhaps mention that I have no intention of trying to get my modifications merged into the open source Roslyn project. What I intend to do with Roslyn is to make some very specialized modifications that would not be of interest generally.

There's lots of information on the internet about Roslyn, but, as far as I've determined, it's all about how to use Roslyn via the existing APIs. I have not been able to find any articles about how to modify Roslyn, perhaps because it's not something any sensible person should try to do.

Building the Roslyn Visual Studio solution

The Roslyn project on GitHub is here:

I must admit that I'm surprised and concerned that an open source project can function with over 4,000 open issues, but maybe that's just my lack of experience with large open source projects.

Anyway, I forked the project, and then downloaded it to my development PC using the desktop version of GitHub. Roslyn is huge, at least by my experience in working with programs.

There is information here about downloading Roslyn and building it on a Windows system:,%20Debugging,

That information was not completely up-to-date when I used it (end of January, 2018), for example the information about adding workloads to the Visual Studio 2017 installation said the Modify menu could be found via an icon with three horizontal lines, but in fact it was on a drop-down menu.

And as of end January 2018, it was also necessary to download and install the "Microsoft .Net Core SDK 2.2.0 - Preview", which can be found here:

When you double-click the Roslyn.sln file and launch Visual Studio 2017 for the entire Roslyn project you will get an idea of how huge it is. Around 178 - 190 projects (the number of projects seems to change depending on when they are being counted), and around 4.6 million lines of C# and Visual Basic code, plus an additional 500,000 lines of non-code

For my purposes I didn't need to test Roslyn in an alternative instance of Visual Studio, so building the entire Roslyn solution successfully with Visual Studio was all I needed at this point.

However, I did have the major problem that building Roslyn with Visual Studio was very sluggish. Maybe my development PC isn't up to a project of this size, or maybe it had something to do with the files being under version control (GitHub), or maybe it had something to do with ReSharper, but just starting Visual Studio could take about 10 minutes, and a simple build, even if nothing was changed, typically took about three to five minutes.

A test project to run the Roslyn compiler

At this point I created a C# test project in a new Visual Studio solution. To some extent I followed the instructions in this article:

The project was created as a Console project for .Net Framework 4.6.1, i.e., not .Net Core. Here's the entire code, in file Program.cs:

using System; using System.IO; // From the Roslyn build output using Microsoft.CodeAnalysis; using Microsoft.CodeAnalysis.CSharp; using Microsoft.CodeAnalysis.Emit; // ReSharper disable LocalizableElement namespace ModifyRoslynProject1 { /// <summary> /// Program to perform a C# complication using the modified Roslyn compiler. This is based on /// /// /// Copyright (c) Rennie Petersen, All Rights Reserved. /// Licensed under the Apache License, Version 2.0. /// (Just to be compatible with the Microsoft Roslyn license.) /// </summary> public static class Program { public static void Main() { try { Console.WriteLine("Beginning Roslyn compilation."); SyntaxTree syntaxTree = CSharpSyntaxTree.ParseText(File.ReadAllText(@"..\CompilerInput\Test1.cs")); PortableExecutableReference mscorlib = MetadataReference.CreateFromFile(typeof(object).Assembly.Location); CSharpCompilation roslynCompilation = CSharpCompilation.Create("MyCompilation", syntaxTrees: new[] { syntaxTree }, references: new[] { mscorlib }); // Emitting to file is available through an extension method in the // Microsoft.CodeAnalysis namespace EmitResult emitResult = roslynCompilation.Emit(@"..\CompilerOutput\Test1.exe"); // If our compilation failed, we can discover exactly why if (!emitResult.Success) { foreach (Diagnostic roslynDiagnostic in emitResult.Diagnostics) Console.WriteLine(roslynDiagnostic.ToString()); } Console.WriteLine("End of Roslyn compilation. Hit any key to terminate this program."); Console.ReadKey(); } catch (Exception e) { Console.WriteLine(e.Message); Console.ReadKey(); } } } }

This requires that you add two references to modules produced by the Roslyn build.

ModRos 1 Snap1

The two files Microsoft.CodeAnalysis.dll and Microsoft.CodeAnalysis.CSharp.dll can be found in folder "E:\GitHub\dotnet\roslyn\Binaries\Debug\Dlls", assuming that "E:\GitHub\dotnet\roslyn" is where your downloaded and newly-built copy of the Roslyn solution is located.

The reference to the System.Reflection.Metadata.dll module is a bit more problematic. This is distributed by Microsoft as a NuGet package, and in fact about 100 copies of the .dll file can be found in the Roslyn projects, in several different versions. I'm so old-fashioned that I prefer to avoid use of NuGet when possible, so I found the package in the NuGet library, here:

(that's the version that was most used in the Roslyn soltion) and used the "Manual download" feature, and then unpacked the package (it's a .zip file), found the .dll file inside the package, copied it elsewhere, and referenced it directly in the References section of my test project.

Here's what the folder structure for my test project looked like:

ModRos 1 Snap2

And here's the Test1.cs file:

using System; namespace ConsoleApplication2 { class Program { static void Main() { Console.WriteLine("Hello world!"); Console.ReadKey(); } } }

Now when I executed my test program it used the Roslyn compiler to compile the Test1.cs file, and produce a Test1.exe file in the CompilerOutput folder. And when I double-clicked on Test1.exe, guess what I got?

ModRos 1 Snap3

OK, so far, so good. Now to do something interesting.

My first modification of the Roslyn compiler

First, in order to avoid the extreme slowness of everything in the Roslyn Visual Studio solution I added a reference to my test solution to the one Roslyn C# project that I was going to modify: "E:\GitHub\dotnet\roslyn\src\Compilers\Core\Portable\CodeAnalysis.csproj". This worked, even though the Roslyn solution uses .Net Standard projects and my test project was a .Net Framework project. I also right-clicked on the solution name, "Modify Roslyn project 1", and specified that my test project was dependent on the CodeAnalysis project, to ensure correct order of building.

For information about the .Net Standard projects, which have somewhat different .csproj files, see here:

I decided to start with something simple. After some difficulty I found the place in Roslyn where it emits the IL instruction that accesses a string constant that has been embedded in the program. It's in the file "E:\GitHub\dotnet\roslyn\src\Compilers\Core\Portable\CodeGen\ILBuilderEmit.cs" and it looked like this:

internal void EmitStringConstant(string value) { if (value == null) { EmitNullConstant(); } else { EmitOpCode(ILOpCode.Ldstr); EmitToken(value); } }

But not for long. Now it looks like this:

internal void EmitStringConstant(string value) { if (value == null) { EmitNullConstant(); } else { if (value == "Hello world!") value = "Hello universe!"; EmitOpCode(ILOpCode.Ldstr); EmitToken(value); } }

So now when I ran my test solution it first re-compiled the Microsoft.CodeAnalysis module, then compiled and ran my test program.

And now when I double-clicked on the Test1.exe file I got this:

ModRos 1 Snap4

Isn't that awesome?

Actually, it's kind of silly. I don't see much use for a custom C# compiler that automatically converts "Hello world" programs into "Hello universe" programs. And I'm guessing that this kind of conversion can probably be done using the Roslyn API.

But this is, in my opinion, a successful "proof of concept" exercise, In my next article I'll try to do something a bit more useful via modifications to Roslyn.

You must login to post a comment.
Loading comment... The comment will be refreshed after 00:00.

Be the first to comment.