What is an "anonymized" module?

What I'm doing here is adding a bit of obfuscation to the modules produced by the Roslyn compiler, to make it less obvious how to reverse-engineer a program. Specifically, I change the file name from, for example, Merlinia.CommonClasses.MArrays.dll to Merlinia0633.dll. The same change is made to the internally-recorded AssemblyName and ModuleName properties.

In addition, I "censor" some of the information that comes from the AssemblyInfo.cs file. so instead of the assembly description specified there I change the recorded text to specify simply "Merlinia module".

Two modules for the price of one!

As well as wanting to (optionally) produce an anonymized module for shipping to customers I still want to produce a "normal" non-anonymized module for use in internal development. I do this by modifying the Roslyn compiler to produce not just one, but two output modules for each compilation, at least when the option to produce anonymized modules is turned on. So when I compile my MArrays library assembly, I get both Merlinia.CommonClasses.MArrays.dll and Merlinia0633.dll from the one Roslyn build.

In the rest of this article I'll omit all the excruciating detail of the previous articles, and just show some of the highlights of how this was done. In the unlikely event that anyone is actually interested in seeing all of the modifications, I've decided to try to publish everything on GitHub a bit later, after I figure out how to do that.

Note: The code shown below is somewhat obsolete. A newer version is available for download - see this article.

src\Compilers\Core\Portable\Compilation\Compilation.cs

This file is in the CodeAnalysis project. In the Visual Studio Solution Explorer it can be found under CodeAnalysis - Compilation.

This is where the code to produce an additional, anonymized, output module is located, in method SerializeToPeStream(), which was at line 2399 in the revision of Roslyn I was working with.

//Yacks04: Stream anonPeStream = null; Stream anonSigningInputStream = null;

These two lines were inserted at line 2440, just before a large try-catch block.

//Yacks04: If requested, also create an "anonymized" PE module (anonPeStream, anonSigningInputStream) = CreateAnonymizedModule(moduleBeingBuilt, peStream, metadataDiagnostics, metadataOnly, includePrivateMembers, deterministic, emitTestCoverageData, privateKeyOpt, cancellationToken);

This call statement was added at line 2506, just before the catch clause.

//Yacks04: Strong name sign both the non-anonymized module and maybe also the // optional anonymized module if (!SignPeModule(signingInputStream, peStream, diagnostics)) return false; if (!SignPeModule(anonSigningInputStream, anonPeStream, diagnostics)) return false;

This code is at line 2539, just before the finally clause. It replaces some code that was moved to an added source file - see below.

//Yacks04: anonPeStream?.Dispose(); anonSigningInputStream?.Dispose();

This was added at line 2953, just inside the end of the finally clause.

src\Compilers\Core\Portable\Compilation\Compilation.Yacks.cs

As promised above, this is the source file that was added to the CodeAnalysis project, and contains the code called from the above modifications.

// Copyright (c) Merlinia A/S. All Rights Reserved. Licensed under the Apache License, Version 2.0. (Just to be compatible with the Microsoft Roslyn license.) using System; using System.Diagnostics; using System.IO; using System.Security.Cryptography; using System.Threading; using Microsoft.CodeAnalysis.Emit; using Roslyn.Utilities; namespace Microsoft.CodeAnalysis { /// <summary> /// This source file contains some code that has been added to the Roslyn Compilation class. But /// to keep the inline added code to a minimum the added code is placed in methods in this source /// file and called from the main Compilation.cs source file. /// </summary> partial class Compilation { // Reference to a data object containing some Yacks-related information. This field should be // copied from one instance of Compilation (actually, CSharpCompilation) to the next one for // at least some of the situations where CSharpCompilation is recreated due to it being // immutable. If this field is null it implies no Yacks-related modifications are in force. public Yacks_CompilationData YacksData { get; set; } /// <summary> /// Method to create an "anonymized" PE module if requested, in addition to the non-anonymized /// PE module that has just been created. /// /// The anonymized module is created in a sub-directory "Anonymized", relative to the location /// where the non-anonymized module was created. Finding this location is based on assuming /// that the peStream object used to write the non-anonymized module is actually a FileStream, /// and that it contains the full path for the disk file. /// </summary> private (Stream, Stream) CreateAnonymizedModule(CommonPEModuleBuilder moduleBeingBuilt, Stream peStream, DiagnosticBag metadataDiagnostics, bool metadataOnly, bool includePrivateMembers, bool deterministic, bool emitTestCoverageData, RSAParameters? privateKeyOpt, CancellationToken cancellationToken) { FileStream peFileStream = peStream as FileStream; if (YacksData == null || !YacksData._AnonymizeModule || peFileStream == null) return (null, null); string peFilePath = Path.GetDirectoryName(peFileStream.Name); string anonPeFilePath = Path.Combine(peFilePath, Yacks_CompilationData.CAnonymized); Directory.CreateDirectory(anonPeFilePath); // In case first time // Rename library module output (.dll) to anonymous file name, while leaving executable // assembly (.exe) file name unchanged string peFileName = Path.GetFileName(peFileStream.Name); string peFileNameExtension = Path.GetExtension(peFileName).ToUpperInvariant(); Debug.Assert(peFileNameExtension == ".DLL" || peFileNameExtension == ".EXE"); string anonPeFileName = peFileNameExtension == ".EXE" ? peFileName : YacksData.GetModuleName() + ".dll"; EmitStreamProvider anonPeStreamProvider = new SimpleEmitStreamProvider( FileUtilities.CreateFileStreamChecked(File.Create, Path.Combine(anonPeFilePath, anonPeFileName), Yacks_CompilationData.CAnonymized)); Stream anonPeStream = null; Stream anonSigningInputStream = null; Func<Stream> getAnonPeStream = () => { Stream selectedStream; (anonPeStream, anonSigningInputStream, selectedStream) = GetPeStream(metadataDiagnostics, anonPeStreamProvider, metadataOnly); return selectedStream; }; YacksData._EmittingAnonymized = true; SerializePeToStream( moduleBeingBuilt, metadataDiagnostics, MessageProvider, getAnonPeStream, null, null, null, null, metadataOnly, includePrivateMembers, deterministic, emitTestCoverageData, privateKeyOpt, cancellationToken); return (anonPeStream, anonSigningInputStream); } /// <summary> /// Method to strong name sign a PE module. This is copied from the original code in method /// Compilation.SerializeToPeStream() in the interests of DRY, since it is now needed twice, /// first for the non-anonymized module and maybe also for the optional anonymized module. /// </summary> private bool SignPeModule(Stream signingInputStream, Stream peStream, DiagnosticBag diagnostics) { if (signingInputStream != null && peStream != null) { Debug.Assert(Options.StrongNameProvider != null); try { Options.StrongNameProvider.SignStream(StrongNameKeys, signingInputStream, peStream); } catch (DesktopStrongNameProvider.ClrStrongNameMissingException) { diagnostics.Add(StrongNameKeys.GetError(StrongNameKeys.KeyFilePath, StrongNameKeys.KeyContainer, new CodeAnalysisResourcesLocalizableErrorArgument(nameof(CodeAnalysisResources.AssemblySigningNotSupported)), MessageProvider)); return false; } catch (IOException ex) { diagnostics.Add(StrongNameKeys.GetError(StrongNameKeys.KeyFilePath, StrongNameKeys.KeyContainer, ex.Message, MessageProvider)); return false; } } return true; } } }

src\Compilers\Core\Portable\PEWriter\MetadataWriter.cs

This source file is also part of the CodeAnalysis project in Roslyn. In the Visual Studio Solution Explorer it can be found under CodeAnalysis - PEWriter.

Several modifications were made here to modify or remove the data emitted to the "anonymized" module. These modifications call methods in an added source file - see below.

//Yacks04: Anonymize the assembly name if applicable string s = module.Name; if (EmittingAnonymized()) { Debug.Assert(s.StartsWith("Merlinia")); // Check no other assembly is involved s = module.CommonCompilation.YacksData.GetModuleName(); } metadata.AddAssembly( flags: flags, hashAlgorithm: sourceAssembly.HashAlgorithm, version: sourceAssembly.Identity.Version, publicKey: metadata.GetOrAddBlob(sourceAssembly.Identity.PublicKey), //name: GetStringHandleForPathAndCheckLength(module.Name, module), name: GetStringHandleForPathAndCheckLength(s, module), culture: metadata.GetOrAddString(sourceAssembly.Identity.CultureName)); }

This is at the end of the PopulateAssemblyTableRows() method, which was at line 1962 for the revision of Roslyn that I was working with.

private void AddCustomAttributeToTable(EntityHandle parentHandle, ICustomAttribute customAttribute) { //Yacks04: Omit emitting the Code Analysis attributes if building an "anonymized" module if (OmitCodeAnalysisAttributes(customAttribute)) return; metadata.AddCustomAttribute( parent: parentHandle, constructor: GetCustomAttributeTypeCodedIndex(customAttribute.Constructor(Context)), value: GetCustomAttributeSignatureIndex(customAttribute)); }

A couple of lines of code were added to the start of the AddCustomAttributeToTable() method, which was at line 2151.

//Yacks04: Anonymize the module name if applicable string s = module.Name; if (EmittingAnonymized()) s = module.CommonCompilation.YacksData.GetModuleName(); metadata.AddModule( generation: this.Generation, //moduleName: metadata.GetOrAddString(this.module.ModuleName), moduleName: metadata.GetOrAddString(s), mvid: mvidHandle, encId: metadata.GetOrAddGuid(EncId), encBaseId: metadata.GetOrAddGuid(EncBaseId)); }

This is at the end of the PopulateModuleTableRow() method, which was at line 2676.

private void SerializeCustomAttributeSignature(ICustomAttribute customAttribute, BlobBuilder builder) { var parameters = customAttribute.Constructor(Context).GetParameters(Context); var arguments = customAttribute.GetArguments(Context); Debug.Assert(parameters.Length == arguments.Length); //Yacks04: Anonymize some of the assembly attributes if building an "anonymized" module if (AnonymizeAssemblyAttribute(customAttribute)) { arguments = ImmutableArray.Create(new MetadataConstant( arguments[0].Type, Yacks_CompilationData.CMerliniaModule)); }

This is at the start of the SerializeCustomAttributeSignature() method, which was at line 3378.

src\Compilers\Core\Portable\PEWriter\MetadataWriter.Yacks.cs

This is another source file which has been added to the CodeAnalysis project.

// Copyright (c) Merlinia A/S. All Rights Reserved. Licensed under the Apache License, Version 2.0. (Just to be compatible with the Microsoft Roslyn license.) using System.Collections.Immutable; using System.Diagnostics; using System.Reflection.Metadata; using System.Reflection.Metadata.Ecma335; using Microsoft.CodeAnalysis; namespace Microsoft.Cci { /// <summary> /// This source file contains some code that has been added to the Roslyn MetadataWriter class. /// But to keep the inline added code to a minimum the added code is placed in methods in this /// source file and called from the main MetadataWriter.cs source file. /// </summary> internal partial class MetadataWriter { /// <summary> /// Method to test for a Code Analysis attribute, and if so to indicate that it can be omitted /// from the output module if an "anonymized" module is currently being built. This is done on /// the assumption that they are not necessary, and because they may contain unobfuscated /// method parameter names /// </summary> private bool OmitCodeAnalysisAttributes(ICustomAttribute customAttribute) { if (!EmittingAnonymized()) return false; string attributeName = GetAttributeName(customAttribute); if (attributeName == null) return false; return attributeName == "SuppressMessageAttribute"; } /// <summary> /// Method to test for one of the AssemblyInfo attributes that should be "censored" to avoid /// providing information about the purpose of the module. This is only done when building an /// "anonymized" module. /// </summary> private bool AnonymizeAssemblyAttribute(ICustomAttribute customAttribute) { if (!EmittingAnonymized()) return false; string attributeName = GetAttributeName(customAttribute); if (attributeName == null) return false; return attributeName == "AssemblyTitleAttribute" || attributeName == "AssemblyDescriptionAttribute" || attributeName == "AssemblyProductAttribute"; } /// <summary> /// Method to get and check AttributeData.Name, given an ICustomAttribute object. /// </summary> /// <returns>AttributeData.Name, or null if something wrong</returns> private static string GetAttributeName(ICustomAttribute customAttribute) { AttributeData attributeData = customAttribute as AttributeData; if (attributeData == null) return null; Debug.Assert(attributeData.AttributeClass != null && attributeData.AttributeClass.Name != null); return attributeData.AttributeClass.Name; } /// <summary> /// Method to test if Yacks modifications are in effect and if an "anonymized" module is /// currently being built. /// </summary> private bool EmittingAnonymized() { return module.CommonCompilation.YacksData != null && module.CommonCompilation.YacksData._EmittingAnonymized; } /// <summary> /// Method to perform some kludgy processing needed to get loading the reference to the static /// field "s" in the static objects that implement disguised strings to work. /// </summary> /// <returns>true = kludgy processing done, false = not disguised string reference</returns> private bool ProcessPossibleDisguisedStringReference(int pseudoToken, ImmutableArray<byte> generatedIL, ref int localOffset, BlobWriter blobWriter) { // Test if the pseudo/fake token is actually a (negative) stringy number, exit if not if (pseudoToken >= 0) return false; Yacks_CompilationData yacksData = module.CommonCompilation.YacksData; Debug.Assert(yacksData != null); // Replace the ldstr opcode with ldsfld opcode in BlobWriter's data area Debug.Assert(ReadByte(generatedIL, localOffset - 1) == (byte)ILOpCode.Ldstr); blobWriter.Offset = localOffset - 1; blobWriter.WriteByte((byte)ILOpCode.Ldsfld); // Look up the fake/pseudo token for the "s" field, convert it to a handle and emit it int fakeToken; if (!yacksData._StringyNumberToFakeTokenNumber.TryGetValue(-pseudoToken, out fakeToken)) Debug.Assert(false, "stringy number not in dictionary"); blobWriter.WriteInt32( MetadataTokens.GetToken(ResolveEntityHandleFromPseudoToken(fakeToken))); localOffset += 4; return true; // No further processing for this opcode } } }

That's all for this article. As mentioned above, I'll look into the possibility of publishing all of my Roslyn modifications on GitHub eventually.

You must login to post a comment.
Loading comment... The comment will be refreshed after 00:00.

Be the first to comment.