Generating unique and persisted numbers for C# methods

In previous steps I described how I've modified the Roslyn compiler to generate unique numbers for the C# projects (modules / assemblies) and for the .Net types in these programs and for the fields defined in the types. Now the turn has come to the methods defined in the C# classes and structs in the programs.

As usual, an important aspect of this is that for the public methods these unique method numbers are persisted in a LiteDB database, so they remain the same for every compilation. This is necessary to ensure that other programs that reference these public methods via their method numbers will still work unchanged after rebuilding selected modules.

One difference compared to the previous articles is that when it comes to generating and persisting unique numbers for the methods I can't justify what I'm doing by saying that these numbers will someday be used for some useful and altruistic purpose. No, these method numbers are only used for (optionally) anonymizing (obfuscating) the output module, and that's that.

As I've also mentioned previously, for example when I "disguised" the literal strings in the program, I have no illusions that my disguising and anonymizing endeavors will be a serious obstacle for any motivated hacker - I'm only trying to make it slightly less attractive to reverse-engineer my commercial programs.

Still, partly just as an "is it really possible?" exercise, I'm taking method signatures into account, and generating the same method number for two totally unrelated methods if they have different signatures. For example, I have a class with these five methods:

public bool Get(byte[] byteArray, int bitIndex) { ... } public void Set(ref byte[] byteArray, int bitIndex, bool zeroOrOne) { ... } public int IndexOf(byte[] byteArray, bool zeroOrOne) { ... } public string ToString(byte[] byteArray) { ... } public List<int> ToList(byte[] byteArray) { ... }

My modified Roslyn compiler (optionally) renames all five of these methods to M0001(). This works because the overloading rules say that the five methods are different methods even though they all have the same name. Take that, hackers!

Perhaps you noticed that the last two methods have the same signature according to C# rules, but the .Net CLR includes the return value as part of a method's signature, so for the CLR these are two different methods:

public string M0001(byte[] byteArray) { ... } public List<int> M0001(byte[] byteArray) { ... }

In the rest of this article I'll just list most (but not all) of the code involved in generating and persisting these method numbers, and in using them to generate anonymized method names in the output module.

Note: The code shown below is somewhat obsolete. A newer version is available for download - see this article.

src\Compilers\Core\Portable\YacksCompilation.cs

See previous articles for previous (and now obsolete) listings of this source file. Here are some of the changes made in this iteration.

Around line 113:

// This dictionary keeps track of the last non-persisted method number generated for the non- // public methods that have a certain signature. This is used to implement method overload // renaming, so totally unrelated methods can be given the same obfuscated name as long as // they have different signatures. The key is the method signature and the value is the last // method number generated for a method with the specified signature. private readonly Dictionary<string, int> _nonPersistedMethodNumbers = new Dictionary<string, int>();

Around line 385:

/// <summary> /// Method to get a persisted method number for a method defined in a type in the current /// project. /// /// Unlike for fields, this is currently implemented without using any YacksMethodInfo object. /// </summary> /// <param name="typeInfo">YacksTypeInfo object for the containing type</param> /// <param name="methodName">name of the method</param> /// <param name="methodSignature">signature for the method</param> /// <returns>method number, 1 - 9999, or -1 if something wrong</returns> internal int GetOrCreateMethodNumber(YacksTypeInfo typeInfo, string methodName, string methodSignature) { // Check the method number dictionaries exist, create them if not if (typeInfo.MethodDictionary1 == null) { typeInfo.MethodDictionary1 = new Dictionary<string, int>(); typeInfo.MethodDictionary2 = new Dictionary<string, int>(); } // Test for the specified method already defined in a previous compilation, return its // number if so int methodNumber = typeInfo.GetMethodNumber(methodName, methodSignature); if (methodNumber != -1) return methodNumber; // Generate a persisted method number via the method signature and a dictionary in the // YacksTypeInfo object, add it to the dictionary of known method names, write the Yacks // metadata for this project to the disk, return method number methodNumber = GenerateMethodNumber(typeInfo.MethodDictionary1, methodSignature); typeInfo.AddMethodNumber(methodName, methodSignature, methodNumber); _yacksProjects.Update(_projectInfo); return methodNumber; } /// <summary> /// Method to get a method number that should not be persisted, i.e., for a non-public method, /// including a method in a non-public type. /// </summary> /// <returns>method number, 1 - 9999, or -1 if something wrong</returns> internal int GetNonPersistedMethodNumber(string methodSignature) { return GenerateMethodNumber(_nonPersistedMethodNumbers, methodSignature); } /// <summary> /// Method to generate a persisted or non-persisted method number via a dictionary of the /// method signatures vs. the last method number generated for that signature. /// </summary> private static int GenerateMethodNumber(Dictionary<string, int> methodNumbers, string methodSignature) { // Test if this method signature has been encountered before, if not add an entry to the // dictionary for this signature and give it method number 1 int methodNumber; if (!methodNumbers.TryGetValue(methodSignature, out methodNumber)) { methodNumbers.Add(methodSignature, 1); return 1; } // This method signature has been encountered before, so increment the method numbers used // for this signature by one methodNumber += 1; methodNumbers[methodSignature] = methodNumber; // Check method number hasn't reached max value, which should be totally impossible, would // require a project with well over 10000 non-public methods altogether, taking into // consideration that method name overloading is being used return methodNumber <= CMethodNumberMax ? methodNumber : -1; }

src\Compilers\Core\Portable\YacksMetadata-TypeInfo.cs

This file has been listed in several previous steps. In this article I'll just list some of the changes since the last step.

// The following two Dictionary<> collections, if they exist, implement method numbers for the // public methods defined in this C# class or struct. // This dictionary keeps track of the last method number generated for the public methods that // have a certain signature. This is used to implement method overload renaming, so totally // unrelated methods can be given the same obfuscated name as long as they have different // signatures. The key is the method signature and the value is the last method number // generated for a method with the specified signature. public Dictionary<string, int> MethodDictionary1 { get; set; } = null; // This dictionary records the method numbers generated for the public methods in this class // or struct. The key is the method name + ":" + the method signature, and the value is the // method number. public Dictionary<string, int> MethodDictionary2 { get; set; } = null; /// <summary> /// Method to get a persisted method number for a method defined in this class or struct. /// </summary> /// <param name="methodName">name of the method</param> /// <param name="methodSignature">signature for the method</param> /// <returns>method number, 1 - 9999, or -1 if something wrong</returns> public int GetMethodNumber(string methodName, string methodSignature) { int methodNumber; if (MethodDictionary2.TryGetValue(MethodNamePlusSignature(methodName, methodSignature), out methodNumber)) return methodNumber; return -1; } /// <summary> /// Method to add a persisted method number for a method defined in this class or struct to /// the dictionary that persists this information. /// </summary> /// <param name="methodName">name of the method</param> /// <param name="methodSignature">signature for the method</param> /// <param name="methodNumber">method number, 1 - 9999</param> public void AddMethodNumber(string methodName, string methodSignature, int methodNumber) { MethodDictionary2.Add(MethodNamePlusSignature(methodName, methodSignature), methodNumber); } /// <summary> /// Method to create the keys used in MethodDictionary2. /// </summary> private static string MethodNamePlusSignature(string methodName, string methodSignature) { return methodName + ":" + methodSignature; }

src\Compilers\Core\Portable\PEWriter\MetadataWriter.cs

This source file is part of the CodeAnalysis project in Roslyn. In the Visual Studio Solution Explorer it can be found under CodeAnalysis - PEWriter.

Compared with the previous steps one more method has been modified.

private void PopulateMethodTableRows(int[] methodBodyOffsets) { var methodDefs = this.GetMethodDefs(); metadata.SetCapacity(TableIndex.MethodDef, methodDefs.Count); int i = 0; foreach (IMethodDefinition methodDef in methodDefs) { //Yacks10: Anonymize the method name if necessary MethodAttributes methodAttributes = GetMethodAttributes(methodDef); StringHandle methodNameHandle = AnonymizeMethodName(methodDef, methodAttributes); metadata.AddMethodDefinition( //attributes: GetMethodAttributes(methodDef), attributes: methodAttributes, implAttributes: methodDef.GetImplementationAttributes(Context), //name: GetStringHandleForNameAndCheckLength(methodDef.Name, methodDef), name: methodNameHandle, signature: GetMethodSignatureHandle(methodDef), bodyOffset: methodBodyOffsets[i], parameterList: GetFirstParameterHandle(methodDef)); i++; } }

src\Compilers\Core\Portable\PEWriter\MetadataWriter.Yacks.cs

This is a source file which has been added to the CodeAnalysis project. Several methods previously shown have been modified and several additional methods have been added.

/// <summary> /// Method to either "anonymize" a method name (and the parameter list) in a C# class or /// struct if applicable and possible, or else to do standard processing to emit the member /// definition. (Some of the code in this method is copied from original code in the /// MetadataWriter.PopulateMethodTableRows() method.) /// </summary> private StringHandle AnonymizeMethodName(IMethodDefinition methodDef, MethodAttributes methodAttributes) { // Test if applicable to "anonymize" the method name (INamedTypeDefinition containingTypeDef, string namespaceName) = GetNamespaceName(methodDef); if (namespaceName != null && EmittingAnonymized(namespaceName)) { // Test for a constructor method - I don't anonymize their names, even though I think // the CLR does accept constructor methods with arbitrary names if (methodDef.Name != ".ctor" && methodDef.Name != ".cctor") { return AnonymizeMethodName(methodDef, methodAttributes, containingTypeDef, namespaceName); } } // Method name does not get anonymized - do standard processing return GetStringHandleForNameAndCheckLength(methodDef.Name, methodDef); } /// <summary> /// Sub-method of above method to "anonymize" a method name. This makes extensive use of /// method name overloading based on the method signature, so two totally unrelated methods /// can be given the same obfuscated name as long as they have different signatures. /// </summary> private StringHandle AnonymizeMethodName(IMethodDefinition methodDef, MethodAttributes methodAttributes, INamedTypeDefinition containingTypeDef, string namespaceName) { // Get the method signature as a string string methodSignature = GetMethodSignatureAsString(methodDef); // Get the YacksTypeInfo object for the containing type. If this is not available it // indicates the method is defined in a non-public type. (YacksCompilation yacksCompilation, YacksTypeInfo typeInfo) = GetContainingTypeDef(containingTypeDef, namespaceName); // Get the persisted Yacks metadata method number for this method if possible, or for // non-public methods get a non-persisted method number bool isPublic = typeInfo != null && (methodAttributes & MethodAttributes.Public) == MethodAttributes.Public; int methodNumber = isPublic ? yacksCompilation.GetOrCreateMethodNumber(typeInfo, methodDef.Name, methodSignature) : yacksCompilation.GetNonPersistedMethodNumber(methodSignature); return GetHandleForAnonymizedMethodName(methodNumber, isPublic); } /// <summary> /// Method to get the namespace name for a field or method definition. This also returns a /// reference to the containing type definition. /// </summary> private (INamedTypeDefinition containingTypeDef, string namespaceName) GetNamespaceName(ITypeDefinitionMember memberDef) { INamedTypeDefinition containingTypeDef = memberDef.ContainingTypeDefinition as INamedTypeDefinition; string namespaceName = containingTypeDef?.AsNamespaceTypeDefinition(Context)?.NamespaceName; return (containingTypeDef, namespaceName); } /// <summary> /// Method to get the YacksTypeInfo object for the type containing a field or method /// definition. This also returns a reference to the containing type definition. /// </summary> private (YacksCompilation yacksCompilation, YacksTypeInfo typeInfo) GetContainingTypeDef( INamedTypeDefinition containingTypeDef, string namespaceName) { YacksCompilation yacksCompilation = module.CommonCompilation._YacksCompilation; YacksTypeInfo typeInfo = yacksCompilation.GetTypeInfo( FullyQualifiedTypeName(namespaceName, GetMangledName(containingTypeDef))); return (yacksCompilation, typeInfo); } /// <summary> /// Method to process emitting references to member names, i.e. field and method names. /// Special processing is needed if an anonymized module is being emitted, and it is /// referencing fields and methods in another anonymized module. (Some of the code in this /// method is copied from original code in the MetadataWriter.PopulateMemberRefTableRows() /// method.) /// </summary> private StringHandle GetMemberReference(ITypeMemberReference memberRef) { // Getting the namespace name is complicated by the possibility that the field or method is // defined in this project, and is a "specialized" field or method ITypeReference containingTypeRef1; ISymbol memberSymbol; IFieldReference fieldReference = memberRef as IFieldReference; IMethodReference methodReference = memberRef as IMethodReference; Debug.Assert(fieldReference != null || methodReference != null); if (fieldReference != null) { ISpecializedFieldReference specializedFieldReference = fieldReference.AsSpecializedFieldReference; if (specializedFieldReference != null) fieldReference = specializedFieldReference.UnspecializedVersion; containingTypeRef1 = fieldReference.GetContainingType(Context); memberSymbol = fieldReference as ISymbol; } else { ISpecializedMethodReference specializedMethodReference = methodReference.AsSpecializedMethodReference; if (specializedMethodReference != null) methodReference = specializedMethodReference.UnspecializedVersion; containingTypeRef1 = methodReference.GetContainingType(Context); memberSymbol = methodReference as ISymbol; } // Some common processing ... INamedTypeReference containingTypeRef2 = containingTypeRef1 as INamedTypeReference; string namespaceName = containingTypeRef2?.AsNamespaceTypeReference?.NamespaceName; // Test if applicable to "anonymize" the member name reference if (namespaceName != null && memberSymbol != null && EmittingAnonymized(namespaceName)) { // Get the persisted type info for the containing type from Yacks metadata database if // possible YacksCompilation yacksCompilation = module.CommonCompilation._YacksCompilation; (int moduleNumber, YacksTypeInfo typeInfo) = yacksCompilation.GetTypeInfo( memberSymbol.ContainingModule.Name, FullyQualifiedTypeName(namespaceName, GetMangledName(containingTypeRef2))); if (typeInfo != null) { // Process fields and methods separately if (fieldReference != null) { int fieldNumber = yacksCompilation.GetFieldNumber(typeInfo, memberRef.Name); if (fieldNumber != -1) // Should not be possible, but just in case ... return GetHandleForAnonymizedFieldName(fieldNumber, true); } else { int methodNumber = typeInfo.GetMethodNumber(memberRef.Name, GetMethodSignatureAsString(methodReference)); if (methodNumber != -1) // Should not be possible, but just in case ... return GetHandleForAnonymizedMethodName(methodNumber, true); } } } // The member name has not been anonymized - do standard processing return GetStringHandleForNameAndCheckLength(memberRef.Name, memberRef); } /// <summary> /// Method to get the method signature as a string. (The Roslyn processing of method /// signatures results in a binary blob, and I want something that is human-readable.) /// /// The returned signature string does not indicate if the method is static or an instance /// method - C# does not allow both types of methods with the same signature. Similarly, the /// returned signature string does not indicate if "parms" has been used, since C# does not /// allow two methods where the only difference in the signature is use of "parms". /// /// Some of this code is copied from MetadataWriter.GetMethodSignatureHandleAndBlob() and /// MetadataWriter.SerializeReturnValueAndParameters(). /// </summary> private string GetMethodSignatureAsString(IMethodReference methodReference) { StringBuilder stringBuilder = new StringBuilder(); // Following based on code in GetMethodSignatureHandleAndBlob(): // See here, also the comments: https://stackoverflow.com/a/49819265/253938 ISpecializedMethodReference specializedMethodReference = methodReference.AsSpecializedMethodReference; if (specializedMethodReference != null) methodReference = specializedMethodReference.UnspecializedVersion; Debug.Assert((methodReference.CallingConvention & CallingConvention.Generic) != 0 == (methodReference.GenericParameterCount > 0)); // Following based on code in SerializeReturnValueAndParameters(): ISignature methodSignature = methodReference; ITypeReference returnType = methodSignature.GetType(Context); if (module.IsPlatformType(returnType, PlatformType.SystemTypedReference)) { Debug.Assert(!methodSignature.ReturnValueIsByRef); stringBuilder.Append("System_TypedReference"); // Don't know what this implies } else if (module.IsPlatformType(returnType, PlatformType.SystemVoid)) { Debug.Assert(!methodSignature.ReturnValueIsByRef); stringBuilder.Append("void"); } else { Debug.Assert(methodSignature.RefCustomModifiers.Length == 0 || methodSignature.ReturnValueIsByRef); stringBuilder.Append(returnType.ToString()); } stringBuilder.Append(" ("); bool isFirst = true; AppendParameters(stringBuilder, methodSignature.GetParameters(Context), ref isFirst); AppendParameters(stringBuilder, methodReference.ExtraParameters, ref isFirst); stringBuilder.Append(")"); return stringBuilder.ToString(); } /// <summary> /// Sub-method of above method to add a collection (possibly empty) of method parameters to /// the signature string. If there is a parameter modifier of "ref" or "out" then this is /// added by default by the IParameterTypeInformation.ToString() method, based on the value of /// symbol.RefKind (the location of which I can't seem to find). /// </summary> private static void AppendParameters(StringBuilder stringBuilder, ImmutableArray<IParameterTypeInformation> methodParameters, ref bool isFirst) { foreach (IParameterTypeInformation methodParameter in methodParameters) { if (!isFirst) stringBuilder.Append(", "); stringBuilder.Append(methodParameter.ToString()); isFirst = false; } } /// <summary> /// Method to format an anonymized method name and then to add it to the PE module #Strings /// stream and return the "handle" for the string. An anonymized method name is "Mnnnn" or /// "mnnnn", where nnnn is the persisted method number within the type or the non-persisted /// method number within the project, respectively. /// </summary> private StringHandle GetHandleForAnonymizedMethodName(int methodNumber, bool isPublic) { return metadata.GetOrAddString((isPublic ? "M" : "m") + methodNumber.ToString( "D" + YacksCompilation.CFieldNumberLength, CultureInfo.InvariantCulture)); }

Testing

As usual, I used a (slightly modified) library assembly Merlinia.CommonClasses.MArrays to test the modified Roslyn compiler. These screen shots (with highlighting added) show the MethodDef metadata for the non-anonymized and the anonymized versions of the MArrays assembly as displayed by JetBrains dotPeek.

ModRos 10 Snap1

ModRos 10 Snap2

To test the modified referencing I compiled a program TestDynamicBitArrays, which tests parts of the Merlinia.CommonClasses.MArrays library assembly. Both the non-anonymized and the anonymized versions of the two programs worked.

Here are a couple of screen shots that show a portion of the MemberRef metadata for the non-anonymized and the anonymized versions of the TestDynamicBitArrays program as displayed by JetBrains dotPeek.

ModRos 10 Snap3

ModRos 10 Snap4

You must login to post a comment.
Loading comment... The comment will be refreshed after 00:00.

Be the first to comment.