Rachel Pierson, Work In Progress: Extension Methods

Extension Methods were introduced in the revisions to the C# and VB .Net languages that accompanied the release of Framework version 3.5 in November 2007 (that is, C# 3.0 and VB 9.0 respectively). They allow developers to augment – to extend – existing classes and interfaces by facilitating the creation of new Methods, without breaking the encapsulation of the original type/interface being extended. This can be achieved without the developer having to create a whole new type that derives from the type they wish to extend, and without their needing to have access to or know about the internal implementation of the type/interface they are extending. Used sparingly and intelligently, Extension Methods can be an extremely powerful and useful technique in a developer’s repertoire. This article contains a brief background on the advent of Extension Methods, a discussion of the syntax for implementing them, and some suggestions as to where and when best to use them.

A little background...

It’s no accident that Extension Methods were first included in the same version of the .Net Framework that also introduced LINQ. A full discussion of LINQ and the capabilities that it appends to the .Net language is a separate topic for another time. In summary, though, and of relevance to the present topic, LINQ (Language Integrated Query) is a syntax extension of the .Net languages that allows types that implement certain existing Interfaces to be directly queried in a native language syntax that reads very much like the SQL statements that had hitherto traditionally only been used within DBMS such as SQL Server. Using LINQ, developers could for the first time query, sort, and otherwise work with types representing data collections in-memory. The following simple program demonstrates LINQ at work on a collection of type string[] :

using System;
using System.Linq;

namespace Demo
{
    static class Program
    {
        static void Main()
        {
            string[] _CapitalCities = new string[5] { "Edinburgh", "London", "Paris", "Rome", "Stockholm" };

            foreach (string city in _CapitalCities.OrderByDescending(city => city))
            {
                Console.WriteLine(city);
            }

            Console.ReadLine();
        }
    }
}

The above program produces a Console window, listing the five cities stored in the variable _CapitalCities in reverse alphanumeric order. It does this by accessing a new method for type string[] called OrderByDescending. If you remove the using System.Linq directive from the above program, you will find that the compiler no longer recognises a Method called OrderByDescending for objects of type string[]. So what’s going on?

Because the type string[] implements an existing generic interface called IEnumerable<T> that was defined in previous versions of the .Net Framework, and because a new class called System.Linq.Enumerable that was appended to Framework 3.5 by the designers of LINQ defines an Extension Method for IEnumerable<T>called OrderByDescending, variables of the type string[] acquire the new Method observed, which is only accessible within the context of any object that references the System.Linq namespace.

LINQ uses Extension Methods in this way to facilitate adding new functionality to existing Framework definitions such as IEnumerable<T>, for the purpose of enabling objects that utilise those definitions to be instantly usable in conjunction with the new LINQ syntax. Extension Methods allowed them to do this without causing breaking changes to designs that preceded LINQ, and which consumed the original, un-extended versions of those existing Framework objects. Alternative approaches by the designers of LINQ might have included updating all of the Interfaces that they wanted to add new functionality to (which would have been great for LINQ, but would have broken any existing designs that had used previous versions of the updated Interfaces as soon as those designs were upgraded to the new version of the Framework, thereby upsetting a lot of developers and system owners). Or, they could have created an entirely new set of Interfaces, based on the originals, but with the extra LINQ goodness included (would have needlessly added to the complexity of the .Net Framework, which already contained quite enough redundant but kept-for-reasons-of-backward-compatibility features and near-identical object definitions from earlier upgrades and improvements, and would furthermore have needlessly rendered LINQ unusable by existing objects that had implemented older versions of the superseded Interfaces). By using Extension Methods, the LINQ developers were able to add functionality to existing objects, in a way that would only be apparent within designs that utilised their new language feature, and that avoided producing unnecessary breaking changes for developers that were merely upgrading their designs to the latest version of the Framework, but who had no interest in utilising LINQ within their legacy designs. In the process of their work, the LINQ developers gave us a useful additional tool to make much the same subtle enhancements to our own established designs as they had to the preceding version of the Framework.

Implementing Extension Methods within your own designs...

The syntax of implementing Extension Methods for a given class is demonstrated below. Suppose you have an existing class defined in the following way:

namespace RPPClassLibrary
{
    public class Person 
    {
        public Person()
        {
        }

        public string FirstName { get; set; }
        public string LastName { get; set; }
    }
}

If you wanted to extend the class Person to include a new Method called FullName using Extension Methods, the syntax for doing so would be as follows:

using System;
using RPPClassLibrary;

namespace RPPConsumingAssembly
{
    static class Program
    {
        static void Main()
        {
            Person aRachel = new Person() { FirstName = "Rachel", LastName = "Pierson" };
            Console.WriteLine(aRachel.FullName());
            Console.ReadLine();
        }

    }

    public static class MyExtensions
    {
        public static string FullName(this Person pExistingObject)
        {
            return pExistingObject.FirstName + " " + pExistingObject.LastName;
        }
    }
}

The above scenario, where the developer has free access to the class they are extending, isn’t a particularly good example of when Extension Methods should be used, but it is nonetheless a simple enough scenario to demonstrate what the syntax for creating such a Method should look like. The above code snippet takes an existing class, Person, which resides within the namespace RPPClassLibrary, and extends it to include the newly-required Method FullName, only for objects of type Person that are defined within the scope of the separate namespace RPPConsumingAssembly. This is achieved by adding a new static class within RPPConsumingAssembly called MyExtensions. The Main Method of the class Program within RPPConsumingAssembly demonstrates using the new FullName Extension Method with an object of the freshly-extended Person type. The output of the program above is as follows:

To create an Extension Method for any existing class or interface, as the code snippet above demonstrates you need to create a static class at the top level of the namespace in which you want to consume the Extension Method, then create as many public static Methods within that static class as you wish to create Extension Methods for. It doesn’t matter what you call the static class that you use to contain the Extension Methods you define, it matters only that the parameter list for any Extension Methods you define within your top-level static class take the exact format shown above – i.e. (this Person pExistingObject) – changing the type Person for whichever class/interface it is that you are extending. You can include Extension Methods for entirely different existing types within the same top-level static class, provided your definition of each individual Extension Method makes it clear which type your new Method applies to.

When to use Extension Methods...

There is a school of thought that says: “Never!”. The reason for this viewpoint is that extending classes willy-nilly certainly could lead to confusion at a later date, if the class that is extended by Extension Methods subsequently gets directly updated at some point to contain some additional intrinsic Methods with the same signatures as the Extension Methods that have been defined in the interim by external consuming classes. There would be nothing in the compiler that would alert the original author of the class being reviewed to the fact that some consumer of their object had previously extended it in what would in some respects be a potentially conflicting change. What would happen in actuality if the developer of a since-extended class happened to add a new Method to their class that had exactly the same signature as an Extension Method that had been created by a consumer, is that the new internal Method would take precedence, and the Extension Method would be superseded and effectively ignored by the compiler. For example, imagine if the original class in the simple example presented earlier were to be revised at some point so that it looked like this:

public class Person 
{
    public Person()
    {
    }

    public string FirstName { get; set; }
    public string LastName { get; set; }
    public string MiddleName { get; set; }

    public string FullName()
    { 
        StringBuilder _FullName = new StringBuilder();

        _FullName.AppendFormat("{0} {1} {2}", 
            FirstName ?? "(No First Name)",
            MiddleName ?? "(No Middle Name)",
            LastName ?? "(No Last Name)");

        return _FullName.ToString();
    }
}

In this scenario, the assembly that consumes the Person class (which would still contain a now-obsolete Extension Method with the same name as the new internal Method FullName that has been added to the original Person class) would still compile, and the program would run without errors, but the output would now reflect the internal, subtly-different, definition of the FullName Method:

The basic issue that some developers have with Extension Methods is that, by some definitions, they break the key object-oriented design principle of encapsulation. That is, they begin to impact upon the ability of objects to contain and control their own definitions, and act as black boxes to consumers. This isn’t an entirely fair criticism since, as the above example demonstrates, the original object remains quite untouched by any external consumers implementing their own Extension Methods for their own internal consumption; there’s nothing about such Extension Methods that prevent the developer of the original object from continuing to be in full control of their object’s design (though there is certainly an impetus on the developer of any Extension Methods to consider the possibility of unexpected behaviour should any object they have extended subsequently have its design changed).

The encapsulation issue aside, as the historical example of the introduction of LINQ demonstrates, there are certain discrete circumstances when it can be very useful to be able to extend the existing definition of an established object, in a way that avoids creating breaking changes for existing consumers of the original object, but that still provides new functionality for new consumers of same. Where Extension Methods are most useful, therefore, is when you are reasonably sure that the object you’re extending wont be subject to further internal change, and the functionality you are adding is specifically useful within the context in which you are adding it.

Wednesday, 21 July 2010

Extension Methods