Rachel Pierson, Work In Progress: A Guide to The Cloud, Part 1

In recent years, I’ve been involved in a number of cloud computing projects. Most recently, this included a very enjoyable project working for a forward-looking games company based in Glasgow. This blog post is intended to dispel some of the myths that linger about the various technologies that enable cloud computing projects to work. The content in this first part is primarily aimed at non-technical managers looking to get an understanding of what the cloud can do for them. In Part 2, aimed at a more technical audience, I’ll delve more deeply into the underlying technologies.

“The Cloud” is one of those buzzword phrases that’s been bandied around an awful lot. In the process, it’s had its meaning stretched and diluted a great deal. There’s been a lot of misinformation about what does and does not constitute a cloud computing project / platform. Common aspects of the various definitions I’ve encountered have included:

Applications that are web-based.
The hosting of those web applications, and the databases that underlie them, on remote hardware that isn’t located in the same building as the development team.
Lower hardware maintenance costs.
The ability to scale applications as an application’s user base grows.

A difficulty with some of the discussion that has fallen under the “cloud” umbrella, is that some or all of these qualities are also found in projects that are not true “cloud” applications, and never will be.

For the avoidance of doubt, when I speak of cloud computing projects, I am talking specifically about projects that encapsulate all of the following discrete qualities:

They are web applications that are accessible across the open internet, and are designed from the ground up to be deployed to dedicated cloud-computing platforms. This involves considering scalability, security and ease of deployment (discussed below) as primary design goals. It is not simply taking an existing Java EE 6 or ASP.Net application that was once hosted on internally-managed hardware and deploying it to a small number of servers in a single data centre.
Projects where the hardware to which the above solutions are deployed is not directly managed by the party that owns/writes the software. That is, an organisation that deploys a solution ‘to the cloud’ typically doesn’t know or care about where the physical server upon which their application runs resides, beyond broad geographical considerations. So, whilst it’s often possible to choose between “Asia”, “Europe”, “North America”, etc, when deciding roughly where your application will be hosted, if your hardware management is any more fine-grained than that then you are not using cloud technologies at all; you’re simply remotely-managing hardware that you are still heavily-invested in maintaining yourself.
Solutions where you can scale your application to serve a greater number of users quickly and reliably. This typically involves a combination of leaving managing any physical hardware up to the third party you purchase cloud hosting services from, and an awareness within the development team of scalability issues as they apply to software design.

In Part 2 of this blog post I’ll get into some specific technical implementation details involving one particular set of cloud technologies: Windows Azure and ASP.Net MVC, in conjunction with SQL Azure. But first, let’s have a look at some general design considerations that apply whichever cloud platform you are using, and that should be clearly understood by technical and non-technical managers of cloud computing projects alike:

Security

I’ve worked on a range of types of application that have been used for a wide variety of purposes, from the very most trivial you can think of to mission-critical applications that needed to work every single time. An example of the diverse range of problems I’ve been involved in solving includes:

· Automating precision engineering manufacturing processes for producing delicate parts that keep satellites in orbit

· National power utility infrastructure management

· DV-cleared national government work

· A national police project

· Investment banking applications aimed at Anti Money Laundering

· A system for designing custom zombies for use in online games (seriously)

All of which is to say, I fully appreciate the need for security and I have a wide enough grounding in a diverse range of applications that required same to be able to make an informed judgement about whether cloud technologies are sufficiently well-protected to be able to use for each of the above discrete applications. I get it. Really I do. (Hey, there’s nothing more important than protecting society against the ever-present threat of a zombie apocalypse, right?)

I suspect that most if not all of the Public Sector and banking organisations with whom I’ve worked would be horrified at the idea of storing their sensitive data on hardware they didn’t physically control. (Even though many organisations in those sectors experience very serious problems anyway, even when working solely with hardware they get to fully manage in ways with which they are more comfortable.) There’s something falsely-comforting to the uninitiated about having physical control of actual touchable hardware. It’s the same misguided illusion of security that makes some people store all their life savings under a mattress rather than putting it in a bank for safekeeping.

As well as the psychological difficulties some organisations/managers have in letting go control of physical hardware, in Europe specifically there are also some rather ill-conceived and as yet legally-untested rules concerning the processing of data outside the EU. So, if you operate there you might be forgiven for wondering whether you are allowed to store sensitive customer information on physical hardware that may be located outside Europe, even if you might wish to do so. Like the EU cookie law, it’s nonsense that’ll get over itself soon enough. But still, misguided and vague concerns like these allow people with a predilection to do so to spread worry and doubt about the security and legality of using cloud technologies they don’t fully understand, to solve problems they’d rather would just go away.

Without getting into the technical details too deeply at this juncture, in summary it is possible to easily encrypt data to a level where even the most sophisticated state/non-state actors can’t access it. If desired, it’s possible to encrypt all of the data you store on cloud servers, or just those parts that are particularly sensitive like passwords. Implementation details aside, most of the encryption schemes in use today use the same public key cryptography principle (though new approaches can and are being developed all the time). It’s the same process that allows you to safely access your bank account online, and make purchases from online retailers without risk of a third party being able to intercept and misuse your details. It’s safe: if it weren’t, there would be a lot more online fraud than there is.

Some organisations that operate in the cloud include: Amazon, Google, Microsoft and the National Security Agency. So, if anyone ever tries to tell you that you shouldn’t use a cloud solution purely on the grounds of security, I suggest you point them at the above links and invite them to come up with a more supportable rationalisation for their preferred approach.

Scalability

Aside from security, this is probably the second most important concern for cloud applications. Scalability is the ability of a given application to be able to adequately and reliably serve the needs of users under a diverse range of conditions. This involves several discrete design considerations, some or all of which may affect your project, depending on its nature:

The ability to support many concurrent users

First and foremost, your application must have the ability to support many thousands of concurrent users equally as well as supporting individual users in isolation. This design consideration is very easy to overlook when you’re working on a Proof Of Concept, where you’re mainly focused on providing features and the only people developers need to satisfy are the rest of their peers in the development team (hopefully augmented by some independent testers that will have the luxury of working on a version of the system that has not yet gone live and where they are consequently not using the system under stress). To be able to have confidence that systems work under the stress of heavy concurrent use, it’s important to test for that specific design goal using appropriate methods. There are various ways to do so that typically involve using a test harness to simulate such use; more on the technical implementation details of that in Part 2.

Considering the strengths of multi-tenancy vs single tenancy

Most software that’s used today tends to be written with a single end-user organisation in mind. If that’s the type of project you’re working on, you can dispense with this consideration altogether, since it doesn’t affect you. However, for some types of application, it’s the case that the same basic application gets delivered to multiple end user organisations, each of whom will have their own user base and subtle usage considerations. In these circumstances, an assessment must be made about the relative benefits and drawbacks of allowing different organisations to share instances of your application (known as multi-tenancy solutions) vs allowing each customer to have their own instance (known as single tenancy).

There’s no ‘right’ or ‘wrong’ answer that fits every situation. However, some things to consider include: Will different customers want to use different versions of your application at the same time? E.g., if customer ‘A’ buys version 1 of your application, and some time later customer ‘B’ comes along and purchases your latest improved version with additional features (version 2), are you going to move every customer that is presently on version 1 up to the latest version for free to satisfy the desire of your latest customer to buy the latest version? And if so, are your existing customers going to be happy to make the move?

The answers to these questions will dictate whether you should provide everyone with their own instance of your application, or attempt to cater to the needs of multiple organisations using one dedicated version.

Deployment

As new customers of your cloud-hosted solution come on board, you’ll need to consider how you are going to cater for providing them with the service they will be paying for. Whether you’re going to take a multi- or single- tenancy approach is a separate consideration. You also need to consider how you are going to get from the point of a customer requesting the ability to use your service, and that service being up and running. This typically involves, but is not necessarily limited to :

Setting up a database to contain the end-user organisation’s information.
Providing an instance of the web application that is specific to the end user organisation. E.g., you might provide the exact same stock management solution to a supermarket as you do to a company that makes metal parts. If you do, the supermarket is unlikely to want to direct their customers to www.shinymetalparts.com to check the price of milk at their local superstore.
You don’t want to get too deeply into managing physical hardware (not having that headache is one of the advantages that cloud computing is meant to bring you). However, you may still want to take an interest in the general geographical area that your solution will be deployed to. If you acquire a customer that has a large user base in Asia, for reasons of bandwidth management you’re unlikely to want to route all the traffic to that customer’s instance of your solution via the North American cloud hub that you used to develop and test your solution.

Most importantly, as an organisation that provides a cloud-hosted Software As A Service solution to others, you do not want to waste a great deal of time and effort getting developers involved in the above matters at the time of deployment. Planning and preparation for deployment needs to be done in advance if it’s to be executed efficiently.

Ideally, you’d like it to be the case that your salespeople can speak with potential new customers, and for those customers to be up and running with a minimum of fuss as soon as a contract for service has been signed. You shouldn’t need a DBA to set up the database, a developer to create a copy of the web application, and a tester to make sure it all still works as intended, just to supply something to customer ‘B’ that you’ve already supplied to Customer ‘A’.

Fortunately, there are solutions to the deployment process that involve minimal work at deployment time. I’ll get into the technical details more in Part 2, but for now I’ll just note that there are tools that, provided they’re used correctly, make the process as simple as running a single script to achieve All Of The Above goals.

In Part 2 I’ll discuss in detail how you can use a combination of Powershell, NAnt, and FluentMigrator to automate the deployment process. Key to the success of these is one final piece of the puzzle…

Continuous Integration and Version Control

The Joel Test has been kicking around for quite a while now, and whilst it is showing its age a little, many of the most savvy developers ask questions from it when deciding where to work. (Side note: yes, believe it or not, the best developers do still get to do that. Even in this economy. Think of the number of businesses that don’t use the internet or IT in some way; that’s the number of places good developers can’t find work, and those organisations are consequently who you’re competing against for the best talent). There aren’t too many organisations still operating today, thank goodness, that don’t provide basic tools like bug tracking and source control. Rather fewer have testing teams that are completely independent from the development team. Fewer still ensure quiet conditions for developers, and in my experience almost no organisation has been capable of doing daily builds or builds in one step at will.

The ability to deploy easily and at will is covered above. Related to that, however, is the consideration of how you will support multiple versions of your solution, some of which may be being used by different customers simultaneously. Part of the reason that most organisations aren’t able to deploy different versions at will is that, as noted earlier, most software today is simply written for one group of users and will only ever been used by and updated for that one specific group of users. If that’s the category your project falls into, then you don’t need to read any further. For those organisations that produce solutions for use by more than one customer at a time, sooner or later you’re going to have to delve into the topic of version control and continuous integration.

Continuous Integration is the process of managing features that are being developed by your R&D / development team, and determining which version(s) of your product will be benefactors of new features / bug fixes that are continually being developed. One day your R&D team might be working on Super Duper feature ‘X’ that is only going to be made available to new customers or ones that upgrade to your latest version. Another day those same developers might be addressing a critical bug that’s been discovered, the fix to which will be rolled out to users of all versions as soon as it’s been through development and testing.

There are tools available that automate and manage this process as a separate activity to development. I’ll discuss one of these tools – TeamCity – in detail in Part 2.

Rachel Pierson, Work In Progress

Saturday, 2 March 2013

A Guide to The Cloud, Part 1 - For Muggles