Rachel Pierson, Work In Progress: Tools for Assessing Software Developers

It’s been a while since I last wrote on the subject of how to hire great software developers and weed out any applicants that aren’t experienced enough for the more senior positions within your team. Given the advent of new tools that are available to conduct such interviews, I felt it was worth updating my previous advice on the subject.

Skype is probably the single biggest game-changer in technical recruiting in recent years. Particularly if distance is an issue, using Skype to conduct interviews is a no-brainer.

Previously, phone screens were the de facto best way of carrying out an initial sift of shortlisted candidates. And to be honest they were never that good of a predictive indicator. What’s different about Skype is that, provided the candidate in question has an IDE at home (and most experienced developers do) you can use it to quickly screen candidates’ coding ability. There’s nothing like seeing someone actually using an IDE right from your very first ‘meeting’ to get a feel for whether the experience they profess to have on their CV actually translates into meaningful skills that they’re capable of applying to realistic business problems.

Skype allows you and the candidate to see one another. For the hirer, that enables you to get feedback from any non-verbal cues about their interest in the job and aptitude for same. It also allows you to screen-share, so you can see what they’re typing in real time in their IDE. In those respects, Skype is even better than trying to conduct a similar process in person, because you don’t need to crowd around a laptop screen or use a projector to be able to see them at work.

So, by all means don’t rule any interesting CVs out on the mere grounds that the applicant doesn’t have a webcam, a development setup at home, or a fast enough internet connection to facilitate a video call. But if they do have those assets available it makes it much easier to confirm their ability in a matter of minutes, before either party has invested any great amount of time in the process.

The second biggest innovation in recent years, in my opinion, is Github. It’s always been desirable for candidates to provide code samples as a means of demonstrating their skill. However, previously you could never be sure that any work submitted was a candidate’s own. Most candidates are honest. Just occasionally, however, you’d identify someone that had provided an impressive ‘code sample’, but who it later transpired couldn’t programme a tenner out of a cash machine. Wherever they had plagiarised such samples from, it was clear that they didn't actually understand them themselves. (Such antics are quite probably how this guy here got his job.) It’s a waste of both of your time if you only discover this fact when it comes to sitting down in front of a laptop at interview and you ask the candidate to take you through their solution, only to find they can’t explain the first thing about how it works or why certain design choices have been made.

Github aids candidates’ credibility by being a freely-available online source control solution, that verifiably identifies the authors of any content submitted. Not only can you freely download any complete solutions that have been placed there, but you can see the individual check-ins that went in to producing each solution and the thought processes indicated by the comments associated with same. If you know what you’re looking at, those fine details tell you much more about a candidate than a mere CV full of buzzwords and all the glowing references in the world ever could. And unlike copying whole solutions you didn't write yourself, forging a history of the individual check-ins that go in to making up a complete solution is all but impossible.

With Github, you can also confirm a demo project’s creation date. This is important. Do you ever get the impression that candidates’ CVs are merely re-wordings of your job spec? This is in some ways understandable, and arises from the fact that the standard advice jobseekers are given is to tailor their CVs to highlight relevant experience. But still, as a hiring manager you sometimes would prefer to see what a candidate felt their own strengths were, before they knew what you were actually looking for. Github gives you that insight. If you’re looking for someone that has experience in Technology ‘X’, being able to see that they’ve completed a project using that technology some months before your particular requirement even came up is a pretty convincing demonstration that the candidate actually does know what they’re talking about when it comes to the subject concerned*.

(* That said, outside of specialist contracting roles, where you do expect new hires to hit the ground running from day 1, hiring software developers should rarely if ever merely be about hiring a particular skillset. It’s always better to instead hire for aptitude and attitude, and train for skill when you need to. Because new technologies come up all the time, and it’s no good hiring one-trick ponies that are incapable of keeping up with constantly-emerging technologies. Or, worse still, people that may be gifted as individuals but whose personality problems render them unsuitable for teamwork. You can teach people with the right aptitude and temperament almost any technical skill they need to know. The best ones will be capable of constantly improving themselves. But you can’t teach them not to try and use their one golden hammer to solve every single problem they come across. And you can’t teach them not to be an arrogant control freak that alienates their peers.)

The above are great ways to identify talent. That said, I know from working with a great many talented software developers over the years, that a lot of them don’t have the time to work on open source projects on Github whilst they’re fitting a family life around about being great assets to their existing employer. And some of them live in places where the internet connection is slow, making Skype a difficult option.

So, for people for whom Skype and Github aren’t options, there is a Plan ‘B’ you can use. A less-preferable secondary approach that also works is to conduct an initial phone screen using a stock list of questions. I’m loathe to suggest an undue correlation between merely knowing the answers to some coding trivia questions and actual meaningful ability as a software developer. One is merely knowledge, the other is a demonstration of actual intelligence. However, there are just some basic things that you should know about any language or technology you profess to be proficient in, and that knowledge can be used as a baseline check if need be.

E.g., for a junior level C# developer, I’d expect them to know:

Q. What is an Abstract Class?

A. One that is only intended to be used as the base for other classes, but that may not be directly instantiated in and of itself.

Q. Tell me three basic variable types in the common type system?

A. Take your pick from any of these.

Q. What are the scopes you may use to limit Field/Property visibility, and to what extent do they make these aspects of a class visible?

A. Public, Private, Protected, Internal and Protected Internal.
(NB: I wouldn’t fault anyone for failing to name that last as a distinct scope in its own right, whose limit is a combination of that afforded by ‘Protected’ and ‘Internal’.)

The key thing is that there are no trick questions here that would require knowledge of obscure parts of the .Net framework. Candidates may or may not not happen to have used certain discrete parts of the 4000-plus namespaces in the .Net Framework, but good developers could easily look up and utilise any part of the Framework if they needed to with only a couple of hours research. Asking about the features of a specific namespace is therefore pretty meaningless. The questions above instead just concern basic, core features of the C# language. Anyone that has used C# at all should be reasonably expected to be aware of them.

Questions like these don’t help you identify whether someone is a great developer or not. Seeing how candidates write actual code using a real IDE is the only thing that enables you to do that. These questions are purely intended as a baseline negative check to help you identify any manifestly-unqualified candidates where the other preferred means of confirming ability mentioned earlier are unavailable.

For more senior C# developers, I’d expect them to know more advanced, but still core, features of the language. E.g., :

Q. What is a Delegate?

A. A pointer to a function. They’re used in lots of ways, but a typical example would be to assign a handler to an event.

Q. What’s the difference between a class and a struct?

A. The first is a Reference type, the second is a Value type. This affects where they’re stored in memory, and how assignment between variables of the types concerned work.

Q. Can you explain the principles of object oriented design?

A. I’m looking for them to be able to succinctly tell me what Inheritance, Abstraction, Polymorphism and Encapsulation are.

For a Lead Developer or Architect, I’d expect them to be able to speak meaningfully about:

Can you describe some Design Patterns? (e.g., please explain what Singleton is, What is the Decorator pattern? Tell me about a time when you used them?)

What are your thoughts on Inversion of Control / Dependency Injection? What about Test Driven Development? Do you always use them on every solution?* If not, what criteria do you use when deciding whether to expend the additional effort? What are the limitations of IoC? Which of the 22 plus frameworks that presently exist have you encountered on live projects?
(* FWIW, I personally believe that using these presently-fashionable methodologies and techniques on every single project is about as misguided as never using them.)

What is an abstract class?*
(* The observant will notice that this last question is the same question used for junior developers. It’s amazing how many Architects can recite high-level summaries of chapters from the Gang of Four, but who’ve lost touch with how coding actually works in the trenches. It gets more difficult as your career develops to keep in touch with the front line, but my personal belief is that you can only lead great developers if you actually share their pain by hitting a keyboard yourself once in a while. You certainly shouldn’t exhibit any signs of Hero Syndrome or micro-managerial tendencies by needing to be involved in writing every line of code yourself, and you shouldn’t try to do developers’ thinking for them. You need to entrust and empower those you lead by allowing them the freedom to get on with any tasks you delegate to them using their own skill. However, it is important to implement a particular feature yourself every so often, purely to keep your own skills current in an ever-changing technical landscape. Otherwise you only lose touch with emergent technologies. A clear sign that you aren’t getting enough personal keyboard time is when you begin to lose the basic knowledge that even junior developers working under you are expected to possess.)

For any one topic that I consider myself experienced enough to assess others in, I have a list of about 200 such questions that represent basic knowledge I’d expect most people to know at each level. During an initial phone screen, selecting two or three such questions as baseline checks is the next best alternative to using Skype or Github to assess whether there’s any potential.

I wouldn’t lose sleep over anyone getting any one individual question wrong. (Especially if they’re honest enough to admit they don’t know a particular fact. The very best people show awareness of things they don’t presently know, whilst less skilled individuals are often paradoxically unaware of their own current limitations. That inability to perceive their own present weaknesses leads to them failing to ever improve. This is known as the Dunning-Kruger Effect.) I still prefer actually seeing a person code using Skype, Github or even YouTube in preference to using coding trivia as an initial screening tool, but phone screens using basic questions to eliminate candidates is the next best option for the initial sift of candidates that invariably apply to almost any openly-advertised technical position. You can apologise to the ones that find it ridiculously easy afterwards, and explain the reasoning behind your using such simple baseline checks.

Skype and Github are better options because they represent positive checks for ability, whilst asking baseline questions is merely a negative check to identify the absence of basic knowledge. However, if a candidate can’t answer any of the simple baseline questions appropriate to their level of seniority, that’s clearly someone that you won’t take forward to interview.

For anyone that attends an in-person interview, I’d always recommend seeing them code using an actual IDE. (If you’ve seen them do so via Skype previously, obviously you can skip this step). The best way to do this is to attach a projector to a laptop that’s loaded up with a full IDE and an internet connection, and watch them work. I once had a hiring manager tell me that they used pen and paper coding exercises instead “because they didn’t want the candidate to have access to Intellisense, and all those other ‘cheats’ that a full IDE provides”. No, I don’t understand the logic behind that one either. I found myself wondering if they’d ask a prospective master carpenter to bang in nails wearing a blindfold, and decide from how swollen their thumbs were afterwards which was the ‘best’ at their craft.

Just like when you’re using Skype, you can record candidates’ efforts to build a quick solution using free tools like CamStudio recorder if you like. That approach can be very useful if you work in a large organisation and have a wider selection committee that will need to review the interview later on. It can also feel a little like an unfriendly interrogation, though, so you need to decide what’s right for your own organisational culture. Personally, I’d only record a coding test if there were a need to show the recording to other members of your recruitment panel afterwards. And I would explain to the candidate that the purpose was to save them having to demonstrate their ability multiple times to different people.

It’s important to make clear that the problem you’re asking them to solve constitutes realistic work, but not real work on an actual business problem. The first activity is a meaningful test of their skill. The second would merely represent unpaid work, and that would risk making you look like a freeloader. One problem I’ve seen used in the past and that I thought was a pretty fair baseline check read something like this:

“Design a system that allows you to model shapes as objects. Each shape should be capable of outputting a text description of itself. The description given in each case will be:

‘I am a _________. I have ____ sides, and ____ corners. My colour is ______. Their lengths are _______.’

There will be appropriate Properties in any classes you use to model such shapes to store the information to be supplied in the blanks in the above description.

You can implement this solution using any UI you like. Have specific classes that describe the shapes ‘triangle’, ‘square’, ‘rectangle’ and ‘circle’”

A developer should be able to come up with a simple design that has a base (possibly abstract) class that provides any shared Properties like colour, numSides, etc. They can either implement a Method in that abstract class to allow a string description to be output, or they can override the default ToString method. Classes describing the specific shapes requested should be inherited from this base. Extra points for having the perception to make appropriate properties/fields read only in more specific classes (i.e., you don’t want consumers to be able to create a triangle with four sides). Points too for using inheritance where appropriate (e.g., realising that a square is just a more specific instance of a rectangle.) Nothing too taxing, and no trick questions or tasks that would take an unreasonable amount of time. Just a simple problem to allow developers to show that they’re not a non Fizz-buzzer.

As this is a blog about assessment tools, it’s worth mentioning ‘online’ tests like ProveIT, Brain Bench, and Codility. These ‘tests’ fall into two main categories:

Tests that attempt to assess ability based on being able to instantly-recall knowledge of obscure parts of particular frameworks.

Tests that try to assess an actual ability to write code, but not using an actual IDE.

My opinion on using obscure trivia to assess problem-solving ability is well-documented. I’m with Einstein on this one, who when asked what the speed of sound was once said that:

“[I do not] carry such information in my mind since it is readily available in books. ...The value of a college education is not the learning of many facts but the training of the mind to think.” *

[ * New York Times, 18 May 1921 ]

I don’t consider memorising a lot of obscure and easily-obtainable facts to be a good indicator of programming ability. Nor do I consider not being able to recall such facts at will to be an indicator of a lack of ability. Developers have Google and reference books available on the job. I’m therefore only concerned with testing those aspects of a developer’s ability that those tools can’t provide.

That leaves those online ‘tests’ that attempt to assess coding skill, such as Codility. There’s nothing wrong with the basic idea of getting candidates to write code as a demonstration of their existing ability and potential. However, there’s a big difference between writing code using an actual IDE, and attempting to write code using a web browser (which is how Codility works). In a real IDE, you have Intellisense, code snippets, meaningful object navigation (e.g., if you place the carat on the usage of a class or property in Visual Studio and use the F12 key, it’ll take you to where that class/property is implemented), colour coding of keywords and objects, compilation checking as you type, etc, etc. Codility advocates believe that because that assessment tool has a “compile solution now” button at the bottom of the browser window that amounts to the same thing. It simply doesn’t. Going back to my earlier analogy about inappropriate ways to assess carpentry skills, you’ve merely gone from using a blindfold to asking the candidate to wear sunglasses in a dimly-lit room.

Codility tests run in a web browser

The main problem with Codility et al, however, is simply this. They don’t give you anything that you don’t also get by watching a candidate solve a real problem using a real IDE. Because of this, you invariably find that these tools are preferred by interviewers that don’t possess skills in the language concerned themselves. Such interviewers don’t use an IDE / laptop with a projector approach, because they simply wouldn’t understand what it was they were looking at. By using Codility instead, they’re generally looking for an ‘easy’ way to understand whether a given solution is ‘right’ or ‘wrong’, without having to go to the trouble of understanding why such a value judgement has been arrived at themselves. Good candidates are aware of this, and the best of them will be concerned that if you only understand how good they are because some automagically-marked test tells you what to think, how are you going to be able to fairly assess their performance on the actual job in the absence of such feedback?

Everyone knows that good interviews are a two-way street. Candidates are assessing you and your organisation just as you are assessing them. Sending a signal that you don’t understand what it is that they do can damage your credibility and your employer/manager brand considerably. So, if you’re not technical yourself (and some managers aren’t), I’d generally recommend instead asking one of your existing staff that you trust to be able to make a meaningful assessment of a candidate's ability to accompany you when assessing candidates’ technical fit.

A second problem with Codility, in my opinion, is that solving discrete problems using technology in the real world rarely works in such black and white terms as a solution being ‘more’ or ‘less’ right than other approaches. There are generally a great many ways to satisfy any one problem. Which one(s) is/are ‘correct’ is all about context. Tests that focus on an overly-narrow set of criteria when determining success may not always identify the best candidate, even if they identify someone that produces the fastest solution, or the one that uses the least (or most) lines of code to solve a problem. e.g., if someone were to use the line 123 << 4 to get the result 1968 instead of writing 123 * 16 , that might be the genius you need to optimise nanoseconds on calculations within the firmware for a graphics card, or they might just be That One Guy that writes unreadable code that produces hard to find bugs. (Mostly, though, they’ll just be someone that doesn’t realise low-level arithmetic optimisations like bitwise operators are largely meaningless in languages like C#, where high-level code is converted at compile time into optimised MSIL before being converted into even more optimised machine code specific to the hardware it’s running on.)

You can try Codility for yourself here, and I'd strongly recommend that you do so if you're considering using it to fairly assess candidates. It's not enough just to get someone else to look at the test for you, unless you ask your chosen guinea pig to work under the exact same time constraints as candidates will be asked to work to. That also means they only get one shot at the test, just like candidates.

In the interests of debunking The Emperor's New Code, when I tested Codility out as an assessment tool I found that I didn't produce a 100% solution myself first time in the time allowed. I therefore felt it'd be unfair to ask candidates to do something that I myself couldn't.

I doubt that many people could produce an 'optimal' result in the timeframe allowed, particularly when you don't get to see the criteria that will be deemed to constitute an 'optimal' solution before submitting your answer. When they only have a short window to think about the problem, candidates will be inclined to focus on providing a solution that works rather than one that shaves milliseconds off of the runtime. And even where candidates do provide an 'optimal' solution, there doesn't seem to be much allowance for readability in the simplistic percentage score returned.

I suspect that most 100% results that users might see from this tool may be best explained by the fact that there are many solutions to the tests published online, and some candidates will be inclined to copy one of those.

This deliberately-obscure and unreable
solution scores 100%

(Full-size view available here)

This shorter and more readable solution also scores 100%

My overall conclusion: companies that let computer algorithms select the best people to work for them rather than the other way round may well be disappointed by the results.

Tuesday, 16 July 2013

Tools for Assessing Software Developers