Developers and “bad” code

I’m quickly realising that “bad” code is a vast exaggeration.

Listening to many developers talking to me about code that they consider to be bad, it is becoming clear to me that what is meant by bad is not necessarily what the word implies. Instead, many developers seem to mean “I don’t understand this code”. Worse, it’s not that they don’t understand, but that they don’t want to understand. Here’s a couple of examples to explain:

A developer recently told me that he found that writing LINQ one liners was bad practice*. After quizzing him a little he cited several examples of LINQ that he did not instantly understand. After writing imperative versions of the same code, I at least came to the conclusion that the LINQ one liners were in fact more clear than the procedural code. I relayed this to the developer in question, and indeed, he admitted that he did not find any of the procedural versions easy to understand. In short, this developer baulked at the idea that code had been compressed into one line, and did not consider that this description might be easier to understand than the alternative. The sole reason being that he was not used to working with LINQ, and did not have his brain prepared for understanding LINQ statements.

Another developer recently told me that his colleague had written some bad code. He cited several reasons why he considered the code to be bad which all seemed quite reasonable – indeed, I was convinced that the code was poor. On consulting the other developer though, my opinion changed. The second developer explained that he too felt that the code was ugly from that point of view, but that if he had implemented it another way, it would have been more ugly from another point of view. His arguments were enough to sway me that his code was not bad, instead, he’d just thought about the problem from another angle. Of course, one could make an argument that none of the devs (including myself) had thought for long enough about this code. There probably was a solution that solved both sides of the argument neatly, but at some point, we have to write code and produce a working product. The key point here though is that the original complainant had not understood the problem fully, and had therefore declared the code to be “bad” prematurely.

The key to both these problems though was a lack of understanding. A developer’s job is to wrap their head around a problem fast, and to understand it from all angles, in these two cases I’m not convinced that’s happened. In future, I’m going to treat developers telling me about “bad” code with a large pinch of salt. Instead of assuming that the code is actually bad, I will assume that the developer in question has simply not understood the reasoning behind the code yet.

A second piece of this puzzle is the hunt for perfection in developers. Few developers will ever tell you that they consider any piece of code to be good. This includes code that they themselves have written. For any given piece of their code, a developer will typically list several things that can be improved, often in conflicting ways. This contributes heavily to the lack of understanding. A new developer on this code will not only have to understand the original developer’s reason for designing the code in a certain way, but they’ll have to understand where and why they made ugly implementation decisions. It may simply be that they haven’t had time yet to clean up the problems, or that there’s a trade off involved. The fact remains though that this introduces an extra variable to the lack of understanding.

Ultimately, what this boils down to is that no developer is happy unless they have their head well and truly wrapped around a problem. When first starting to understand another developer’s code this is not true. The result is that all too often code is declared to be “ugly”, “bad”, “messy” or any number of other derogatory terms. Instead, typically what is meant is “I don’t understand why this developer did this”.

My gut feeling is that this actually lessens the impact of terms like “terrible” code. These terms should be reserved for code that is actually erroneous, or is inefficient to the point of being in a complexity class it clearly does not need to be in. So please developers, stop using the term to brand all code that you read. Instead, make constructive criticism of the code, and try to understand why it was done that way in the first place.

* LINQ is a functional programming inspired API that allows developers to write clear, concise “queries” to extract data instead of complex loops in loops.

Obj-C’s type system is too strong

That’s rather a surprising title, isn’t it! Objective-C has one of the weakest type systems of any language. What I’m going to demonstrate though, is that with the addition of Objective-C’s “block” construct (really closures with a special name), Objective-C’s type system is now not only too weak for my tastes, but too strong to do useful things!

In short, Objective-C’s type system is broken, not only does it allow lots of incorrect programs that many type systems disallow, but it also disallows a fair number of correct programs that it shouldn’t.

Blocks

Objective-C gained a really useful feature lately – the closure. We can define a closure like so:

// Define a closure that multiplies it's argument by a variable 'a'.
- (void)myClosureDefiningMethod
{
    int a = 5;
    int (^timesA)(int x) = ^(int x) { return x * a; };
}

The syntax isn’t the prettiest in the world, but it mirrors C function pointer syntax, so it’s not all bad.

Higher Order Programming

The ability to create functions on the fly like this is really powerful, so much so, that whole languages (like Haskell) base their programming style on doing this kind of thing lots. Let’s then, turn to Haskell for inspiration about what kinds of things we might want to do with this.

The standard Haskell library (the Prelude) defines some really stunningly simple things using this technique, and the lovely thing is that they turn out to be quite useful. Lets look at const for example:

const :: a -> b -> a
const x y = x

So, we pass const an argument, and what we get back is a new function that ignores it’s argument, and returns our original one. It’s dead simple, but mega useful.

Lets try to define the same function with Obj-C closures then:

(a (^)(b ignore))constantly(a ret)
{
    return ^(b ignore){ return ret; };
}

This looks great! We have our const function, but wait, I’ve cheated. I’ve not defined the return type of the closure, or the type of constantly’s argument properly. What I want to be able to say is, in typical C weak typing fashion, “any type at all”. This, although it wouldn’t specify the type very strongly, would at least allow me to use the function. Unfortunately, neither C, nor Obj-C has such a type. The closest you can reasonably get is void *, and that won’t admit a whole swathe of useful types like BOOL, int, float etc.

Exponentiation Types

A pair colleagues of mine and I have been staring at an interesting riddle, which I’m guessing exists in the literature somewhere. He pointed out that we have sum types where a+b is the type containing all the values in a, and all the values in b, we have product types where a*b is the type containing all the values which contain an a, and a b. What we don’t have though is exponentiation types. The riddle then – what is the type ab?

Bart realised that this type is b -> a. The type contains all functions that map bs onto as. This has some rather nice mathematical properties. We know from our early maths a couple of rules about exponents:

ab * ac = ab+c

This gives us a rather nice isomorphism: (b -> a, c -> a) is equivalent to (b + c) -> a. That is, if we have one function that produces an a from bs, another that gives us an a from cs, we can write a function that gives us as, given either a b or a c, and vice versa.

Secondly, and perhaps even nicer
(ab)c = ab*c

This gives us a different isomorphism: c -> b -> a is equivalent to (b,c) -> a. Woohoo, we have currying!

This seems very close to the curry-howard isomorphism, but not quite there. Does anyone know who’s discovered this already?

Collecting Non-Memory Resources

A Problem

Let us consider a small problem. We would like to manage resources using a Haskell program, that are not just memory. For the sake of argument we will consider GPU resources. This can be reasonably straight forwardly done by using the IO monad to essentially write an imperative program that manages the resources. But doesn’t this defeat the point of functional programming? We’re losing so many benefits that we normally get, we no longer get to describe only the result of our program, instead we have to describe how to get to it too. Not only that, but we’ve lost our wonderful garbage collection system that allows us to easily avoid all of those nasty segfaults we see in non-managed languages. So, the problem today is, how do we extend the Haskell garbage collector (preferably without playing with the runtime or compiler) to be able to manage all these resources.

An attempt

Let’s consider only one small subset of GPU resources – a shader. What we would like in our Haskell program is a pure value that represents the shader, which we can call on at a later date. We’d like a function that takes our shader code, and produces this pure value, and we’d like the resources on the GPU to be collected when the value is no longer in scope.

compile :: String -> String -> Shader
compile vertexShdrSrc fragShdrSrc = s
  where
    s = doCompile s vertexShdrSrc fragShdrSrc

{-# NOINLINE doCompile #-}
doCompile :: Shader -> String -> String -> Shader
doCompile  s vertexShdrSrc fragShdrSrc =
  unsafePerformIO $ do
    {- Set up our fancy pants shader stuff here -}
    addFinalizer s {- Remove resources from the GPU here -}

What we hope will happen is that we return our shader – s, with a finalizer attached to it. When the garbage collector collects s, it will also collect the resources off the GPU. This all looks rather good, so lets try using it:

myShader :: Shader
myShader =
  compile "some vertex shader source"
          "some fragment shader source"

The result of evaluating myShader is a constant use of s, the definition of this constant is looked up, and replaces it, so myShader is now defined as the right hand side of s. Unfortunately, there’s now nothing that points at s itself, so it’s garbage collected, and all our resources removed from the graphics card.

Conclusion

We’ve tried to find a way of getting automated collection of non-memory resources, but ultimately, not quite got there. I don’t see a way forward from this point, and would love to hear other people’s input on how this sort of management can be done

Cabal’s default install location

Cabal’s default install location is somewhat controversial – Many people seem to like the default of user installs, while many others would prefer that it matched all other software and installed globally. The assumption amongst the community at the moment is that “most” want user installs. I wanted to find out if they’re right. If you’re a Haskeller, please vote, it’ll take a lot less time than voting for a new logo :)

Bottoms

In Haskell, every type has at least one value – ⊥. The empty type is not actually empty – it has ⊥ in it, sum types contain the sum of the two types plus one extra element – ⊥, etc.

But we don’t always need to do this. The ⊥ value of a type has one important feature, it’s the least defined value in the type. So let’s investigate the four primitive types:

Empty – as this type has no values, there’s obviously no least defined one, so we definitely need to add an extra value.
() – this type has only one value, so that value clearly is the least defined. Thus (⊥ :: ()) can be defined safely to be ().
Sum types – any value we chose in (A + B) must carry a tag to determine which of type A or B it’s in, and so cannot be the least defined value – if we chose a value in A, it carries enough information to say it’s not in B, and vice-versa. Thus for sum types, we must add an extra ⊥.
Product types – assuming that we have bottom values in types A and B, we can define ⊥ for (A × B) as being (⊥ :: A, ⊥ :: B).

One can clearly make at least two choices here, either the choice Haskell makes – add bottom values everywhere, or add bottom values only where they are needed. One could argue convincingly that what Haskell does is very consistent and predictable, but interestingly, the other choice has some nice results.

The functor laws demand that fmap id x = x. A special case of this is that fmap id ⊥ = ⊥. Lets look at this for pairs – that means that fmap id undefined = undefined, but this isn’t as lazy as we could be – we’d like to be able to return a tuple straight away, without looking at any of the tuple we’re given.

If however we chose to only add a bottom value to a type when needed, then bottom for tuples really is (⊥, ⊥), and we’re able to define fmap for tuples as fmap f x = (fst x, f $ snd x) and not break the Functor laws.

Any more comments about this are appreciated. What other consequences does this decision have?

Dependancies are Bad

This entry’s going to be relatively short. I wanted to point out something that I find very odd.

In the software industry, we strive to eliminate code-duplication. Duplicated code is duplicated effort, duplicated errors, and duplicated complexity. Yet I hear the same thing cropping up over and over again in many different projects – “I’m not going to use xyz, because it adds an extra dependancy”.

Are dependancies so bad that we can’t stand to see just one extra one in the aid of reducing code duplication?

My feeling is that the reason that people don’t want to do this is that the dependancies are often pretty heavy weight things. What they’re really complaining about is that “xyz” covers too much, and that they only want to use a small part of it.

In the interests of increasing code reuse then, I’d like to suggest to everyone one simple task. Release packages, release many packages, release small packages. Small packages are great, because people won’t see them as great monolithic blocks that they don’t want to depend on. Bottom line: if a package can be split into two in a reasonable way, do it! Oh, and one more thing: for the love of god, do it in an open way!