A Case for Prototypes

Published 28 February 2013

Two days ago I started working on my own implementation of QuickCheck, since none of the implementations I saw for JavaScript did quite what I wanted. So, I sat down to outline the major goals of Claire:

  • Provide the means for property-based testing. For pure functionality.
  • Provide the means for random program generation. For complex, state-dependant functionality.
  • Provide core primitives and combinators for random-data generation. This way domain-specific generators can be easily derived by combining primitives.
  • Be a nice DSL for use in JavaScript.

So I started drafting the initial API that would satisfy those goals (ignoring program generation for now, since I decided to leave that for later), and decided on two basic data types:

-- | Describes a property and its specifications
type Property
  arguments    :: [Gen a]
  classifiers  :: [(a... -> Maybe b)]
  implications :: [(a... -> Bool)]
  invariant    :: (a... -> Bool)

  -- | The DSL
  satisfy  :: @Property => (a... -> Bool) -> Property
  classify :: @Property => (a... -> Maybe b) -> Property
  given    :: @Property => (a... -> Bool) -> Property


-- | Describes a way of randomly generate a series of values
type Gen a
  size   :: Int
  next   :: () -> a

-- | A semantic way of getting a Property than Property.make()
forAll :: Gen a... -> Property

I'll use these types as a basis for the rest of the article. If you don't know Haskell's type annotation syntax, it suffices to say that :: describes the type of the thing on the left, [a] describes a list of a's, and a -> b describes a function that takes a value a, and returns a value b. I've added @a => b for describing a function that, when applied to type a (as this), yields a value of type b. And the ellipsis to denote variadic functions.

Maintaining purity

First of all, I wanted the DSLs of properties to be pure. That means that, even though you'll be writing things like:

var sqrt_p = forAll( Int )
             .satisfy(function(n){ return Math.sqrt(n * n) == n })

The satisfy function should not change the Property instance that it was applied to. Instead, it should always return a new Property instance. Thus, when you write things like:

var all_positive_ints = forAll( Int )
                        .given(function(n) { return n > 0 })

var sqrt_p = all_positive_ints
             .satisfy(function(n) { return Math.sqrt(n * n) == n })

Even though satisfy modifies the invariant field of a Property, all_positive_ints and sqrt_p should be two different data structures. It could be obviously achieved by creating a new object, copying all fields of the previous object in the new one, and changing the invariant field, but prototypes make that both dead simple and efficient (in terms of processing power and memory usage).

So, turns out the implementation for this is just a simple and straight-forward one-line (the actual implementation is simpler due to both LiveScript's syntax and the fact that Property inherits from Boo's Base object):

var derive = require('boo').derive

var Property = {
  /* (...) */
  satisfy: function(f) { return derive(this, { invariant: f }) }
}
note

So, derive here is just a thin abstraction over Object.create that creates a new Object which delegates to another object (this, in the case above), and extends that new object with the given properties.

If you inspect the values of invariant, you'll get:

all_positive_ints.invariant
// (Function) => function (){
//                 throw Error('unimplemented');
//               }

sqrt_p.invariant
// (Function) => function (n){
//                 return Math.sqrt(n * n) === n;
//               }

all_positive_ints.isPrototypeOf(sqrt_p)
// (Boolean) => true

So, this is a cheap trick, and not really new (I've used it on a parser combinator library I was writing ages ago). But it's a cheap trick that gets the work done extremely well, in a simple way (both to write and to reason about) and still manages to be fairly efficient. We couldn't really ask for more ;3

The real deal

Things start getting a lot more awesome when we take a look at how Generators were modelled, specially when we take into account the combinators defined for creating new Generators. Here's the deal:

  • We want generators to follow the pure model Properties do.
  • We want combinators to be expressive, but still easy enough to write.

I could have used functions, in which case those would be relatively easy to achieve due to the higher-order facility in JavaScript, but a Generator actually encodes more than just a way of generating the next value. It also knows how to shrink a value it has generated, and how to display itself to the user in a human-readable and concise way for debugging purposes.

So, I was stuck with objects, and I still had to satisfy the two constraints above. Well, meet the redemption: Prototypes. Prototypes are a really great fit in this case because, even though all generators obey the basic Gen a structural type, they can be based in any kind of generator that the user may come up with. Since we have prototypes rather than classes, it's entirely possible to encode this complex and dynamic hierarchy as simple functions as combinators. Yes, we get all we need from combinators, and still have combinators as plain functions (that yield new Generators).

The `asGenerator' combinator

So, the first and most basic generator is the asGenerator. This is a function that takes any value a, and returns a Generator that will always yield that value (or the result of applying a function, if a is a function). On the other hand, if the function receives a generator as value, it works like the identity function. This allows us to lift a regular primitive value, like a Boolean or a String, into a generator:

var True = asGenerator(true)
True.next() // => (Boolean) true

The core Generator object inherits from Boo's Base object, so it's rather easy to write a function like asGenerator, and in fact all other generators take advantage of this sugar. The following is a slightly simplified version of asGenerator:

function asGenerator(a) {
  return generatorP(a)?   a
       : /* otherwise */  Generator.derive({
                            next: function(){ return a }
                          })
}

In a class based language, specially those that are less expressive, like Java, you would have to encode the asGenerator function as a whole new Class, making the implementation unnecessarily complicated.

The `choice' and `frequency' combinators

On top of the lifting generator, a clear second step is to provide people with a way to alternatively select between different generators to create a new value. These roles are satisfied by the choice combinator, which takes several generators and generates a value using one of them in an uniformly distributed pseudo-random choice. And then there's the frequency combinator, which does almost the same as choice, but uses a weighted distribution instead.

So, what's the big deal here? Ain't those other two cases that you could encode as a class? Well, sure. You can always encode things using something in computer science, but that doesn't mean it'll be easy, simple or straight-forward. And I want the three of those characteristics.

Using these combinators looks like this:

var vowels = choice('a', 'e', 'i', 'o', 'u')
var coins = frequency([3, true], [5, false])

Since both choice and frequency leverage asGenerator, you can pass either simple values or generators over to them. In fact, one of the nice things about using objects as generators is that checking if something is a generator is easy: you just check the conformance to the structural type Gen a.

This is a simplified implementation of choice and frequency:

function choice() {
  var gs = toArray(arguments).map(asGenerator)
  return Generator.derive({
           next: function() { return pickOne(gs).next() }
         })
}

function frequency() {
  var gs = toArray(arguments).reduce(weightedConcat, [])
  return choice.apply(null, gs)

  function weightedConcat(r, x) {
    return r.concat(replicate(x[0], x[1]))
  }
}

And finally, the `repeat' combinator

Once we have all this in place, we can also generate lists of things, and this is easily done by the repeat combinator. In fact, the List core generator is a thin abstraction on the repeat combinator, just so you don't have to type repeat(choice(...)).

Now, this is where the magic of prototypes truly shine. I didn't tell you until now, but all generators need to maintain all of the properties of the previous generator. All of them! This is because other generators might depend on those properties (for example, size). And in fact, the size field plays a huge and important role in the repeat combinator, since it will only generate lists of values up to that length!

Now, enough talk, this is how the repeat combinator looks for reals:

var numbers = repeat(function(){ return randInt(0, 10) })

Pretty straight forward, huh? So is the simplified implementation:

function repeat(g) {
  return asGenerator(g).derive({
           next: function() {
             var range = Array(randInt(0, this.size)).join(0).split(0)
             return range.map(g.next.bind(g))
           }
         })
}

So, now we're taking any kind of Generator, an instance, and making a new kind of Generator that efficiently shares all the properties defined in the previous generator, but also defines its own properties. This is the true beauty of prototypical inheritance: instances inheriting directly from instances.

Conclusion

There's much more to Claire than what I've shown above, but those are out of the scope of this blog post, which was to outline how you can take advantage of prototypical inheritance to create awesome things, since you're already coding in a language that provides it for you. Dynamic delegation chains are sweet for lots of things, these are just some of them.

Perhaps I might write a new blog post on some other aspect of Claire (for instance, shrinking or program generation, when those are done). But I don't know, I've got some other quite hot and sweet topics to talk about as well! :3

Anyways, hope you guys can see now why inheriting directly from objects is a Darn Good Thing™. And why I still think that Function.prototype is an anti-pattern in the language, which should be avoided (although it's undeniably fast).

Acknowledgements

Thanks to Theo for pointing out that abcde aren't vowels ;3