Language-neutral sort block behavior

Developers frequently need to sort collections according to specific sort criteria.

A common and powerful idiom is to supply a two-argument block that, when evaluated with two elements from the collection, returns a value indicating the desired (sorted) order of the two elements. This “sortBlock” is applied to each of the elements in the collection (in some order), and the result is returned as the now-sorted collection.

So far, so good. The problem is in the semantics of how the sort block is applied, and in which order the elements of a given collection are passed to the sort block.

Problem

Smalltalk and Python use different semantics for the sort block, and apply the block in different sequences for the same initial collection.

In Smalltalk, a sortBlock is called with two elements (I’ll call them “arg1” and “arg2”), and is expected to return “true” if arg1 is to be inserted before arg2 and “false” otherwise. In Python, a sortBlock is called again called with two elements, and is expected to return -1 if arg1 is less than arg2, 0 if they are the same, and “1” if arg1 is greater than arg2.

Leaving aside, for the moment, the debate about which behavior is correct, the point is that they are different. The implication of this is that the developer is forced to think about and code for the specific language (in this case Smalltalk and Python) of a particular fragment of code.

Solution

The Zeetix solution is to encapsulate this semantic difference in the SortedCollection class itself, so that the programmer doesn’t have to think about it. The platform can, if desired, support the ability to choose whatever semantics are desired — and that choice is then automatically applied across the board, regardless of the current language binding.

Example Problem
Behavior, the (abstract) superclazz of Metaclazz and Clazz, provides a “subclazzes” method that answers a sorted collection containing the immediate subclazzes of the receiver, ordered according to their “symbol” (the name by which they are referenced in the ecology that contains them).

The Behavior.subclazzes method should answer a SortedCollection with two elements — Clazz and Metaclazz, in that order. Behavior clazz also has two subclazzes, Clazz clazz and Metaclazz clazz, in the same order.

The KernelTest.CoreTest.KernelCoreLoadsTest test case was failing testBehaviorLoads, because the SortedCollection instance returned by Behavior and BehaviorClazz was ordered incorrectly.

My inspection (in the debugger) of the failing test showed that the result was, in fact, an instance of SortedCollection and contained the correct objects. They were, however, in the wrong order. Clearly, SortedCollection was sorting incorrectly.

Example solution

Step 1: Document the failure with new test methods

I cloned the existing “ExampleSetTest” from ZeeTestTest into a new “SortedCollectionTest” in KernelTest.CLDTTest. I added a test method to the SortedCollectionTest that demonstrated the bug:

class SortedCollectionTest(TestCase):

    def doSetUp(self):
        self._empty = self.clazzAt_('Kernel.CLDT.SortedCollection').new()

    def testClazzNameSort(self):
        aSortBlock = self.clazzAt_('Kernel.Core.Clazz').sortBlock()
        self._empty.sortBlock_(aSortBlock)
        self._empty.add_(self.clazzAt_('Kernel.Core.Metaclazz'))
        self._empty.add_(self.clazzAt_('Kernel.Core.Clazz'))
        self.assert_(self._empty.at_(0).symbol() == 'Kernel.Core.Clazz')
        self.assert_(self._empty.at_(1).symbol() == 'Kernel.Core.Metaclazz')

As hoped, this test demonstrated that SortedCollection was adding the two Clazz objects in the wrong order. But why? Was the bug in the sortBlock supplied by Clazz, or in SortedCollection itself?

Since I used Smalltalk semantics when I created Zeetix, the sortBlock method (derived from Smalltalk’s Class class>sortBlock) looks like the following:

class ClazzClazz(BehaviorClazz):
    """Clazz ..."""
    def sortBlock(self):
        """Answers block that sorts clazzes alphabetically by name."""
        answer = self.blockClazz().withCode_(lambda clazzA, clazzB: clazzA.symbol() < clazzB.symbol())
        return answer

This works in Smalltalk (obviously) and I expected it to work in Zeetix. I apologize for the bloated lambda expression, more on that another time.

Clearly, more tests were needed in SortedCollectionTest — specifically, the SortedCollection>>sortBlock_ method:

    def doSetUp(self):
        self._empty = self.clazzAt_('Kernel.CLDT.SortedCollection').new()
        self._alpha = self.clazzAt_('Kernel.CLDT.SortedCollection').fromNativeArray_(['beta', 'gamma', 'phi'])

    def testAlphaSortBlockForward(self):
        aSortBlock = self.clazzAt_('Kernel.Core.Block').withCode_(lambda a, b: a < b)
        self._alpha.sortBlock_(aSortBlock)
        self.assert_(self._alpha.at_(0) == 'beta')
        self.assert_(self._alpha.at_(1) == 'gamma')
        self.assert_(self._alpha.at_(2) == 'phi')

    def testAlphaSortBlockReverse(self):
        aSortBlock = self.clazzAt_('Kernel.Core.Block').withCode_(lambda a, b: a > b)
        self._alpha.sortBlock_(aSortBlock)
        self.assert_(self._alpha.at_(0) == 'phi')
        self.assert_(self._alpha.at_(1) == 'gamma')
        self.assert_(self._alpha.at_(2) == 'beta')

I added “_alpha” just in case the bug had something to do with string comparisons.
Voila! These tests failed as well — the SortedCollection implementation of sortBlock_ was broken.

Step 2: Fix the bug

I was horrified to discover that the implementation of SortedCollection was, well, just plain hosed. I have been coding around it — or just ignoring it — for a long long time. I clearly needed to fix this bug, and then add a LOT more tests to SortedCollectionTest.

There were several problems. First, the original sortBlock_ method was broken. Here is the original code:

    def sortBlock_(self, aTwoArgumentBlock):
        self._sortBlock = aTwoArgumentBlock
        return self

This method changes the sortBlock, but doesn’t apply to it an existing instance. Clearly, SortedCollectionTest needs some new methods:

class SortedCollectionTest(TestCase):
    def doSetUp(self):
        self._empty = self.clazzAt_('Kernel.CLDT.SortedCollection').new()
        self._full = self.clazzAt_('Kernel.CLDT.SortedCollection').fromNativeArray_([2, 4, 6])
        self._alpha = self.clazzAt_('Kernel.CLDT.SortedCollection').fromNativeArray_(['beta', 'gamma', 'phi'])

    def testSortBlock(self):
        aSortBlock = self.clazzAt_('Kernel.Core.Block').withCode_(lambda a, b: a > b)
        self._full.sortBlock_(aSortBlock)
        self.assert_(self._full.at_(0) == 6)
        self.assert_(self._full.at_(1) == 4)
        self.assert_(self._full.at_(2) == 2)

I added “_full” to the test case (initialized in doSetUp), to exercise the behavior with a SortedCollection that has several (in this case, 3) elements. I chose integers to make comparison easy, and chose values ([2, 4, 6]) to provide some “air” in the collection so that inserts and deletes are easy to find and test for.

Now, I was ready to change the sortBlock_ method:

    def sortBlock_(self, aTwoArgumentBlock):
        self._sortBlock = aTwoArgumentBlock
        self._isSorted = False
        self.sort()
        return self

    def sort(self):
        if self._isSorted:
            return self
        if self._sortBlock == None:
            self._implementation.sort()
        else:
            self._implementation.sort(self._sortBlock.value_value_)
        self._isSorted = True
        return self

At least this now attempts to apply the new sortBlock whenever it is added. The sort method simply passes the block (it is a callable) to python’s sort method. Right? Wrong. No joy. But “sort” was now doing the same thing that add_ does (I hadn’t refactored add_ yet), so … add_ must also be broken:

    def add_(self, anObject):
        """Add an anObject to the appropriate position in the receiver."""
        self._implementation.append(anObject)
        if self._sortBlock == None:
            self._implementation.sort()
        else:
            self._implementation.sort(self._sortBlock.value_value_)

More tests, of add_, were clearly in order:

    def testAdd(self):
        self._empty.add_(5)
        self.assert_(self._empty.includes_(5))

    def testAddFirst(self):
        self._full.add_(0)
        self.assert_(self._full.at_(0) == 0)

    def testAddMiddle(self):
        self._full.add_(3)
        self.assert_(self._full.at_(1) == 3)

    def testAddEnd(self):
        self._full.add_(7)
        self.assert_(self._full.at_(3) == 7)

Sadly, these all worked — I still hadn’t found the original bug. However, I now enough tests to safely refactor the original add_ method:

    def add_(self, anObject):
        """Add an anObject to the appropriate position in the receiver."""
        self._isSorted = False
        self._implementation.append(anObject)
        self.sort()

Now, at least add_ and sortBlock_ shared the same sort code. But that code was still not working. Since I was worried about whether strings were behaving differently from integers, I added one more string test:

    def testAlphaSortForward(self):
        self.assert_(self._alpha.at_(0) == 'beta')
        self.assert_(self._alpha.at_(1) == 'gamma')
        self.assert_(self._alpha.at_(2) == 'phi')

Something was wrong with the sort routine itself — what could it be. I finally read the fine manual — and reread it — and reread it again. The python sort function expects different behavior from the two-argument block from Smalltalk! But I was afraid to change the sortBlock on Clazz — if that was wrong, then who knows what else might have been wrong (the platform attempts to use sorted collections in many places).

I needed a way to make a sortBlock, coded with Smalltalk semantics, work with the Python sort function. Here’s the result (please pardon my Lisp):

    def sort(self):
        def apply_with_with_(aTwoArgBlock, arg1, arg2):
            aBoolean = aTwoArgBlock.value_value_(arg2, arg1)
            if aBoolean:
                return 1
            else:
                return -1
        if self._isSorted:
            return self
        if self._sortBlock == None:
            self._implementation.sort()
        else:
            self._implementation.sort(lambda car, cdr: apply_with_with_(self._sortBlock, car, cdr))
        self._isSorted = True
        return self

I wanted to make sure I could see what was really going on, and I had to curry the sortBlock somehow. Local functions to the rescue!

The local apply_with_with_ function, taking three args, applies the sortBlock to arg1 and arg2 and returns -1, 0, or 1 as the Python sort function expects. I use a (Python) lambda to collect the sortBlock and pass along the two arguments from the Python sort iterator. The original version nearly worked — but still no joy. Why? The sort iterator calls the arguments of its two-arg block in a different order from Smalltalk. Hence, please note that the Smalltalk-style sortBlock is called with “arg2” followed by “arg1” — reversed from the lambda binding below.

Finally, it worked!

Step 3: Validate the putative fix

First, I ran KernelTest.CLDTTest.SortedCollectionTest to make sure everything worked:

<ZeeTest.TestSuite: SortedCollectionTest>
0.031000 sec
<run: 9, passed: 9, failed: 0, errors: 0>

Fabulous!

Next, I exercised KernelTest.CoreTest.KernelCoreLoadsTest to ensure that I had truly solved the original problem with Behavior:

<ZeeTest.TestSuite: KernelCoreLoadsTest>
0.000000 sec
<run: 12, passed: 12, failed: 0, errors: 0>

Finally, I ran all the tests to make sure that I broke nothing else:

<ZeeTest.TestSuite: TestCase>
2.672000 sec
<run: 100, passed: 100, failed: 0, errors: 0>

Summary

The Zeetix platform can now apply consistent sortBlock behavior across each language to which it is bound. The test-first process, using ZeeUnit, revealed a number of serious problems in the SortedCollection implementation. The new tests, preserved in KernelTest.CLDTTest.SortedCollectionTest, document the platform behavior and simultaneously ensure that future kernel changes will not inadvertently break that behavior.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: