Out Of The PARC
Tuesday, May 20, 2003

 
Although I'll be doing software related stuff, I'll be travelling to XP2003 so I won't be able to post for a while. I'm sure all that XP stimulation will give me great fodder for when I get back...

ian smith at pee ay are see dot com

Thursday, May 15, 2003

 
I have been thinking about social capital and it's relationship to software development. It seems to me that part of the appeal of XP (to me and others) has been the perception that "my life is better" when I use XP, especially versus no methodology or code-n-fix. I've always attributed this feeling of happiness with XP to comfort in the artifact. I've felt that the fact that I have confidence in the testing/stability of the artifact and in the business value the artifact embodies.

Now, perhaps I have a new idea of why I'm happier. First, I've never been able to explain why pair programming and the planning game are as effective as they are. I always put this down to their effect on the artifact more than their effect on people--although, at some other level, I knew that this was important. Perhaps the XP value of communication goes deeper than (or Kent?) suspected. Maybe these processes create a reciprocating network of trust (social capital) that makes business, even software development, work.

One of the often touted benefits of XP is confidence in estimates. This is usually expressed in terms like this:


Our [business] people understand when they'll be able to get the software and with what features. They trust our estimates and believe that we can deliver because they have experience with us and our process. They also feel that they can change their minds with confidence (within the
constraints of our right to estimate features) and steer the car to a destination that we all want.


Maybe that isn't really about software at all? Change software to government policies and estimate to vote and you could be talking about a representative democracy that many people would like to live in. The leaders steer the car, but the people have rights too. The two parties enter into a social contract that makes them responsible to each other and is built on trust.

Just musing.
ian smith at pee ay are see dot com


Wednesday, May 14, 2003

 
If you haven't heard Josh and Co have been working on this new spin on XP called (of course) IXP. I saw their presentation at bay XP a few weeks ago and was impressed.

I think the thing that they have really pushed forward the most for my money is the idea of test driven management. I really think that Kent did a great job sorting out the software development side when he dreamed up XP. However, the stuff that Josh and co. are talking about now seems to really be pushing on the non-bottom-up adoption of XP. It's encouraging that the IXP stuff is even needed!



 
Was reading back through Laurent Bossavit's blog entries for last week. The May 9 entry caused me to realize how addicted I've become to sanity checks in my test code. A typical test method consists of getting the world into a known state (esp since I have a persistent database underneath me), running the operation I'm testing, and then making sure the actual state of the world matches my expectations. Simple enough, right?

But even getting the world into a known state is sometimes tricksy. So littered through my code, between the setup and the operation under test, are lines like
assertEquals("sanity", 3, nContactsBeforeDelete);
and
assertTrue("sanity", model.TEST_PROBE().isChoice());

Having the sanity check increases my confidence that I'm testing the right thing. Labeling it as a sanity check helps communicate the intent and explain the flow of the test -- "I'm not really testing this, but simply confirming that this is true if we've done the setup correctly."
Tuesday, May 13, 2003

 
Interesting experience debugging this afternoon. We track down a bug, the bar goes green, all the tests are passing, and we have a working system again. Then my programming buddy notices that there's a parallel bug that we haven't fixed yet.

"Go fix the test to fail," he says. Isn't that the essence of test-driven development?


 
The test suite is, by definition, the set of functionality provided by the present system.

There are four important implications that flow directly from this.

First, as Beck mentions, it is perfectly acceptable to attempt some change that could have pervasive consequences throughout the system. There is no reason to prove to ourselves that such a change should be benign -- we can just test it. If the test suite runs after the change, then we can commit it. Or not. Our choice.

Second, we have a responsibility to write a failing test before fixing a bug. Tests are bets. If we are never losing our bets, we probably aren't playing our cards right. Maybe this is heresy in the XP camp, but I think that if newly-introduced bugs are always caught by the test suite, then we are probably paying too much for our test suite, in the cost of creation and maintenance, relative to the value we are deriving from it. It should be rare that we have a bug in the production code that is not caught by the test suite, but we should expect it to happen. When it does, our first responsibility is to write a failing test that illuminates the bug. This is the main corrective for writing less-than-obsessive tests here in a research environment. And it should be a major difference from the "smoke test" style of testing that Bossavit describes. A bug once fixed stays fixed.

Third, because the test suite defines ground truth for the current system, there is a shift of responsibility that frees the developer from a vague sense of dread of breaking code he's unfamiliar with. Let's say a system has no test suite, or let's say it has a test suite, but the test suite is not considered the absolute definition of what is and isn't working in a system. In such an environment, if a member of the development team, particularly one who is newer or is more junior, makes a change that breaks some feature X, then he is to blame. He should have known better. His mistake becomes part of his education. Now switch contexts. Same team member, same change, same breakage of feature X, but in an environment where the test suite is ground truth. Now, if the test suite catches the breakage, well and good, and he knows not to commit. But let's say that feature X has creeped into the system without adequate tests (bad, bad!). Our team member commits. Feature X breaks. But it didn't really break, because it wasn't ever really in the system. By definition.

Fourth, where one has to be paranoid is in refactoring tests or in removing tests from the system. Since the system's capabilities are defined by the tests, we must take care not to unintentionally lose defined features in refactoring, or in removing what appear to be redundant tests. Without XP, one must carefully ponder serious changes to the production code. With XP, one must carefully ponder serious refactorings or removal of test code.
Monday, May 12, 2003

 
I have to take issue with the May 11 entry in Incipient.oO about "smoke test" vs what he claims an XP-style test should do.

Beck suggests that you keep writing tests as long as there are any reasonable tests that either member of a programming pair can dream up. Elsewhere, he puts forward the notion of tests as bets. These two notions don't quite jive, and I'm siding with the tests as bets. Writing a test is an economic educated guess. It incurs the cost of writing the test, and the possible future cost of refactoring the tests when the test infrastructure becomes too complex or needs to undergo change. On the other hand, it has the value of encouraging the development of clean production code, and of providing some confidence to the initial developers of some feature that they have succeeded. But a large part of the value is in the fact that the test is automated, and once it's among the tests in the suite, the feature stays implemented. The test functions as a lighthouse to developers of other, future features that they are about to crash on the rocks.

With the notion of tests as bets, or tests as educated economic guesses, software development teams can choose where to lie along the "smoke test" - "test everything" axis. For the software I'm writing today, I believe that we are writing the correct sorts of tests for software that is intended to be proof-of-concept for research ideas. If we were developing code for medical devices or even consumer applications, my guess would be that we should be writing an order of magnitude more tests, becoming downright anal in our test-writing. For our applications, there are even tests that we choose not to write. (Shudder.) There are one or two features in our system that are (a) very painful to write as an automated test and (b) would be found in the first 2 seconds of running the system. We've decided that the bet isn't worth placing. I bet that the cost of writing the automated test to cover our start-up configuration completely (requiring that our test code be able to fire up multiple JVMs) would never be repaid in saving us from introducing a bug that would screw this up.

Now, if this was the only paradigm hovering around testing, we might expect developers to get lazy, and have beta testing degenerate into the kind of smoke testing Bossavit describes and reproves. The cure for that is "testing as ground truth", but that will have to be the topic of a different post.



 
Okay, I admit it: I'm obsessed. Once I start thinking about an issue about the "right" way to develop some piece of code, I can't quit until I can succinctly describe the boundaries. So here is yet another rant on overriding equals. (Let hashCode tag along at his own risk.)

The Java collections classes fail if equals violates either reflexivity or transitivity. Transitivity is the key. Let's say Bar extends Foo extends Object, and both Foo and Bar override equals. All Bars are Foos, but there are Foos that aren't Bars. Because of reflexivity, if *any* Foo (foo) that's not a Bar can claim equality to some Bar (bar), then that Bar must claim equality with that Foo. Because of transitivity, bar must also provide the same answer to equals that foo provides for all other objects.

In a nutshell, the rule for overriding equals for some class is: If instances of another class (presumably a superclass) can claim equality to instances of the class you are implementing, then the equals implementation you provide must put all of your instances in the same equivalence classes that the superclass puts them in.

Both of the options spelled out in the previous blog satisfy this constraint. I suspect there are others, but I'll have to think about that further...

 
An article by the prolific Paul Graham (author of ARC) about Hackers And Painters.

I agree with him a bit on some points and strongly on others. However, I think its a bit self-aggrandizing to raise Computer Science (or Hacking, or Software Engineering) up to the level of art.

 
For the Mac heads out there, you may want to check out Hydra which lets you use rendevous to do pair programming over the wire. Sweet.

 
I've been thinking a bit about pair programming. In particular, I was wondering if XP's mantra of "all production code to be written in pairs" is really just a degenerate case of some slightly less dogmatic rule:


When the going gets tough, the tough pair program.


In other words, use good judgement about when to pair and when not to pair. If you are sure people can progress without pairing, don't do it and let the resources run at max utilization. On the other hand, when you need to do something nasty, there is a tough design problem, or somebody just doesn't understand pair for all the reasons that XP advocates advocate. Obviously, if you have people for whom this judgement is difficult, the XP rule is a reasonable approximation.

I think I first heard some form of this from the aspectJ guys but I am really coming to appreciate it lately. (The aspectJ guys pair programmed the tricky parts of the aspectJ compiler, I think.)
In particular, as I have been forced to only work on a project part-time, I've had conversations with other developers who have said, "Well, if you can only work on this project half a day today, maybe we should work on this instead of that because I can do that no problem."

I don't yet have a way to express when to pair and when not too, but it does seem that it could give better resource utilization if done judiciously.

 
Unless I hear some serious moaning, I'm going to keep this blog here on blogger. I could switch to moveable type on my server but that seems a bit premature. My evil plan is simply to (get this) wait until blogger has trackback. I'll upgrade to blogger pro or whatever to get us an RSS feed soon.

Can somebody direct me to a service that offers me a way to simply list a set of RSS feeds that I want and have the aggregated for me? This would make browsing my daily blogs way faster.

ian smith,
iansmith at pee ay are see dot com
Friday, May 09, 2003

 
Ian mentioned that he met someone working on an pricey XML database (I think it is Cerisent), so I wonder what they think of Sleepycat's open Berkeley DB XML. I certainly look forward to never building another wrapper to store and generate XML data using SQL.

 
After my previous rant about testing, I've found a variety of fun articles about testing and the types of testing.

Laurent Bossavit has an entry about the testing and the types of testing that might not be obvious.

Laurent links to an entry by Mike Clark about "learning tests." I have to say that I do this now all the time with new APIs.

If you haven't seen it, this is the new tool FIT that is all the rage in the functional testing part of XP. I'm a bit skeptical because I don't know if this goes far enough to overcome the problem that customer's incentive to write tests (and esp. good tests) isn't as clear as for other parts of XP, such as the planning game. The benefit to the customer of testing is a bit more abstract a great deal more deferred than with the planning game.
Thursday, May 08, 2003

 
Specific technical nits wrt overriding equals and hashCode in Java. The standard invariants, given Object a and Object b, are
assert a.equals(b) == b.equals(a)
assert !a.equals(b) || ( a.hashCode() == b.hashCode())

If ClassA overrides equals and hashCode, the typical idiom for equals is
public boolean equals(Object other) {
if (other instanceof ClassA) {
... (testing for equivalence)
}
return false;
}
We assume that ClassA's hashCode does something reasonable.

Now, if ClassB extends ClassA, is there any useful way to override equals or hashCode? Given objects ClassA a and ClassB b, we require that b.equals(a) == a.equals(b), so ClassB's equals must agree with ClassA's. One suggestion was to use the following in ClassA instead:
public boolean equals(Object other) {
if (other.getClass() == getClass()) {
...
}
return false;
}
But this never allows subclassed instances to equal instances of the base class. There is no way in this paradigm to allow instances of two different subclasses of some common class to be equal, even though this is the desired semantics at times.

Playing a hot potato game is dangerous because of the danger of dropping into infinite recursion, a la
public boolean equals(Object other) {
if (other instanceof ClassA) {
if (other.getClass() != getClass()) {
return other.equals(this); //WRONG! possible infinite recursion
}
... //normal equality test
}
return false;
}

The problem here (besides planning for the future) is what happens when ClassB doesn't override equals and we compare two instances of ClassB. Oops. Seems to me that the only time it's safe to return other.equals(this) is when other.getClass() is provably a proper subclass of this.getClass(). Even in this case, if ClassC and ClassB both extend ClassA, then you'll run into problems if either ClassB or ClassC override equals.

Conclusions I'm reaching:
1/ When a class implementor wants to override equals, he must decide whether instances of subclasses should
be able to equal instances of the superclass, and equals and hashCode should be written accordingly.
2/ If equals is overridden using the instanceof paradigm, then both equals and hashCode should be made final.
3/ If equals is overridden using the other.getClass()==getClass() paradigm, then subclasses should enforce the same test. This could be enforced in the code, but is probably overly paranoid.

I'm inclined toward option 2.

(To wit,
class ClassA {
...
public final boolean equals(Object other) {
if (other.getClass() != getClass()) return false;
return equalValues((ClassA)other);
}
public boolean equalValues(ClassA other) {
...
}
Other classes can override equalValues, but it will only be called by equals when the two instances have the same concrete class.)

We end up with pretty much the same options for hashCode. If equals can be viewed as putting objects into equivalence classes, then hashCode must guarantee that two instances in the same equivalence class have the same hashCode. Assume that ClassA's hashCode is correct. For option 3 above, since all direct instances of ClassA are in different equivalence classes from all intances of ClassB, it is acceptable to override hashCode. But for option 2, I can't think of a time that it might be useful for ClassB to override hashCode -- if there is any ClassA a such that a.equals(b), then b.hashCode() better provide the same semantics as ClassA.hashCode, right? You can't change the hashCode equivalence classes laid down in ClassA without screwing something up. The one case where you might do something usable is if there are ClassB b where !a.equals(b) no matter what a. In that case, b.hashCode() can provide a different value than would have been provided by ClassA's hashCode. This is a really screwy case, so I stick to my original claim: If you follow option 2, make equals and hashCode final.

 
Interview with Grady Booch. I don't trust him because he says UML and Aspects in the same interview. :-)

He does mention the websphere stuff Jim H has been talking about.

http://www-106.ibm.com/developerworks/ibm/library/i-booch/

 
I think Kent has got this wrong, at least somewhat.

A design that is getting worse may be exactly right for a project that is winding down and the economics of which don't justify the investment to make it get better.

Further, in our line of work, if the idea is getting better, maybe the design doesn't matter?

ian smith at pee ay are see dot com


 
Documentation is a real problem. It's not a technical problem--most of us have swallowed the cool aid on this one already--because we know (?) that executable tests are the best way to make sure a system is documented and that the documentation cannot be wrong.

The problem is the customers. There is a set of expectations among customers, especially those that maybe have had a few programming courses or done programming in the past, about documentation that are hard to sidestep. If someone says "Did you document this code?" I feel ok saying "yes" if I have good tests and they communicate the code's functioning well. However, if they say "Is this code commented?" I have trouble saying "No, and it won't be." Worse is "Is this code commented? Could someone else pick it up and understand it?" My gut level response would be something like:

No, it's not commented and of course someone else could pick it up and understand it. If they are any good they'll know better than to expect much from the comments and they'll trust the code that actually runs. If the person getting this code is really good, he or she probably expects the comments to be misleading, wrong, and completely out of date and thus would never trust them at all.


Of course, that response isn't really what people were looking for. It's not really acceptable to say that. Sigh.

In response to a question about comments in a meeting once I said, "No there aren't any comments in the code. None. Zero. The customers don't care about comments they care about working code and reliability." While the statement was true, it wasn't very smart politics. I'd like to know what I should say.


Tuesday, May 06, 2003

 
There has been a lot of discussion around here lately about the difference between "testing to find bugs" and "testing to document how a system should work."

I have noticed that I feel very differently about the effect of these on other people. In the "finding and preventing bugs" camp I have no trouble exploiting any and all knowlege of the system. So I don't mind saying to someone:

Oh, yeah, that's kinda evil. It uses *blah* which is way down in the bowels and connects it up to a *frak* which never really happens when the user runs the code but it stresses part *grik* to make sure that *grak* can't happen. I know it's ugly but *grak* is really complex and I wanted to make totally sure...


This is true also for tests that are put into the system to demostrate/squash bugs. Again, I think that any amount of test complexity, infrastructure use/abuse, whatever is ok here. The tests purpose is to keep the software running right, so everything is legal and damn the understandability and communication. This is mitigated somewhat by the fact that in the case of a bug I usually put in a description of what the bug was so at least the person looking at the test has a hope of comprehension.

On the other hand, I do feel guilty often about tests whose purpose is to
communicate how the software functions. Until recently, I hadn't really thought through doing this explicitly, e.g. having a explicit software development effort to write tests that specify/document how the system works. I was however doing this unconsciously with my unit tests. Now, my test harness(es)/infrastructure(s) have gotten older, cruftier, more complicated, and (worst) multiplied in number. This leads to guilt because the purpose has been lost.

Taking these together, I think this is a wierd way in which XP is hurting me. Because I know the tests are all performing useful work (and passing) I build up a refactoring debt in the test suite that I am loathe to pay. It's a totally hidden cost--these are the very things that are making the software work as well as does. Passing this cost to the customer representative in the planning game also makes me embarassed, since them paying this debt with their budget serves no purpose for them. I know that I should just be amortizing better and hiding constant payments of test suite refactoring debt throughout the development, but I have trouble making myself do it. Of course, the debt accumulates so when do end up paying it... sigh.

ian smith at pee ay are see dot com



Powered by Blogger