Thursday, October 27, 2005

Agile Databases and other paradoxes

So Niall Litchfield’s aside about Sandra Momali’s taste for Agile development has kicked off a bit of a debate. I thought I’d throw in my tuppenceworth, as I find XP a very seductive idea, even though I haven’t had much opportunity to use more than a couple of its techniques. I think it is easy for discussion about Agile development to descend into straightforward bashing of Java heads and developer-vs-DBA scrapping. Which can be fun for the participants but are not particularly enlightening for spectators. Besides there’s more to Agile than Java. In fact, Bruce Tate (author of Better, Faster, Lighter Java) seems ready to dump it in favour of Ruby because Java is insufficiently agile, EJB3.0 notwithstanding.

I hope we can all agree that the problems that Agile seeks to address are valid ones. Applications are all too frequently delivered which are bug ridden, late, not what the customer wanted, not what the customer now needs, and sometimes all of the above simultaneously. Let’s disregard considerations of whether there are other approaches that can also solve these problems and take the Agile approach as a given. The question that needs to be answered is, "Can Agile techniques sensibly and profitably be applied to database development?"1

From the Agile side of the fence the answer is mostly a resounding raspberry. Agile practioners tend to regard regard databases as persistence repositories of last resort. They have a number of strategies to avoid writing SQL. Not just by deferral, using Mock Objects in the development stage, but by making deliberate architectural decisions to not use databases. These range from serialization to files through use of in-memory object stores like Prevayler to hands-off Object-Relational mapping tools like Hibernate. Last year I was talking to a UKOUG member who no longer attended the SIG I help run. He explained that his project was currently being migrated to a new version using Hibernate, Spring, Velocity, etc. I said, that sounds exciting, could someone do a presentation at the SIG? Back came the answer: No. Being Java developers they literally could not see the point in talking to a database user group. The whole point about their approach to application development was to avoid thinking about the database at all.

Some of this resistance may be down to sheer pigheadedness, but (at the risk of seeming to patronise Ron Jeffries) many of these people are actually very bright and aware people. They just don’t think databases can be Agile. Why is this? Let’s look at some of the Agile practices that are made harder by working with an RDBMS.

Incremental Design


Agile starts with rather sparsely-defined requirements and uses techniques such as Test Driven Design to flesh them out. This is not building a prototype, this is working, production quality code. By building real code the Agile practioner can show his user a working application and get meaningful feedback that allows him to refine the requirements. Hence together they build the system that the user actually wants. The anti-pattern for this is Big Design Up Front. Here we spend ages drawing a mesh of boxes called an Entity Relationship Diagram. We show it to a bemused user who nods guardedly. We turn it into a data model but the user still isn’t as excited as we are. Finally the database gets built and then the front end gets built on it and whereupon the user wails, "But that’s not what I meant at all!" The question is, does the fault lie in the database or the process? Are databases inherently BDUF?

Obviously, we can just build one table at a time. But is that enough? Probably not. We still need to normalise our database2. Consequently, any attempt to build our database incrementally means constantly refactoring of our existing tables, to eliminate duplication. Can we do this? Yes we can! Of course it requires that we hide all our tables from public gaze behind an API - PL/SQL packages, views or types, which some people will just hate. The larger question is Tom Kyte's point about the importance of designing for performance right from the start. Does that imply BDUF? Yes, at least some. The other problem is that refactoring an object is a lot simpler than splitting one table filled with twenty million rows into two tables. Our database design can be reasonably plastic in development but once we go into production inertia kicks in. I think here is the killer for Agile practices: database refactoring is hard. Scott Ambler seems to be ploughing a lonely furrow in this area but his thoughts on The Process of Database Refactoring make interesting reading.

User stories


Agile practices tend to re-inforce each other and that’s the case with User Stories and Incremental Design. User stories describe how the customer’s business works. These stories will not mention data storage, except by implication. Customers don’t specify a database, they specify an application, that is, a front end. Customers cannot accept or validate databases. Even if we gave them TOAD and they wrote SQL queries that would not prove the database met any of their requirements, because their requirements are specified in terms of the front end. And without continuous user feedback we cannot be sure that we are coding in the right direction.

Kent Graziano has been working on building a datawarehouse using Agile techniques. He has presented on Agile Methods and DataWarehousing several times this year. When I saw him talk at Open World he said he had got some grief from Agilistas because the ETL section of datawarehouses doesn’t have users. So there cannot be any user stories. But as Kent points out, just because the business user (the customer) doesn’t know or care about the ETL process doesn’t mean that ETL has no users. On Kent's project they put the BI report writer in the user role. Which is a nimble, indeed agile, extension of the practice which doesn’t, as far as I can see, break any tenet.

Common Ownership of Code


When programming an agile practioner drives a path through the code base. If she finds an existing part of the application that blocks her implementation she can change, extend or fix that code. She can do this safely because the existing code has unit tests and the existing application has integration tests. More importantly, she can do this new work because the existing code base is in Java and she knows Java. But if this piece in the code base was in PL/SQL she would be lost: she doesn’t know PL/SQL. Even worse, suppose her change requires a new table? Then she would have to go to the DBA to find out about schemas, tablespaces, etc. This is too slow for the Agile way of working.

Teams of specialists are not Agile. We know this is true. How often have we been trying to progress a project only to be told, “We can’t do that, Gary’s on leave today.” To be fair this does not just apply to databases. I think Agile practices militates against heterogeneous programming environments and would be equally suspicious of (say) extensive use of shell programming. So, unless pretty much the whole team is competent in PL/SQL, using stored procedures just does not fit in an Agile project.

What this boils down to is the matter of where we should put the business logic. Unfortunately this frequently descends into a religious war. Either you believe it’s obvious that business rules belong in the database or it’s obvious to you that they belong in the middle tier. So let’s put it another way: in a client/server application where does the business logic belong? I think a large part of the middle tier proponents would then grudgingly accept the merits of the database as the repository of business logic. In most sites today we probably have heterogenous applications written in a variety of styles, techniques and languages, including many client/server applications. Our best hope of consolidating business rules is to use the database. But, this is an emotional debate and not one I expect either side to cede.

Object Oriented Programming


The alert amongst you will have noticed how often the term object appears in that list of persistence techniques: this is key. Agile programming languages are object-oriented languages. I was recently embroiled in a debate on the Test Driven Design list when I was told this surprising fact. Surprisng because I thought for several years I had been doing Test Driven Design by using Steven Feuerstein’s utPLSQL to build database applications. Apparently there’s more to Test First than merely writing our tests first.

Obviously an RDBMS is not object-oriented. A non-OO language like PL/SQL lacks certain key features supposedly necessary to the full panoply of Agile practices - abstraction, interfaces, reflection. Does this matter? I remain unconvinced. XP originated in the Smalltalk community and pretty much everybody you come across on Agile/Test Driven arenas seems to be an OO programmer. However, this is a self-fulfilling prophecy. Come out as a relational database developer and you get labelled as a handyman with only a hammer in your toolbox.

Use of Tools


Agile practioners make full use of utilities to improve their productivity - smart IDEs, code completion, continuous integration, automated unit testing, test coverage estimnators, refactoring browsers, etc. Compared to this panoply of riches PL/SQL looks very Spartan. There is the aforementioned utPLSQL but not much else. Of course, this is a vicious circle. PL/SQL lacks the tools to support Agile practices so people who want to be Agile don’t use PL/SQL so nobody builds the tools for PL/SQL.

Database development tools tend to focus on browsing the data dictionary and running SQL statements. There is very little around that genuinely supports the developer in building better PL/SQL or helps the DBA build better databases. This is a situation that is only going to get worse, as Oracle Designer slowly fades into the sunset. Paul Dorsey’s BRIM project (AKA The Tool) may be one to watch. Interestingly there is very little open source stuff for PL/SQL developers, compared to the wealth of freely downloadable utilities for Java heads and Pythonistas. I think this illustrates the lack of people who are competent in both an application (i.e. a UI building) language and PL/SQL.

Still, the Agile Manifesto says we have come to value individuals and interactions over processes and tools, so that’s all right then.

Instaneous Feedback


There is an old rule of thumb from the HCI arena that states that a user will regard as “too slow” any task that takes longer than ten seconds; if it takes longer than a minute they will start on another task and come back to the first task later. In Test Driven circles, running a suite of unit tests is such a task. The JUnit list regularly features complaints about how long it takes to run unit tests that require population of a database. This interferes with flow. Hence the popularity of Mock Objects. Hence the drive to isolate the database elements to as small a subset of the application as possible.

I think the database has to cop to this one. If your sole experience of database population is writing tedious XML files for use in DbUnit then you will naturally have a jaundiced view of RDBMS systems. But even people who know how to write SQL must admit that deleting and inserting records takes longer than instantiating some objects (once the JVM has been warmed up!). That’s the price we happily pay for rigour.

Pair Programming


Well, database people are notoriously misanthropic. Especially DBAs. We can still talk to each other.

Conclusion


The Agile argument is databases are slow to build and inherently hard to understand. This is true. But it’s true because Agile is primarily a methodology targeted at building user-facing applications. Its adherents tend to work from the user interface downwards. Of course it’s crucial that we build an application that meets the user’s needs. But the user’s needs extend beyond the functional requirements of the application screens.

That doesn’t mean we cannot apply Agile techniques to database development projects. Certainly I regard Test First to be an invaluable technique in building stored procedures; I’m always nevervous when I’m not coding to a utPLSQL test case. My experience of working on DSDM projects tells me that incremental development - in particular placing working code in front of users early and often - is a very good idea. Kent’s work proves that Agile techniques can be profitably applied even to a juggernaut like a datawarehouse. The pain comes when OO-fixated bean-mongers clash with constraint obsessed databasers. Perhaps if they drank less coffee they wouldn’t be so hyped up all the time.


1. By database development I mean building data structures (i.e. tables) as well as programs (PL/SQL).

2. I learnt yesterday that Codd’s Twelve Rules have been deprecated because "Codd formulated them as a quick and dirty way to counter all the nonsense that was floating at that early time, they are not orthogonal or systematic, and our understanding of RM has progressed considerably since then."

Thursday, October 20, 2005

ToDo Driven Development

Lucas Jellema has just posted a technique for incorporating TODO tags in PL/SQL codes. This was a spooky piece of synchronicity because I had just stumbled across a "sexy new development methodology" called ToDo Driven Development on the SecretGeek blog. Leon Bambrick's implementation is for .Net and I was going to port it PL/SQL but Lucas has saved me the effort. Mind you, ToDo Driven Development has an extension which tracks HACKs as well as TODOs and that's obviously worth having. So I may enhance Lucas's code this evening.

Monday, October 17, 2005

Blogging about a blog about blogging

Jakob "Usability" Nielsen has just published an Alertbox on blogging. It's depressing to discover that this blog fails at least five of his strictures: No Author Photo, The Calendar is the Only Navigation, Irregular Publishing Frequency, Mixing Topics, Having a Domain Name Owned by a Weblog Service. The jury is still out on Forgetting That You Write for Your Future Boss and Nondescript Posting Titles. I'm not sure whether Classic Hits are Buried applies, as I don't know if I have any classic hits; but if I have 'em, they're buried. Sob.

Some of these are easier to fix than others. But I guess I really Must Try Harder.

Politics: Life During Wartime

Tony Blair describes the arguments in favour of his new anti-terrorist legislation as, "compelling". This is because if you disagree you will be arrested and put into a cell until you realise your fears about human rights abuses are groundless. Which brings me to David Mery. I gather this has been slashdotted but I only just heard about it.

Mery is a computer geek who has an unhealthy obsession with Bill Gates's Basic compiler for the Altair. On 28th July this year he was arrested for being in possession of a French accent and a thick jacket in a London tube station. In 1979 who would have thought that, twenty-five years on, policing practice would be based on a Not The Nine O'Clock News sketch?

Eventually Mery was released and his case has been marked No Further Action. But three months on he has still not had his confiscated computers returned. And, of course, being arrested for suspected terrorist activity will pose Mery an interesting hurdle at US Immigration and Border Control, if he ever decides to try to visit the States. Still I suppose he should count himself lucky not to have been shot eight times in the back of the head.

As someone who lives in south London and who uses the tube almost every day of his life, of course I back the police in the struggle against the Jihadists. But I too carry my laptop in a backpack. Does that mean I could be arrested because I meet some apparently arbitrary criteria? Ultimately, the Jihadists are trying to destroy a free society and replace it with one in which the individual's thoughts and actions are constrained by the pronouncements of official interpreters of divinely-given law. But, if we do this to ourselves, it won't matter that the name of the God is different, the Jihadists will still have won.

Wednesday, October 12, 2005

Prioritisation

In his latest article Joel Spolsky describes a workshop to prioritise features for the next version of his FogBugz software package. Like a lot of Joel's approaches it's both radical and plain commonsense. Incidentally it's strange that in our high tech industry all the best ideas seem to hinge on creative use of index cards.

Tuesday, October 11, 2005

Oracle: a suitable career for a young person?

A while back someone posted a pertinent question on the OTN DB forum, Is it Worth to be DBA?. The poster was in their last year of college, studying computing. They were wanted advice on the merits of various different technical jobs: sysadmin, networks, DBA. Alas they never replied my response (how did they want to measure worth: salary, intrinsic job interest, long term career prospects, fringe benefits?) and the thread withered.

Still, the question remains: how should we advise a young person wondering whether to become an Oracle DBA? Obviously being an Oracle practioner has done well by me and many of the readers of this blog. It's been my prime focus of work for thirteen years, and there's still some juice in it yet. For most of that time I have been a developer, designer and latterly architect but my time spent doing DBA work has undoubtedly informed my knowledge of Oracle. It is good for everybody who builds Oracle systems to understand the database from the perspective of a DBA, because in production the application is only going to run as well as the database permits.

Whilst I was pondering this question someone else pops up, asking how they get job as a DBA. I do find this desire to become a DBA slightly puzzling. The work, particularly in production, is generally dull except when it gets exciting, and that's when you remember that "May you live in interesting times!" is a Chinese curse. Being a DBA requires you to know the purpose of 257 parameters, 266 V$ views, 415 DBA% views, let's not forget the the X$ views... There's dry stuff like sizing redo logs, taking backups, applying patches, running installation scripts written by developers who haven't tested them, etc, etc. What precisely is the allure?

Partly it's the mystique of being in the know. The DBA is the gatekeeper, the person with the authority to run your code in production, the one who can tell you why your code is running like a three-legged dog in a vat of cold treacle, the person who laughs when you ask them to make the USERS tablespace bigger because you keep getting this ORA-1555 error. Knowing the SYSDBA password is the database equivalent of the sysadmin's got root? T-shirt. In fact, the DBAs and the sysadmins tend to inhabit the same room (and indeed are quite often the same people) but it is easier for a SQL developer to move into being a DBA than it is to become a *nix administrator.

Also, being a DBA is a glamorous job. It's not really of course, but most of the luminaries of the Oracle technical world are (or seem capable of being) DBAs: Tom Kyte, Jonathon Lewis, Connor MacDonald, Cary Milsap, Mogens Noegaard, the list goes on. But how many App Server gurus can you name? If you're developer, who is there to look up to? Steven Feuerstein and, er, that's it. (I know if you're into Java you could cite Steve Muench and Duncan Mills, but Java with Oracle is a distraction.)

So everybody wants to be a DBA because it's all about the data(base). Does that mean a person leaving university should try to become an Oracle DBA? Almost certainly not. There's a reeason why most DBAs are gnarled, bearded men: we need experience as well as knowledge before we are ready to become a DBA. It's just not a role for a neophyte. Besides, with the increasing commodisation of databases the role of the DBA is going to decrease in importance and sustained interest for the majority of people. Not to mention considerations of whether Oracle is a legacy technology.

If long term career prospects are the main concern I think a fresh graduate with a technical bent should get into networks: database, platforms, operating systems will come and go but everything about computing in the foreseeable future will still require networks. And with the rise of wireless the amount of time spent crawling under desks should be greatly reduced. If salary is their main driver then a graduate should seek to get into management, which mainly requires shedding all their technical knowledge as soon as possible.

However, we all spend the majority of the day, most days of our life at work so it is important to do something that continually engages us. There's no doubt that most Oracle roles are capable of offering stimulating work alongside the humdrum. How much time we spend doing the latter over the former is a measure of our talent. Good people understand the need to do the tedious stuff in order to get the interesting work right, so do it well and quickly; the very best know how to spin the dross into gold.

But for a job that's interesting day after day, that's full of surprise, delight and moments of wonder, that's always worthwhile despite all the poop it entails? Well, there's only parenthood. Nothing else comes close. Still, maybe not the thing to do right after graduation (unless gowns and mortarboards make you hot).

Monday, October 10, 2005

MS Office: Get your re-org boots!

Jakob Neilsen's latest Alertbox draws our attention to the iminent redesign of the MS Office user interface. Not the least of the woes ahead will be the learning of a new acronym: WYGIWYS, What You Get Is What You See. Practice it now, it's a tough one to get right.

Apparently Office 2003 is so full of functionality that many users cannot locate many of the features. Who'd have thunk it? Anyway, to solve this problem Microsoft are rejigging the various menus, toolbars and dropdowns in favour of coherently organised Command Tabs. This sounds mildly useful. More interesting is the WYGIWYS bit, whereby we format the document by choosing our desired result from a gallery of examples. As somebody is frequently driven to apoplectic fury by my inability to switch off some particular embedded Word style object I think any improvement to Word format styles is to be welcomed. Of course want I really want is a codes pane like WordPerfect had but I can't see that coming soon.

What does it all mean? Well there's some learning coming up for those of us who are bound to the corporately-supplied toolset. There's possibly some big yokks for the Open Office and Mac sparts. But just suppose Microsoft pulls it off. A new UI not swiped off Apple/Xerox that delivers real productivity gains to ordinary users: wouldn't that be cool?

Oracle Metalink: Don't shoot the pianist

Laurent Schneider recently posted on his bad Metalink experiences. Let me show you my scars...

My first stand-out bad experience was an iTAR on generating checkboxes in JavaServerPages back in the days of JDeveloper 3.0. After two weeks of hassling I got told there was a palette with a Checkbox icon in the JDeveloper IDE. This much I knew already: it was for Swing programs. I tried to explain about the difference between Java for client/server and web but the iTAR got closed anyway. This was when the Java language support was still done by the database group, who really didn't understand building client-facing applications at all. One of our Java brains figured it out eventually.

The second bad experience was a year or so later. We were having some difficulties with BC4J which dragged on without resolution. The support guy, exasperated, asked why we were using BC4J as "Nobody was using, even in Oracle". I posted that to an ODTUG list, to stir up a response. Oh boy. Steve Muench fired an enraged e-mail back to ODTUG. I had the Support manager on the phone. I expect the poor second-line drone caught it in the shorts (which I do feel bad about. Sorry, whoever you were). Did we get our problem resolved? Again, eventually. Again, this was Java. Oracle really took a long time to sort out its support for Java. But that's how I discovered the OTN forums, so it wasn't all terrible.

The third bad experience, was some kind of database thing, I forget what. The first-liner resolved it to his satisfaction and closed the iTAR. I called back to explain that he had mis-read what I had typed in the iTAR and that his "solution" solved a completely different problem. He said, 'You'll have to open another TAR.' Now that's a lot of typing to do all over again. It took seven words, 'I want to speak to your manager', to get the TAR re-opened, and resolved pretty quickly thereafter.

I've been using Metalink and Support for over ten years now and I think three memorably bad experiences in that time is pretty good going. Most iTARs get solved, and get solved within a reasonable time. Of course, not working in production means that I can take a more relaxed attitude towards rsolution time.

The real problem with Metalink is the interface. When I first started using Support it was all done with phone calls. Which was great: I almost always got someone who was knowledgeable, so most TARs were actually resolved on the phone or else they turned out to be bugs, (As I was using Designer 1.2 on Windows for Workgroups 3.1.1 to generate Forms 4.5 there were a lot of bugs). What that meant was I tended to phone up Supoort to get answers to questions of the "How do I do ...?" variety. Now that iTARs have to be raised through a web interface everything takes much longer. There just isn't the immediacy of a phone call. In fact the iTAR process is so slow it's generally a lot quicker to use the Metalink search engine or Google. As a consequence I never contact support with a "How do I do ... ?" question. They're smart, these Oracle guys.

Of course, the Metalink repository is not without it's problems. Why is it so hard to filter out anything to do with RdB or Apps of any flavour? Why can't I explicitly search for (or exclude) stuff by document type (release installation note, error report, How To bulletin)? Why is the weighting of documents so poor? It used to be that the first result when search for an error number was the OERR document. Now it's quite frequently number 79/100. The answer is quite simple: poor metadata. The Metalink repository needs much better tagging. Of course there must be so many documents in the database you can see why nobody want's to undertake the task.

But then, but then. Only last week I started to raise a TAR because I had failed to find anything helpful in the repository. having typed it all in the TAR workflow suggested six documents that might solve my issue, and the second one did. Result! So the information is there, it's just sometimes so hard to find it.

It's easy to knock Metalink, but who would really want to be without it?