March is Not a Number

Find my posts on IT strategy, enterprise architecture, and digital transformation at ArchitectElevator.com.

Domain Languages

Domain specific languages promise to be the next big step in software development productivity, significantly reducing the gap between the business domain and the implementation domain. As with most things in our field, the idea of domain languages is not exactly new. Many of the unix tools such as tcl, lex, yacc can justly be called domain specific languages (for text processing, scanner generation, etc). Unfortunately, in the past creating a new language was generally quite difficult and working with them also mostly meant giving up modern IDE amenities such as syntax-highlighting, interactive debugging, and auto-complete. Now many people are now working on making creating and working with these languages easier so that it becomes a viable option to create a (or multiple) new languages for each project.

One major distinction in DSLs are embedded DSLs vs. external DSLs. In short, external DSLs define a new language from ground up. For example XSLT is an external language for XML document transformation. Embedded DSLs are built inside a "host" language such as Java or Ruby. There are quite a few discussions about what makes a programming language a good host language. For now I'll simply refer to a higher authority.

Domain-driven Design

Once we talk about things that start with domain Eric Evans (the author, not the porn star) is never far away, at least conceptually. Eric has written an excellent book on domain-driven design. Once again, I defer to the higher authority for all the gory details (Eric's site features an excellent summary of the patterns in his book). For my purposes let it suffice to say that the goal is to develop an expressive model of the problem domain that can be closely reflected in the resulting source code. Coding against this model results in code that uses essentially the same language the domain experts use (albeit with some inevitable syntax clutter).

Easy to Write vs. Easy to Read

One of the main motivations behind these techniques and approaches is to write code that is more expressive as to the problem it is trying to solve. Ever wondered what a piece of code is actually doing? Or, worse yet, can you tell a business person where in the code base a specific piece of business logic is "buried"? A lot has been said about "code that is worth reading" to quote Ward Cunningham. Along the same lines my recent keynote at TheServerSide Java Symposium (re-)emphasized the notion of "coding for humans instead of the machine".

One of my key realizations in this topic area is that you do not have to go out and write a whole new domain language to make headway in achieving this goal. Just thinking a bit harder and putting special emphasis on the way your code reads often gets you leaps and bounds ahead of hard-to-understand everyday code. And while Java is not the best language to host a domain language you can get a whole lot further than you would by not trying.

The essence is a mental shift away from code that is easy to write to code that is easy to read. Once you realize that most code is read a lot more often than it is written (insert your favorite statistic stating that X% of total software cost results from maintenance here) it makes sense to put a bit more effort into code that is nice to read. I am not talking about indentation and comments here but about code that is actually a pleasure to read because it represents an accurate and elegant representation of the problem that is being solved. Source code is such an expressive medium that it would be a shame not take more advantage of its capabilities. Let me give you a simple example.

Time and Money

Eric Evans developed an open source Time and Money library that he uses in projects and also in his training classes. I was lucky enough to attend one of his workshops and write some code against the library. While it is easy to come up with something significantly better than java.util.Calendar (e.g. see JodaTime) Eric's library has some nice properties that get to the heart of expressive software.

One exercise in Eric's tutorial asks us to implement the following business rule: payments are due 30 days after the end of the month in which the work was completed. Using java.util.Calendar one can easily imagine the ugly code that has to wrestle with months that start with 0 and years that start with 1900, all of which has rather little to do with the problem at hand. Implemented on top of Eric's Time and Money the code looks like this:

CalendarDate dueDate(CalendarDate workEndDate, int allowableDays) {
  return workEndDate.month().end().plusDays(allowableDays);
}

The first observation is surely that the code reads very much like the requirement in English: "from the work's month end add n days". To me, the most noteworthy piece of this nice line of code is the month() method. One might find it curious that this method actually returns a CalendarInterval instead of a number like java.util.Calendar would. But come to think about it, the month of March is not a number, so any expression like 3 == date.getMonth() should seem silly. Instead the month of March is a range of days from March 1st to March 31st.

Explicit and Implicit Concepts

Another critical observation is that CalendarDate actually denotes a date and not a specific point in time like java.util.Calendar. Dates do not have hours and minutes and also do not have time zones. The 20th of March is simply that, a day in the year, regardless of where on earth you are. In contrast, were you using Java Calendar you would have to bake in some assumptions to have it represent a date instead of a specific point in time. Most implementations do this by fixing the time to 0:00:00 and selecting a default time zone such as GMT. It gets tricky when these assumptions are not explicitly stated. For example, is a method allowed to look at the hours or minutes of a Calendar that is passed in? Can it assume that they would also be zero? If it sets a different value will that be persisted to the database? Code is better off not having to ask such questions.

The Time and Money library does not hide these concepts but instead makes them explicit. That means that if you want to convert from a CalendarDate to a java.util.Calendar you have to specify all assumptions that are made implicitly by Calendar, such as the time zone. The code to accomplish this conversion looks like this:

Calendar convertToCalendar(CalendarDate date) {
  return date.asTimeInterval(TimeZone.getDefault()).start().asJavaCalendar();
}

You might think: wow, why so much code? But come to think about it, CalendarDate is a date, not a point in time. So first you have to convert the day into a time interval, e.g. the span of time from midnight on the beginning of the date until midnight of the next day. Since we are talking times now we need to introduce the concept of time zones. While March 20th is always March 20th when we talk about dates, as soon as we start talking about times and time intervals we need to realize that March 20th starts a lot earlier in Japan than it does in California. Note that because the concept of time zone is critical for this conversion there is no default constructor. You have to make it explicit which time zone you want to use, even if you choose the default time zone. Lastly, if your convention is to default the time portion of Calendar to 0:00:00 you express that by specifying the start() of that time interval to be used for the conversion to Calendar. Having to re-inject the assumptions might mean more typing but it answers all the nagging questions posed above. You can shorten the expression a little:

return date.startAsTimePoint(TimeZone.getDefault()).asJavaCalendar();

Here the notion of taking the starting point and converting it to a time point is accomplished in a single step. I almost prefer the former option as it makes it more explicit what the step-by-step conversion is.

Make Your Code Worth Reading

I hope this simple example illustrates how a simple piece of business logic can be implemented in code such that it truly expresses what the business is asking for. No compiler generators, byte code injection, or other magic were needed. All that it took was to take the time to really understand the problem domain and to write code that reflects this understanding. Now that's what I call code that is worth reading.