Find my posts on IT strategy, enterprise architecture, and digital transformation at ArchitectElevator.com.
Domain specific languages promise to be the next big step in software development productivity, significantly reducing the gap between the business domain and the implementation domain. As with most things in our field, the idea of domain languages is not exactly new. Many of the unix tools such as tcl, lex, yacc can justly be called domain specific languages (for text processing, scanner generation, etc). Unfortunately, in the past creating a new language was generally quite difficult and working with them also mostly meant giving up modern IDE amenities such as syntax-highlighting, interactive debugging, and auto-complete. Now many people are now working on making creating and working with these languages easier so that it becomes a viable option to create a (or multiple) new languages for each project.
One major distinction in DSLs are embedded DSLs vs. external DSLs. In short, external DSLs define a new language from ground up. For example XSLT is an external language for XML document transformation. Embedded DSLs are built inside a "host" language such as Java or Ruby. There are quite a few discussions about what makes a programming language a good host language. For now I'll simply refer to a higher authority.
Once we talk about things that start with domain Eric Evans (the author, not the porn star) is never far away, at least conceptually. Eric has written an excellent book on domain-driven design. Once again, I defer to the higher authority for all the gory details (Eric's site features an excellent summary of the patterns in his book). For my purposes let it suffice to say that the goal is to develop an expressive model of the problem domain that can be closely reflected in the resulting source code. Coding against this model results in code that uses essentially the same language the domain experts use (albeit with some inevitable syntax clutter).
One of the main motivations behind these techniques and approaches is to write code that is more expressive as to the problem it is trying to solve. Ever wondered what a piece of code is actually doing? Or, worse yet, can you tell a business person where in the code base a specific piece of business logic is "buried"? A lot has been said about "code that is worth reading" to quote Ward Cunningham. Along the same lines my recent keynote at TheServerSide Java Symposium (re-)emphasized the notion of "coding for humans instead of the machine".
One of my key realizations in this topic area is that you do not have to go out and write a whole new domain language to make headway in achieving this goal. Just thinking a bit harder and putting special emphasis on the way your code reads often gets you leaps and bounds ahead of hard-to-understand everyday code. And while Java is not the best language to host a domain language you can get a whole lot further than you would by not trying.
The essence is a mental shift away from code that is easy to write to code that is easy to read. Once you realize that most code is read a lot more often than it is written (insert your favorite statistic stating that X% of total software cost results from maintenance here) it makes sense to put a bit more effort into code that is nice to read. I am not talking about indentation and comments here but about code that is actually a pleasure to read because it represents an accurate and elegant representation of the problem that is being solved. Source code is such an expressive medium that it would be a shame not take more advantage of its capabilities. Let me give you a simple example.
Eric Evans developed an open source Time and Money library that he uses in projects and also in his training classes. I was lucky enough
to attend one of his workshops and write some code against the library. While it is
easy to come up with something significantly better than java.util.Calendar
(e.g. see JodaTime) Eric's library has some nice properties that get to the heart of expressive software.
One exercise in Eric's tutorial asks us to implement the following business rule:
payments are due 30 days after the end of the month in which the work was completed.
Using java.util.Calendar
one can easily imagine the ugly code that has to wrestle with months that start with
0 and years that start with 1900, all of which has rather little to do with the problem
at hand. Implemented on top of Eric's Time and Money the code looks like this:
CalendarDate dueDate(CalendarDate workEndDate, int allowableDays) { return workEndDate.month().end().plusDays(allowableDays); }
The first observation is surely that the code reads very much like the requirement
in English: "from the work's month end add n days". To me, the most noteworthy piece of this nice line of code is the month()
method. One might find it curious that this method actually returns a CalendarInterval
instead of a number like java.util.Calendar
would. But come to think about it, the month of March is not a number, so any expression
like 3 == date.getMonth()
should seem silly. Instead the month of March is a range of days from March 1st to
March 31st.
Another critical observation is that CalendarDate
actually denotes a date and not a specific point in time like java.util.Calendar
. Dates do not have hours and minutes and also do not have time zones. The 20th of
March is simply that, a day in the year, regardless of where on earth you are. In
contrast, were you using Java Calendar
you would have to bake in some assumptions to have it represent a date instead of
a specific point in time. Most implementations do this by fixing the time to 0:00:00
and selecting a default time zone such as GMT. It gets tricky when these assumptions
are not explicitly stated. For example, is a method allowed to look at the hours
or minutes of a Calendar
that is passed in? Can it assume that they would also be zero? If it sets a different
value will that be persisted to the database? Code is better off not having to ask
such questions.
The Time and Money library does not hide these concepts but instead makes them explicit.
That means that if you want to convert from a CalendarDate
to a java.util.Calendar
you have to specify all assumptions that are made implicitly by Calendar
, such as the time zone. The code to accomplish this conversion looks like this:
Calendar convertToCalendar(CalendarDate date) { return date.asTimeInterval(TimeZone.getDefault()).start().asJavaCalendar(); }
You might think: wow, why so much code? But come to think about it, CalendarDate
is a date, not a point in time. So first you have to convert the day into a time
interval, e.g. the span of time from midnight on the beginning of the date until midnight
of the next day. Since we are talking times now we need to introduce the concept of
time zones. While March 20th is always March 20th when we talk about dates, as soon
as we start talking about times and time intervals we need to realize that March 20th
starts a lot earlier in Japan than it does in California. Note that because the concept
of time zone is critical for this conversion there is no default constructor. You
have to make it explicit which time zone you want to use, even if you choose the default
time zone. Lastly, if your convention is to default the time portion of Calendar
to 0:00:00 you express that by specifying the start()
of that time interval to be used for the conversion to Calendar
. Having to re-inject the assumptions might mean more typing but it answers all the
nagging questions posed above. You can shorten the expression a little:
return date.startAsTimePoint(TimeZone.getDefault()).asJavaCalendar();
Here the notion of taking the starting point and converting it to a time point is accomplished in a single step. I almost prefer the former option as it makes it more explicit what the step-by-step conversion is.
I hope this simple example illustrates how a simple piece of business logic can be implemented in code such that it truly expresses what the business is asking for. No compiler generators, byte code injection, or other magic were needed. All that it took was to take the time to really understand the problem domain and to write code that reflects this understanding. Now that's what I call code that is worth reading.