Interesting applications rarely live in isolation. Whether your sales application must interface with your inventory application, your procurement application must connect to an auction site, or your PDA’s PIM must synchronize with the corporate calendar server, it seems like any application can be made better by integrating it with other applications.
All integration solutions have to deal with a few fundamental challenges:
- Networks are unreliable. Integration solutions have to transport data from one computer to another across networks. Compared to a process running on a single computer, distributed computing has to be prepared to deal with a much larger set of possible problems. Often times, two systems to be integrated are separated by continents and data between them has to travel through phone-lines, LAN segments, routers, switches, public networks, and satellite links. Each of these steps can cause delays or interruptions.
- Networks are slow. Sending data across a network is multiple orders of magnitude slower than making a local method call. Designing a widely distributed solution the same way you would approach a single application could have disastrous performance implications.
- Any two applications are different. Integration solutions need to transmit information between systems that use different programming languages, operating platforms, and data formats. An integration solution needs to be able to interface with all these different technologies.
- Change is inevitable. Applications change over time. An integration solution has to keep pace with changes in the applications it connects. Integration solutions can easily get caught in an avalanche effect of changes – if one system changes, all other systems may be affected. An integration solution needs to minimize the dependencies from one system to another by using loose coupling between applications.
Over time, developers have overcome these challenges with four main approaches:
- File Transfer — One application writes a file that another later reads. The applications need to agree on the filename and location, the format of the file, the timing of when it will be written and read, and who will delete the file.
- Shared Database — Multiple applications share the same database schema, located in a single physical database. Because there is no duplicate data storage, no data has to be transferred from one application to the other.
- Remote Procedure Invocation — One application exposes some of its functionality so that it can be accessed remotely by other applications as a remote procedure. The communication occurs real-time and synchronously.
- Messaging — One applications publishes a message to a common message channel. Other applications can read the message from the channel at a later time. The applications must agree on a channel as well as the format of the message. The communication is asynchronous.
While all four approaches solve essentially the same problem, each style has its distinct advantages and disadvantages. In fact, applications may integrate using multiple styles such that each point of integration takes advantage of the style that suits it best.
What is Messaging?
This book is about how to use messaging to integrate applications. A simple way to understand what messaging does is to consider the telephone system. A telephone call is a synchronous form of communication. I can only communicate with the other party if the other party is available at the time I place the call. Voice mail on the other hand, allows asynchronous communication. With voice mail, when the receiver does not answer, the caller can leave him a message; later the receiver (at his convenience) can listen to the messages queued in his mailbox. Voice mail enables the caller to leave a message now so that the receiver can listen to it later, which is lot easier than trying to get the caller and the receiver on the phone at the same time. Voice mail bundles (at least part of) a phone call into a message and queues it for later consumption; this is essentially how messaging works.
Messaging is a technology that enables high-speed, asynchronous, program-to-program communication with reliable delivery. Programs communicate by sending packets of data called messages to each other. Channels, also known as queues, are logical pathways that connect the programs and convey messages. A channel behaves like a collection or array of messages, but one that is magically shared across multiple computers and can be used concurrently by multiple applications. A sender or producer is a program that sends a message by writing the message to a channel. A receiver or consumer is a program that receives a message by reading (and deleting) it from a channel.
The message itself is simply some sort of data structure—such as a string, a byte array, a record, or an object. It can be interpreted simply as data, as the description of a command to be invoked on the receiver, or as the description of an event that occurred in the sender. A message actually contains two parts, a header and a body. The header contains meta-information about the message—who sent it, where it’s going, etc.; this information is used by the messaging system and is mostly (but not always) ignored by the applications using the messages. The body contains the data being transmitted and is ignored by the messaging system. In conversation, when an application developer who is using messaging talks about a message, he’s usually referring to the data in the body of the message.
Asynchronous messaging architectures are powerful, but require us to rethink our development approach. As compared to the other three integration approaches, relatively few developers have had exposure to messaging and message systems. As a result, application developers in general are not as familiar with the idioms and peculiarities of this communications platform.
What is a Messaging System?
Messaging capabilities are typically provided by a separate software system called a messaging system or message-oriented middleware (MOM). A messaging system manages messaging the way a database system manages data persistence. Just as an administrator must populate the database with the schema for an application’s data, an administrator must configure the messaging system with the channels that define the paths of communication between the applications. The messaging system then coordinates and manages the sending and receiving of messages. The primary purpose of a database is to make sure each data record is safely persisted, and likewise the main task of a messaging system is to move messages from the sender’s computer to the receiver’s computer in a reliable fashion.
The reason a messaging system is needed to move messages from one computer to another is that computers and the networks that connect them are inherently unreliable. Just because one application is ready to send a communication does not mean that the other application is ready to receive it. Even if both applications are ready, the network may not be working, or may fail to transmit the data properly. A messaging system overcomes these limitations by repeatedly trying to transmit the message until it succeeds. Under ideal circumstances, the message is transmitted successfully on the first try, but circumstances are often not ideal.
In essence, a message is transmitted in five steps:
- Create — The sender creates the message and populates it with data.
- Send — The sender adds the message to a channel.
- Deliver — The messaging system moves the message from the sender’s computer to the receiver’s computer, making it available to the receiver.
- Receive — The receiver reads the message from the channel.
- Process — The receiver extracts the data from the message.
This diagram illustrates these five transmission steps, which computer performs each, and which steps involve the messaging system:
This diagram also illustrates two important messaging concepts:
- Send and forget — In step 2, the sending application sends the message to the message channel. Once that send is complete, the sender can go on to other work while the messaging system transmits the message in the background. The sender can be confident that the receiver will eventually receive the message and does not have to wait until that happens.
- Store and forward — In step 2, when the sending application sends the message to the message channel, the messaging system stores the message on the sender’s computer, either in memory or on disk. In step 3, the messaging system delivers the message by forwarding it from the sender’s computer to the receiver’s computer, and then stores the message once again on the receiver’s computer. This store-and-forward process may be repeated many times, as the message is moved from one computer to another, until it reaches the receiver’s computer.
The create, send, receive, and process steps may seem like unnecessary overhead. Why not simply deliver the data to the receiver? By wrapping the data as a message and storing it in the messaging system, the applications delegate to the messaging system the responsibility of delivering the data. Because the data is wrapped as an atomic message, delivery can be retried until it succeeds and the receiver can be assured of reliably receiving exactly one copy of the data.
Why Use Messaging?
Now that we know what messaging is, we should ask: Why use messaging? As with any sophisticated solution, there is no one simple answer. The quick answer is that messaging is more immediate than File Transfer, better encapsulated than Shared Database, and more reliable than Remote Procedure Invocation. However, that’s just the beginning of the advantages that can be gained using messaging.
Specific benefits of messaging include:
- Remote Communication. Messaging enables separate applications to communicate and transfer data. Two objects that reside in the same process can simply share the same data in memory. Sending data to another computer is a lot more complicated and requires data to be copied from one computer to another. This means that objects have to "serializable", i.e. they can be converted into a simple byte stream that can be sent across the network. If remote communication is not needed, messaging is not needed; a simpler solution such as concurrent collections or shared memory is sufficient.
- Platform/Language Integration.When connecting multiple computer systems via remote communication, these systems likely use different languages, technologies and platforms, perhaps because they were developed over time by independent teams. Integrating such divergent applications can require a demilitarized zone of middleware to negotiate between the applications, often using the lowest common denominator—such as flat data files with obscure formats. In these circumstances, a messaging system can be a universal translator between the applications that works with each one’s language and platform on its own terms, yet allows them to all communicate through a common messaging paradigm. This universal connectivity is the heart of the Message Bus pattern.
- Asynchronous Communication. Messaging enables a send and forget approach to communication. The sender does not have to wait for the receiver to receive and process the message; it does not even have to wait for the messaging system to deliver the message. The sender only needs to wait for the message to be sent, e.g. for the message to successfully be stored in the channel by the messaging system. Once the message is stored, the sender is then free to perform other work while the message is transmitted in the background. The receiver may want to send an acknowledgement or result back to the sender, which requires another message, whose delivery will need to be detected by a callback mechanism on the sender.
- Variable Timing. With synchronous communication, the caller must wait for the receiver to finish processing the call before the caller can receive the result and continue. In this way, the caller can only make calls as fast as the receiver can perform them. On the other hand, asynchronous communication allows the sender to batch requests to the receiver at its own pace, and for the receiver to consume the requests at its own different pace. This allows both applications to run at maximum throughput and not waste time waiting on each other (at least until the receiver runs out of messages to process).
- Throttling. A problem with remote procedure calls is that too many of them on a single receiver at the same time can overload the receiver. This can cause performance degradation and even cause the receiver to crash. Asynchronous communication enables the receiver to control the rate at which it consumes requests, so as not to become overloaded by too many simultaneous requests. The adverse effect on callers caused by this throttling is minimized because the communication is asynchronous, so the callers are not blocked waiting on the receiver.
- Reliable Communication. Messaging provides reliable delivery that a remote procedure call (RPC) cannot. The reason messaging is more reliable than RPC is that messaging uses a store and forward approach to transmitting messages. The data is packaged as messages, which are atomic, independent units. When the sender sends a message, the messaging system stores the message. It then delivers the message by forwarding it to the receiver’s computer, where it is stored again. Storing the message on the sender’s computer and the receiver’s computer is assumed to be reliable. (To make it even more reliable, the messages can be stored to disk instead of memory; see Guaranteed Delivery.) What is unreliable is forwarding (moving) the message from the sender’s computer to the receiver’s computer, because the receiver or the network may not be running properly. The messaging system overcomes this by resending the message until it succeeds. This automatic retry enables the messaging system to overcome problems with the network such that the sender and receiver don’t have to worry about these details.
- Disconnected Operation. Some applications are specifically designed to run disconnected from the network, yet to synchronize with servers when a network connection is available. Such applications are deployed on platforms like laptop computers, PDA’s, and automobile dashboards. Messaging is ideal for enabling these applications to synchronize—data to be synchronized can be queued as it is created, waiting until the application reconnects to the network.
- Mediation. The messaging system acts as a mediator—as in the Mediator pattern [GoF]—between all of the programs that can send and receive messages. An application can use it as a directory of other applications or services available to integrate with. If an application becomes disconnected from the others, it need only reconnect to the messaging system, not to all of the other messaging applications. The messaging system can be used to provide a high number of distributed connections to a shared resource, such as a database. The messaging system can employ redundant resources to provide high-availability, balance load, reroute around failed network connections, and tune performance and quality of service.
- Thread Management. Asynchronous communication means that one application does not have to block while waiting for another application to perform a task, unless it wants to. Rather than blocking to wait for a reply, the caller can use a callback that will alert the caller when the reply arrives. (See the Request-Reply pattern.) A large number of blocked threads, or threads blocked for a long time, can be problematic. Too many blocked threads may leave the application with too few available threads to perform real work. If an application with some dynamic number of blocked threads crashes, when the application restarts and recovers its former state, re-establishing those threads will be difficult. With callbacks, the only threads that block are a small, known number of listeners waiting for replies. This leaves most threads available for other work and defines a known number of listener threads that can easily be re-established after a crash.
So there are a number of different reasons an application or enterprise may benefit from messaging. Some of these are technical details that application developers relate most readily to, whereas others are strategic decisions that resonate best with enterprise architects. Which of these reasons is most important depends on the current requirements of your particular applications. They’re all good reasons to use messaging, so take advantage of whichever reasons provide the most benefit to you.
Challenges of Asynchronous Messaging
Asynchronous messaging is not the panacea of integration. It resolves many of the challenges of integrating disparate systems in an elegant way but it also introduces new challenges. Some of these challenges are inherent in the asynchronous model while other challenges vary with the specific implementation of a messaging system.
- Complex programming model. Asynchronous messaging requires developers to work with an event-driven programming model. Application logic can no longer be coded in a single method that invokes other methods, but the logic is now split up into a number of event handlers that respond to incoming messages. Such a system is more complex and harder to develop and debug. For example, the equivalent of a simple method call can require a request message and a request channel, a reply message and a reply channel, a correlation identifier and an invalid message queue (as described in Request-Reply).
- Sequence issues. Message channels guarantee message delivery, but they do not guarantee when the message will be delivered. This can cause messages that are sent in sequence to get out of sequence. In situations where messages depend on each other special care has to be taken to re-establish the message sequence.
- Synchronous scenarios. Not all applications can operate in a send and forget mode. If a user is looking for airline tickets, he or she is going to want to see the ticket price right away, not after some undetermined time. Therefore, many messaging systems need to bridge the gap between synchronous and asynchronous solutions. (See the Request-Reply pattern.)
- Performance. Messaging systems do add some overhead to communication. It takes effort to make data into a message and send it, and to receive a message and process it. If you have to transport a huge chunk of data, dividing it into a gazillion small pieces may not be a smart idea. For example, if an integration solution needs to synchronize information between two existing systems, the first step is usually to replicate all relevant information from one system to the other. For such a bulk data replication step, ETL (extract, transform, and load) tools are much more efficient than messaging. Messaging is best suited to keeping the systems in sync after the initial data replication.
- Limited platform support. Many proprietary messaging systems are not available on all platforms. Often times it is easier to FTP a file to another platform than accessing it via a messaging system.
- Vendor lock-in. Many messaging system implementations rely on proprietary protocols. Even common messaging specifications such as JMS do not control the physical implementation of the solution. As a result, different messaging systems usually do not connect to one another. This can leave you with a whole new integration challenge: integrating multiple integration solutions! (See the Messaging Bridge pattern.)
So asynchronous messaging does not solve all problems, and can even create some new ones. Keep these consequences in mind when deciding which problems to solve using messaging.
Thinking Asynchronously
Messaging is an asynchronous technology, which enables delivery to be retried until it succeeds. In contrast, most applications use synchronous function calls; for example: a procedure calling a sub-procedure, one method calling another method, or one procedure invoking another remotely through a remote procedure call (RPC) (such as CORBA and DCOM). Synchronous calls imply that the calling process is halted while the sub-process is executing a function. Even in an RPC scenario, where the called sub-procedure executes in a different process, the caller blocks until the sub-procedure returns control (and the results) to the caller. When using asynchronous messaging, the caller uses a send and forget approach that allows it to continue to execute after it sends the message. As a result, the calling procedure continues to run while the sub-procedure is being invoked.
Asynchronous communication has a number of implications. First, we no longer have a single thread of execution. Multiple threads enable sub-procedures to run concurrently, which can greatly improve performance and help ensure that some sub-processes are making progress even while other sub-processes may be waiting for external results. However, concurrent threads can also make debugging much more difficult. Second, results (if any) arrive via a callback. This enables the caller to perform other tasks and be notified when the result is available, which can improve performance. However, the caller has to be able to process the result even while it is in the middle of other tasks, and it has to be able to use the result to remember the context in which the call was made. Third, asynchronous sub-processes can execute in any order. Again, this enables one sub-procedure to make progress even while another cannot. But it also means that the subprocesses must be able to run independently in any order, and the caller must be able to determine which result came from which sub-process and combine the results together. So asynchronous communication has several advantages but requires rethinking how a procedure uses its sub-procedures.
Distributed Applications vs. Integration
This book is about enterprise integration—how to integrate independent applications so that they can work together. An enterprise application often incorporates an n-tier architecture (a more sophisticated version of a client/server architecture) enabling it to be distributed across several computers. Even though this results in processes on different machines communicating with each other, this is application distribution, not application integration.
Why is an n-tier architecture considered application distribution and not application integration? First, the communicating parts are tightly coupled—they dependent directly on each other, so that one tier cannot function without the others. Second, communication between tiers tends to be synchronous. Third, an application (n-tier or atomic) tends to have human users that will only accept rapid system response.
In contrast, integrated applications are independent applications that can each run by itself, but coordinate with each other in a loosely coupled way. This enables each application to focus on one comprehensive set of functionality and yet delegate to other applications for related functionality. Integrated applications communicating asynchronously don’t have to wait for a response; they can proceed without a response or perform other tasks concurrently until the response is available. Integrated applications tend to have a broad time constraint, such that they can work on other tasks until a result becomes available, and therefore are more patient than most human users waiting real-time for a result.
Commercial Messaging Systems
The apparent benefits of integrating systems using an asynchronous messaging solution have opened up a significant market for software vendors creating messaging middleware and associated tools. We can roughly group the messaging vendors’ products into the following four categories:
- Operating Systems. Messaging has become such a common need that vendors have started to integrate the necessary software infrastructure into the operating system or database platform. For example, the Microsoft Windows 2000 and Windows XP operating systems include the Microsoft Message Queuing (MSMQ) service software. This service is accessible through a number of API’s, including COM components and the System.Messaging namespace, part of the Microsoft .NET platform. Similarly, Oracle offers Oracle AQ as part of its database platform.
- Application Servers. Sun Microsystems first incorporated the Java Messaging Service (JMS) into version 1.2 of the J2EE specification. Since then, virtually all J2EE application servers (such as IBM WebSphere, BEA WebLogic, etc.) provide an implementation for this specification. Also, Sun delivers a JMS reference implementation with the J2EE JDK.
- EAI Suites. Products from these vendors offer proprietary—but functionally rich—suites that encompass messaging, business process automation, workflow, portals, and other functions. Key players in this marketplace are IBM WebSphere MQ, Microsoft BizTalk, TIBCO, WebMethods, SeeBeyond, Vitria, CrossWorlds, and others. Many of these products include JMS as one of the many client API’s they support, while other vendors—such as SonicSoftware and Fiorano—focus primarily on implementing JMS-compliant messaging infrastructures.
- Web Services Toolkits. Web services have garnered a lot of interest in the enterprise integration communities. Standards bodies and consortia are actively working on standardizing reliable message delivery over web services (i.e., WS-Reliability, WS-ReliableMessaging, and ebMS). A growing number of vendors offer tools that implement routing, transformation, and management of web services-based solutions.
The patterns in this book are vendor-independent and apply to most messaging solutions. Unfortunately, each vendor tends to define their own terminology when describing messaging solutions. In this book we have striven to choose pattern names that are technology- and product-neutral, yet descriptive and easy to use conversationally.
Many messaging vendors have incorporated some of this book’s patterns as features of their products, which simplifies applying the patterns and accelerates solution development. Readers who are familiar with a particular vendor’s terminology will most likely recognize many of the concepts in this book. To help these readers map the pattern language to the vendor-specific terminology, the following tables map the most common pattern names to their corresponding product feature names in some of the most widely-used messaging products.
Enterprise Integration Patterns | Java Message Service (JMS) | Microsoft MSMQ | WebSphere MQ |
Message Channel | Destination | MessageQueue | Queue |
Point-to-Point Channel | Queue | MessageQueue | Queue |
Publish-Subscribe Channel | Topic | --- | --- |
Message | Message | Message | Message |
Message Endpoint | MessageProducer, MessageConsumer |
Enterprise Integration Patterns | TIBCO | WebMethods | SeeBeyond | Vitria |
Message Channel | Topic | Intelligent Queue | Channel | |
Point-to-Point Channel | Distributed Queue | Intelligent Queue | Channel | |
Publish-Subscribe Channel | Subject | --- | Intelligent Queue | Pub/Sub Channel |
Message | Message | Document | Event | Event |
Message Endpoint | Publisher, Subscriber | Publisher, Subscriber | Publisher, Subscriber | Publisher, Subscriber |
Pattern Form
This book is structured as a set of patterns organized into a pattern language. Books such as Design Patterns, Pattern Oriented Software Architecture, Core J2EE Patterns, and Patterns of Enterprise Application Architecture have popularized the concept of using patterns to document computer-programming techniques. Christopher Alexander pioneered the concept of patterns and pattern languages in his books A Pattern Language and A Timeless Way of Building. Each pattern represents a decision that the reader must make and the considerations that go into that decision. A pattern language is a web of related patterns where each pattern leads to others, guiding the reader through the decision making process. This approach is a powerful technique for documenting an expert’s knowledge so that it can be readily understood and applied by non-experts.
A pattern language teaches the reader how to solve a limitless variety of problems within a bounded problem space. Because the overall problem that is being solved is different every time, the path through the patterns and how they’re applied is also unique. In this way, this book was written for anyone using any messaging tools for any application, but can be applied specifically for you and the specific application of messaging that you are facing.
Just using the pattern form does not guarantee that a book contains a wealth of knowledge. It is not just enough to simply say, “When you face this problem, apply this solution.” For a reader to truly learn from a pattern, it has to document why the problem is difficult to solve, consider possible solutions that in fact don’t work well, and explain why the solution offered is the best available. Likewise, the patterns need to connect to each other so as to walk the reader from one problem to the next. In this way, the pattern form can be used to teach the reader not just what solutions to apply, but how to solve problems the author could not have predicted. These are goals we strive to accomplish in this book.
Patterns should be prescriptive, meaning that they should tell you what to do. They don’t just describe a problem, and they don’t just describe how to solve it, they tell you what to do to solve it. Each pattern represents a decision the reader must make: “Should I use Messaging?” “Would a Reply Message help me here?” The point of the patterns and the pattern language is to help the reader make decisions that lead to a good solution for his specific problem, even if the authors didn’t have that specific problem in mind, and even if the reader doesn’t have the knowledge and experience to develop that solution on his own.
There is no one universal pattern form; different books use various structures. We used a style that is fairly close to the Alexandrian form, which was first popularized for computer programming in Smalltalk Best Practice Patterns by Kent Beck. We like the Alexandrian form because it results in patterns that are more prose-like. As a result, even though each pattern follows an identical, well-defined structure, the format avoids headings for each individual sub-section, which disrupt the flow of the discussion. To improve navigability, the format uses style elements such as bolding, indentation, and pictures to help the reader identify important sections even at a quick glance.
This pattern language uses the following pattern structure:
- Name – This is an identifier for the pattern that indicates what the pattern does. We chose names that can easily be used in a sentence that describes applying the pattern so that it is easy to reference the pattern’s concept in a conversation between designers.
- Icon – Many patterns are associated with an icon in addition to the pattern name. Because many architects are used to communicating visually by using diagrams, we wanted to provide a visual language in addition to the verbal language. This visual language underlines the composability of the patterns as multiple pattern icons can be combined to describe the solution of a larger, more complex pattern.
- Context – This explains what you might be working on that would make you likely to run into the problem that this pattern solves. The context sets the stage for the problem and often refers to other patterns you may have already applied.
- Problem – This explains the difficulty you are facing, expressed as a question you’re asking yourself, which this pattern solves. You should be able to read the problem statement and quickly determine if this pattern is relevant to your work. We’ve formatted the problem to be one sentence, bold and indented.
- Forces – The forces explore the constraints that make the problem difficult to solve. If it were easy, you wouldn’t need a pattern. They often consider alternative solutions that seem promising but don’t pan out, which helps show the value of the real solution.
- Solution – This is a template that explains what you should do to solve the problem. It is not specific to your particular circumstances, but describes what to do in the variety of circumstances represented by the problem. If you understand a pattern’s problem and solution, you understand the pattern and don’t necessarily need to read the other sections. We’ve formatted the solution to be one sentence, bold and indented.
- Sketch – One of the most appealing properties of the Alexandrian form is that each pattern contains a sketch that illustrates the solution. In many cases, just by looking at the pattern name and the sketch you can understand the essence of the pattern. We tried to maintain this style by inserting a solution picture, or sketch, after the solution statement of each pattern.
- Results – This part expands upon the solution to explain the details of how to apply the solution and how it resolves the forces. It also addresses new challenges that may arise as a result of applying this pattern.
- Next – This section lists other patterns to be considered after applying the current one. Patterns don’t live in isolation; the application of one pattern usually leads you to new problems that are solved by other patterns. This is what makes the collection a pattern language and not just a pattern catalog.
- Sidebars – These sections discuss more detailed technical issues or variations of the pattern. We set these sections visually apart from the remainder of the text so you can easily skip them if they are not be relevant to your particular application of the pattern.
- Examples – A pattern usually includes one or more examples of the pattern being applied or having been applied. An example may be as simple as naming a known use or as detailed as a large segment of sample code. Given the large number of available messaging technologies, we do not expect readers to be familiar with each technology used to implement an example. Therefore, we designed the patterns so that you can safely skip the example without loosing any critical content of the pattern.
The beauty in describing solutions as patterns is that it not only teaches the reader how to solve the specific problems discussed, but also how to create designs that solve problems the authors were not even aware of. As a result, these patterns for messaging describe not only messaging systems that exist today, but may also apply to new ones created well after this book is published.
Diagram Notation
Integration solutions consist of many different pieces—applications, databases, endpoints, channels, messages, routers, etc. If we want to describe an integration solution, we need to define a notation that accommodates all these different components. To our knowledge, there is no widely used, comprehensive notation that is geared towards the description of all aspects of an integration solution. The Unified Modeling Language (UML) does a fine job of describing object-oriented systems with class and interaction diagrams, but it does not contain semantics to describe messaging solutions. The UML Profile for EAI [UMLEAI] enriches the semantics of collaboration diagrams to describe message flows between components. This notation is very useful as a precise visual description of a system that can serve as the basis for code generation as part of a model-driven architecture (MDA). We decided not to adopt this notation for two reasons. First, the UML Profile does not capture all the patterns described in our pattern language. Second, we were not looking to create a precise visual specification, but images that have a certain ‘sketch’ quality to them. We wanted pictures that are able to convey the essence of a pattern to the reader at a quick glance—very much like Alexander’s sketch. That’s why we decided to create our own ‘notation’. Luckily, unlike the more formal notation, ours does not require you to read a large manual. A simple picture should suffice:
This simple picture shows a message being sent to a component over a channel. We use the word component very loosely here—it can indicate an application that is being integrated, an intermediary that transforms or routes the message between applications, or a specific part of an application. Sometimes, we also depict a channel as a three-dimensional pipe if we want to highlight the channel itself. Often times we are more interested in the components and draw the channels as simple lines with arrow heads. The two notations are equivalent. We depict the message as a small tree with a round root and nested, square elements. The tree elements can be shaded or colored to highlight their usage in a particular pattern. Many messaging systems allow messages to contain tree-like data structures, for example XML documents. Also, depicting messages in this way allows us to provide a quick visual description of transformation patterns—it will be easy to show a pattern that adds, re-arranges or removes fields from the message.
When we describe application designs—for example, messaging endpoints or examples written in C# or Java—we do use standard UML class and sequence diagrams to depict the class hierarchy and the interaction between objects. The UML notation is widely accepted as the standard way of describing these types of solutions (if you need a refresher on UML, have a look at [UML]).
Examples and Interludes
We have tried to underline the broad applicability of the patterns by including implementation examples using a variety of integration technologies. The potential downside of this approach is that you may not be familiar with each technology that is being used in an example. That’s why we made sure that reading the examples is strictly optional — all relevant points are discussed in the pattern description. Therefore, you can safely skip the examples without risk of losing out on important detail. Also, where possible, we provided more than one implementation example using different technologies.
When presenting example code we focused on readability over runnability. A code segment can help remove any potential ambiguity left by the solution description and many application developers and architects prefer looking at 30 lines of code as opposed to reading many paragraphs of text. To support this intent we often only show the most relevant methods or classes of a potentially larger solution. We also omitted most forms of error checking to highlight the core function implemented by the code. Most code snippets do not contain in-line comments as the code is explained in the paragraphs before and after the code segment.
Providing a meaningful example for a single integration pattern is challenging. Enterprise integration solutions typically consist of a number of heterogeneous components, spread across multiple systems. Likewise, most integration patterns do not operate in isolation but rely on other patterns to form a meaningful solution. To highlight the collaboration between multiple patterns we included more comprehensive examples as interludes at the end of the major sections of the book. These solutions illustrate many of the trade-offs involved in designing a more comprehensive messaging solution.
All code samples should be treated as illustrative tools only and not as a starting point for development of an integration solution. For example, almost all examples lack any form of error checking or concern for robustness, security, or scalability.
We tried as much as possible to base the examples on software platforms that are available free of charge or as a trial version. In some cases, we used commercial platforms (such as TIBCO ActiveEnterprise or Microsoft BizTalk) to illustrate the difference between developing a solution from scratch and using a commercial tool. We presented those example in such a way that they are educational even if you do not have access to the required run-time platform. For many examples, we use relatively bare-bones messaging frameworks such as JMS or MSMQ. This allows us to be more explicit in the example and focus on the problem at hand instead of distracting from it with all the features a more complex middleware toolset may provide.
The Java examples in this book are based on the JMS 1.1 specification, which is part of the J2EE 1.4 specification. By the time this book is published, most messaging and application server vendors will support JMS 1.1. You can download Sun’s reference implementation of the JMS specification from Sun’s Web site: http://java.sun.com/j2ee.
The Microsoft .NET examples are based on Version 1.1 of the .NET Framework and are written in C#. You can download the .NET Framework SDK from Microsoft’s Web site: http://msdn.microsoft.com/net.
Organization of this Book
The pattern language in this book, as with any pattern language, is a web of patterns referring to each other. At the same time, some patterns are more fundamental than others, forming a hierarchy of big-concept patterns that lead to finer-detailed patterns. The big-concept patterns form the load-baring members of the pattern language. They are the main ones, what we term root patterns, that provide the foundation of the language and support the other patterns.
This book groups patterns into chapters by level-of-abstraction and by topic area. The following diagram shows the root patterns and their relationship to the chapters of the book.
The most fundamental pattern is Messaging; that’s what this whole book is about. It leads to the six root patterns—which are in the Messaging Systems chapter—namely Message Channel, Message, Pipes and Filters, Message Router, Message Translator, and Message Endpoint. In turn, each of these root patterns leads to its own chapter in the book (except Pipes and Filters, which is not specific to messaging but is the basis of the routing and transformation patterns).
The pattern language is divided into eight chapters, which follow the hierarchy described above:
- Chapter 1: Integration Styles – This chapter reviews the different approaches available for integrating applications, including Messaging.
- Chapter 2: Messaging Systems – This chapter reviews the six root messaging patterns, giving an overview of the entire pattern language.
- Chapter 3: Messaging Channels – Applications communicate via channels. Channels define the logical pathways a message can follow. This chapter shows how to determine what channels your applications need.
- Chapter 4: Message Construction – Once you have message channels, you need messages to send on them. This chapter explains the different ways messages can be used and how to take advantage of their special properties.
- Chapter 5: Message Routing – As a messaging topography becomes more complex, senders know less and less about who should receive their messages. Rather, they send the messages to intermediate applications that send them to others until the messages finally find their way to their final destination. This chapter teaches you the responsibilities of these routing applications.
- Chapter 6: Message Transformation – Independently developed applications often don’t agree on messages’ formats, on the form and meaning of supposedly unique identifiers, and even the character encoding to be used. Therefore, intermediate components are needed to convert messages from the form one application produced to that which other applications will consume. This chapter shows how to design these transformer applications.
- Chapter 7: Messaging Endpoints – Many applications were not designed to participate in a messaging solution. As a result, they must be explicitly connected to the messaging system. This section describes a messaging layer in the applications that is responsible for sending and receiving the messages, making your application an endpoint for messages.
- Chapter 8: System Management – Once we have a messaging system in place to integrate our applications, how do we make sure that it’s running correctly and doing what we want? This chapter explores how to test and monitor a running messaging system.
These eight chapters go together to teach you what you need to know about connecting applications using messaging.
Getting Started
With any book that has a lot to teach, it’s hard to know where to start, both for the authors and the readers. Reading all of the pages straight through assures covering the entire subject area, but isn’t the quickest way to get to the issues that are of the most help. Starting with a pattern in the middle of the language can be like starting to watch a movie that’s half over; you see what’s happening but don’t understand what it means.
Luckily, the pattern language is formed around root patterns (as described earlier). These root patterns collectively provide an overview of the pattern language, and individually provide starting points for delving deep into the details of messaging. To get an overall survey of the language without reviewing all of the patterns, start with reviewing the root patterns. To jump into the middle of the language, jump in at a root pattern, a place where the language has finished discussing one major topic and is now starting another.
Chapter 1: Integration Styles provides an overview of the four main application integration techniques and settles on Messaging as being the best overall for many integration opportunities. Read this chapter if you are unfamiliar with issues involved in application integration and the pros and cons of the various approaches that are available. If you just want to know what’s so great about messaging, go straight to that pattern. If you’re already convinced that messaging is the way to go and want to get started with how to use messaging, you can skip the first chapter completely.
Chapter 2: Messaging Systems contains all of this pattern language’s root patterns (except Messaging, which is in the first chapter). For an overview of the pattern language, read (or at least skim) all of the patterns in this chapter. To dive deep on a particular topic, read its root pattern, then go to the patterns mentioned in its next section at the end of the pattern; those next patterns will all be in a chapter named after the root pattern.
The root patterns in this language are:
- Messaging – This is the #1 root pattern for the entire book: What is messaging, what problem does it solve, and how does it solve it?
- Message Channel – What is the structure in a messaging system that conveys messages from the sender to the receiver? How do you know which ones your applications need?
- Message – How does information get communicated from a sender to a receiver?
- Pipes and Filters – How can intermediate steps be performed after a message is sent but before it is received?
- Message Router – If the sender does not know ultimately where the message should go, how can the messaging system get it there?
- Message Translator – If the sender and receiver do not agree on the message format, how can they communicate?
- Message Endpoint – How do the applications that send and receive messages connect to the messaging system?
After the first two chapters, different types of messaging developers may be most interested in different chapters, based on the specifics of how each group uses messaging to perform integration:
- System Administrators may be most interested in Chapter 3: Messaging Channels, the guidelines for what channels to create, and Chapter 8: System Management, guidance on how to maintain a running messaging system.
- Application Developers should look at Chapter 7: Messaging Endpoints to learn how integrate an application with a messaging system, and Chapter 4: Message Construction to learn what messages to send when.
- System Integrators will gain the most from Chapter 5: Message Routing—how to direct messages to the proper receivers—and Chapter 6: Message Transformation—how to convert messages from the sender’s format to the receiver’s.
Keep in mind that when reading a pattern, if you’re in a hurry, start by just reading the problem and solution (the two sentences in bold). This will give you enough information to determine if the pattern is of interest to you right now, and if you already know the pattern. If you do not know the pattern and it sounds interesting, go ahead and read the other parts.
Also remember that this is a pattern language, so the patterns are not necessarily meant to be read in the order they’re presented in the book. The book’s order teaches you about messaging by considering all of the relevant topics in turn and discussing related issues together. To use the patterns to solve a particular problem, start with an appropriate root pattern. Its context explains what patterns need to be applied before this one, even if they’re not the ones immediately preceding this one in the book. Likewise, the next section (the last paragraph of the pattern) describes what patterns to consider applying after this one, even if they’re not the ones immediately following this one in the book. Use the web of interconnected patterns, not the linear list of book pages, to guide you through the material.
Supporting Web Site
Please look for companion information to this book plus related information on enterprise integration at our Web site: www.enterpriseintegrationpatterns.com. You can also e-mail your comments, suggestions and feedback to us at authors@enterpriseintegrationpatterns.com.
Summary
You should now have a good understanding of the following concepts which are fundamental to the material in this book:
- What messaging is
- What a messaging system is
- Why to use messaging
- How asynchronous programming is different
- How application integration is different from application distribution
- What types of commercial products contain messaging systems
You should also have a feel for how this book is going to teach you how to use messaging:
- The role patterns have in structuring the material
- The meaning of the custom notation used in the diagrams
- The purpose and scope of the examples
- The organization of the material
- How to get started learning the material
Now that you understand the basic concepts and how the material will be presented, you are now ready to start learning how to integrate applications using messaging.