Unit of Work


Coordinates the loading and saving of objects to ensure udpates are done in the right order.

When you're pulling data in and out of a database, it's important to keep track of what you've changed, otherwise you won't get stuff written back into the database. Similarly you have to insert new objects you create and remove any object you've deleted.

You could do this as you go along, but if you do that you'll need to open a transaction early. If you've got several things that fit into one transaction, that could lead to a transaction being open for a long time. This is not good, since it will cause locking in the database and hammer your multi-user performance.

Another problem to bear in mind is exactly how you do your updates, especially if there is referential integrity on the database. If you change an order's customer to be a customer you've just created, you need to ensure the customer is inserted into the database before you update the order. You'll need this to get a customer id number to add into the foreign key field for the order. If you create a new orders with a new customer then you'll need to ensure the customer is inserted before the order.

How it Works

Unit of Work is an object that keeps track of these things. As soon as you start doing something that may affect a database, you create a Unit of Work to keep track of the changes. Every time you create, change, or delete an object you tell the Unit of Work.

The key thing about Unit of Work is that when it comes time to commit, the Unit of Work decides what to do. It carries out the inserts, updates, and deletes in the right order. Application programmers never explicitly call methods to update the database. This way they don't have to keep track of what's changed, nor do they need to worry about how referential integrity affects the order in which they need to do things.

Of course for this to work the Unit of Work needs to know what objects it should keep track of. You can do this either by the caller doing it or by getting the object to tell the unit of work.

With caller registration the user of an object needs to remember to register the object with the Unit of Work for changes. Any objects that aren't registerered won't get written out on commit. Although this allows forgetfulness to cause trouble, it does give flexibility in allowing people to make in-memory changes that they don't want written out - although we would argue that is going to cause far more confusion that would be worthwhile. It's better to make an explicit copy for that purpose.

Figure 1: Having the caller register a changed object

With object registration the onus is removed from the caller. The usual trick here is to place registration methods in all the setters of the object. That way any change to the object forces registration. For this scheme to work the Unit of Work needs to either be passed to the object or to be in a well known place. Passing the Unit of Work around is tedious, and it's usually no problem to have the Unit of Work present in some kind of session object.

Figure 2: Getting the receiver object to register itself

Even object registration leaves something to remember, that is the developer of the object needs to remember to add a registration call to every setter. The consistency becomes habitual, but is still an awkward bug when it's missed. There are more sophisticated schemes that can help with this. For instance TOPLink allows you to use the Unit of Work to control the read of an object from the database. The object is automatically registered on read. TOPLink also takes a copy of the object when it's read and compares at commit time to see if it's been changed. This adds some overhead to the commit, but frees programmers from remembering to register objects.

Creating an object is often a special time to consider caller registration. It's not uncommon for people to want to create objects that are only supposed to transient. A good example of this is in testing domain objects where the tests will run much faster without database writes. Caller registration can make this apparant. However there are other solutions, such as providing a transient constructor that doesn't register with the unit or work, or better still providing a null unit of work that does nothing with a commit.

In general, getting the update order right is not exactly trivial. To a first approzimation the thing to do is first to write out new objects, then carry out updates to changed rows, then delete anything that needs to be deleted. However this does not solve the order of creation problem. For many apps you can hard code the tables to be written out from a knowledge of the schema. General purpose mapping tools use a knowledge of the schema to compute the write order.

When to Use it

Unit of Work provides little value when the mapping between between objects and database is simple. As it gets more complicated and there are more objects to keep track of then Unit of Work becomes an increasingly valuable strategy. Indeed this one that we're inclined to go to fairly quickly since the overhead of using it is quite small and it can remove a good bit of error prone duplicate coding. Certainly we would argue that Unit of Work is essential with an indirect mapping strategy. With direct mapping you have more of a choice, particularly if you are using Active Record

Example: A Simple Unit Of Work Implementation

by David Rice

Here's an implementation of the Unit of Work that, while rather simple, could provide service in a real application with only minor enhancements.

First, we need an interface that defines the Unit of Work

class UnitOfWork... 
	public void registerNew(DomainObject obj);
	public void registerClean(DomainObject obj);
	public void registerDirty(DomainObject obj);
	public void registerRemoved(DomainObject obj);
	public void commit();
	public void rollback();

The simplest implementation of Unit of Work is a class with three lists that store dirty, new, and removed domain objects.

class SimpleUnitOfWork... 
	private List newObjects = new ArrayList();
	private List dirtyObjects = new ArrayList();
	private List removedObjects = new ArrayList();

The register...() methods will maintain the state of these lists. These methods should also perform some basic assertions such as checking that an id is not null or that a dirty object is not being registered as new. Here's how two of the registration methods might be implemented:

class SimpleUnitOfWork... 
	public void registerNew(DomainObject obj) {
		Assert.notNull(obj.getId());
		Assert.isTrue("can't register a dirty object as new", !dirtyObjects.contains(obj));
		Assert.isTrue("can't register a removed object as new", !removedObjects.contains(obj));
		Assert.isTrue("can't register an object new  twice", !newObjects.contains(obj));
		newObjects.add(obj);
	}
	public void registerDirty(DomainObject obj) {
		Assert.notNull(obj.getId());
		Assert.isTrue("can't register a removed object as dirty", !removedObjects.contains(obj));
		if (!dirtyObjects.contains(obj) && !newObjects.contains(obj)){
			dirtyObjects.add(obj);
		}
	}

commit() will locate the Database Mapper for each domain object and invoke the appropriate method

class SimpleUnitOfWork... 
	public void commit() {
		insertNew();
		updateDirty();
		deleteRemoved();
	}
	private void insertNew() {
		for(Iterator objects = newObjects.iterator(); objects.hasNext();){
			DomainObject obj = (DomainObject)objects.next();
			MapperRegistry.getMapper(obj.getClass()).insert(obj);
		}
	}

For now, rollback() will clear the state lists. Defining a rollback() method in the Unit of Work interface is a bit tricky since such a method might imply transactional in-memory objects, when in fact most implementations will simply clear the state of the current Unit of Work. In a language or environment that provides transactional objects the rollback() method might prove very powerful. It is also feasible that the Unit of Work might coordinate with your mappers, finders, or cache to keep original copies around in order to properly implement rollback(), but this might prove to be both a poor performer and too complex to code and debug. That said, it might be more appropriate to define and implement a clear() method. Regardless of name, invoking this method typically means that the context within which the Unit of Work exists, e.g. a session bean, must also be reset.

class SimpleUnitOfWork... 
	public void rollback() {
		newObjects.clear();
		dirtyObjects.clear();
		removedObjects.clear();
	}

Next, we need to facilitate object registration. The first need is a 'session' object where the context can register the current Unit of Work and objects can locate the current Unit of Work. The easiest way to do this is to use a static instance of ThreadLocal:

class CurrentUnitOfWork... 
	private static ThreadLocal current = new ThreadLocal();
	public static void register(UnitOfWork uow) {
		current.set(uow);
	}
	public static void deregister() {
		current.set(null);
	}
	public static UnitOfWork get() {
		return (UnitOfWork)current.get();
	}

The second part of object registration is to provide a simple means for domain objects to register with the current Unit of Work. One way to do this is to add to your domain object superclass methods such as markNew() and markDirty() that will locate and register with the Unit of Work.

class DomainObject... 
	protected void markNew() {
		CurrentUnitOfWork.get().registerNew(this);
	}
	protected void markClean() {
		CurrentUnitOfWork.get().registerClean(this);
	}
	protected void markDirty() {
		CurrentUnitOfWork.get().registerDirty(this);
	}
	protected void markRemoved() {
		CurrentUnitOfWork.get().registerRemoved(this);
	}

Once object registration is in place concrete domain objects need to remember to mark themselves new, dirty, clean, and removed where appropriate.

class TestDomainObject... 
	public static TestDomainObject create(String name) {
		TestDomainObject obj = new TestDomainObject(IdGenerator.nextId(), name);
		obj.markNew();
		return obj;
	}
	public void setName(String name) {
		this.name = name;
		markDirty();
	}

The final piece of Unit of Work implementation is setting up synchronization between the Unit of Work and its context. That is, the Unit of Work needs to be instantiated, registered, committed, and rolled back. The simplest means of this is to explicitly code registration and commit. Here's an example test case:

class TestForExample... 
	public void testShowUnitOfWorkInUse() {
		TestMapper mapper = new TestMapper();
		MapperRegistry.registerMapper(mapper, TestDomainObject.class);
		CurrentUnitOfWork.register(new SimpleUnitOfWork());

		TestDomainObject obj1 = TestDomainObject.activate(new Long(123), "obj1");
		TestDomainObject obj2 = TestDomainObject.create(new Long(456), "obj2");
		obj1.setName("newName");
		CurrentUnitOfWork.get().commit();

		assertEquals("I-456,U-123", mapper.getLog());
		CurrentUnitOfWork.deregister();
	}

It's easy to see how one would move registration, deregistration, and perhaps commit and rollback to the setUp() and tearDown() methods of a test case in order to make Unit of Work synchronization transparent to the test developer. Transparent Unit of Work synchronization increases the likelihood of consistent, error-free code. When writing a business application you'll need to take a look at your deployment container in order to provide transparent Unit of Work synchronization. Many possibilities exist: a few examples are hooking into the SessionSynchronization interface of a session bean, code generating a wrapper around your application facade, or using an abstract Unit of Work enabled servlet. Here's what the servlet might look like:

class UnitOfWorkServlet... 
	final protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
		try{
			UnitOfWork uow = new SimpleUnitOfWork();
			CurrentUnitOfWork.register(uow);
			get(request, response);
			uow.commit();
		} finally{
			CurrentUnitOfWork.deregister();
		}
	}
	abstract void get(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException;


© Copyright Martin Fowler, all rights reserved