You are viewing severoon

severoon
severoon
.::::..:...... :...:..
about severoon's musings
what you will find here: photography, science, and general blather.

severoons musings - Blogged

October 2011
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31

severoon [userpic]
Java & OO: Complex validation using builders and immutability

All ye non-geeks dare not enter. Ye hath been warn'd, verily.

(FYI, LJ is ~AWFUL~ for editing code. I took my time to do all the highlighting and indenting right, and LF ruined the formatting and coloring. =( )

The Problem

Have you ever found yourself wanting to create an object that requires complex validation? Your class might have to keep a number of different properties that all must be consistent with each other; property foo is an integer that's only allowed to be between 0 (inclusive) and 60 (inclusive), unless string array bar is non-null, which case foo is only allowed to be between 30 (inclusive) and 45 (inclusive), etc. You can imagine a set of interrelating properties with arbitrarily complex validation rules.

How is this situation normally handled? In my experience, normal practice is to simply create a value object, store all the values in it with no validation, and deal with data inconsistency problems only when using the data causes a failure. By that time, the data is usually far from where it was created, possibly even having been stored in a database for several days (weeks? months?). By the time the data is checked for validity, the user or system that provided it is long gone and is no longer interacting with your code.

Here's the way it might normally be written:
/** The "defer validation until it's too late" approach. */
public class FooBar {
  protected int foo;
  protected String[] bar;

  public FooBar(int foo, String[] bar) {
    this.foo = foo;
    this.bar = bar;
  }

  public int getFoo() { return foo; }
  public String[] getBar() { return bar; }
}
The other approach I've seen is to use validation that attempts to validate each piece of data at the most granular level possible, perhaps in the setters. This creates a problem when validation rules interact. In the example above, foo's validation rules depend upon bar's value. If we validate foo in its setter, we have no way of knowing if bar has been set to its intended value yet. We could provide a setter that allows interrelated values to be set in a single method call, but you can see how this quickly runs out of control if we have a complex value object with many properties.

Other validation frameworks make similar errors at higher levels of granularity, perhaps allowing validation only at the scope of a particular page in a web app. If it makes sense to collect the values of foo and bar from the user on different pages, though, that breaks down as well. A good approach to validation allows the developer to treat the scope of the validation differently from any scope tied to the user interface. Rather, the scope of a validation should be only defined by the intrinsic scope of the data itself and independent from any details of the application in which it happens to exist.

The Solution

I.
A better way to handle this is to maintain FooBar's immutability and validate its contents upon construction. The goal of this approach is to guarantee that if a FooBar instance exists, it is valid. (New code highlighted.)
/** Encapsulated and validated, but still not great. */
public class FooBar {
  private final int foo;
  private final String[] bar;

  /**
   * Constructs a valid {@code FooBar} instance.
   *
   * ...doc specifying preconditions here...
   *
   * @throw IllegalArgumentException if a valid instance cannot be created
   *     based on provided values
   */
  public FooBar(int foo, String[] bar) {
    int fooMin = bar != null ? 0 : 30;
    int fooMax = bar != null ? 45 : 60;
    if (foo < fooMin || foo > fooMax) {
      throw new IllegalArgumentException();
    }
    this.foo = foo;
    this.bar = copy(bar);
  }

  public int getFoo() { return foo; }
  public String[] getBar() { return copy(bar) ; }

  /** Convenience method that returns a copy of {@code array}. */
  private static String[] copy(String[] array) {
    return Arrays.copyOf(array, array.length);
  }
}
Note that, as part of providing strict validation, in order to completely encapsulate the properties of this object we must make a defensive copy of bar both in the constructor and in the getter (since strings are immutable in Java, we don't need to worry about copying the individual elements of the array). From the simple example above, you can see that we have to be careful about copying bar both on the way in (the constructor) and on the way out (the getter).

For more complex value objects, guaranteeing immutability can be a headache. You can avoid this headache by building up value objects from immutables from the lowest level on up.

We're a little closer. However, we have now introduced a new problem for callers.

II.
Imagine that instead of just two fairly simple properties, this class contained many, many properties with a complex set of rules of validation. Note that the contract of this class requires callers to pass valid properties into the constructor as a precondition. If this object can only be created when all of those values are valid, we're essentially putting all of the responsibility on the caller to make sure the data is valid prior to calling the constructor.

Faced with complicated preconditions that must be met before calling the constructor, client code is likely to just create another mutable value object to contain all of the properties and pass that around instead—essentially, the first incarnation of FooBar up above. We have a correct object here from a design standpoint in that it enforces encapsulation and the validation contract, but in doing so we haven't made it easy on the caller...we've simply encouraged the caller to avoid using this object until absolutely necessary, or seek ways to avoid using it at all.

We can solve this new problem with the Builder design pattern. We capture the validation in one place, and restrict creation of the value object to the builder.
/** New & improved, now with a builder. */
public class FooBar {
  private final int foo;
  private final String[] bar;

  /** Prevent direct instantiation. */
  private FooBar(Builder builder) {
    this.foo = builder.foo;
    this.bar = copy(builder.bar);
  }

  public int getFoo() { return foo; }
  public String[] getBar() { return copy(bar) ; }

  private static String[] copy(String[] array) {
    return Arrays.copyOf(array, array.length);
  }

  public static class Builder {
    private int foo;
    private String[] bar;

    public Builder withFoo(int foo) {
      this.foo = foo;
      return this;
    }

    public Builder withBar(String[] bar) {
      this.bar = bar;
      return this;
    }

    /** @return {@code true} if a valid {@code FooBar} instance can be built */
    public boolean isValid() {
      int fooMin = bar != null ? 0 : 30;
      int fooMax = bar != null ? 45 : 60;
      return fooMin <= foo && foo <= fooMax;
    }

    public void validate() {
      if (!isValid()) throw new IllegalArgumentException();
    }

    public FooBar build() {
      validate();
      return new FooBar(this);
    }
  }
}
Now we're doing some of the heavy lifting for the caller. Notice that the builder we provide supports call chaining; that is, the with*() methods return the builder itself, allowing a caller a very convenient way to create FooBar instances:
FooBar foobar = new FooBar.Builder().withFoo(1).withBar(null).build();
Furthermore, the extraction of the validation code to the isValid() method gives the caller an easy way to check if calling build() will succeed or throw an exception. A more complex implementation could have isValid() return more than just a boolean, indicating exactly what's wrong (which, in this post, I'll leave as an exercise for the reader).
// get a new builder with some defaults set
FooBar.Builder fbBuilder = new FooBar.Builder().withFoo(1).withBar(null);
do {
  // prompt user for properties
} while (!fbBuilder.isValid());
FooBar validFooBar = fbBuilder.build();
By using the builder, we give callers a place to collect all the various properties of FooBar, check for validity along the way, and create an instance when this process is complete. In this way, the caller may have an arbitrarily complex process of obtaining these properties and simply pass the builder around, and the validation of the resulting value object is completely free of any scope related to that activity. (Looking at this approach in a slightly different light, we could begin to regard the builder as a mutable value object that can produce immutable snapshots whenever needed. This will not be the case very soon, though, as we'll soon reset the state of our builder every time build() is invoked.)

At this point, I think some of you may raise an objection. Why do FooBar's properties need to be marked final? They're already private, and removing the restriction of marking them final would mean that the builder could simply instantiate an empty FooBar and set its properties directly instead of keeping its own copies locally. As long as the class (or the builder) doesn't mess up and modify/allow the caller to modify those values post-build, and they're validated at build time, what's the advantage of having them marked final?

That's fair, particularly for classes that have a lot of properties, it's a lot of boilerplate to copy them all in the builder. There are probably advantages to marking them final at the level of the compiler since you're signaling to the compiler that these properties are immutable, and there are perhaps some implications for multithreaded uses as well. I will keep them final here, but this is a call you should make for your own situation. (Also, it's worth pointing out that since annotations were introduced into Java, pretty much all of the boilerplate involved here could be removed.)

One last thing before we move on...note that FooBar's constructor takes its builder. If you've ever created an immutable object with constructor that has a million properties, you understand why this is worth pointing out. If we were dealing with an object that takes 10 ints, for instance, the order they're passed in can cause bugs that are not easy to spot.

III.
This is nice, but we can do more.

We could create our builder so that callers may reuse it to build multiple instances. Also, notice that in the sample code above the builder is immediately set up with default values even before prompting the user to input values. Our builder should provide sensible defaults already set, and an easy way to reset to that default state whenever needed.
/** Getting there... */
public class FooBar {
  private final int foo;
  private final String[] bar;

  private FooBar(Builder builder) {
    this.foo = builder.foo;
    this.bar = copy(builder.bar);
  }

  public int getFoo() { return foo; }
  public String[] getBar() { return copy(bar) ; }

  private static String[] copy(String[] array) {
    return Arrays.copyOf(array, array.length);
  }

  /** Easy way to get a new builder. */
  public static Builder builder() { return new Builder(); }

  public static class Builder {
    private int foo;
    private String[] bar;

    /** Construct a new builder based on default properties. */
    private Builder() { reset(); }

    public Builder withFoo(int foo) {
      this.foo = foo;
      return this;
    }

    public Builder withBar(String[] bar) {
      this.bar = bar;
      return this;
    }

    public final boolean isValid() {
      int fooMin = bar != null ? 0 : 30;
      int fooMax = bar != null ? 45 : 60;
      return fooMin <= foo && foo <= fooMax;
    }

    public final void validate() {
      if (!isValid()) throw new IllegalArgumentException();
    }

    public final FooBar build() {
      validate();
      FooBar newbie = new FooBar(this);
      reset();
      return newbie;
    }

    /** Reset this builder to default values. */
    public final void reset() {
      defaults();
      validate();
    }

    /** Set builder properties to sensible defaults. */
    protected void defaults() {
      withFoo(0).withBar(new String[] { "Hello, world!" });
    }
  }
}
Note that several of the builder methods are now marked final. If you guessed that's because I'm getting ready to extract common builder behavior to a superclass in the next step, you're right. If you further guessed that I'm setting things up to use the Template Method design pattern, you are way ahead of the pack. Taking advantage of generics in combination with the Template Method, we can save ourselves a lot of boilerplate.

Take note of a few other things above.
  • the builder constructor is now private, and we provide a static method on FooBar for callers
  • the builder constructor now initializes the builder to default values, making use of the new reset() method
  • the responsibility for setting those defaults has been extracted to its own method, defaults()
  • the externally visible reset() method sets the builder to defaults and validates that those defaults are valid
This last point is desirable because the default properties set by the defaults() method should always allow a caller to immediately build the object without having to check for validity:
FooBar foobar = FooBar.builder().build();
In the case where a builder accidentally sets invalid defaults, this guarantees those tests will detect the problem right away.

It's worth pointing out something about "sensible defaults". Once these are chosen, they become a published part of the builder's API. This means that if one were to change them down the road, the intention ought to be that all calling code that does not override the default values gets the new default values. The takeaway is this: even though it may seem that we're simply providing a nicety with these defaults, as soon as we make it part of the public API of a class, we need to be willing to remain backwards compatible or support a reasonable deprecation process. The more relied-upon our code, the more important it is to maintain the API responsibly.

IV.
Having done all this work, we're now free to pull the boilerplate up to a superclass. We also take the opportunity to define an interface that captures the builder API exposed to the caller, as opposed to the implementer of the builder, which we provide as an API in the form of an abstract class. We repeat the complete definition of FooBar below using these new classes:
/** {@code Builder} API visible to callers. */
public interface Builder<T> {
  /** @return {@code true} iff current properties specify a valid {@code T} instance */
  boolean isValid();

  /**
   * @return a {@code T} instance based on current properties
   * @throws IllegalArgumentException if a valid {@code T} instance cannot be built
   */
  T build();

  /** Reset all current properties to valid defaults. */
  void reset();
}


/** An abstract {@link Builder} implementation for easy subclassing. */
public abstract class AbstractBuilder<T> implements Builder<T> {
  /** Construct a new builder with valid defaults. */
  protected AbstractBuilder() { reset(); }

  public final T build() {
    validate();
    T newbie = construct();
    reset();
    return newbie;
  }

  public final void reset() {
    defaults();
    validate();
  }

  /** @throws IllegalArgumentException if <code>{@link #isValid()}==false</code> */
  private final void validate() {
    if (!isValid()) throw new IllegalArgumentException();
  }

  /** Set properties to default values. */
  protected abstract void defaults();

  /** Create a {@code T} instance based on current properties. */
  protected abstract T construct();
}


/** {@code FooBar} that supports validation, complete with builder. */
public class FooBar {
  private final int foo;
  private final String[] bar;

  private FooBar(Builder builder) {
    this.foo = builder.foo;
    this.bar = copy(builder.bar);
  }

  public int getFoo() { return foo; }
  public String[] getBar() { return copy(bar); }

  private static String[] copy(String[] array) {
    return Arrays.copyOf(array, array.length);
  }

  public static Builder builder() { return new Builder(); }

  public static class Builder extends AbstractBuilder<FooBar> {
    public static final int FOO_DEFAULT = 0;
    public static final String[] BAR_DEFAULT = new String[] { "Hello, world!" };

    private int foo;
    private String[] bar;

    private Builder() {}

    public Builder withFoo(int foo) {
      this.foo = foo;
      return this;
    }

    public Builder withBar(String[] bar) {
      this.bar = bar;
      return this;
    }

    public boolean isValid() {
      int fooMin = bar != null ? 0 : 30;
      int fooMax = bar != null ? 45 : 60;
      return fooMin <= foo && foo <= fooMax;
    }

    protected void defaults() {
      withFoo(FOO_DEFAULT).withBar(BAR_DEFAULT);
    }

    protected FooBar construct() { return new FooBar(this); }
  }
}
Look at all we've accomplished now!

The FooBar class is able to define a builder with a fair amount of ease, and all such classes can reuse the common builder code we've extracted to the superclass. The only methods that need to be defined for FooBar's builder are those that have to deal with the specific aspects of building a FooBar instance. The only methods a caller needs to use that are specific to building a FooBar instance are the public methods defined on FooBar.Builder API.

I would like to take a moment to point out one subtle problem that may have escaped attention. In the FooBar.Builder above, I have extracted the default properties to constants. This is proper because these default values are part of the builder's public API; since they are intrinsic properties of this builder, they have to be publicly defined somewhere to maintain correct design.

Great, so where's the problem? The astute reader will notice that, since I opted to define the bar property as a string array instead of using a collection, we find ourselves in a bit of a pickle. A caller could conceivably modify the contents of BAR_DEFAULT, breaking our builder's encapsulation! This is one area where our API is "leaky", and it's a problem we could solve had we specified Bar as type Collection<String> (or List, or Set, etc., depending on whether its contents need to be ordered, unique, etc.).

I've intentionally used an array in this example code to allow myself to make this point here. In a real system, BAR_DEFAULT would be immutable, and arrays are not. From here forward, example code will show this corrected.

One final note, the construct() method has snuck its way into the AbstractBuilder API. We must introduce it here to defer construction of the generic type to subclasses; it's not possible for AbstractBuilder's build() method to know how to construct an as-yet-unspecified type.

Where may we go from here? Are there any other conveniences we can provide to callers?

V.
Imagine a situation where there is not one obvious set of default properties. Let's say that there are a couple of different possibilities for "obvious" defaults. To make this example more concrete, imagine if most callers like the defaults we've specified above, but there is also a significant minority of callers that want to start with the following defaults instead:
FooBar.Builder fbBuilder = FooBar.builder().withFoo(30).withBar(null);
With the code we have so far, we'll have a lot of callers writing this boilerplate all over. We can do better simply by adding another static builder method on FooBar that gets a builder with alternative defaults. (You were wondering why I made the builder constructor private and used a static method a ways back, weren't you?)
public class FooBar {
  // ...

  public static Builder builder()  { return new Builder(); }

  public static Builder nullBarBuilder() {
    return new Builder().withFoo(30).withBar(null);
  }

  // ...
}
...but you already know what I'm going to say. Remember above where I extracted the builder defaults to constants to publicly define them? We have no such advantage here. A unit test would have to simply know the defaults for the "null bar" variant of the builder, for instance; they're not publicly defined anywhere.

VI.
We need make no such concession. One of the advantages of this approach is that we can create several builders quite easily. (The entire FooBar is shown below, recall the change of bar from String[] to List<String>.)
/** Now with more builders! */
public class FooBar {
  private final int foo;
  private final List<String> bar;

  private FooBar(Builder builder) {
    this.foo = builder.foo;
    this.bar = Collections.unmodifiableList(new ArrayList<String>(builder.bar));
  }

  public int getFoo() { return foo; }
  
  /** @return {@code bar} as an unmodifiable list */
  public List<String> getBar() { return bar; }

  public static DefaultBuilder builder() { return new DefaultBuilder(); }
  public static NullBarBuilder nullBarBuilder() { return new NullBarBuilder(); }

  public static abstract class Builder extends AbstractBuilder<FooBar> {
    private int foo;
    private List<String> bar;

    private Builder() {}

    public final Builder withFoo(int foo) {
      this.foo = foo;
      return this;
    }

    public final Builder withBar(List<String> bar) {
      this.bar = bar;
      return this;
    }

    protected abstract int defaultFoo();
    protected abstract List<String> defaultBar();

    public final boolean isValid() {
      int fooMin = bar != null ? 0 : 30;
      int fooMax = bar != null ? 45 : 60;
      return fooMin <= foo && foo <= fooMax;
    }

    protected final void defaults() {
      withFoo(defaultFoo()).withBar(defaultBar());
    }

    protected final FooBar construct() { return new FooBar(this); }
  }

  public static class DefaultBuilder extends Builder {
    public static final int FOO_DEFAULT = 0;
    public static final List<String> BAR_DEFAULT = Collections.unmodifiableList(
        Arrays.asList(new String[] { "Hello, world!" }));

    private DefaultBuilder() {}

    protected int defaultFoo() { return FOO_DEFAULT; }
    protected List defaultBar() { return BAR_DEFAULT; }
  }

  public static class NullBarBuilder extends Builder {
    public static final int FOO_DEFAULT = 30;
    public static final List<String> BAR_DEFAULT = null;

    private NullBarBuilder() {}

    protected int defaultFoo() { return FOO_DEFAULT; }
    protected List defaultBar() { return BAR_DEFAULT; }
  }
}
Note that the static builder methods on FooBar return the specific builder types, so as to make clear to callers where their defaults are defined. Since these classes define no methods, though, it would be quite reasonable for all callers to keep references simply as type FooBar.Builder.

Also note that we have created methods in FooBar.Builder that defer obtaining the default values of each property to its subclasses. For such a simple object as FooBar, this is a bit overwrought; later, we'll remove those methods and simply let these specific builders implement the defaults() method directly.

However, I wanted to show this approach here to make clear that the defaultFoo() and defaultBar() method implementations are pure boilerplate. Since the addition of annotations to the Java language, this means it would be quite easy to replace them with simple annotations on the public constants directly. In this way, for a more complex object than FooBar, we could write the defaults() method once in the FooBar.Builder and provide subclasses that are nothing more than annotated constants.

An aside on the "constants interface" antipattern. If you do find yourself in the situation where it is appropriate to define a class that does nothing more than define a bunch of constants, do not declare it as an interface! Instead, make a proper constants class: mark the class final and give it a private constructor. The practice of implementing a constants interface is a nasty shortcut because the side effect is that the interface becomes a type, yet it is a type that defines no methods and therefore no contract; no caller would ever want to instantiate a bunch of different instances that implement that interface and then refer to those instances polymorphically by the type the interface defines. (The same reasoning applies to an "constants abstract class"—interfaces and abstract classes are meant to be extended and treated polymorphically by callers, constants classes need not apply!) There was never really a good reason to do this, and since static imports were introduced into the Java language, there is really no excuse.

VII.
Now we are at a point where you can begin to imagine our validated value object being useful in real world scenarios that require an industrial strength solution. In particular, picture a situation where we have deeply nested hierarchies of different validated value objects akin to a DOM tree and you can really begin to develop an appreciation for this approach.

Let's take a moment now to consider testability. Say you have a service that takes a FooBar instance as a parameter. If we use a simple, unvalidated value object as at the top of this post, think about how much more difficult this service becomes to write...the service that uses a FooBar must validate its state. This validation code now lives in our service, which means that in order to properly test the service, its unit tests must pass in variants of invalid FooBars to make sure the service does the validation properly. Imagine that our service takes one of the more complex value objects with a deeply nested hierarchy of other value objects—suddenly we have an explosion of validations and testing that must go on at this service method!

By moving all of the responsibility of validation into the builders of each argument passed into this service, we create code that is highly testable. It's not daunting at all to write unit tests for FooBar that guarantee the correct operation of its builders. Here's how we test the state provided by the default builder and how it handles various values of foo:
public void testDefaultBuilder_expectedDefaults() {
  assertEquals(0, FooBar.DefaultBuilder.FOO_DEFAULT);
  assertEquals(1, FooBar.DefaultBuilder.BAR_DEFAULT.size());
  assertEquals("Hello, world!", FooBar.DefaultBuilder.BAR_DEFAULT.get(0));
}

public void testDefaultBuilder_defaultFooBar() {
  FooBar foobar = FooBar.builder().build();
  assertEquals(FooBar.DefaultBuilder.FOO_DEFAULT, foobar.getFoo());
  assertEquals(FooBar.DefaultBuilder.BAR_DEFAULT, foobar.getBar());
}

public void testIsValid_validFoo() {
  assertIsValid_foo(true, /* valid test values for foo */);
}

public void testIsValid_invalidFoo() {
  assertIsValid_foo(false, /* invalid test values for foo */);
}

private void assertIsValid_foo(boolean valid, int... foos) {
  FooBar.Builder builder = FooBar.builder();
  for (int foo : foos) { assertEquals(valid, builder.withFoo(foo).isValid(); }
}
This is by no means a complete test of FooBar.DefaultBuilder, but it does give an idea of how easy the tests are to write and how accurately we can fix behavior and pinpoint any problems with a failing test. Unlike where we started, with a completely unvalidated object, methods that take a FooBar need not take all the weight of writing onerous validation code, and then writing tests that verify that validation code. And, of course, if multiple methods take an unvalidated object as a parameter, they each need to repeat that validation code and testing. Besides being a lot of wasted effort, it's easy to see that, when FooBar is validated all over the place, those validations would quickly become wildly inconsistent (and test coverage even moreso).

And that's the situation if we assume perfect communication. That is to say, we're assuming all the developers writing these methods that take an unvalidated FooBar all have the same understanding of what a valid FooBar is. If you're a working software engineer, you're probably already chuckling at the fantastic optimism of this assumption.

VIII.
Now that I make the claim that this is ready for real world use, I must cover one final case. I have avoided a complication in the code above that will come up from time to time in situations that require complex validation. In the real world, we may find that it is desirable for some properties to never be set in some builders. So at this point we add a new property, baz, that is only settable from the NullBarBuilder.

Note that until now our use of multiple builders only served one purpose, which was to maintain two different sets of reasonable defaults. They could each ultimately construct the same object based on how the caller uses them...they were simply different starting points. Now, however, we are introducing a behavior difference into these builders that makes them fundamentally different. Callers that have gotten a hold of a DefaultBuilder can only build FooBars with baz==null. So, with this change in our approach, we are adding a very important capability into the mix.

One example of where this might be particularly useful: authorization. If the current user that's logged in is an administrator, then we might want to make available a builder that allows all properties of an object to be set. For a user with a lower level of access, a builder with more restrictions and maybe even a more restrictive isValid() method might be called for. Obviously, if we had such a requirement we could not have static methods on FooBar that blithely hand out builder instances of all types, we would need a subsystem somewhere that manages access to them—but that's a topic for another post.

However, we now have a problem. We cannot simply add a withBaz() method on the builder as we did with foo and bar—that would allow callers to set that property even if they were using a DefaultBuilder, and we only want it settable from a NullBarBuilder.

So we must do a little restructuring when the new property is added.
/** Conditionally settable {@code baz}, with one small problem. */
public class FooBar {
  private final int foo;
  private final List<String> bar;
  private final Boolean baz;

  private FooBar(Builder builder, Boolean baz) {
    this.foo = builder.foo;
    this.bar = Collections.unmodifiableList(new ArrayList<String>(builder.bar));
    this.baz = baz;
  }

  public int getFoo() { return foo; }
  public List<String> getBar() { return bar; }
  public Boolean getBaz() { return baz; }

  public static DefaultBuilder builder() { return new DefaultBuilder(); }
  public static NullBarBuilder nullBarBuilder() { return new NullBarBuilder(); }

  public static abstract class Builder extends AbstractBuilder<FooBar> {
    private int foo;
    private List<String> bar;

    private Builder() {}

    public final Builder withFoo(int foo) {
      this.foo = foo;
      return this;
    }

    public final Builder withBar(List<String> bar) {
      this.bar = bar;
      return this;
    }

    public final boolean isValid() {
      int fooMin = bar != null ? 0 : 30;
      int fooMax = bar != null ? 45 : 60;
      return fooMin <= foo && foo <= fooMax;
    }
  }

  public static class DefaultBuilder extends Builder {
    public static final int FOO_DEFAULT = 0;
    public static final List<String> BAR_DEFAULT = Collections.unmodifiableList(
        Arrays.asList(new String[] { "Hello, world!" }));

    private DefaultBuilder() {}

    protected void defaults() { withFoo(FOO_DEFAULT).withBar(BAR_DEFAULT); }
    protected FooBar construct() { return new FooBar(this, null); }
  }

  public static class NullBarBuilder extends Builder {
    public static final int FOO_DEFAULT = 30;
    public static final List<String> BAR_DEFAULT = null;
    public static final Boolean BAZ_DEFAULT = Boolean.TRUE;

    private Boolean baz;

    private NullBarBuilder() {}

    public NullBarBuilder withBaz(Boolean baz) {
      this.baz = baz;
      return this;
    }

    protected void defaults() {
      withFoo(FOO_DEFAULT).withBar(BAR_DEFAULT).withBaz(BAZ_DEFAULT);  // OOPS!
    }

    protected FooBar construct() { return new FooBar(this, baz); }
  }
}
So, what's changed?
  • the baz property is added to both FooBar and the NullBarBuilder
  • the FooBar constructor now takes baz as a separately settable parameter
  • the different kinds of builders now implement defaults() and construct() explicitly, and we've done away with the methods that return individual default properties at that level as they're no longer needed
This all looks pretty good. And, since withBaz() is only available on the NullBarBuilder, note that we only need comprehensive unit testing including this property on this particular builder; for DefaultBuilder we need only add one unit test that verifies the FooBar instances it builds have baz==null.

There's only one tiny little problem with this code. It doesn't compile.

If you look at the call highlighted in red, you'll see why. We take advantage of call chaining in the defaults() method just as a caller would. Since the withBar() method returns a Builder and not a NullBarBuilder, and the Builder class doesn't have the withBaz() method, we have a problem.

Wait, there's a quick fix! We could simply switch the order of the calls and put withBaz() at the head of the line. It returns a NullBarBuilder which has all the following methods! All good, right?

Um, not quite...think about this from the caller's perspective; it suddenly matters what order methods are called? What if there's more levels of builders in the hierarchy than just the AbstractBuilder and its immediate implementations? To make use of call chaining, the properties always have to be set in order of most- to least-specific according to the builder class hierarchy?

Remember that we are trying to enable callers to get a builder and use it as a kind of temporary holder for these various properties. We can't make assumptions about what order these properties will be set...maybe on the UI for this application, the end user inputs foo and only several pages later is baz entered. By this time, the calling code will have been passing around the supertype unless it's resorted to some nasty downcasting.

This is clearly nonsense that could only result from a design that's suddenly gone off the rails. No, this won't do.

IX.
Instead, the with*() methods ought to return the type of builder that's currently being used.

We accomplish this by using Angelika Langer's getThis() trick in the AbstractBuilder. In short, we add a generic type parameter to AbstractBuilder so that it is now a self-referencing type (similar to the way Enum is defined: Enum<E extends Enum<E>>). This way, AbstractBuilder can define a method that returns the specific type of builder in the method getThis(), which does nothing more than cast a this reference.
public abstract class AbstractBuilder<B extends AbstractBuilder<B,T>,T>
    implements Builder<T> {
  protected AbstractBuilder() { reset(); }

  public final T build() {
    validate();
    T newbie = construct();
    reset();
    return newbie;
  }

  public final void reset() {
    defaults();
    validate();
  }

  private final void validate() {
    if (!isValid()) throw new IllegalArgumentException();
  }

  protected abstract void defaults();
  protected abstract T construct();

  /** @return {@code this} cast to a specific subtype */
  @SuppressWarnings("unchecked")
  protected B getThis() { return (B) this; }
}
Now, in FooBar, we can simply replace all this references with the newly defined getThis() call instead.
/** Conditionally settable {@code baz}, corrected! */
public class FooBar {
  // ...

  public static abstract class Builder<B extends Builder<B>>
      extends AbstractBuilder<B,FooBar> {
    private int foo;
    private List<String> bar;

    private Builder() {}

    public final B withFoo(int foo) {
      this.foo = foo;
      return getThis();
    }

    public final B withBar(List bar) {
      this.bar = bar;
      return getThis();
    }

    public final boolean isValid() {
      int fooMin = bar != null ? 0 : 30;
      int fooMax = bar != null ? 45 : 60;
      return fooMin <= foo && foo <= fooMax;
    }
  }

  public static class DefaultBuilder extends Builder<DefaultBuilder> {
    public static final int FOO_DEFAULT = 0;
    public static final List<String> BAR_DEFAULT = Collections.unmodifiableList(
        Arrays.asList(new String[] { "Hello, world!" }));

    private DefaultBuilder() {}

    protected void defaults() { withFoo(FOO_DEFAULT).withBar(BAR_DEFAULT); }
    protected FooBar construct() { return new FooBar(getThis(), null); }
  }

  public static class NullBarBuilder extends Builder<NullBarBuilder> {
    public static final int FOO_DEFAULT = 30;
    public static final List<String> BAR_DEFAULT = null;
    public static final Boolean BAZ_DEFAULT = Boolean.TRUE;

    private Boolean baz;

    private NullBarBuilder() {}

    public NullBarBuilder withBaz(Boolean baz) {
      this.baz = baz;
      return getThis();
    }

    protected void defaults() {
      withFoo(FOO_DEFAULT).withBar(BAR_DEFAULT).withBaz(BAZ_DEFAULT);
    }

    protected FooBar construct() { return new FooBar(getThis(), baz); }
  }
}
X.
We've covered a lot of ground here, and we've ended with code that has a high power-to-weight ratio.

The approach I've described here provides complex validation that is convenient for callers to use and requires little or no boilerplate even when many different defaults are needed. It provides a high degree of testability. As described in the last section, the test code we need to add as we add more builders is reasonable.

Besides callers that use the builders to build validated value objects, code that receives them as parameters benefits greatly because these arguments passed in are guaranteed to be valid. Since the validation code is removed to the builders and tested there, methods receiving these objects can focus on providing and testing their own logic. It is conceivable that some services will want to further constrain validation of its arguments, but the essential point here is that any extra validation should be specific to that particular service; moreover, it would be perfectly acceptable for a service to define its own validated value object class for itself if only to encapsulate its own validation to simplify testing. We can build up ever more complex value objects containing other validated value objects without fear. No matter how complex the data structures get, validation rules are clearly defined and remain testable at each level.

This post contains a lot of code. However, I want to highlight that most of the code above is simply reposted from previous sections to provide the relevant context for the changes. To emphasize this, I'll close by comparing our beginning and ending states side-by-side.

First, consider the simple, unvalidated value object with all three properties, foo, bar, and baz.
/** An unvalidated "foobar" value object. */
public class UnvalidatedFooBar
{
  private int foo;
  private List<String> bar;
  private Boolean baz;

  public UnvalidatedFooBar(int foo, List<String> bar, Boolean baz) {
    this.foo = foo;
    this.bar = bar;
    this.baz = baz;
  }
  
  public int getFoo() { return foo; }
  public List<String> getBar() { return bar; }
  public Boolean getBaz() { return baz; }
}
Now the complete code listing for our approach that provides full validation. (Note that I have excluded @Override annotations above for readability, but I now include them here.)
/**
 * The builder API visible to callers.
 *
 * @param <T> the type of object this builder builds
 */
public interface Builder<T> {
  /** @return {@code true} if {@link #build()} would return successfully */
  boolean isValid();

  /**
   * @return a valid object
   * @throw IllegalArgumentException if cannot build a valid {@code T} based
   *     on current properties
   */
  T build();

  /** Reset this builder to its default state. */
  void reset();
}


/**
 * The builder API visible to implementors.
 *
 * @param <B> the specific type of this builder instance
 * @param <T> the type of object built by this builder
 */
public abstract class AbstractBuilder<B extends AbstractBuilder<B,T>,T>
    implements Builder<T> {
  protected AbstractBuilder() { reset(); }

  @Override
  public final T build() {
    validate();
    T newbie = construct();
    reset();
    return newbie;
  }

  @Override
  public final void reset() {
    defaults();
    validate();
  }

  /** @throw IllegalArgumentException if {@link #isValid()} returns {@code false} */
  private final void validate() {
    if (!isValid()) throw new IllegalArgumentException();
  }

  @SuppressWarnings("unchecked")
  protected B getThis() { return (B) this; }

  protected abstract void defaults();
  protected abstract T construct();
}


/** A validated {@code FooBar}. */
public class FooBar {
  private final int foo;
  private final List<String> bar;
  private final Boolean baz;

  private FooBar(Builder<?> builder, Boolean baz) {
    this.foo = builder.foo;
    this.bar = Collections.unmodifiableList(new ArrayList<String>(builder.bar));
    this.baz = baz;
  }

  public int getFoo() { return foo; }
  public List<String> getBar() { return bar; }
  public Boolean getBaz() { return baz; }

  public static DefaultBuilder builder() { return new DefaultBuilder(); }
  public static NullBarBuilder nullBarBuilder() { return new NullBarBuilder(); }

  /** {@link FooBar}'s common builder functionality. */
  public static abstract class Builder<B extends Builder<B>>
      extends AbstractBuilder<B,FooBar> {
    private int foo;
    private List<String> bar;

    private Builder() {}

    public final B withFoo(int foo) {
      this.foo = foo;
      return getThis();
    }

    public final B withBar(List<String> bar) {
      this.bar = bar;
      return getThis();
    }

    /**
     * @return {@code true} if {@link #build()} can build a valid {@code FooBar}
     *     based on current properties
     */
    @Override
    public final boolean isValid() {
      int fooMin = bar != null ? 0 : 30;
      int fooMax = bar != null ? 45 : 60;
      return fooMin <= foo && foo <= fooMax;
    }
  }

  /** {@link FooBar}'s default builder. */
  public static class DefaultBuilder extends Builder<DefaultBuilder> {
    public static final int FOO_DEFAULT = 0;
    public static final List<String> BAR_DEFAULT = Collections.unmodifiableList(
        Arrays.asList(new String[] { "Hello, world!" }));

    private DefaultBuilder() {}

    @Override
    protected void defaults() { withFoo(FOO_DEFAULT).withBar(BAR_DEFAULT); }

    @Override
    protected FooBar construct() { return new FooBar(getThis(), null); }
  }

  /** {@link FooBar}'s special "null bar" builder. */
  public static class NullBarBuilder extends Builder<NullBarBuilder> {
    public static final int FOO_DEFAULT = 30;
    public static final List<String> BAR_DEFAULT = null;
    public static final Boolean BAZ_DEFAULT = Boolean.TRUE;

    private Boolean baz;

    private NullBarBuilder() {}

    public NullBarBuilder withBaz(Boolean baz) {
      this.baz = baz;
      return getThis();
    }

    @Override
    protected void defaults() {
      withFoo(FOO_DEFAULT).withBar(BAR_DEFAULT).withBaz(BAZ_DEFAULT);
    }

    @Override
    protected FooBar construct() { return new FooBar(getThis(), baz); }
  }
}
There's no denying that the validated FooBar class is substantially longer. However, once you have gone through this post section by section and understood how we've arrived at this final form, and how much other unseen code this implementation obviates, hopefully I've convinced you that each line we've added has positive impact.

The one thing I might change, depending upon the situation, is the FooBar.getBar() method. As written, it returns internal reference to an unmodifiable list. For some situations, a caller might be surprised and dismayed that attempts to change the contents of the return value result in an exception. This could be alleviated simply by having getBar() return a mutable copy.

This is not without its drawbacks, however. There's no guarantee that bar isn't massive. If this list were 100MB and there were many calls to getBar(), several gigabytes of RAM could quickly get consumed. If only very few of those callers actually had the need to modify the contents, it's much better to let them make their own copy as needed.

If we opt to return a copy, another problem is that it's not clear how callers will use it, making it difficult to know what kind of list to return. One caller might want the speedy access and replacement behaviors of an ArrayList while another requires the behavior of a LinkedList. Rather than copy the contents into a list with the wrong properties, better to leave it to the caller to decide what kind of list they want. (The caller may not even want a list, preferring to copy the contents into a HashSet or something else.) For these reasons, I'd simply provide clear javadoc specifying that the method returns an unmodifiable list.

The behavior of getBar() is a point worth considering, though, since the Java standard libraries do a very poor job of dealing with immutable data structures (a topic for another post). This line of thinking points to a much larger issue as well. Our considerations with getBar() are made much simpler by the fact that Java strings are already immutable. If this were a collection of some mutable type, we have a more difficult situation to deal with as callers read into our data structure and begin modifying its contents, possibly undoing our careful validation work. For this reason, I build validated value objects out of immutable types only. In particular, building up hierarchies of validated value objects is a great idea because, provided that each value object does sensible validation that does not depend upon its context (another topic for another post), it is easy to quickly build up cascading validation rules of arbitrary complexity while maintaining complete control over the codebase. With good design this need not become wasteful of resources either. As every functional programmer knows, immutable components can be easily shared; once a FooBar instance is built above, that same instance can be used to build other immutables, shared across threads, etc.

As always, I'd love to hear comments, corrections, questions, etc.

mood: geekygeeky