2015-05-08

After Rails Girls

Rails Girls can be a very chaotic experience which throws you in the middle of an overwhelming pile of technology. But if you survived it and are willing to learn the things properly at your own pace, this post outlines some pointers.

It's easy to get started in making websites, but as soon as you enter the territory of programming logic, the learning curve rises sharply. You should be aware of your motives that why you are getting into software development. If you want to do it just because you've heard it pays well, it's probably not enough. Wanting to create a solution to some need that you and others have is more motivating and helps in focusing your efforts. Also it helps if you enjoy problem solving and figuring out how things work, because that's what you'll be doing every day.


The essence of programming (source)

Here is a roadmap of things you will need to learn sooner or later. The effort required for learning them is expressed using chili peppers. chili can be learned in a few days, chilichili requires some weeks of focused learning, and chilichilichili will take many months.

I want to create a web site or app chili

You will need to learn HTML, which is the language that every web site and web application is made out of. There is a good overview in MDN's Learning the Web guide.

Though you can learn HTML with all the files on just your own computer, eventually you will need a server for hosting your web site so that also others can see it. You can get started with GitHub Pages.

I want my site to look pretty chili

You will need to learn CSS, which is the language for describing the fonts, colors and layouts of web sites. You can learn bits of CSS easily as you go, though learning the intricacies of CSS layouts means at least chilichili worth of fiddling.

I want my site to do things chilichilichili

You will need to learn programming. The exact choice of language doesn't matter much. The hard part is learning to think like a programmer, but after you can program in one language, you can learn another language in a matter of days or weeks.

In order to change a web page while the user is on it, you will have to use JavaScript. In order to generate different HTML when the user enters or reloads a web page, any general purpose programming language can be used, such as Ruby.

I want my site to remember things chilichili

Almost every application uses a database for storing information. You will need to learn the basics of SQL, which is the language for communicating with a relational database (for example PostgreSQL). Another popular kind of database is a document database (for example CouchDB), in which case the query language is usually something other than SQL.

I can't understand the code I wrote last month chilichilichili

Congratulations, you've learned enough programming to be dangerous. Anybody can write code that the computer understands, but it requires skill to write code that other programmers can understand. You should start learning the principles of writing good code. Clean Code: A Handbook of Agile Software Craftsmanship is a good starting point and my site has additional resources.

My code doesn't always work right chilichilichili

To make sure that the code does what you think it should do, you should write automated tests for it. This can be done by learning Test-Driven Development which will guarantee that all the code you write is tested. TDD Tetris Tutorial is a good starting point.

I want to work on a project together with others chili

Version control is used for sharing code changes with other developers. It's also useful when working alone, because it lets you reliably return to earlier versions of your code and also works as a backup. The most popular choice is using the Git version control system and the GitHub repository hosting service.

I have a question chili

The first step in solving a problem is doing a Google search. If you can't find an answer, then for programming related questions you can post your question on Stack Overflow, but first learn how to make an SSCCE to get your code related questions answered. It's also good to visit meetups to get in touch with the local developer community and talk with other developers, especially for questions which have multiple correct answers (e.g. "which tool or language should I use for X").

I want a job chilichilichili

To get a job in technology, more important than your CV or degrees is that you have some projects which you can showcase. Create a complete application, make it available for others to use and put its code up on GitHub. Start a blog and post there regularly things you've learned and what you're thinking about. Attend local meetups regularly and discuss with other developers to get some contacts and to hear about open positions for a junior developer.

2014-12-31

When to Refactor

How to maintain the balance between adding new features and refactoring existing code? Here are some rules of thumb for choosing when to refactor and how much.

Refactoring is the process of improving the code's design without affecting its functionality. Is it possible to over-refactor? I don't think that code can ever be "too clean", and following the four elements of Simple Design should not result in over-engineering. But certainly there is code that needs more cleaning up than other cases, and we rarely have enough time to do all we want. That's why prioritization is needed.

When I refactor, it's usually in one of the following situations.

After getting a test to pass

At unit scale TDD (as opposed to system scale TDD), writing a test and making it pass takes only a few minutes (or you're working in too big steps). After getting a test to pass, it's good to take a moment to look at the code we just wrote and clean it up. Basically it comes down to removing duplication and improving names, i.e. Simple Design. This takes just a minute or two.

This is also a good time to fix any obvious design smells while they are still small. For example Primitive Obsession gets the harder to fix the more widespread it is. This usually takes just a few minutes and at most an hour. Very faint design smells I would leave lying around until they ripen enough for me to know how to fix them - but not too long, so that they begin to rot.

When adding a feature is hard

If the system's design does not make it easy to add a feature I'm currently working on, I would first refactor the system to make adding that feature easier. If this is the second or third instance of a similar feature [1], I would refactor the code to follow the Open-Closed Principle, so that in the future adding similar features will be trivial. This might take from half an hour up to a couple of hours.

When the difficulty of adding a feature hits you right away like a ton of bricks, then it's obvious to do the refactoring first. But what if a difficulty sneaks up on you during implementing the feature? Trying to refactor and implement features at the same time is a road to pain and suffering. Instead, retreat to the last time that all tests passed (either revert/stash your changes or disable the new feature's one failing test), after which you can better focus on keeping the tests green while refactoring.

When our understanding of what would be the correct design improves

When we start developing a program, we have only partial understanding of the problem being solved, but we'll do our best to make the code reflect our current understanding of the problem. As the program grows over months and years, we will learn more and inevitably there will be parts of the code that we would have designed differently, if we only had then known what we know today. This is the original definition of the Technical Debt metaphor and the ability to pay back the debt depends on how clean the code is.

For big refactorings, it is unpractical to block adding new features while the design is being changed. So working towards a new design should be done incrementally at the same time as developing new features. Whenever a developer needs to change code that does not yet conform to the target design, they should refactor that part of the codebase there and then, before implementing the feature at hand. This way it might take many weeks or months for the whole codebase to be refactored, but it is done incrementally in small steps, a class or a method at a time (which should not take more than a couple of hours), so that the software keeps working at all times.

When trying to understand what some piece of code does

If you need to understand some code, even if you're not going to change it, refactoring the code is one means for understanding it better. Extract methods and variables, give them better names and move things around until the code says clearly what it does. You may combine this with writing unit tests, which likewise helps to understand the code.

If the code has good test coverage, you might as well commit the changes you just did, in hopes of the next reader understanding the code faster [2]. But even if the code has no tests, you can do some refactoring to understand it and then throw away your changes - your understanding will remain. If you know that you're going to throw away your changes, you can even do the throwaway refactoring faster with less care. And for complex refactorings, when you're not sure about what sequence of steps would bring you safely to your goal, prodding around the code can help you to get a feel for the correct refactoring sequence.

TL;DR

Refactoring does not have to be, nor should be, its own development phase which takes weeks or months. Instead it can be done incrementally in small steps, interleaved with feature development.


Notes

[1]: If the shape of the code is developing into a direction that you've seen happen many times in the past, it's easy to know how to refactor it already when the second duplicate instance raises its head. But if you're uncertain of what the code should be like, it may be worthwhile to leave the second duplicate be and wait for the third duplicate before creating a generic solution, so that you can clearly see which parts are duplicated and which vary.

[2]: Sometimes I wonder whether a refactoring made the code better, or I just understand it better because of spending time refactoring it.

Not sure if refactoring made code more understandable, or I just understand the code better because I spent hours in it.


This article was first published in the Solita developer blog. There you can find also other articles like this.

2014-12-02

Phase Change Pattern for Mutating Immutable Objects

Here is a design pattern I've found useful for "mutating" immutable objects in non-functional languages, such as Java, which don't have good support for it (unlike for example Scala's copy methods). I shall call it the Phase Change Pattern, because of how it freezes and melts objects to make them immutable and back again mutable, the same way as water can be frozen and melted.

This pattern consist of the following parts:

  • An immutable class with some properties
  • A mutable class with the same properties
  • The mutable class has a freeze() method for converting it to the immutable class
  • The immutable class has a melt() method for converting it to the mutable class
  • Both of the classes have package-private copy constructors that take the other class' instance as parameter, copying all fields (making immutable/mutable copies of the field values when necessary)

The freeze method plays a similar role as the build method in the Builder pattern, but it's named after Ruby's freeze method. The name melt was chosen as a metaphor based on the thermodynamic phase changes. I think it has better connotations than the other alternatives I considered: "build - destroy", "save - edit", "persist - dispersist" (or can "transient" be made a verb?)

Code Example

The following code is taken from the Jumi Test Runner project where I originally invented this pattern.

Usage

The classes can be used like this to create nice immutable classes that may be freely passed around:

SuiteConfiguration config = new SuiteConfigurationBuilder()
        .addToClasspath(Paths.get("something.jar"))
        .addJvmOptions("-ea")
        .freeze();

But the pattern also makes it possible, in a method that takes the immutable object as parameter, to augment it with new values:

config = config.melt()
        .addJvmOptions("-javaagent:extra-agent.jar")
        .freeze();

This is useful in situations where the code that creates the original immutable object does not know all the arguments, but some of the arguments are known only much later by some other code.

Immutable Class

Here is the immutable class. Its default constructor sets all fields to their default values. The copy constructor will need to make immutable copies of all mutable properties (e.g. java.util.List). The copy constructor takes the builder as parameter, making it easier to match the field names than if each property had its own constuctor parameter. The cyclic dependency is a small price to pay for this convenience. There are getters for all properties. Also this class can be made a value object by overriding equals, hashCode and toString.

@Immutable
public class SuiteConfiguration {

    public static final SuiteConfiguration DEFAULTS = new SuiteConfiguration();

    private final List<URI> classpath;
    private final List<String> jvmOptions;
    private final URI workingDirectory;
    private final String includedTestsPattern;
    private final String excludedTestsPattern;

    public SuiteConfiguration() {
        classpath = Collections.emptyList();
        jvmOptions = Collections.emptyList();
        workingDirectory = Paths.get(".").normalize().toUri();
        includedTestsPattern = "glob:**Test.class";
        excludedTestsPattern = "glob:**$*.class";
    }

    SuiteConfiguration(SuiteConfigurationBuilder src) {
        classpath = Immutables.list(src.getClasspath());
        jvmOptions = Immutables.list(src.getJvmOptions());
        workingDirectory = src.getWorkingDirectory();
        includedTestsPattern = src.getIncludedTestsPattern();
        excludedTestsPattern = src.getExcludedTestsPattern();
    }

    public SuiteConfigurationBuilder melt() {
        return new SuiteConfigurationBuilder(this);
    }

    @Override
    public boolean equals(Object that) {
        return EqualsBuilder.reflectionEquals(this, that);
    }

    @Override
    public int hashCode() {
        return HashCodeBuilder.reflectionHashCode(this);
    }

    @Override
    public String toString() {
        return ToStringBuilder.reflectionToString(this, ToStringStyle.SHORT_PREFIX_STYLE);
    }


    // getters

    public List<URI> getClasspath() {
        return classpath;
    }

    public List<String> getJvmOptions() {
        return jvmOptions;
    }

    public URI getWorkingDirectory() {
        return workingDirectory;
    }

    public String getIncludedTestsPattern() {
        return includedTestsPattern;
    }

    public String getExcludedTestsPattern() {
        return excludedTestsPattern;
    }
}

Mutable Class

Here is the mutable class. Its constructors and freeze method are a dual to the immutable class' constructors and melt method. The defaults are written only once, in the immutable class. This class has both getters and setters for all the properties. All the mutator methods return this to enable method chaining.

@NotThreadSafe
public class SuiteConfigurationBuilder {

    private final List<URI> classpath;
    private final List<String> jvmOptions;
    private URI workingDirectory;
    private String includedTestsPattern;
    private String excludedTestsPattern;

    public SuiteConfigurationBuilder() {
        this(SuiteConfiguration.DEFAULTS);
    }

    SuiteConfigurationBuilder(SuiteConfiguration src) {
        classpath = new ArrayList<>(src.getClasspath());
        jvmOptions = new ArrayList<>(src.getJvmOptions());
        workingDirectory = src.getWorkingDirectory();
        includedTestsPattern = src.getIncludedTestsPattern();
        excludedTestsPattern = src.getExcludedTestsPattern();
    }

    public SuiteConfiguration freeze() {
        return new SuiteConfiguration(this);
    }


    // getters and setters

    public List<URI> getClasspath() {
        return classpath;
    }

    public SuiteConfigurationBuilder setClasspath(URI... files) {
        classpath.clear();
        for (URI file : files) {
            addToClasspath(file);
        }
        return this;
    }

    public SuiteConfigurationBuilder addToClasspath(URI file) {
        classpath.add(file);
        return this;
    }

    ...
}

Most of the mutators have been omitted for brevity. The full source code of these classes is in Github.

2014-10-10

Continuous Discussion: Agile, DevOps and Continuous Delivery

Two days ago I participated in a online panel discussing Agile, DevOps and Continuous Delivery. The panel consisted of 11 panelists and was organized by Electric Cloud. Topics discussed include:

  • What are/were your big Agile obstacles?
  • What does DevOps mean to you?
  • What is Continous Delivery? What does it take to get there?

You can watch a recording of the session at Electric Cloud's blog. There is also information about the next online panel that will be arranged.

2014-07-17

Java 8 Functional Interface Naming Guide

People have been complaining about the naming of Java 8's functional interfaces in the java.util.function package. Depending on the types of arguments and return values a function may instead be called a consumer, supplier, predicate or operator, which is further complicated by two-argument functions. The number of interfaces explodes because of specialized interfaces for some primitive types. In total there are 43 interfaces in that package. Compare this to Scala where just three interfaces (with compiler optimizations) cover all the same use cases: Function0, Function1 and Function2.

To make some sense out of this all, I wrote a program that generates the interface names based on the function's argument types and return type. The names it generates match those in the Java library. For this article I also made visualization which you can see below.

Visually

Click to make it bigger.

Textually

The interface name is determined by the following algorithm. The names are described as regular expressions; the generic interfaces have the shortest names, but specialized interfaces for functions taking or returning primitives include the primitive types in the interface name.

  1. Does it return void?
    • Arity 0: Runnable
    • Arity 1: (|Int|Long|Double)Consumer
    • Arity 2:
      • Both arguments are generic: BiConsumer
      • First argument is generic: Obj(Int|Long|Double)Consumer
  2. Does it take no arguments?
    • Arity 0: (|Int|Long|Double|Boolean)Supplier
  3. Does it return boolean?
    • Arity 1: (|Int|Long|Double)Predicate
    • Arity 2: BiPredicate
  4. Do all arguments have the same type as the return value?
    • Arity 1: (|Int|Long|Double)UnaryOperator
    • Arity 2: (|Int|Long|Double)BinaryOperator
  5. Otherwise:
    • Arity 1: (|Int|Long|Double)(|ToInt|ToLong|ToDouble)Function
    • Arity 2: (|ToInt|ToLong|ToDouble)Function

The method names follow. When the return type is a primitive, the method name is a bit longer (this is apparently because the JVM supports method overloading also based on the method return type, but the Java language does not).

  • Runnable: run
  • Consumers: accept
  • Suppliers: get(|AsInt|AsLong|AsDouble|AsBoolean)
  • Predicates: test
  • Functions and operators: apply(|AsInt|AsLong|AsDouble)

I hope that helps some of you in remembering what interface to look for when browsing the API.

2013-07-23

Lambda Expressions Backported to Java 7, 6 and 5

Do you want to use lambda expressions already today, but you are forced to use Java and a stable JRE in production? Now that's possible with Retrolambda, which will take bytecode compiled with Java 8 and convert it to run on Java 7, 6 and 5 runtimes, letting you use lambda expressions and method references on those platforms. It won't give you the improved Java 8 Collections API, but fortunately there are multiple alternative libraries which will benefit from lambda expressions.

Behind the Scenes

A couple of days ago in a café it popped into my head to find out whether somebody had made this already, but after speaking into the air, I did it myself over a weekend.

The original plan of copying the classes from OpenJDK didn't work (LambdaMetafactory depends on some package-private classes and would have required modifications), but I figured out a better way to do it without additional runtime dependencies.

Retrolambda uses a Java agent to find out what bytecode LambdaMetafactory generates dynamically, and saves it as class files, after which it replaces the invokedynamic instructions to instantiate those classes directly. It also changes some private synthetic methods to be package-private, so that normal bytecode can access them without method handles.

After the conversion you'll have just a bunch of normal .class files - but with less typing.

P.S. If you hear about experiences of using Retrolambda for Android development, please leave a comment.

2013-03-02

Refactoring Primitive Obsession

Primitive Obsession means using a programming language's generic type instead of an application-specific domain object. Some examples are using an integer for an ID, a string for an address, a list for an address book etc. Others have explained why to fix it - this article is about how to fix it.

You can see an example of refactoring Primitive Obsession in James Shore's Let's Play TDD episodes 13-18. For a quick overview, you may watch episode #14 at 10-12 min and episode #15 at 0-3 min, to see him plugging in the TaxRate class.

The sooner the Primitive Obsession is fixed, the easier it is. In the above videos it takes just a couple of minutes to plug in the TaxRate class, but the Dollars class takes over half an hour. James does the code changes manually, without automated refactorings. For a big project with rampant Primitive Obsession it will easily take many hours, even days, to fix the problem of a missing core domain type.

Here I'm presenting some tips of using fully automated refactorings to solve Primitive Obsession. I'm using IntelliJ IDEA's Java refactorings, but the ideas should, to some extent, be applicable also to IDEs with inferior refactoring support.

The Example

Let's assume that we have a project that uses lots of "thingies" which are saved in a database. The thingies each have an ID that at the moment is just an integer. To avoid the thingy IDs getting mixed with other kinds of IDs, we create the following value object:

public final class ThingyId {

    private final int id;

    public ThingyId(int id) {
        this.id = id;
    }

    public int toInt() {
        return id;
    }

    @Override
    public boolean equals(Object obj) {
        if (!(obj instanceof ThingyId)) {
            return false;
        }
        ThingyId that = (ThingyId) obj;
        return this.id == that.id;
    }

    @Override
    public int hashCode() {
        return id;
    }

    @Override
    public String toString() {
        return getClass().getSimpleName() + "(" + id + ")";
    }
}

Creating such a class is easy, but putting it to use is not so when the primitive ID is used in a couple of hundred places…

Starting Small

Refactoring Primitive Obsession is quite mechanical, but because it requires cascading changes, it's very easy to mess things up. So it's best to start small and proceed in small steps.

It makes sense to start from a central place from where the change can be propagated to the whole application. For example by starting to use ThingyId inside this one class, without changing its public interface:

(Cannot see the video? Watch it as GIF)

This example refactoring had to be done manually, because the field was mutable (we must update all reads and writes to the field in one step), but the following refactorings can be done with the help of automatic refactorings.

Pushing Arguments Out

When there is a method which wraps one of its arguments into ThingyId, we can propagate it by pushing the act of wrapping outside the method. In IntelliJ IDEA this can be done with the Extract Parameter (Ctrl+Alt+P) refactoring:

(Cannot see the video? Watch it as GIF)

Pushing Return Values

When there is a method which unwraps its return value from ThingyId to int, we can propagate the unwrapping outside the method. There is no built-in refactoring for that, but it can be accomplished by combining Extract Method (Ctrl+Alt+M) and Inline (Ctrl+Alt+N).

First extract a method that does the same as the old method, but does not unwrap ThingyId. Then inline the original method and rename the new method to be the same as the original method.

(Cannot see the video? Watch it as GIF)

Pushing Return Values of Interface Methods

A variation of the previous refactoring is required when the method is part of an interface. IntelliJ IDEA 12 does not support inlining abstract methods (I would like it to ask that which of the implementations to inline), but since IDEA can refactor code that doesn't compile, we can copy and paste the implementation into the interface and then inline it:

(Cannot see the video? Watch it as GIF)

Pushing Arguments In

Instead of trying to refactor a method's arguments from the method caller's side, it's better to go inside the method and use Extract Parameter (Ctrl+Alt+P) as described earlier. Likewise for a method's return value. This leaves us with some redundant code, as can be seen in this example. We'll handle that next.

(Cannot see the video? Watch it as GIF)

Removing Redundancy

By following the above tips you will probably end up with some redundant wrapping and unwrapping such as new ThingyId(thingyId.toInt()) which is the same as thingyId. Changing one such thing manually would be easy, but the problem is that there are potentially tens or hundreds of places to change. In IntelliJ IDEA those can be fixed with one command: Replace Structurally (Ctrl+Shift+M).

In the following example we use the search template "new ThingyId($x$.toInt())" and replacement template "$x$". For extra type safety, the $x$ variable can be defined (under the Edit Variables menu) to be an expression of type ThingyId.

(Cannot see the video? Watch it as GIF)

Updating Constants

When there are constants (or final fields) of the old type, as is common in tests, those can be updated by extracting a new constant of the new ThingyId type, redefining the old constant to be an unwrapping of the new constant, and finally inlining the old constant:

(Cannot see the video? Watch it as GIF)

Finding the Loose Ends

The aforementioned refactorings must be repeated many times until the whole codebase has been migrated. To find out what refactoring to do next, search for the usages of the new type's constructor and its unwrapping method (e.g. ThingyId.toInt()). Use an appropriate refactoring to push that usage one step further. Repeat until all the usages are at the edges of the application (e.g. saving ThingyId to database) and cannot be pushed any further.

And as always, run all your tests after every step. If the tests fail and you cannot fix them within one minute, you're about to enter Refactoring Hell and it's the fastest that you revert your changes to the last time when all tests passed. Reverting is the easiest with IntelliJ IDEA's Local History which shows every time that you ran your tests and whether they passed or failed, letting you revert all your files to that time. The other option is to commit frequently, after every successful change (preferably rebased before pushing), and revert using git reset --hard.


This article was originally published at my company's blog.