Contents

Road to a more functional Java with Javaslang - example refactoring

It was a Friday like every other in SoftwareMill: I have implemented some features, prepared a pull request and waited
patiently for my teammate to review it. After a few minutes I had my feedback: “Why don’t you use Try from
Javaslang?” - Maciek said. I have done it and shared the new version in our #java channel on Slack. Suddenly (it was Friday, right?),
a discussion about additional improvements ignited and my code became a base for several evolutions. This blog post shows
how a small, relatively simple Java class could evolve from an old fashioned approach to a more object oriented one
and then into something (to some extent) functional. Time to start our road to better and cleaner code. All code presented
below is available on our GitHub repo.

Requirements

We need to extract the first image from a blog post that could be later used as a thumbnail in our system. Nowadays most
services support sharing their content on Facebook and almost every page has og:image element in <meta> section
that could be used to implement our business case.

Planned flow:

  1. Load a blog post page
  2. Extract the first og:image from the meta tag in <head> section
  3. Return an URL of this image
  4. If there is no such page, we encounter any exception or there is no og:image, log a warning and return an URL to a default image as a fallback.

Initial code

@Slf4j
public class FacebookImage {

    private final static String FACEBOOK_IMAGE_TAG = "og:image";
    private final static int TEN_SECONDS = 10_000;

    public String extractImageAddressFrom(String pageUrl) {
        Document document;
        try {
            document = Jsoup.parse(new URL(pageUrl), TEN_SECONDS);
        } catch (IOException e) {
            log.error(
              "Unable to extract og:image from url {}. Problem: {}", 
              pageUrl, 
              e.getMessage()
            );
            return DEFAULT_IMAGE;
        }
        List<Element> ogImages = List
                .ofAll(document.head().getElementsByTag("meta"))
                .filter(e -> FACEBOOK_IMAGE_TAG.equals(e.attr("property")));
        if (ogImages.isEmpty()) {
            log.warn("No {} found for blog post {}", FACEBOOK_IMAGE_TAG, pageUrl);
            return DEFAULT_IMAGE;
        }
        return ogImages.get(0).attr("content");
    }

}

As you can see it is not very complicated. But there is an issue that should struck us immediately: this code is
everything but object oriented. FacebookImage class in its current shape would be used in a following way:

FacebookImage facebookImage = new FacebookImage();
String imageAddress = facebookImage.extractImageAddressFrom(blogPostAddress);

So we create an empty class object and put all the logic in the only public method. To highlight where the problem lies,
let’s try to make our method static:

public class FacebookImage {
   public static String extractImageAddressFrom(String pageUrl) {
     // ...
   }
}

Apart from adding static keyword, we did not have to change anything else. It turns out that we have a stateless thing
that could be renamed to FacebookImageUtils and it will still compile and pass all tests without any other changes in the code!

public class FacebookImageUtils {
   public static String extractImageAddressFrom(String pageUrl) {
      // ...
   }
}

So instead of wiring object-oriented code, we have a well hidden procedural code pretending to be something else.

Towards a more object-oriented approach

Our FacebookImage class does not hold any state. We treated it as a dumb container for a single stateless method,
which is unacceptable in a language where real, proud objects should be the first citizens.

Luckily, making our class great again is not hard. Instead of returning only a state we care about (our image URL),
we should embed it in our class as a field, so FacebookImage can stand proudly next to other objects in our system:

@Slf4j
public class FacebookImage {

    private final static String FACEBOOK_IMAGE_TAG = "og:image";
    private final static int TEN_SECONDS = 10_000;

    private final String url;

    public FacebookImage(String pageUrl) {
        Document document;
        try {
            document = Jsoup.parse(new URL(pageUrl), TEN_SECONDS);
        } catch (IOException e) {
            log.error(
              "Unable to extract og:image from url {}. Problem: {}", 
              pageUrl, 
              e.getMessage()
            );
            url = DEFAULT_IMAGE;
            return;
        }
        List<Element> ogImages = List
            .ofAll(document.head().getElementsByTag("meta"))
            .filter(e -> FACEBOOK_IMAGE_TAG.equals(e.attr("property")));
        if (ogImages.isEmpty()) {
            log.warn("No {} found for blog post {}", FACEBOOK_IMAGE_TAG, pageUrl);
            url = DEFAULT_IMAGE;
        } else {
            url = ogImages.get(0).attr("content");
        }

    }

    public String getUrl() {
        return url;
    }

}

Now, when our class is used, its instance really holds a real value:

FacebookImage facebookImage = new FacebookImage(blogPostAddress);
String imageAddress = facebookImage.getUrl();

This looks so much better. Now it is a real object with something inside!

Towards a more functional approach - step one

Ok, since we have our class written in a more object-oriented way, it is time to apply some functional programming
concepts available in Javaslang to FacebookImage internals. Let’s begin with Try. If this concept is new to you,
here is a short description from their javadocs:

Try is a monadic container type which represents a computation that may either result in an exception, or return
a successfully computed value. It’s similar to, but semantically different from Either. Instances of Try, are either
an instance of Success or Failure.”

So, we will try to replace try-catch section with Try.of and use javaslang.collection.List. It provides streams
capabilities with no explicit stream() call required.

@Slf4j
public class FacebookImage {

    private final static String FACEBOOK_IMAGE_TAG = "og:image";
    private final static int TEN_SECONDS = 10_000;

    private final String url;

    public FacebookImage (String pageUrl) {
        Try<String> imageTry = Try.of(() -> {
            Document document = Jsoup.parse(new URL(pageUrl), TEN_SECONDS);
            List<Element> ogImages = List.ofAll(document.head().getElementsByTag("meta"))
                    .filter(e -> FACEBOOK_IMAGE_TAG.equals(e.attr("property")));
            if (ogImages.isEmpty()) {
                log.warn("No {} found for blog post {}", FACEBOOK_IMAGE_TAG, pageUrl);
                return DEFAULT_IMAGE;
            } else {
                return ogImages.get(0).attr("content");    
            }
        });

        url = imageTry
            .onFailure(error -> log.error(
              "Unable to extract og:image from url {}. Problem: {}", 
              pageUrl, 
              error.getMessage()
            ))
            .getOrElse(DEFAULT_IMAGE);
    } 

    public String getUrl() {
        return url;
    }

}

This is definitely a step into the right direction, but we are still far from a fluent, functional execution with actions
coming seamlessly one after another.

Towards a more functional approach - step two

To achieve this we need to go deeper. Luckily, Try is not only a variation of Either for success/failure scenarios.
It is also a fully functional concept with a handy mapTry method, allowing us to chain the executions into a single
flow. And when we extract steps into functions, the entire logic starts to look really nice and clean:

@Slf4j
public class FacebookImage {

    private final static String FACEBOOK_IMAGE_TAG = "og:image";
    private final static int TEN_SECONDS = 10_000;

    private final String url;

    public FacebookImage(String pageUrl) {

        CheckedSupplier<Document> parseDocument = () -> Jsoup.parse(new URL(pageUrl), TEN_SECONDS);

        CheckedFunction<Document, List<Element>> findElementsWithPropertyTag =
                document -> List.ofAll(document.head().getElementsByTag("meta"));

        CheckedFunction<List<Element>, List<Element>> findElementsWithFacebookImageProperty =
                elements -> elements.filter(e -> FACEBOOK_IMAGE_TAG.equals(e.attr("property")));

        Consumer<List<Element>> warnIfEmpty = elements -> {
            if (elements.isEmpty()) {
                log.warn("No {} found for blog post {}", FACEBOOK_IMAGE_TAG, pageUrl);
            }
        };

        CheckedFunction<List<Element>, Element> findFirst = elements -> elements.get(0);

        Function<Element, String> content = getContentValue -> getContentValue.attr("content");

        url = Try.of(parseDocument)
            .mapTry(findElementsWithPropertyTag)
            .mapTry(findElementsWithFacebookImageProperty)
            .peek(warnIfEmpty)
            .mapTry(findFirst)
            .toOption()
            .map(content)
            .getOrElse(DEFAULT_IMAGE);
    }

    public String getUrl() {
        return url;
    }

}

Wrap up

Now, after all these steps, we have ended up with a pretty nice and still readable solution that is way nicer than the
initial version. Along the road of this refactoring, we have learnt a bit about designing our classes to be more
object-oriented and even a bit more functional. Getting more familiar with Javaslang was not a bad thing either.

This is how our transformation of FacebookImage class looks like. What are your thoughts about it? Does it appeal
to you or maybe you are more into stopping it just after we made FacebookImage class more object-oriented? Please
share your thoughts, we are eager to see different points of view from other developers.

Blog Comments powered by Disqus.