Implementing the Java StringBuilder Class

The Java StringBuilder is a class that most Java programmers are familiar with. It is of utmost importance, especially when you want to avoid creating unnecessary string objects when looping through tens or hundreds of thousands of records.

In the Java programming language, Strings are immutable. This means that Java strings cannot be altered once created.

The StringBuilder is a common interface available in some of the high-level programming languages such as C# and Java. They are wrappers classes that provide a great deal of utility when working with character arrays, which is the underlying data structure used to store the characters. We will examine the Java String Builder class, discuss the reasons for using it and also take a sneak peek at how it works.

Java String Overview

You might be asking: why do we need to use a Java String Builder when we can just concatenate strings?

To understand the reason for String builders, we first need to know what a String is and how it works in Java.

A string is an Object (there is a String class in the JDK) that stores a sequence of characters. Under the hood, all the variables inside of a string are stored inside a character array.

String aRandomString = “A cool goat”;

java strings use character arrays behind the scene

Note that you can also pass in a character array to the String constructor. It behaves exactly the same as the String declaration made above.

String aString = new String(charArray);

The Java String class is declared final, meaning it cannot be extended. Therefore, it is a class that is designed to not be extended, meaning that the folks who created the Java Development Kits did not want us spawning all sorts of versions of the String class, as that would create chaos and confusion. At least, that is what I think.

For more information on how the String feature works, I recommend reading the following section where we dig slightly deeper into the String class. In the future, I plan to write an entire post/series on the String class, as it is a very important topic that is grossly neglected when learning the basics of Computer Science.

String Behind the Scenes

As mentioned below, Strings use character arrays behind the scenes. The most important point in understanding why we need this class is as follows

In Java, Strings are immutable objects.

Simply put, once it is instantiated, it cannot be modified. If you have been programming in Java for some time, you will likely have come across the final keyword. The String class, is a final class. Often, beginners think that Strings are a primitive type, because it is used so widely, but that is a huge misconception.

In other words, if I write the following code:

String aStr = "A new String";
String bStr = aStr + " version 2";

Behind the scenes, the String class uses a character array as its underlying data structure.

private final char value[];

Once it is defined, you are not able to increase the character count dynamically, because arrays are by nature, static data structures. In order to increase/decrease the character count, you will need to create a new character array and copy the desired characters into the new array.

We are creating two entirely separate string objects. However, strings declared without the new keyword can be interned.

String Interning in Java

Although this is an entirely different topic, I will touch upon this topic briefly. In the future, I will most likely write an entire post on how string interning works in Java.

String interning enables us to avoid creating new Strings if the value that we assign to our String variable already exists in memory. Here is a simple example.

String aName = "Teemo";
String bName = "Teemo";

// Checking for "referential equality" here
if (aName == sName) {
    System.out.println("Strings are referentially equal");
}

In other words, the two strings above are pointing at the same space in memory. In the example above, string interning is happening implicitly. We can also choose to explicitly intern strings if needed, although this should be done with due care, because the JVM cache is somewhat limited in comparison to the heap memory.

String cName = bName.intern(aName);

Why Java StringBuilder?

The Java StringBuilder class enables developers to build strings dynamically without creating unnecessary String objects. How does it do that? I want you to personally take 5 minutes to think about this thoroughly.

Okay, if you need a hint, take a look at everything that you have read up until now. After that, figure out a solution to concatenate strings without creating unnecessary string objects.

Answer: Working with character arrays instead of strings.

The downside to this answer is however, that strings are character array wrappers, meaning they provide additional utility and therefore, are generally more user-friendly. So, it is a matter of convenience vs memory usage in most (not all) cases. It is impossible to not create additional string objects, especially when building a string in a for loop. However, by using the String Builder class, we can reduce the number of string objects created.

If we create way too many string objects, the garbage collector has to work overtime. This can be very costly, especially in a mission-critical environment, in which a sudden CPU usage spike can create lag or even crash the application due to overflow. In that case, we can reduce the number

Second reason for using string builders is that they are just faster in cases where we need to append a large number of characters to a string. In the next section, I will demonstrate the example through a test case.

Java StringBuilder vs String Concatenation Test Case

Java stringbuilder class to the rescue

In the test case below, we are going to be performing the string concatenation operation in a for loop one million times. In the first example, we will be doing non-interned string concatenation by concatenating the numbers onto the string.

FYI, I am running these tests with the following specs, so your results may differ slightly in comparison to mine.

Java Version: Java 9

PC Specifications:

Processor Intel(R) Core(TM) i7-6700 CPU @ 2.60GHz (8 CPUs), ~2.66Hz
Memory 16GB

Below is the source code used to run the test.

        long NANOSECOND_IN_MILLISECOND = 1000000;
        int ONE_MILLION = 1000000;

        long startTime = System.nanoTime();
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < ONE_MILLION; i++) {
            sb.append(i);
        }
        String test = sb.toString();

        System.out.println("StringBuilder Test ------------------------------------");
        System.out.println("Time elapsed: " + (System.nanoTime() - startTime) / NANOSECOND_IN_MILLISECOND + " milliseconds");

        startTime = System.nanoTime();

        String test2 = "";
        for (int i = 0; i < ONE_MILLION; i++) {
            test2 += i;
        }

        System.out.println("StringBuilder Test ------------------------------------");
        System.out.println("Time elapsed: " + (System.nanoTime() - startTime) / NANOSECOND_IN_MILLISECOND + " milliseconds");

The results might actually be more than you imagined, especially if you are exploring this topic for the first time. I strongly suggest that you run the simulation above in your own PC to see the difference in performance between building a String using a StringBuilder class and via String concatenation.

StringBuilder Test ------------------------------------
Time elapsed: 40 milliseconds
StringBuilder Test ------------------------------------
Time elapsed: 468850 milliseconds

FYI, the time taken via String concatenation is approximately 8 minutes (yes, not seconds). Hopefully this test case alone will cause you to consider using the StringBuilder code in areas, where there is heavy string concatenation.

NOTE: The test above was done using the built-in Java StringBuilder class, which is of course, more performant, as it has a lot of important optimization features and took more than an hour to develop. On my PC, my code took 150 milliseconds to finish appending, which is way faster than 8 minutes.

When to Use StringBuilder Over String Concatenation

Generally, we want to use StringBuilders when performing string concatenation. The only exception would be cases where we keep our string concatenation count minimal or if code is somehow more readable with String concatenation and the performance hit is rather minimal. In other words, you are not concatenating like 10,000 Strings.

General rule of thumb: When concatenating 6 or more strings, use a StringBuilder. Otherwise, if concatenating 5 or less Strings, just use the plus operator. Note that this rule is merely a guideline, and not something set in stone. Rather, it is to serve as a generic guide and to quantify when you should opt for using a StringBuilder over string concatenation.

Examining the Java StringBuilder Class

Now that we know why and when we want to use the StringBuilder class, let’s take a look at how it works. We will be implementing our own simple StringBuilder to get a better understanding of how the Java StringBuilder class works.

Afterwards, I recommend taking a look at the actual Java StringBuilder class implementation. By the end of this tutorial, you will have a better understanding of how it works.

As I mentioned before, Strings are essentially a wrapper around of an array of characters. In contrast, we will need to be able to resize the array when required and be mindful to resize appropriately, so that we do NOT resize the array every time we append strings/characters to the String Builder class.

Initialization

When the StringBuilder class is initialized, users can choose to initialize one of the two key parameters:

  1. The size
  2. The initial String/characters stored inside of the character array

of the underlying character array. The reason why we need the size is quite obvious: How else will we know when to resize the array? We don’t want to resize only when the entire array is full, because that can result in the array overflowing. We want to have a margin (which can be an arbitrary number in the case of a demonstration), in order to prevent errors caused by overflows, primarily the ArrayOutOfBoundsException.

private static final int BUFFER_MULTIPLIER = 2;
private static final int DEFAULT_BUFFER_SIZE = 16;

private char[] str;
private int size;
// Number of characters in the string so far
private int charCount;

public StringBuilder() {
    this.size = DEFAULT_BUFFER_SIZE; // Default size
    this.str = new char[DEFAULT_BUFFER_SIZE];
}

Before we go into the initialization of the member variables, let’s take a look at what will happen if nothing is passed into the StringBuilder. We specify static final default values in case of this. In our example, the default size is 16. Now, let’s take a look at the other constructors;

   /**
     * @param size The initial size of the underlying character array.
     */
    public StringBuilder(int size) {
        this.size = size;   
    }

    /**
     * @param str a character array
     * */
    public StringBuilder(char[] str) {
        this.str = str;
    }

    /**
     * @param str A string containing the intitial value of character buffer;
     * */
    public StringBuilder(String str) {
        this.str = str.toCharArray();
    }

Fairly straightforward right? However, I do see a small problem with this approach. Care to take a stab at where the problem lies?

Click here to see the answer

What happens if we pass in a String or character array whose size/length is greater than the default size of 16? Somewhere down the line, unless we add some defensive coding to resize the array, we will be met with an ArrayOutOfBoundsException, which is never fun.

Another issue would be that the size variable is not initialized in the char[] str constructor.

One way to fix both would be to make a call to the parameter-less constructor (this()) and call the append() method on the initial String/char array inserted. Of course there are other, more optimized ways to solve this issue, but this is a simple, relatively painless and easy to maintain approach.

StringBuilder Append Overview

When appending String or characters onto a StringBuilder, we need to be mindful of the following member variables

  1. Size
  2. Character Array

Once we reach a certain threshold, we need to resize the array and update the size of the StringBuilder.

/**
 * @param str The string to append to the string builder
 * @return <code>this</code>. In other words, the<code>StringBuilder</code> so that we can chain methods
 */
public StringBuilder append(String str) {
    while (resizeRequired(str)) {
        resizeBuffer(str);
    }
    addString(str);
    updateCharCount(str.length());
    return this;
}

The code above almost reads like pseudo code, which is intentional. This is essentially the logic, spelled out in almost pure English. We resize the buffer until we no longer need to. Afterwards, we add the string to the character array, and lastly, update the character count by adding it to the existing count.

Resizing Character Array

This is arguably the trickiest part in implementing the StringBuilder class. Before resizing, we first need to determine if a resize is required. There are other approaches to the resize check method, but for the sake of clarity, I have implemented one of the more simpler approaches:

We only resize if the existing character count plus the new input size count exceeds the size of the current stringBuilder size.

/**
 * @param newInput The new string appended by the consumer
 * @return <code>true</code> if underlying char array needs to be resized
 * */
private boolean resizeRequired(String newInput) {
    return this.charCount + newInput.length() > this.size;
}

We keep resizing until we no longer need to resize. In most, if not all cases, a single resize should suffice. When we resize, we usually want to increase the size of the new character array by a fair bit. In this implementation, we double the size on each resize (BUFFER_MULTIPLIER = 2).

Afterwards, we create a new empty character array that is double the size of the previous array.

Then, we copy all the contents from the old character array, along with the new string into the new character array.

Once this is done, on the last line, we set the member variable holding the string builder data (this.str) to point at the newly created character array.

/**
 * Resize the underlying character array if the existing char
 * array is about to overflow.
 * @param newInput The new string appended by the consumer
 * */
private void resizeBuffer(String newInput) {
    int oldSize = this.size;
    this.size *= BUFFER_MULTIPLIER; // Update buffer size
    char[] newStr = new char[this.size];
    System.out.printf("Resizing array: Increasing size from %d to %d\n", oldSize, this.size);
    // Copy to new array
    System.arraycopy(this.str, 0, newStr, 0, oldSize);
    // Set new array
    this.str = newStr;
}

Afterwards, we add the string to the existing character array.

private void addString(String str) {
    // Copy elements from string to append into the underlying char array.
    System.arraycopy(str.toCharArray(), 0,
                this.str,  this.charCount , str.length());
}

Lastly, we simply update the size of the StringBuilder.

private void updateCharCount(int charCount) {
    this.charCount += charCount;
}

Conclusion and Source Code

In all the operations above, as you can see, we did not perform any String concatenations. Rather, we worked with character arrays. Doing so ensures that we avoid creating unnecessary String objects.

Click here for the source code. Please note that the source code was written with the intent of only covering the core String building functionality. It does not cover serialization or any of the other features that the actual Java StringBuilder class has.

Hopefully, the information helped you understand how the StringBuilder class in Java works. To sum up the key points, modifying a string results in the creation of a new string objects. Using a StringBuilder will ensure that only one string is created when its toString() method is called.

About the Author Jay

I am a programmer currently living in Seoul, South Korea. I created this blog as an outlet to express what I know / have been learning in text form for retaining knowledge and also to hopefully help the wider community. I am passionate about data structures and algorithms. The back-end and databases is where my heart is at.

follow me on:
7 Shares