06 | switch expression: how to simplify multi-scenario operations

The switch expression feature was first released as a preview version in JDK 12. In JDK 13, improved switch expressions are again released as a preview. Finally, switch expressions were officially released in JDK 14.

No matter what programming language you learn, reasonable analysis, judgment, and handling of different situations are essential basic skills. For example, the if-else statements and switch statements we use are used to handle various situations. We all know the switch statement, but what is the switch expression? What is the difference between switch statement and switch expression?

If you understand the two basic concepts of Java statements and expressions, you may have less trouble. In the Java specification, expressions complete operations on data. The result of an expression can be a number (i * 4); or a variable (i = 4); or nothing (void type).

A Java statement is the most basic executable unit of Java. It is not a numerical value or a variable. The iconic symbols of Java statements are semicolons (code) and double quotes (code blocks), such as if-else statements, assignment statements, etc. Looking at it this way, it’s very simple: a switch expression is an expression, and a switch statement is a statement.

What does a switch expression look like? Why do we need a switch expression? Let’s learn the switch expression bit by bit through cases and code.

Read the case

When explaining or learning the switch statement, the twelve months of the year or the seven days of the week are the demonstration data we often use. In this case, we also use such data to see where the traditional switch statement needs to be improved.

Next, what we are going to discuss is also a traditional question: How to use code to calculate how many days in a month? In life, we are familiar with this jingle, “135780 is December, thirty-one days are always the same, 469 is winter and 30 is exactly February 28 in ordinary years, and February in leap years is the same as one.”

The following code is based on the logic of this jingle to calculate how many days there are in the month where today is.

package co.ivi.jus.sweexpr.former;

import java.util.Calendar;

class DaysInMonth {<!-- -->
    public static void main(String[] args) {<!-- -->
        Calendar today = Calendar.getInstance();
        int month = today.get(Calendar.MONTH);
        int year = today.get(Calendar.YEAR);

        int daysInMonth;
        switch (month) {<!-- -->
            case Calendar.JANUARY:
            case Calendar.MARCH:
            case Calendar.MAY:
            case Calendar.JULY:
            case Calendar.AUGUST:
            case Calendar.OCTOBER:
            case Calendar.DECEMBER:
                daysInMonth = 31;
                break;
            case Calendar.APRIL:
            case Calendar.JUNE:
            case Calendar.SEPTEMBER:
            case Calendar.NOVEMBER:
                daysInMonth = 30;
                break;
            case Calendar.FEBRUARY:
                if (((year % 4 == 0) & amp; & amp; !(year % 100 == 0))
                        || (year % 400 == 0)) {<!-- -->
                    daysInMonth = 29;
                } else {<!-- -->
                    daysInMonth = 28;
                }
                break;
            default:
                throw new RuntimeException(
                    "Calendar in JDK does not work");
        }

        System.out.println(
            "There are " + daysInMonth + " days in this month.");
    }
}

In this code, we use switch statement. There is nothing wrong with the code itself, but there are at least two areas where it is easy to make mistakes.

The first common mistake is the use of the break keyword. In the above code, if one more break keyword is used, the logic of the code will change; similarly, if one less break keyword is used, problems will also occur.

int daysInMonth;
switch (month) {<!-- -->
    case Calendar.JANUARY:
    case Calendar.MARCH:
    case Calendar.MAY:
        break; // WRONG BREAK!!!
    case Calendar.JULY:
    case Calendar.AUGUST:
    case Calendar.OCTOBER:
    case Calendar.DECEMBER:
        daysInMonth = 31;
        break;
    // snipped
}

int daysInMonth;
switch (month) {<!-- -->
    // snipped
    case Calendar.APRIL:
    case Calendar.JUNE:
    case Calendar.SEPTEMBER:
    case Calendar.NOVEMBER:
        daysInMonth = 30;
                         // WRONG, NO BREAK!!!
    case Calendar.FEBRUARY:
        if (((year % 4 == 0) & amp; & amp; !(year % 100 == 0))
                || (year % 400 == 0)) {<!-- -->
            daysInMonth = 29;
        } else {<!-- -->
            daysInMonth = 28;
        }
        break;
    // snipped
}

The omission or redundancy of break statements is so common that it has even been listed as a common software security vulnerability. Any code that uses switch statements is likely to become the focus of hackers. Due to logical errors and the special care of hackers, we need to be very careful when writing code; when reading the code, we also need to repeatedly check the context of the break statement. There is no doubt that this increases the cost of code maintenance and reduces productivity.

Why do we need to use break in the switch statement? The main reason is to want to be able to share some or all code snippets under different circumstances. For example, in the example above, for the four scenarios of April, June, September, and November, you can share code snippets such that each month has 30 days. This code snippet only needs to be written after the November scenario. The previous three scenarios of April, June and September will perform the following operations (fall-through) in sequence until they encounter the next break statement or switch. Termination of statement.

Now we all know that this is a design that does more harm than good. But unfortunately, the initial design of Java adopted this design idea. If we were to design a modern language, we would need to use more switch statements, but no more break statements. However, sharing code snippets for different scenarios is still a real need. Before deprecating the break statement, we need to find new rules for sharing code snippets between different scenarios.

The second common mistake is the repeated assignment statements. In the above code, the variable declaration and actual assignment of the daysInMonth local variable are separated. Assignment statements need to appear repeatedly to adapt to different situations. If the daysInMonth variable is not assigned a value in the switch statement, the compiler will not report an error, and the default or initial variable value will be used.

int daysInMonth = 0;
switch (month) {<!-- -->
    // snipped
    case Calendar.APRIL:
    case Calendar.JUNE:
    case Calendar.SEPTEMBER:
    case Calendar.NOVEMBER:
        break; // WRONG, INITIAL daysInMonth value IS USED!!!
    case Calendar.FEBRUARY:
    // snipped
}

In the above example, the initial variable value is not a suitable data; of course, in another example, the default or initial variable value may also be a suitable data. In order to determine whether this local variable has a suitable value, we need to look through the entire switch statement block to ensure that the assignment is not missing or redundant. This increases the chance of coding errors and increases the cost of reading the code.

So, can the code block for multi-context processing have a value? Or to put it another way, can the code block for multi-context processing be turned into an expression? This idea gave birth to a new feature of the Java language: “switch expression”.

switch expression

What does a switch expression look like? The code below uses the switch expression, which improves the code in the reading case above. You can read this code with the problems encountered above. These issues include:

How does the switch expression represent a numerical value so that it can be assigned a value to a variable?
How do switch expressions share code fragments between different scenarios?
Does the code using switch expressions become simpler, simpler, and easier to understand?

package co.ivi.jus.swexpr.modern;

import java.util.Calendar;

class DaysInMonth {<!-- -->
    public static void main(String[] args) {<!-- -->
        Calendar today = Calendar.getInstance();
        int month = today.get(Calendar.MONTH);
        int year = today.get(Calendar.YEAR);

        int daysInMonth = switch (month) {<!-- -->
            case Calendar.JANUARY,
                 Calendar.MARCH,
                 Calendar.MAY,
                 Calendar.JULY,
                 Calendar.AUGUST,
                 Calendar.OCTOBER,
                 Calendar.DECEMBER -> 31;
            case Calendar.APRIL,
                 Calendar.JUNE,
                 Calendar.SEPTEMBER,
                 Calendar.NOVEMBER -> 30;
            case Calendar.FEBRUARY -> {<!-- -->
                if (((year % 4 == 0) & amp; & amp; !(year % 100 == 0))
                        || (year % 400 == 0)) {<!-- -->
                    yield 29;
                } else {<!-- -->
                    yield 28;
                }
            }
            default -> throw new RuntimeException(
                    "Calendar in JDK does not work");
        };

        System.out.println(
                "There are " + daysInMonth + " days in this month.");
    }
}

**The first change we see is that the switch code block appears on the right side of the assignment operator. **This also means that this switch code block represents a value or a variable. In other words, this switch block is an expression.

int daysInMonth = switch (month) {<!-- -->
    // snipped
}

**The second change we see is the merging of multiple scenarios. In other words, a case statement can handle multiple scenarios. **These scenarios, separated by commas, share a code block. In traditional switch code, one case statement can only handle one scenario.

case Calendar.JANUARY,
     Calendar.MARCH,
     // snipped

The design of multi-scenario merging meets the need for shared code snippets. Furthermore, since only one case statement is used, there is no need to use a break statement to satisfy this requirement. Therefore, the break statement disappears from the switch expression.

The difference is that in traditional switch code, different case statements can share part of the code fragments; in switch expressions, all code fragments need to be shared. This may seem like a loss, but in fact, the ability to share parts of code creates more confusion for the code’s writers than it does any good. If we need to share some code snippets, we can always find ways to replace them, such as encapsulating the code that needs to be shared into smaller methods. So, we don’t need to worry about switch expressions not supporting shared parts of the code.

The next change, is a new context operator, “->”, *** which is an *** arrow identifier. This notation is used in case statements, and its general form is “case L ->”. The L here is one or more scenarios to be matched. If the target variable matches the context, then the expression or code block on the right side of the operator is executed. If there are two or more scenarios to be matched, use commas “,” to separate them with delimiters.

case Calendar.JANUARY,
     // snipped
     Calendar.DECEMBER -> 31;

For traditional switch code, this general form is “case L:”, which uses colon identifiers. Why not stick with the traditional situational operators? This is mainly to simplify the code. We can still use colon identifiers in switch expressions. A case statement using colon identifiers can only match one scenario. We will discuss this situation later.

**The next change we see is the value to the right of the arrow identifier. This value represents the value of the switch expression in the matching scenario. **It should be noted that the right side of the **arrow identifier can be an expression, code block or exception throwing statement, but not other forms. **If only one statement is needed, this statement should also be presented in the form of a code block.

case Calendar.JANUARY,
     // snipped
     Calendar.DECEMBER -> {<!-- --> // CORRECT, enclosed with braces.
    yield 31;
}

If the code is not presented in the form of code blocks, an error will be reported when compiling. This is a great constraint. The form of code blocks enhances the visual effect and reduces coding errors.

case Calendar.JANUARY,
     // snipped
     Calendar.DECEMBER -> // WRONG, not a block.
    yield 31;

In addition, the right side of the arrow identifier requires a numeric value that expresses the switch expression, which is a strong constraint. If a statement violates this requirement, it cannot appear in a switch expression. For example, the return statement in the following code intends to exit the method without expressing the value of the switch expression. This code will not pass the compiler’s review.

int daysInMonth = switch (month) {<!-- -->
    // snipped
    case Calendar.APRIL,
         // snipped
         Calendar.NOVEMBER -> {<!-- -->
        // yield 30;
        return; // WRONG, return outside of enclosing switch expression.
    }
    // snipped
}

The last change we can see is the emergence of a new keyword “yield”. In most cases, the right side of the switch expression arrow identifier is a number or an expression. If one or more statements are required, we use the form of code blocks. At this time, we need to introduce a new yield statement to generate a value, which becomes the value represented by the enclosed code block.

For ease of understanding, we can think of the value produced by the yield statement as the return value of the switch expression. Therefore, yield can only be used in switch expressions, not switch statements.

case Calendar.FEBRUARY -> {<!-- -->
    if (((year % 4 == 0) & amp; & amp; !(year % 100 == 0))
            || (year % 400 == 0)) {<!-- -->
        yield 29;
    } else {<!-- -->
        yield 28;
    }
}

Actually, there is another change here that we cannot see from the above code. In the switch expression, all scenarios must be listed, no more, no less (this is what we often call exhaustive enumeration).

For example, in the above example, if there is no final default scenario branch, the compiler will report an error. This is a far-reaching improvement that will make the switch expression code more robust and significantly reduce maintenance costs, especially if a scenario branch needs to be added in the future.

int daysInMonth = switch (month) {<!-- -->
    case Calendar.JANUARY,
         // snipped
         Calendar.DECEMBER -> 31;
    case Calendar.APRIL,
         // snipped
         Calendar.NOVEMBER -> 30;
    case Calendar.FEBRUARY -> {<!-- -->
             // snipped
    }
    // WRONG to comment out the default branch, 'switch' expression
    // MUST cover all possible input values.
    //
    // default -> throw new RuntimeException(
    // "Calendar in JDK does not work");
};

Improved switch statement

Through the above interpretation, we know that there are many positive changes in the switch expression. So do these changes affect the switch statement? For example, can we use arrow identifiers in switch statements? As we said before, the yield statement is used to generate a value represented by a switch expression. Therefore, the yield statement can only be used in switch expressions and not in switch statements.

What about other changes? Let’s look at the following piece of code first.

private static int daysInMonth(int year, int month) {<!-- -->
    int daysInMonth = 0;
    switch (month) {<!-- -->
        case Calendar.JANUARY,
             Calendar.MARCH,
             Calendar.MAY,
             Calendar.JULY,
             Calendar.AUGUST,
             Calendar.OCTOBER,
             Calendar.DECEMBER ->
            daysInMonth = 31;
        case Calendar.APRIL,
             Calendar.JUNE,
             Calendar.SEPTEMBER,
             Calendar.NOVEMBER ->
            daysInMonth = 30;
        case Calendar.FEBRUARY -> {<!-- -->
            if (((year % 4 == 0) & amp; & amp; !(year % 100 == 0))
                    || (year % 400 == 0)) {<!-- -->
                daysInMonth = 29;
                break;
            }

            daysInMonth = 28;
        }
        // default -> throw new RuntimeException(
        // "Calendar in JDK does not work");
    }

    return daysInMonth;
}

In this code, we see the arrow identifier, the break statement, and the commented out default statement. This is legal, working code. In other words, the switch statement can use arrow identifiers, it can also use break statements, and it is not necessary to list all scenarios. On the surface, the improvements to the switch statement may not seem obvious. In fact, the improvement of the switch statement is mainly reflected in the use of the break statement.

We should also see that the break statement does not appear before the next case statement. This means that switch statements using arrow identifiers no longer require break statements to share code between scenarios. Although we can still use the break statement this way, it is no longer necessary.

switch (month) {<!-- -->
    // snipped
    case Calendar.APRIL,
             // snipped
         Calendar.NOVEMBER -> {<!-- -->
            daysInMonth = 30;
            break; // UNNECESSARY, could be removed safely.
        }
    // snipped
}

With or without a break statement, a switch statement using an arrow identifier will not perform the following operations sequentially (fall-through). In this way, the trouble caused by the break statement we talked about earlier disappears.

However, the switch statement using the arrow identifier does not prohibit the break statement, but restores its original meaning: to break out of the code fragment, just like the role it plays in the loop statement.

switch (month) {<!-- -->
   // snipped
   case Calendar.FEBRUARY -> {<!-- -->
        if (((year % 4 == 0) & amp; & amp; !(year % 100 == 0))
                || (year % 400 == 0)) {<!-- -->
            daysInMonth = 29;
            break; // BREAK the switch statement
        }
    
        daysInMonth = 28;
    }
   // snipped
}

Weird switch expression

As we said earlier, switch expressions can also use colon identifiers. A case statement using a colon identifier can only match one scenario and supports fall-through. Like arrow identifier switch expressions, switch expressions using colon identifiers do not support break statements and are replaced by yield statements.

This is a strange form of coding that I don’t recommend, but I can give you a brief introduction. The code below is an example where we try to replace arrow identifiers with colon identifiers. You can compare two pieces of code using colon identifiers and arrow identifiers to think about the advantages and disadvantages of the two different forms. There is no doubt that code using arrow identifiers is cleaner.

package co.ivi.jus.sweexpr.legacy;

import java.util.Calendar;

class DaysInMonth {<!-- -->
    public static void main(String[] args) {<!-- -->
        Calendar today = Calendar.getInstance();
        int month = today.get(Calendar.MONTH);
        int year = today.get(Calendar.YEAR);

    int daysInMonth = switch (month) {<!-- -->
        case Calendar.JANUARY:
        case Calendar.MARCH:
        case Calendar.MAY:
        case Calendar.JULY:
        case Calendar.AUGUST:
        case Calendar.OCTOBER:
        case Calendar.DECEMBER:
            yield 31;
        case Calendar.APRIL:
        case Calendar.JUNE:
        case Calendar.SEPTEMBER:
        case Calendar.NOVEMBER:
            yield 30;
        case Calendar.FEBRUARY:
            if (((year % 4 == 0) & amp; & amp; !(year % 100 == 0))
                    || (year % 400 == 0)) {<!-- -->
                yield 29;
            } else {<!-- -->
                yield 28;
            }
        default:
            throw new RuntimeException(
                    "Calendar in JDK does not work");
        };

        System.out.println(
            "There are " + daysInMonth + " days in this month.");
    }
}

With switch statements and switch expressions using arrow identifiers, we no longer recommend switch statements and switch expressions using colon identifiers. Learning and using switch statements and switch expressions with arrow identifiers will make your code cleaner and more robust.

Summary

Okay, here, let me make a summary. From the previous discussion, we focused on switch expressions and improved switch statements. We also discussed the new concepts and new keywords introduced by switch expressions, and learned about these basic concepts and their scope of application.

New switch forms, statements and expressions, different usage scopes, these concepts are intertwined, making the learning and use of switch a bit challenging. The introduction of arrow identifiers simplifies the code and improves coding efficiency. However, learning so many switch expressions also increases our learning burden. To help you quickly master these forms, I have put different switch expression forms, and the features they support, in the table below.

Or, you can remember the following summary:

The break statement can only appear in a switch statement, not a switch expression;
The yield statement can only appear in a switch expression, not a switch statement;
The switch expression needs to exhaustively enumerate all scenarios, while the switch statement does not require exhaustive scenarios;
The swtich form using colon identifiers supports fall-through between scenarios; while the swtich form using arrow identifiers does not support fall-through;
Using the swtich form of the arrow identifier, a case statement supports multiple scenarios; while the swtich form using the colon identifier does not support multiple scenarios.

Using the swtich form of arrow identifiers eliminates the problematic fall-through feature. Therefore, we recommend using the swtich form of arrow identifiers and phasing out the use of the swtich form of colon identifiers. Between switch expression and switch statement, we should use switch expression first. These choices can help us simplify code logic, reduce code errors, and improve production efficiency.

If you want to enrich your code review checklist, after studying this section, you can add the following:

Is it possible to change the swtich form of colon identifier to use arrow identifier?

Can the operation of assigning values using switch statements be changed to using switch expressions?

In addition, I also mentioned a few technical points discussed today, which may appear in your interview. After this study, you should be able to:

Know switch expressions and be able to use switch expressions;
Interview question: Do you know switch expressions? How to deal with statements in switch expressions?
Understand the problems that switch expressions solve and know how to solve them;
Interview question: What are the benefits of using switch expressions?
Understand the expression forms of different switches, be able to understand different expression forms, and give suggestions for improvement.
Interview question: Do you prefer to use arrow identifiers or colon identifiers?