Collectors.toMap error: null pointer & duplicate key

The stream in Java 8 has been widely used by students in project development. Of course, everyone has also stepped on many pitfalls. Next, I will talk about the pitfalls of Collections.toMap in project use, so as to avoid being pitted again.


1. Introduction to Collectors.toMap

Collectors.toMap is a collector in Java 8 that can convert elements in the stream into Map objects, where the key of each element is generated by the specified function.

When we use the Collectors.toMap method, we may encounter duplicate key problems. This is because when we convert elements into Map objects, if two elements have the same key, a conflict will occur and an exception will be thrown.

You may also encounter the problem that the value is null. This is because when we convert the element into a Map object, toMap eventually calls the Map.merge method. The merge method does not allow the exception caused by the value being null.

2. Problem recurrence, analysis and solutions

1. Collectors.toMap key duplication problem

Problem recurrence:
 public static void main(String[] args) {
        List<BenefitModel> benefitModelList = new ArrayList<>();
        benefitModelList.add(new BenefitModel("123", "Points Benefit"));
        benefitModelList.add(new BenefitModel("123", "Cash Equity"));
        Map<String, String> benefitMap = benefitModelList.stream().collect(Collectors.toMap(BenefitModel::getBenefitId, BenefitModel::getBenefitName));
        System.out.println(JSON.toJSONString(benefitMap));
    }
Run results:

Cause analysis:

View the Collectors.toMap source code as follows,

toMap finally calls the Map.merge method. The incoming mergeFunction is throwingMerger and throws an exception directly. The log information uses the first parameter u. The mapSupplier passed in is a HashMap object (HashMap::new). So HashMap.merge will eventually be called.

In HashMap.merge, the application of mergeFunction is as follows:

In the semantics of HashMap.merge, mergeFunction is used to merge values. For example, for counting keys, you can use map.merge(key, 1, Integer::sum). Set to 1 if it does not exist, + 1 if it exists. The input parameters here are oldValue and newValue.

So the two parameters finally passed to throwingMerger are not k-v. So the so-called Duplicate key reported in the error is actually oldValue.

Solution:
  • Ensure that the key of toMap is not repeated
  • Call the overloaded method and actively specify the merge operation that needs to be done when the key is repeated (the merge rules can be customized according to business needs)

So the code for repeated keys above is optimized as: (Merge rules: when repeated keys appear, take the latter one and discard the previous one)

 public static void main(String[] args) {
        List<BenefitModel> benefitModelList = new ArrayList<>();
        benefitModelList.add(new BenefitModel("123", "Points Benefit"));
        benefitModelList.add(new BenefitModel("123", "Cash Equity"));
        Map<String, String> map = benefitModelList.stream()
                .collect(Collectors.toMap(BenefitModel::getBenefitId, BenefitModel::getBenefitName,
                                          (k1, k2) -> k2));
        System.out.println(JSON.toJSONString(map));
    }
Fixing measures for higher versions of JDK:

The problem of duplicate keys will be fixed in subsequent versions, such as in JDK 11.

2. The value of Collectors.toMap is null

Problem recurrence:
 public static void main(String[] args) {
        List<BenefitModel> benefitModelList = new ArrayList<>();
        benefitModelList.add(new BenefitModel("123", "Points Benefit"));
        benefitModelList.add(new BenefitModel("124", null));
        Map<String, String> benefitMap = benefitModelList.stream().
        collect(Collectors.toMap(BenefitModel::getBenefitId, BenefitModel::getBenefitName));
        System.out.println(JSON.toJSONString(benefitMap));
    }
Run results:

Cause analysis:

If you have any questions, please check the source code. Check the Collectors.toMap source code as follows:

toMap ultimately calls the Map.merge method, and in HashMap.merge, the application of value is as follows:

In the semantics of HashMap.merge, value needs to be nulled before use, and null directly throws an exception NullPointerException.

Solution:

Option 1: First filter out the data with null value, and then use Collectors.toMap.

 Map<String, String> map2 = benefitModelList.stream()
                .filter(m -> m.getBenefitName() != null)
                .collect(Collectors.toMap(BenefitModel::getBenefitId, BenefitModel::getBenefitName));

Option 2: The best option for checking information and evaluation is as follows. In fact, the idea in your plan 1 – manual foreach is exactly the same.

 Map<String, String> map2 = benefitModelList.stream().collect(HashMap::new,
                (m, v) -> m.put(v.getBenefitId(), v.getBenefitName()),
                HashMap::putAll);
Fixing measures for higher versions of JDK:

When Collectors.toMap is used, the value is null. This problem still exists in Java 11. The value may be null. This kind of data is rare, making the resolution process slower.

3. Summary of using Collectors.toMap

In summary, there are a few things to remember when using Collectors.toMap:

1. The key cannot be repeated, otherwise an IllegalStateException: Duplicate key error will be reported, because the key of the Map cannot be repeated.

2. Value cannot be empty, otherwise a NullPointerException will be reported.

After reading this article, you can search for places where Collectors.toMap is used in your project code to see if it is possible to step into the above pitfalls. Don’t say that your business data will not have duplicate key data or null values. With millions of business data, everything will happen.

Reference: java – Ignore duplicates when producing map using streams – Stack Overflow

java – NullPointerException in Collectors.toMap with null entry values – Stack Overflow