Complete guide to Java Stream API in pictures and examples

31 min read February 16, 2025 #java

Since the release of Java 8, I almost immediately started using the Stream API, because I liked the functional approach of data processing. I wanted to use it everywhere, so I began developing my own library, which brings a similar approach to earlier versions of Java, specially for Android. I was also interested in the stream internals. Over time, I've accumulated enough experience, and now I'm eager to share it.

In this article, along with a description of streams, I provide visual demonstrations of how operators work, as well as examples and self-checking tasks. It also covers new features related to streams in Java 9+.

Updates

Self-checks

In this article, especially in the Tasks section, you'll encounter code with an input field. These are self-checking tasks.

Type OK to complete task: /*!interactive answer='OK'*/
Type random number from 0 to 99: /*!interactive answer='42'*/

If the entered answer is correct, the field will become green.

1. Stream#

A Stream is an object for universal data processing. We specify what operations we want to perform without worrying about the implementation details. For example, take elements from a list of employees, select those younger than 40, sort by last name, and place them in a new list. Or a bit more complex, read all JSON files in the books folder, deserialize them into a list of book objects, process the elements of all these lists, and then group the books by author.

Data can be obtained from sources, which are collections or methods that supply data. For example, a list of files, an array of strings, the range() method for numeric intervals, etc. That is, a stream uses existing collections to obtain new elements, it's by no means a new data structure. The data is then processed by operators. For example, take only certain elements (filter), transform each element (map), calculate the sum of elements, or combine everything into one object (reduce).

stream.png

Operators can be divided into two groups:

2. Obtaining a Stream object#

Enough theory for now. It's time to see how to create or obtain a java.util.stream.Stream object.

Here's an example:

public static void main(String[] args) {
    List<String> list = Arrays.stream(args)
        .filter(s -> s.length() <= 2)
        .collect(Collectors.toList());
}

In this example, the source is the Arrays.stream method, which creates a stream from the args array. The intermediate operator filter selects only those strings whose length does not exceed two. The terminal operator collect gathers the resulting elements into a new list.

And another example:

IntStream.of(120, 410, 85, 32, 314, 12)
    .filter(x -> x < 300)
    .map(x -> x + 11)
    .limit(3)
    .forEach(System.out::print)

There are three intermediate operators:

The terminal operator forEach applies the print function to each incoming element.

In earlier versions of Java, this example would look like this:

int[] arr = {120, 410, 85, 32, 314, 12};
int count = 0;
for (int x : arr) {
    if (x >= 300) continue;
    x += 11;
    count++;
    if (count > 3) break;
    System.out.print(x);
}

As the number of operators increased, the code in earlier versions would become significantly more complex, not to mention that splitting the computation into multiple threads would be extremely difficult with this approach.

3. How Stream works#

Streams have certain characteristics. First, processing doesn't start until a terminal operator is called. list.stream().filter(x -> x > 100); will not take a single element from the list. Second, a stream cannot be reused after it has been processed.

Stream<String> stream = list.stream();
stream.forEach(System.out::println);
stream.filter(s -> s.contains("Stream API"));
stream.forEach(System.out::println);

The code on the second line will execute, but the third line will throw an exception:

java.lang.IllegalStateException: stream has already been operated upon or closed

From the first characteristic we conclude that processing occurs from the terminal operator to the source. This is indeed the case, and it's convenient. We can use a generated infinite sequence, such as factorials or Fibonacci numbers as a source, but process only a part of it.

Until we attach a terminal operator, the source is not accessed. As soon as the terminal operator forEach appears, it starts requesting elements from the preceding limit operator. In turn, limit refers to map, map to filter, and filter to the source. Then the elements flow in the direct order: source, filter, map, limit, and forEach.
As long as any operator doesn't process an element properly, new ones will not be requested. Once 3 elements have passed through the limit operator, it goes into a closed state and will no longer request elements from map. forEach requests the next element, but limit indicates that it can no longer supply elements, so forEach concludes that the elements have run out and stops working.

This approach is called pull iteration, meaning elements are requested from the source as needed. By the way, in RxJava, a push iteration approach is implemented, where the source itself notifies that elements have appeared and need to be processed.

4. Parallel Streams#

Streams can be sequential or parallel. Sequential streams are executed only in the current thread, while parallel streams use the common pool ForkJoinPool.commonPool(). In this case, the elements are split (if possible) into several groups and processed separately in each thread. Then, at the certain stage, the groups are combined into one to provide the final result.

To obtain a parallel stream, you need to either call the parallelStream() method instead of stream(), or convert a regular stream into a parallel one by calling the intermediate operator parallel.

list.parallelStream()
    .filter(x -> x > 10)
    .map(x -> x * 2)
    .collect(Collectors.toList());

IntStream.range(0, 10)
    .parallel()
    .map(x -> x * 10)
    .sum();

Working with thread-unsafe collections, splitting elements into parts, creating threads, combining parts, all this is hidden in the Stream API implementation. All we need to do is call the right method and ensure that the functions in the operators do not depend on any external factors, otherwise there's a risk of getting an incorrect result or an error.

Here's what you shouldn't do:

final List<Integer> ints = new ArrayList<>();
IntStream.range(0, 1000000)
    .parallel()
    .forEach(i -> ints.add(i));
System.out.println(ints.size());

This is Schrödinger's code. It may execute normally and show 1000000, it may execute and show 869877, or it may fail with an error:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 332 at java.util.ArrayList.add(ArrayList.java:459)

Therefore, developers strongly advise against side effects in lambdas, repeatedly mentioning non-interference in the documentation.

5. Streams for primitives#

In addition to object streams Stream<T>, there are special streams for primitive types:

For boolean, byte, short, and char, no special streams were created, but you can use IntStream instead and then cast to the desired type. For float, you will also have to use DoubleStream.

Primitive streams are useful because you don't need to spend time on boxing/unboxing, and they also have a number of special operators that make life easier. We will look at them very soon.

6. Stream API operators#

The following are Stream API operators with descriptions, demonstrations, and examples. You can use this as a reference.

6.1. Sources#

empty()#

Stream, like a collection, can be empty, meaning all subsequent operators will have nothing to process.

Stream.empty()
    .forEach(System.out::println);
// No output

of(T value)
of(T... values)#

Stream for one or more specified elements. I often see this construction used:

Arrays.asList(1, 2, 3).stream()
    .forEach(System.out::println);

However, it's redundant. This is simpler:

Stream.of(1, 2, 3)
    .forEach(System.out::println);

ofNullable(T t)#

Introduced in Java 9. Returns an empty stream if null is passed as an argument, otherwise returns a stream of one element.

String str = Math.random() > 0.5 ? "I'm feeling lucky" : null;
Stream.ofNullable(str)
    .forEach(System.out::println);

generate(Supplier s)#

Returns a stream with an infinite sequence of elements generated by the Supplier function s.

Stream.generate(() -> 6)
    .limit(6)
    .forEach(System.out::println);
// 6, 6, 6, 6, 6, 6

Since the stream is infinite, it needs to be limited or used carefully to avoid an infinite loop.


iterate(T seed, UnaryOperator f)#

Returns an infinite stream with elements generated by sequentially applying the function f to the iterated value. The first element will be seed, then f(seed), then f(f(seed)), and so on.

Stream.iterate(2, x -> x + 6)
    .limit(6)
    .forEach(System.out::println);
// 2, 8, 14, 20, 26, 32
Stream.iterate(1, x -> x * 2)
    .limit(6)
    .forEach(System.out::println);
// 1, 2, !interactive answer='4', !interactive answer='8', !interactive answer='16', 32

iterate(T seed, Predicate hasNext, UnaryOperator f)#

Introduced in Java 9. It is the same as the previous method, but with an additional argument hasNext: if it returns false, the stream terminates. This is very similar to a for loop:

for (i = seed; hasNext(i); i = f(i)) {
}

Thus, with iterate you can now create a finite stream.

Stream.iterate(2, x -> x < 25, x -> x + 6)
    .forEach(System.out::println);
// 2, 8, 14, 20
Stream.iterate(4, x -> x < 100, x -> x * 4)
    .forEach(System.out::println);
// !interactive answer='4', 16, !interactive answer='64'

concat(Stream a, Stream b)#

Combines two streams so that the elements of stream A come first, followed by the elements of stream B.

Stream.concat(
        Stream.of(1, 2, 3),
        Stream.of(4, 5, 6))
    .forEach(System.out::println);
// 1, 2, 3, 4, 5, 6
Stream.concat(
        Stream.of(10),
        Stream.of(/*!interactive answer='4'*/, /*!interactive answer='16'*/))
    .forEach(System.out::println);
// 10, 4, 16

builder()#

Creates a mutable object for adding elements to a stream without using any container for this purpose.

Stream.Builder<Integer> streamBuider = Stream.<Integer>builder()
        .add(0)
        .add(1);
for (int i = 2; i <= 8; i += 2) {
    streamBuider.accept(i);
}
streamBuider
    .add(9)
    .add(10)
    .build()
    .forEach(System.out::println);
// 0, 1, 2, 4, 6, 8, 9, 10

IntStream.range(int startInclusive, int endExclusive)
LongStream.range(long startInclusive, long endExclusive)#

Creates a stream from the numerical range [start..end), i.e., from start (inclusive) to end (exclusive).

IntStream.range(0, 10)
    .forEach(System.out::println);
// 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

LongStream.range(-10L, -5L)
    .forEach(System.out::println);
// -10, -9, -8, -7, -6

IntStream.rangeClosed(int startInclusive, int endInclusive)
LongStream.rangeClosed(long startInclusive, long endInclusive)#

Creates a stream from the numerical range [start..end], i.e., from start (inclusive) to end (inclusive).

IntStream.rangeClosed(0, 5)
    .forEach(System.out::println);
// 0, 1, 2, 3, 4, 5

LongStream.range(-8L, -5L)
    .forEach(System.out::println);
// -8, -7, -6, -5

6.2. Intermediate operators#

filter(Predicate predicate)#

Filters the stream, accepting only those elements that satisfy the given condition.

Stream.of(1, 2, 3)
    .filter(x -> x == 10)
    .forEach(System.out::print);
// No output, because the stream will become empty after filtering

Stream.of(120, 410, 85, 32, 314, 12)
    .filter(x -> x > 100)
    .forEach(System.out::println);
// 120, 410, 314
IntStream.range(2, 9)
    .filter(x -> x % /*!interactive answer='3'*/ == 0)
    .forEach(System.out::println);
// 3, 6

map(Function mapper)#

Applies a function to each element and then returns a stream where the elements are the results of the function. map can be used to change the type of elements.

Stream.mapToDouble(ToDoubleFunction mapper)
Stream.mapToInt(ToIntFunction mapper)
Stream.mapToLong(ToLongFunction mapper)
IntStream.mapToObj(IntFunction mapper)
IntStream.mapToLong(IntToLongFunction mapper)
IntStream.mapToDouble(IntToDoubleFunction mapper)

Special operators for converting an object stream to a primitive stream, a primitive stream to an object stream, or a primitive stream of one type to a primitive stream of another type.

Stream.of("3", "4", "5")
    .map(Integer::parseInt)
    .map(x -> x + 10)
    .forEach(System.out::println);
// 13, 14, 15

Stream.of(120, 410, 85, 32, 314, 12)
    .map(x -> x + 11)
    .forEach(System.out::println);
// 131, 421, 96, 43, 325, 23
Stream.of("10", "11", "!interactive answer='32'")
    .map(x -> Integer.parseInt(x, 16))
    .forEach(System.out::println);
// !interactive answer='16', !interactive answer='17', 50

flatMap(Function<T, Stream<R>> mapper)#

One of the most interesting operators. Works like map, but with one difference — you can transform one element into zero, one, or multiple others.

flatMapToDouble(Function mapper)
flatMapToInt(Function mapper)
flatMapToLong(Function mapper)

Like map, these are used to convert to a primitive stream.

To transform one element into zero elements, you need to return null or an empty stream. To transform into one element, you need to return a stream of one element, for example, via Stream.of(x). To transform into multiple elements, you can create a stream with those elements in any way.

Stream.of(2, 3, 0, 1, 3)
    .flatMap(x -> IntStream.range(0, x))
    .forEach(System.out::println);
// 0, 1, 0, 1, 2, 0, 0, 1, 2
Stream.of(1, 2, 3, 4, 5, 6)
    .flatMap(x -> switch (x % /*!interactive answer='3'*/) {
        case 0 -> Stream.of(x, x*x, x*x*/*!interactive answer='2'*/);
        case 1 -> Stream.of(x);
        default -> Stream.empty();
    })
    .forEach(System.out::println);
// 1, 3, 9, 18, 4, 6, 36, 72

mapMulti(BiConsumer<T, Consumer<R>> mapper)#

Introduced in Java 16. This operator is similar to flatMap but uses an imperative approach. Now, along with the stream element, a Consumer is also provided, into which you can pass one or more values, or none at all.

Here's how it was with flatMap:

Stream.of(1, 2, 3, 4, 5, 6)
    .flatMap(x -> {
         if (x % 2 == 0) {
             return Stream.of(-x, x);
         }
         return Stream.empty();
     })
    .forEach(System.out::println);
// -2, 2, -4, 4, -6, 6

And here is how you can rewrite it using mapMulti:

Stream.of(1, 2, 3, 4, 5, 6)
    .mapMulti((x, consumer) -> {
         if (x % 2 == 0) {
             consumer.accept(-x);
             consumer.accept(x);
         }
     })
    .forEach(System.out::println);
// -2, 2, -4, 4, -6, 6

mapMultiToDouble(BiConsumer<T, DoubleConsumer> mapper)
mapMultiToInt(BiConsumer<T, IntConsumer> mapper)
mapMultiToLong(BiConsumer<T, LongConsumer> mapper)

Used for converting to a primitive stream.

mapMulti has several advantages over flatMap. First, if you need to skip values (as in the example above, where odd elements were skipped), there will be no overhead for creating an empty stream. Second, the consumer can easily be passed to another method where transformations, including recursive ones, can be performed.

void processSerializable(Serializable ser, Consumer<String> consumer) {
    if (ser instanceof String str) {
        consumer.accept(str);
    } else if (ser instanceof List) {
        for (Serializable s : (List<Serializable>) ser) {
            processSerializable(s, consumer);
        }
    }
}

Serializable arr(Serializable... elements) {
    return Arrays.asList(elements);
}

Stream.of(arr("A", "B"), "C", "D", arr(arr("E"), "F"), "G")
    .mapMulti(this::processSerializable)
    .forEach(System.out::println);
// A, B, C, D, E, F, G

gather(Gatherer<T, R> gatherer)#

Introduced in Java 22. gather allows you to create your own intermediate operators. To do this, you need to create a Gatherer object, in which you specify the processing logic. For example:

public <T> Gatherer<T, Void, T> twice() {
    return Gatherer.of((state, element, downstream) -> {
        downstream.push(element);
        downstream.push(element);
        return true;
    });
}
Stream.of("A", "B", "C", "D").gather(twice()).toList()
// A, A, B, B, C, C, D, D

The Gatherers class already has several useful predefined operators:

IntStream.rangeClosed(1, 10)
        .boxed()
        .gather(Gatherers.windowFixed(3))
        .forEach(System.out::println);
// [1, 2, 3]
// [4, 5, 6]
// [7, 8, 9]
// [10]

IntStream.rangeClosed(1, 10)
        .boxed()
        .gather(Gatherers.windowSliding(5))
        .forEach(System.out::println);
// [1, 2, 3, 4, 5]
// [2, 3, 4, 5, 6]
// [3, 4, 5, 6, 7]
// [4, 5, 6, 7, 8]
// [5, 6, 7, 8, 9]
// [6, 7, 8, 9, 10]

limit(long maxSize)#

Limits the stream to maxSize elements.

Stream.of(120, 410, 85, 32, 314, 12)
    .limit(4)
    .forEach(System.out::println);
// 120, 410, 85, 32
Stream.of(120, 410, 85, 32, 314, 12)
    .limit(/*!interactive answer='2'*/)
    .limit(5)
    .forEach(System.out::println);
// 120, 410

Stream.of(19)
    .limit(/*!interactive answer='0'*/)
    .forEach(System.out::println);
// No output

skip(long n)#

Skips n elements of the stream.

Stream.of(5, 10)
    .skip(40)
    .forEach(System.out::println);
// No output

Stream.of(120, 410, 85, 32, 314, 12)
    .skip(2)
    .forEach(System.out::println);
// 85, 32, 314, 12
IntStream.range(0, 10)
    .limit(5)
    .skip(3)
    .forEach(System.out::println);
// !interactive answer='3', !interactive answer='4'

IntStream.range(0, 10)
    .skip(5)
    .limit(3)
    .skip(1)
    .forEach(System.out::println);
// !interactive answer='6', !interactive answer='7'

sorted()
sorted(Comparator comparator)#

Sorts the elements of the stream. This operator works cleverly: if the stream is already marked as sorted, sorting will not be performed. Otherwise, it collects all elements, sorts them, and returns a new stream marked as sorted. See 9.1.

IntStream.range(0, 100000000)
    .sorted()
    .limit(3)
    .forEach(System.out::println);
// 0, 1, 2

IntStream.concat(
        IntStream.range(0, 100000000),
        IntStream.of(-1, -2))
    .sorted()
    .limit(3)
    .forEach(System.out::println);
// Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

Stream.of(120, 410, 85, 32, 314, 12)
    .sorted()
    .forEach(System.out::println);
// 12, 32, 85, 120, 314, 410
Stream.of(120, 410, 85, 32, 314, 12)
    .sorted(Comparator./*!interactive size='7' answer='reverse'*/Order())
    .forEach(System.out::println);
// 410, 314, 120, 85, 32, 12

distinct()#

Removes duplicate elements and returns a stream with unique elements. Like sorted, it checks if the stream already consists of unique elements and, if not, selects unique elements and marks the stream as containing unique elements.

Stream.of(2, 1, 8, 1, 3, 2)
    .distinct()
    .forEach(System.out::println);
// 2, 1, 8, 3
IntStream.concat(
        IntStream.range(2, 5),
        IntStream.range(0, 4))
    .distinct()
    .forEach(System.out::println);
// !interactive answer='2', !interactive answer='3', !interactive answer='4', !interactive answer='0', !interactive answer='1'

peek(Consumer action)#

Performs an action on each element of the stream and returns a stream with the elements of the original stream. It's used to pass an element somewhere without breaking the chain of operators (remember that forEach is a terminal operator and the stream terminates after it), or for debugging.

Stream.of(0, 3, 0, 0, 5)
    .peek(x -> System.out.format("before distinct: %d%n", x))
    .distinct()
    .peek(x -> System.out.format("after distinct: %d%n", x))
    .map(x -> x * x)
    .forEach(x -> System.out.format("after map: %d%n", x));
// before distinct: 0
// after distinct: 0
// after map: 0
// before distinct: 3
// after distinct: 3
// after map: 9
// before distinct: 1
// after distinct: 1
// after map: 1
// before distinct: 5
// before distinct: 0
// before distinct: 5
// after distinct: 5
// after map: 25

takeWhile(Predicate predicate)#

Introduced in Java 9. Returns elements as long as they satisfy the condition, i.e., the predicate function returns true. This is similar to limit, but with a condition instead of a number.

Stream.of(1, 2, 3, 4, 2, 5)
    .takeWhile(x -> x < 3)
    .forEach(System.out::println);
// 1, 2
IntStream.range(2, 7)
    .takeWhile(x -> x != /*!interactive answer='5'*/)
    .forEach(System.out::println);
// 2, 3, 4

dropWhile(Predicate predicate)#

Introduced in Java 9. Skips elements as long as they satisfy the condition, then returns the remaining part of the stream. If the predicate returns false for the first element, no elements will be skipped. This operator is similar to skip, but works based on a condition.

Stream.of(1, 2, 3, 4, 2, 5)
    .dropWhile(x -> x >= 3)
    .forEach(System.out::println);
// 1, 2, 3, 4, 2, 5

Stream.of(1, 2, 3, 4, 2, 5)
    .dropWhile(x -> x < 3)
    .forEach(System.out::println);
// 3, 4, 2, 5
IntStream.range(2, 7)
    .dropWhile(x -> x < /*!interactive answer='5'*/)
    .forEach(System.out::println);
// 5, 6

IntStream.of(1, 3, 2, 0, 5, 4)
    .dropWhile(x -> x /*!interactive answer='%'*/ 2 == 1)
    .forEach(System.out::println);
// 2, 0, 5, 6

boxed()#

Converts a primitive stream to an object stream.

DoubleStream.of(0.1, Math.PI)
    .boxed()
    .map(Object::getClass)
    .forEach(System.out::println);
// class java.lang.Double
// class java.lang.Double

6.3. Terminal operators#

void forEach(Consumer action)#

Performs the specified action for each element of the stream.

Stream.of(120, 410, 85, 32, 314, 12)
    .forEach(x -> System.out.format("%s, ", x));
// 120, 410, 85, 32, 314, 12

void forEachOrdered(Consumer action)#

Also performs the specified action for each element of the stream, but ensures the correct order of elements before doing so. Useful for parallel streams when the correct sequence of elements is needed.

IntStream.range(0, 100000)
    .parallel()
    .filter(x -> x % 10000 == 0)
    .map(x -> x / 10000)
    .forEach(System.out::println);
// 5, 6, 7, 3, 4, 8, 0, 9, 1, 2

IntStream.range(0, 100000)
    .parallel()
    .filter(x -> x % 10000 == 0)
    .map(x -> x / 10000)
    .forEachOrdered(System.out::println);
// 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

long count()#

Returns the number of elements in the stream.

long count = IntStream.range(0, 10)
    .flatMap(x -> IntStream.range(0, x))
    .count();
System.out.println(count);
// 45

System.out.println(
    IntStream.rangeClosed(-3, /*!interactive answer='6'*/)
        .count()
);
// 10

System.out.println(
    Stream.of(0, 2, 9, 13, 5, 11)
        ./*!interactive size='3' answer='map'*/(x -> x /*!interactive answer='*'*/ 2)
        .filter(x -> x % 2 == 1)
        .count()
);
// 0

Warning! peek optimization

If this operator doesn't make any changes to the stream, it may be automatically excluded from the pipeline for the optimization sake.

long count = IntStream.range(0, 10)
    .peek(System.out::println)
    .count();
System.out.println(count);

In this example peek doesn't affect the stream elements count, thus it'll be skipped and only the result 10 will be printed.


R collect(Collector collector)#

One of the most powerful Stream API operators. It allows you to collect all elements into a list, set, or another collection, group elements by some criterion, combine everything into a string, etc. The java.util.stream.Collectors class has many methods for various use cases, which we will explore later. If desired, you can write your own collector by implementing the Collector interface.

List<Integer> list = Stream.of(1, 2, 3)
    .collect(Collectors.toList());
// list: [1, 2, 3]

String s = Stream.of(1, 2, 3)
    .map(String::valueOf)
    .collect(Collectors.joining("-", "<", ">"));
// s: "<1-2-3>"

R collect(Supplier supplier, BiConsumer accumulator, BiConsumer combiner)#

The same as collect(collector), but with parameters broken down for convenience. If you need to quickly perform some operation, there is no need to implement the Collector interface; just pass three lambda expressions.

The supplier should provide new objects (containers), for example, new ArrayList().
The accumulator adds an element to the container.
The combiner is necessary for parallel streams and combines parts of the stream into one.

List<String> list = Stream.of("a", "b", "c", "d")
    .collect(ArrayList::new, ArrayList::add, ArrayList::addAll);
// list: ["a", "b", "c", "d"]

Object[] toArray()#

Returns a non-typed array with the elements of the stream.

A[] toArray(IntFunction<A[]> generator) Similarly, returns a typed array.

String[] elements = Stream.of("a", "b", "c", "d")
    .toArray(String[]::new);
// elements: ["a", "b", "c", "d"]

List<T> toList()#

Finally added in Java 16. Returns a list, similar to collect(Collectors.toList()). The difference is that the returned list is guaranteed to be unmodifiable. Any addition or removal of elements in the resulting list will result in an UnsupportedOperationException.

List<String> elements = Stream.of("a", "b", "c", "d")
        .map(String::toUpperCase)
        .toList();
// elements: ["A", "B", "C", "D"]

T reduce(T identity, BinaryOperator accumulator)
U reduce(U identity, BiFunction accumulator, BinaryOperator combiner)#

Another useful operator. Allows you to transform all elements of the stream into one object. For example, calculate the sum of all elements or find the minimum element.

First, the identity object and the first element of the stream are taken, the accumulator function is applied, and identity becomes its result. Then the process continues for the remaining elements.

int sum = Stream.of(1, 2, 3, 4, 5)
    .reduce(10, (acc, x) -> acc + x);
// sum: 25

Optional reduce(BinaryOperator accumulator)#

This method differs in that it does not have an initial identity object. The first element of the stream serves as the identity. Since the stream can be empty and the identity object will not be assigned, the result of the function is an Optional, allowing you to handle this situation by returning Optional.empty().

Optional<Integer> result = Stream.<Integer>empty()
    .reduce((acc, x) -> acc + x);
System.out.println(result.isPresent());
// false

Optional<Integer> sum = Stream.of(1, 2, 3, 4, 5)
    .reduce((acc, x) -> acc + x);
System.out.println(sum.get());
// 15
int sum = IntStream.of(2, 4, 6, 8)
    .reduce(/*!interactive answer='5'*/, (acc, x) -> acc + x);
// sum: 25

int product = IntStream.range(0, 10)
    .filter(x -> x++ % 4 == 0)
    .reduce((acc, x) -> acc * x)
    .getAsInt();
// product: !interactive answer='0'

Optional min(Comparator comparator)#

Optional max(Comparator comparator)#

Finds the minimum/maximum element based on the provided comparator. Internally, reduce is called:

reduce((a, b) -> comparator.compare(a, b) <= 0 ? a : b));
reduce((a, b) -> comparator.compare(a, b) >= 0 ? a : b));
int min = Stream.of(20, 11, 45, 78, 13)
    .min(Integer::compare).get();
// min: 11

int max = Stream.of(20, 11, 45, 78, 13)
    .max(Integer::compare).get();
// max: 78

Optional findAny()#

Returns the first encountered element of the stream. In parallel streams, this can be any element that was in the split part of the sequence.

Optional findFirst()#

Guaranteed to return the first element of the stream, even if the stream is parallel.

If any element is needed, findAny() will work faster for parallel streams.

int anySeq = IntStream.range(4, 65536)
    .findAny()
    .getAsInt();
// anySeq: 4

int firstSeq = IntStream.range(4, 65536)
    .findFirst()
    .getAsInt();
// firstSeq: 4

int anyParallel = IntStream.range(4, 65536)
    .parallel()
    .findAny()
    .getAsInt();
// anyParallel: 32770

int firstParallel = IntStream.range(4, 65536)
    .parallel()
    .findFirst()
    .getAsInt();
// firstParallel: 4

boolean allMatch(Predicate predicate)#

Returns true if all elements of the stream satisfy the predicate condition. If any element is found for which the predicate function returns false, the operator stops looking at the elements and returns false.

boolean result = Stream.of(1, 2, 3, 4, 5)
    .allMatch(x -> x <= 7);
// result: true
boolean result = Stream.of(1, 2, 3, 4, 5)
    .allMatch(x -> x < 3);
// result: false
boolean result = Stream.of(120, 410, 85, 32, 314, 12)
    .allMatch(x -> x % 2 == 0);
// result: !interactive size='4' answer='false'

boolean anyMatch(Predicate predicate)#

Returns true if at least one element of the stream satisfies the predicate condition. If such an element is found, there is no need to continue iterating through the elements, so the result is returned immediately.

boolean result = Stream.of(1, 2, 3, 4, 5)
    .anyMatch(x -> x == 3);
// result: true
boolean result = Stream.of(1, 2, 3, 4, 5)
    .anyMatch(x -> x == 8);
// result: false
boolean result = Stream.of(120, 410, 85, 32, 314, 12)
    .anyMatch(x -> x % 22 == 0);
// result: !interactive size='4' answer='false'

boolean noneMatch(Predicate predicate)#

Returns true if, after iterating through all elements of the stream, none satisfy the predicate condition. If any element is found for which the predicate function returns true, the operator stops iterating through the elements and returns false.

boolean result = Stream.of(1, 2, 3, 4, 5)
    .noneMatch(x -> x == 9);
// result: true
boolean result = Stream.of(1, 2, 3, 4, 5)
    .noneMatch(x -> x == 3);
// result: false
boolean result = Stream.of(120, 410, 86, 32, 314, 12)
    .noneMatch(x -> x % 2 == 1);
// result: !interactive size='4' answer='true'

OptionalDouble average()#

Only for primitive streams. Returns the arithmetic mean of all elements. Or Optional.empty if the stream is empty.

double result = IntStream.range(2, 16)
    .average()
    .getAsDouble();
// result: 8.5

sum()#

Returns the sum of the elements of a primitive stream. For IntStream, the result will be of type int, for LongStreamlong; for DoubleStreamdouble.

long result = LongStream.range(2, 16)
    .sum();
// result: 119

IntSummaryStatistics summaryStatistics()#

A useful method for primitive streams. Allows you to collect statistics about the numeric sequence of the stream, namely: the number of elements, their sum, arithmetic mean, minimum and maximum element.

LongSummaryStatistics stats = LongStream.range(2, 16)
    .summaryStatistics();
System.out.format("  count: %d%n", stats.getCount());
System.out.format("    sum: %d%n", stats.getSum());
System.out.format("average: %.1f%n", stats.getAverage());
System.out.format("    min: %d%n", stats.getMin());
System.out.format("    max: %d%n", stats.getMax());
//   count: 14
//     sum: 119
// average: 8,5
//     min: 2
//     max: 15

7. Collectors#

7.1. Collectors methods#

toList()#

The most common method. Collects elements into a List.


toSet()#

Collects elements into a Set.


toCollection(Supplier collectionFactory)#

Collects elements into the specified collection. If you need to specify exactly which List, Set, or other collection to use, this method will help.

Deque<Integer> deque = Stream.of(1, 2, 3, 4, 5)
    .collect(Collectors.toCollection(ArrayDeque::new));

Set<Integer> set = Stream.of(1, 2, 3, 4, 5)
    .collect(Collectors.toCollection(LinkedHashSet::new));

toMap(Function keyMapper, Function valueMapper)#

Collects elements into a Map. Each element is transformed into a key and a value based on the results of the keyMapper and valueMapper functions, respectively. If you need to return the same element that came in, you can pass Function.identity().

Map<Integer, Integer> map1 = Stream.of(1, 2, 3, 4, 5)
    .collect(Collectors.toMap(
        Function.identity(),
        Function.identity()
    ));
// {1=1, 2=2, 3=3, 4=4, 5=5}

Map<Integer, String> map2 = Stream.of(1, 2, 3)
    .collect(Collectors.toMap(
        Function.identity(),
        i -> String.format("%d * 2 = %d", i, i * 2)
    ));
// {1="1 * 2 = 2", 2="2 * 2 = 4", 3="3 * 2 = 6"}

Map<Character, String> map3 = Stream.of(50, 54, 55)
    .collect(Collectors.toMap(
        i -> (char) i.intValue(),
        i -> String.format("<%d>", i)
    ));
// {'2'="<50>", '6'="<54>", '7'="<55>"}

toMap(Function keyMapper, Function valueMapper, BinaryOperator mergeFunction)#

Similar to the first version of the method, but in the case where two identical keys are encountered, it allows merging the values.

Map<Integer, String> map4 = Stream.of(50, 55, 69, 20, 19, 52)
    .collect(Collectors.toMap(
        i -> i % 5,
        i -> String.format("<%d>", i),
        (a, b) -> String.join(", ", a, b)
    ));
// {0="<50>, <55>, <20>", 2="<52>", 4="<64>, <19>"}

In this case, for the numbers 50, 55, and 20, the key is the same and equals 0, so the values accumulate. Similarly for 64 and 19.

toMap(Function keyMapper, Function valueMapper, BinaryOperator mergeFunction, Supplier mapFactory)#

The same as above, but allows specifying exactly which Map class to use.

Map<Integer, String> map5 = Stream.of(50, 55, 69, 20, 19, 52)
    .collect(Collectors.toMap(
        i -> i % 5,
        i -> String.format("<%d>", i),
        (a, b) -> String.join(", ", a, b),
        LinkedHashMap::new
    ));
// {0=<50>, <55>, <20>, 4=<69>, <19>, 2=<52>}

The difference between this example and the previous one is that now the order is preserved, thanks to LinkedHashList.

toConcurrentMap(Function keyMapper, Function valueMapper)
toConcurrentMap(Function keyMapper, Function valueMapper, BinaryOperator mergeFunction)
toConcurrentMap(Function keyMapper, Function valueMapper, BinaryOperator mergeFunction, Supplier mapFactory)#

The same as toMap, but works with ConcurrentMap.


collectingAndThen(Collector downstream, Function finisher)#

Collects elements using the specified collector and then applies a function to the result.

List<Integer> list = Stream.of(1, 2, 3, 4, 5)
    .collect(Collectors.collectingAndThen(
        Collectors.toList(),
        Collections::unmodifiableList));
System.out.println(list.getClass());
// class java.util.Collections$UnmodifiableRandomAccessList

List<String> list2 = Stream.of("a", "b", "c", "d")
    .collect(Collectors.collectingAndThen(
            Collectors.toMap(Function.identity(), s -> s + s),
            map -> map.entrySet().stream()))
    .map(e -> e.toString())
    .collect(Collectors.collectingAndThen(
            Collectors.toList(),
            Collections::unmodifiableList));
list2.forEach(System.out::println);
// a=aa
// b=bb
// c=cc
// d=dd

joining()
joining(CharSequence delimiter)
joining(CharSequence delimiter, CharSequence prefix, CharSequence suffix)#

Collects elements that implement the CharSequence interface into a single string. Additionally, you can specify a delimiter, as well as a prefix and suffix for the entire sequence.

String s1 = Stream.of("a", "b", "c", "d")
    .collect(Collectors.joining());
System.out.println(s1);
// abcd

String s2 = Stream.of("a", "b", "c", "d")
    .collect(Collectors.joining("-"));
System.out.println(s2);
// a-b-c-d

String s3 = Stream.of("a", "b", "c", "d")
    .collect(Collectors.joining(" -> ", "[ ", " ]"));
System.out.println(s3);
// [ a -> b -> c -> d ]

summingInt(ToIntFunction mapper)#

summingLong(ToLongFunction mapper)#

summingDouble(ToDoubleFunction mapper)#

A collector that converts objects to int/long/double and calculates the sum.

averagingInt(ToIntFunction mapper)#

averagingLong(ToLongFunction mapper)#

averagingDouble(ToDoubleFunction mapper)#

Similarly, but calculates the average value.

summarizingInt(ToIntFunction mapper)#

summarizingLong(ToLongFunction mapper)#

summarizingDouble(ToDoubleFunction mapper)#

Similarly, but with full statistics.

Integer sum = Stream.of("1", "2", "3", "4")
    .collect(Collectors.summingInt(Integer::parseInt));
System.out.println(sum);
// 10

Double average = Stream.of("1", "2", "3", "4")
    .collect(Collectors.averagingInt(Integer::parseInt));
System.out.println(average);
// 2.5

DoubleSummaryStatistics stats = Stream.of("1.1", "2.34", "3.14", "4.04")
    .collect(Collectors.summarizingDouble(Double::parseDouble));
System.out.println(stats);
// DoubleSummaryStatistics{count=4, sum=10.620000, min=1.100000, average=2.655000, max=4.040000}

All these methods and a few subsequent ones are often used as composite collectors for grouping or collectingAndThen. In the form shown in the examples, they are rarely used. I just show an example of what they return to make it clearer.


counting()#

Counts the number of elements.

Long count = Stream.of("1", "2", "3", "4")
    .collect(Collectors.counting());
System.out.println(count);
// 4

filtering(Predicate predicate, Collector downstream)#

mapping(Function mapper, Collector downstream)#

flatMapping(Function downstream)#

reducing(BinaryOperator op)
reducing(T identity, BinaryOperator op)
reducing(U identity, Function mapper, BinaryOperator op)#

A special group of collectors that apply the filter, map, flatMap, and reduce operations. filtering and flatMapping were introduced in Java 9.

List<Integer> ints = Stream.of(1, 2, 3, 4, 5, 6)
    .collect(Collectors.filtering(
        x -> x % 2 == 0,
        Collectors.toList()));
// 2, 4, 6

String s1 = Stream.of(1, 2, 3, 4, 5, 6)
    .collect(Collectors.filtering(
        x -> x % 2 == 0,
        Collectors.mapping(
            x -> Integer.toString(x),
            Collectors.joining("-")
        )
    ));
// 2-4-6

String s2 = Stream.of(2, 0, 1, 3, 2)
    .collect(Collectors.flatMapping(
        x -> IntStream.range(0, x).mapToObj(Integer::toString),
        Collectors.joining(", ")
    ));
// 0, 1, 0, 0, 1, 2, 0, 1

int value = Stream.of(1, 2, 3, 4, 5, 6)
    .collect(Collectors.reducing(
        0, (a, b) -> a + b
    ));
// 21
String s3 = Stream.of(1, 2, 3, 4, 5, 6)
    .collect(Collectors.reducing(
        "", x -> Integer.toString(x), (a, b) -> a + b
    ));
// 123456

minBy(Comparator comparator)#

maxBy(Comparator comparator)#

Finds the minimum/maximum element based on the given comparator.

Optional<String> min = Stream.of("ab", "c", "defgh", "ijk", "l")
    .collect(Collectors.minBy(Comparator.comparing(String::length)));
min.ifPresent(System.out::println);
// c

Optional<String> max = Stream.of("ab", "c", "defgh", "ijk", "l")
    .collect(Collectors.maxBy(Comparator.comparing(String::length)));
max.ifPresent(System.out::println);
// defgh

groupingBy(Function classifier)
groupingBy(Function classifier, Collector downstream)
groupingBy(Function classifier, Supplier mapFactory, Collector downstream)#

Groups elements by a criterion, storing the result in a Map. Together with the aggregating collectors presented above, it allows to flexibly collect data. More about combining in the Examples section.

groupingByConcurrent(Function classifier)
groupingByConcurrent(Function classifier, Collector downstream)
groupingByConcurrent(Function classifier, Supplier mapFactory, Collector downstream)#

A similar set of methods, but stores data in a ConcurrentMap.

Map<Integer, List<String>> map1 = Stream.of(
    "ab", "c", "def", "gh", "ijk", "l", "mnop")
    .collect(Collectors.groupingBy(String::length));
map1.entrySet().forEach(System.out::println);
// 1=[c, l]
// 2=[ab, gh]
// 3=[def, ijk]
// 4=[mnop]

Map<Integer, String> map2 = Stream.of(
    "ab", "c", "def", "gh", "ijk", "l", "mnop")
    .collect(Collectors.groupingBy(
        String::length,
        Collectors.mapping(
            String::toUpperCase,
            Collectors.joining())
    ));
map2.entrySet().forEach(System.out::println);
// 1=CL
// 2=ABGH
// 3=DEFIJK
// 4=MNOP

Map<Integer, List<String>> map3 = Stream.of(
    "ab", "c", "def", "gh", "ijk", "l", "mnop")
    .collect(Collectors.groupingBy(
        String::length,
        LinkedHashMap::new,
        Collectors.mapping(
            String::toUpperCase,
            Collectors.toList())
    ));
map3.entrySet().forEach(System.out::println);
// 2=[AB, GH]
// 1=[C, L]
// 3=[DEF, IJK]
// 4=[MNOP]

partitioningBy(Predicate predicate)
partitioningBy(Predicate predicate, Collector downstream)#

Another interesting method. Splits a sequence of elements based on some criterion. Elements that satisfy the given condition go into one part, and rest go into the other.

Map<Boolean, List<String>> map1 = Stream.of(
    "ab", "c", "def", "gh", "ijk", "l", "mnop")
    .collect(Collectors.partitioningBy(s -> s.length() <= 2));
map1.entrySet().forEach(System.out::println);
// false=[def, ijk, mnop]
// true=[ab, c, gh, l]

Map<Boolean, String> map2 = Stream.of(
    "ab", "c", "def", "gh", "ijk", "l", "mnop")
    .collect(Collectors.partitioningBy(
        s -> s.length() <= 2,
        Collectors.mapping(
            String::toUpperCase,
            Collectors.joining())
    ));
map2.entrySet().forEach(System.out::println);
// false=DEFIJKMNOP
// true=ABCGHL

teeing(Collector downstream1, Collector downstream2, BiFunction merger)#

Introduced in Java 12. If you're familliar with the Unix tee command, you may have already guessed what this collector does. It collects elements into two different collectors downstream1 and downstream2, and then merges the result in the merger function.

String result = Stream.of(
    "ab", "c", "def", "gh", "ijk", "l", "mnop")
    .collect(Collectors.teeing(
        Collectors.counting(),
        Collectors.filtering(s -> s.length() <= 2, Collectors.toList()),
        (elementsCount, filteredList) ->
            "From the %d elements, only %d were filtered: %s"
                .formatted(elementsCount, filteredList.size(), filteredList)
        ));
System.out.println(result);
// From the 7 elements, only 4 were filtered: [ab, c, gh, l]

record Range(int min, int max) {}
Range range = new Random(42).ints(7)
    .boxed()
    .collect(Collectors.teeing(
        Collectors.minBy(Integer::compare),
        Collectors.maxBy(Integer::compare),
        (min, max) -> new Range(
            min.orElse(Integer.MIN_VALUE),
            max.orElse(Integer.MAX_VALUE)
        )));
System.out.println(range);
Range[min=-1360544799, max=1325939940]

8. Collector#

The java.util.stream.Collector interface is used to collect stream elements into a mutable container. It consists of the following methods:

Characteristics:

8.1. Implementing a custom collector#

Before implementing a collector, make sure that the task can't be solved using a combination of standard collectors.
For example, if you need to collect only unique elements into a list, you can first collect the elements into a LinkedHashSet to preserve the order, and then add all elements to an ArrayList. The combination of collectingAndThen with toCollection and a function that passes the obtained Set to the ArrayList constructor does what's intended:

Stream.of(1, 2, 3, 1, 9, 2, 5, 3, 4, 8, 2)
    .collect(Collectors.collectingAndThen(
        Collectors.toCollection(LinkedHashSet::new),
        ArrayList::new));
// 1 2 3 9 5 4 8

However, if the task is to collect unique elements into one part and duplicate elements into another, for example, in a Map<Boolean, List>, then using partitioningBy will not be very elegant:

final Set<Integer> elements = new HashSet<>();
Stream.of(1, 2, 3, 1, 9, 2, 5, 3, 4, 8, 2)
    .collect(Collectors.partitioningBy(elements::add))
    .forEach((isUnique, list) -> System.out.format("%s: %s%n", isUnique ? "unique" : "repetitive", list));

Here you have to create a Set and use it in the collector predicate, which is undesirable. You can turn the lambda into an anonymous function, but that's even worse:

new Predicate<Integer>() {
    final Set<Integer> elements = new HashSet<>();
    @Override
    public boolean test(Integer t) {
        return elements.add(t);
    }
}

There are two ways to create your own collector:

  1. Create a class that implements the Collector interface.
  2. Use the Collector.of factory.

If you want to use generics, then in the second option you can create a static function and use Collector.of inside.

Here's the resulting collector:

public static <T> Collector<T, ?, Map<Boolean, List<T>>> partitioningByUniqueness() {
    return Collector.<T, Map.Entry<List<T>, Set<T>>, Map<Boolean, List<T>>>of(
        () -> new AbstractMap.SimpleImmutableEntry<>(
                new ArrayList<T>(), new LinkedHashSet<>()),
        (c, e) -> {
            if (!c.getValue().add(e)) {
                c.getKey().add(e);
            }
        },
        (c1, c2) -> {
            c1.getKey().addAll(c2.getKey());
            for (T e : c2.getValue()) {
                if (!c1.getValue().add(e)) {
                    c1.getKey().add(e);
                }
            }
            return c1;
        },
        c -> {
            Map<Boolean, List<T>> result = new HashMap<>(2);
            result.put(Boolean.FALSE, c.getKey());
            result.put(Boolean.TRUE, new ArrayList<>(c.getValue()));
            return result;
        });
}

Let's break it down.

The Collector interface is declared as:

interface Collector<T, A, R>

The signature of the method returning the collector is:

public static <T> Collector<T, ?, Map<Boolean, List<T>>> partitioningByUniqueness()

It accepts elements of type T and returns a Map<Boolean, List<T>>, similar to partitioningBy. The question mark (wildcard) in the middle parameter indicates that the internal implementation type is not important for the public API. Many methods in the Collectors class contain a wildcard as the container type.

return Collector.<T, Map.Entry<List<T>, Set<T>>, Map<Boolean, List<T>>>of

Here, the type of the container had to be specified. Since Java doesn't have a Pair or Tuple class, two different types can be placed in a Map.Entry.

// supplier
() -> new AbstractMap.SimpleImmutableEntry<>(
        new ArrayList<>(), new LinkedHashSet<>())

The container will be AbstractMap.SimpleImmutableEntry. The key will contain the list of duplicate elements, and the value will contain the set of unique elements.

// accumulator
(c, e) -> {
    if (!c.getValue().add(e)) {
        c.getKey().add(e);
    }
}

Here, if an element cannot be added to the set (because it already exists there), it's added to the list of duplicate elements.

// combiner
(c1, c2) -> {
    c1.getKey().addAll(c2.getKey());
    for (T e : c2.getValue()) {
        if (!c1.getValue().add(e)) {
            c1.getKey().add(e);
        }
    }
    return c1;
}

We need to combine two Map.Entry objects. The lists of duplicate elements can be combined easily, but with unique elements it's not that simple — you need to go through each element and repeat everything that was done in the accumulator function.
By the way, the accumulator lambda can be assigned to a variable, and then the loop can be turned into c2.getValue().forEach(e -> accumulator.accept(c1, e));.

// finisher
c -> {
    Map<Boolean, List<T>> result = new HashMap<>(2);
    result.put(Boolean.FALSE, c.getKey());
    result.put(Boolean.TRUE, new ArrayList<>(c.getValue()));
    return result;
}

Finally, return the result. Unique elements will be in map.get(Boolean.TRUE), and duplicate elements in map.get(Boolean.FALSE).

Map<Boolean, List<Integer>> map;
map = Stream.of(1, 2, 3, 1, 9, 2, 5, 3, 4, 8, 2)
    .collect(partitioningByUniqueness());
// {false=[1, 2, 3, 2], true=[1, 2, 3, 9, 5, 4, 8]}

A good practice is to create collectors that accept another collector and depend on it. For example, you can collect elements not only into a List but also into any other collection (Collectors.toCollection), or into a string (Collectors.joining).

public static <T, D, A> Collector<T, ?, Map<Boolean, D>> partitioningByUniqueness(
        Collector<? super T, A, D> downstream) {
    class Holder<A, B> {
        final A unique, repetitive;
        final B set;
        Holder(A unique, A repetitive, B set) {
            this.unique = unique;
            this.repetitive = repetitive;
            this.set = set;
        }
    }
    BiConsumer<A, ? super T> downstreamAccumulator = downstream.accumulator();
    BinaryOperator<A> downstreamCombiner = downstream.combiner();
    BiConsumer<Holder<A, Set<T>>, T> accumulator = (t, element) -> {
        A container = t.set.add(element) ? t.unique : t.repetitive;
        downstreamAccumulator.accept(container, element);
    };
    return Collector.<T, Holder<A, Set<T>>, Map<Boolean, D>>of(
            () -> new Holder<>(
                downstream.supplier().get(),
                downstream.supplier().get(),
                new HashSet<>() ),
            accumulator,
            (t1, t2) -> {
                downstreamCombiner.apply(t1.repetitive, t2.repetitive);
                t2.set.forEach(e -> accumulator.accept(t1, e));
                return t1;
            },
            t -> {
                Map<Boolean, D> result = new HashMap<>(2);
                result.put(Boolean.FALSE, downstream.finisher().apply(t.repetitive));
                result.put(Boolean.TRUE, downstream.finisher().apply(t.unique));
                t.set.clear();
                return result;
            });
}

The algorithm remains the same, but now you can't immediately collect unique elements into the second container, you have to create a new set. For convenience, a Holder class is also added, which stores two containers for unique and duplicate elements, as well as the set itself.

All operations now need to be performed through the passed collector, called downstream. It will provide a container of the required type (downstream.supplier().get()), add an element to this container (downstream.accumulator().accept(container, element)), combine containers and create the final result.

Stream.of(1, 2, 3, 1, 9, 2, 5, 3, 4, 8, 2)
    .map(String::valueOf)
    .collect(partitioningByUniqueness(Collectors.joining("-")))
    .forEach((isUnique, str) -> System.out.format("%s: %s%n", isUnique ? "unique" : "repetitive", str));
// repetitive: 1-2-3-2
// unique: 1-2-3-9-5-4-8

By the way, the first implementation of the method can now be replaced with:

public static <T> Collector<T, ?, Map<Boolean, List<T>>> partitioningByUniqueness() {
    return partitioningByUniqueness(Collectors.toList());
}

9. Spliterator#

It's time to dig a little deeper into the Stream API. Stream elements not only need to be iterated, but also split into parts and sent to other threads. Spliterator is responsible for iterating and splitting. It even sounds like Iterator, but with the prefix "Split".

Interface methods:

9.1. Characteristics#

In the sorted and distinct methods, it was mentioned that if a stream is marked as sorted or containing unique elements, the corresponding operations will not be performed. The characteristics of the spliterator influence this behavior.

Of course, characteristics can be changed when executing a chain of operators. For example, the SORTED characteristic is added after the sorted operator, the SIZED characteristic is removed after the filter, etc.

9.2. Spliterator lifecycle#

To understand when and how a spliterator calls a particular method, let's create a wrapper that logs all calls. To create a stream from a spliterator, the StreamSupport class is used.

long count = StreamSupport.stream(
    Arrays.asList(0, 1, 2, 3).spliterator(), true)
    .count();

list-spliterator.png

The figure shows one possible way a spliterator works. The characteristics method returns ORDERED | SIZED | SUBSIZED characteristics everywhere, as the order matters in a List, and the number of elements and all split parts are also known. The trySplit method divides the sequence in half, but not every part is necessarily sent to a new thread. In a parallel stream, a new thread might not be created because everything is processed in the main thread. But in this case, the new thread managed to process the parts before the main thread did.

Spliterator<Integer> s = IntStream.range(0, 4)
    .boxed()
    .collect(Collectors.toSet())
    .spliterator();
long count = StreamSupport.stream(s, true).count();

Here the spliterator has the characteristics SIZED | DISTINCT, but for each part the SIZED characteristic is lost, leaving only DISTINCT, because it's not possible to split a set so that the size of each part is known.
In the case of a Set, there were three calls to trySplit. The first one supposedly divided the elements equally. After two others, each of the parts returned estimateSize: 1. However, in all but one case, the attempt to call tryAdvance was unsuccessful, as it returned false. But on one of the portions, which also returned an estimateSize of 1, there were four successful calls to tryAdvance. This confirms that estimateSize doesn't necessarily have to return the actual number of elements.

Arrays.spliterator(new int[] {0, 1, 2, 3});
Stream.of(0, 1, 2, 3).spliterator();

The situation is similar to the List example, but the characteristics returned ORDERED | SIZED | SUBSIZED | IMMUTABLE.

Stream.of(0, 1, 2, 3).distinct().spliterator();

Here, trySplit returned null, meaning it wasn't possible to split the sequence. The call hierarchy:

[main] characteristics: ORDERED | DISTINCT
[main] estimateSize: 4
[main] trySplit: null
[main] characteristics: ORDERED | DISTINCT
[main] tryAdvance: true
[main] tryAdvance: true
[main] tryAdvance: true
[main] tryAdvance: true
[main] tryAdvance: false
count: 4
Stream.of(0, 1, 2, 3)
    .distinct()
    .map(x -> x + 1)
    .spliterator();

Everything is the same as above, but now, after applying the map operator, the DISTINCT flag has disappeared.

9.3. Implementing a spliterator#

To correctly implement a spliterator, you need to think about how to split the elements and define the characteristics of the stream. Let's write a spliterator that generates a Fibonacci sequence.
To simplify the task, we will know the maximum number of elements to generate. This means we can split the sequence in half and then quickly calculate the required numbers based on the new index.
We only need to determine the characteristics. We have already assumed that the size of the sequence will be known, so the size of each split part will also be known. The order will matter, so the ORDERED flag is necessary. The Fibonacci sequence is also sorted, as each subsequent element is always greater than or equal to the previous one.
But with the DISTINCT flag, it seems we have a problem. The sequence 0 1 1 2 3 has repeating 1, so we can't have this flag, right? Well, nothing prevents us from calculating the flags automatically. If a part of the sequence doesn't affect the initial indices, this flag can be set:

int distinct = (index >= 2) ? DISTINCT : 0;
return ORDERED | distinct | SIZED | SUBSIZED | IMMUTABLE | NONNULL;

Full class implementation:

import java.math.BigInteger;
import java.util.Spliterator;
import java.util.function.Consumer;

public class FibonacciSpliterator implements Spliterator<BigInteger> {

    private final int fence;
    private int index;
    private BigInteger a, b;

    public FibonacciSpliterator(int fence) {
        this(0, fence);
    }

    protected FibonacciSpliterator(int start, int fence) {
        this.index = start;
        this.fence = fence;
        recalculateNumbers(start);
    }

    private void recalculateNumbers(int start) {
        a = fastFibonacciDoubling(start);
        b = fastFibonacciDoubling(start + 1);
    }

    @Override
    public boolean tryAdvance(Consumer<? super BigInteger> action) {
        if (index >= fence) {
            return false;
        }
        action.accept(a);
        BigInteger c = a.add(b);
        a = b;
        b = c;
        index++;
        return true;
    }

    @Override
    public FibonacciSpliterator trySplit() {
        int lo = index;
        int mid = (lo + fence) >>> 1;
        if (lo >= mid) {
            return null;
        }
        index = mid;
        recalculateNumbers(mid);
        return new FibonacciSpliterator(lo, mid);
    }

    @Override
    public long estimateSize() {
        return fence - index;
    }

    @Override
    public int characteristics() {
        int distinct = (index >= 2) ? DISTINCT : 0;
        return ORDERED | distinct | SIZED | SUBSIZED | IMMUTABLE | NONNULL;
    }

    /*
     * https://www.nayuki.io/page/fast-fibonacci-algorithms
     */
    public static BigInteger fastFibonacciDoubling(int n) {
        BigInteger a = BigInteger.ZERO;
        BigInteger b = BigInteger.ONE;
        for (int bit = Integer.highestOneBit(n); bit != 0; bit >>>= 1) {
            BigInteger d = a.multiply(b.shiftLeft(1).subtract(a));
            BigInteger e = a.multiply(a).add(b.multiply(b));
            a = d;
            b = e;
            if ((n & bit) != 0) {
                BigInteger c = a.add(b);
                a = b;
                b = c;
            }
        }
        return a;
    }
}

Here's how the elements of a parallel stream are now split:

StreamSupport.stream(new FibonacciSpliterator(7), true)
    .count();

fibonaccispliterator.png

StreamSupport.stream(new FibonacciSpliterator(500), true)
    .count();

fibonaccispliterator500.png

10. Other ways to create sources#

Creating a stream from a spliterator is the most efficient way to create a stream, but there are other methods as well.

10.1. Stream from an iterator#

By using the Spliterators class, you can convert any iterator into a spliterator. Here's an example of creating a stream from an iterator that generates an infinite Fibonacci sequence.

public class FibonacciIterator implements Iterator<BigInteger> {

    private BigInteger a = BigInteger.ZERO;
    private BigInteger b = BigInteger.ONE;

    @Override
    public boolean hasNext() {
        return true;
    }

    @Override
    public BigInteger next() {
        BigInteger result = a;
        a = b;
        b = result.add(b);
        return result;
    }
}

StreamSupport.stream(
    Spliterators.spliteratorUnknownSize(
        new FibonacciIterator(),
        Spliterator.ORDERED | Spliterator.SORTED),
    false /* is parallel*/)
    .limit(10)
    .forEach(System.out::println);

10.2. Stream.iterate + map#

You can use two operators: iterate and map, to create the same Fibonacci sequence.

Stream.iterate(
    new BigInteger[] { BigInteger.ZERO, BigInteger.ONE },
    t -> new BigInteger[] { t[1], t[0].add(t[1]) })
    .map(t -> t[0])
    .limit(10)
    .forEach(System.out::println);

For convenience, you can wrap everything in a method and call fibonacciStream().limit(10).forEach(...).

10.3. IntStream.range + map#

Another flexible and convenient way to create a stream. If you have data that can be accessed by index, you can create a numeric range using the range operator and then use map or mapToObj to access the data element by element.

IntStream.range(0, 200)
    .mapToObj(i -> fibonacci(i))
    .forEach(System.out::println);

JSONArray arr = ...
IntStream.range(0, arr.length())
    .mapToObj(JSONArray::getJSONObject)
    .map(obj -> ...)
    .forEach(System.out::println);

11. Examples#

Before moving on to more real-life examples, it's worth mentioning that if the code is already written without streams and works well, there's no need to hastily rewrite everything. There're also situations where it's not possible to implement a task elegantly using the Stream API. In such cases, accept it and don't force streams where they don't fit.

Given an array of arguments, we need to create a Map where each key corresponds to its value.

String[] arguments = {"-i", "in.txt", "--limit", "40", "-d", "1", "-o", "out.txt"};
Map<String, String> argsMap = new LinkedHashMap<>(arguments.length / 2);
for (int i = 0; i < arguments.length; i += 2) {
    argsMap.put(arguments[i], arguments[i + 1]);
}
argsMap.forEach((key, value) -> System.out.format("%s: %s%n", key, value));
// -i: in.txt
// --limit: 40
// -d: 1
// -o: out.txt

Quick and clear. But for the reverse task, converting a Map of arguments into an array of strings, streams will help:

String[] args = argsMap.entrySet().stream()
    .flatMap(e -> Stream.of(e.getKey(), e.getValue()))
    .toArray(String[]::new);
System.out.println(String.join(" ", args));
// -i in.txt --limit 40 -d 1 -o out.txt

Given a list of students:

List<Student> students = List.of(
    new Student("Alex", Speciality.Physics, 1),
    new Student("Rika", Speciality.Biology, 4),
    new Student("Julia", Speciality.Biology, 2),
    new Student("Steve", Speciality.History, 4),
    new Student("Mike", Speciality.Finance, 1),
    new Student("Hinata", Speciality.Biology, 2),
    new Student("Richard", Speciality.History, 1),
    new Student("Kate", Speciality.Psychology, 2),
    new Student("Sergey", Speciality.ComputerScience, 4),
    new Student("Maximilian", Speciality.ComputerScience, 3),
    new Student("Tim", Speciality.ComputerScience, 5),
    new Student("Ann", Speciality.Psychology, 1)
);

enum Speciality {
    Biology, ComputerScience, Economics, Finance,
    History, Philosophy, Physics, Psychology
}

The Student class has all getters and setters, toString, and equals+hashCode implemented.

We need to group all students by their course.

students.stream()
    .collect(Collectors.groupingBy(Student::getYear))
    .entrySet().forEach(System.out::println);
// 1=[Alex: Physics 1, Mike: Finance 1, Richard: History 1, Ann: Psychology 1]
// 2=[Julia: Biology 2, Hinata: Biology 2, Kate: Psychology 2]
// 3=[Maximilian: ComputerScience 3]
// 4=[Rika: Biology 4, Steve: History 4, Sergey: ComputerScience 4]
// 5=[Tim: ComputerScience 5]

Display a list of specialties in alphabetical order in which the students in the list are studying.

students.stream()
    .map(Student::getSpeciality)
    .distinct()
    .sorted(Comparator.comparing(Enum::name))
    .forEach(System.out::println);
// Biology
// ComputerScience
// Finance
// History
// Physics
// Psychology

Display the number of students in each specialty.

students.stream()
    .collect(Collectors.groupingBy(
            Student::getSpeciality, Collectors.counting()))
    .forEach((s, count) -> System.out.println(s + ": " + count));
// Psychology: 2
// Physics: 1
// ComputerScience: 3
// Finance: 1
// Biology: 3
// History: 2

Group students by specialty, maintaining the alphabetical order of specialties, and then group by course.

Map<Speciality, Map<Integer, List<Student>>> result = students.stream()
    .sorted(Comparator
            .comparing(Student::getSpeciality, Comparator.comparing(Enum::name))
            .thenComparing(Student::getYear)
    )
    .collect(Collectors.groupingBy(
            Student::getSpeciality,
            LinkedHashMap::new,
            Collectors.groupingBy(Student::getYear)));

Now, display this nicely.

result.forEach((s, map) -> {
    System.out.println("-= " + s + " =-");
    map.forEach((year, list) -> System.out.format("%d: %s%n", year, list.stream()
            .map(Student::getName)
            .sorted()
            .collect(Collectors.joining(", ")))
    );
    System.out.println();
});
-= Biology =-
2: Hinata, Julia
4: Rika

-= ComputerScience =-
3: Maximilian
4: Sergey
5: Tim

-= Finance =-
1: Mike

-= History =-
1: Richard
4: Steve

-= Physics =-
1: Alex

-= Psychology =-
1: Ann
2: Kate

Check if there are any third-year students among all specialties except physics and computer science.

students.stream()
    .filter(s -> !EnumSet.of(Speciality.ComputerScience, Speciality.Physics)
        .contains(s.getSpeciality()))
    .anyMatch(s -> s.getYear() == 3);
// false

Calculate the number Pi using the Monte Carlo method.

final Random rnd = new Random();
final double r = 1000.0;
final int max = 10000000;
long count = IntStream.range(0, max)
    .mapToObj(i -> rnd.doubles(2).map(x -> x * r).toArray())
    .parallel()
    .filter(arr -> Math.hypot(arr[0], arr[1]) <= r)
    .count();
System.out.println(4.0 * count / max);
// 3.1415344

Display a multiplication table.

IntStream.rangeClosed(2, 9)
    .boxed()
    .flatMap(i -> IntStream.rangeClosed(2, 9)
        .mapToObj(j -> String.format("%d * %d = %d", i, j, i * j))
    )
    .forEach(System.out::println);
// 2 * 2 = 4
// 2 * 3 = 6
// 2 * 4 = 8
// 2 * 5 = 10
// ...
// 9 * 7 = 63
// 9 * 8 = 72
// 9 * 9 = 81

Or a more exotic version, in 4 columns, like in school notebooks.

IntFunction<IntFunction<String>> function = i -> j -> String.format("%d x %2d = %2d", i, j, i * j);
IntFunction<IntFunction<IntFunction<String>>> repeaterX = count -> i -> j ->
        IntStream.range(0, count)
                .mapToObj(delta -> function.apply(i + delta).apply(j))
                .collect(Collectors.joining("\t"));
IntFunction<IntFunction<IntFunction<IntFunction<String>>>> repeaterY = countY -> countX -> i -> j ->
        IntStream.range(0, countY)
                .mapToObj(deltaY -> repeaterX.apply(countX).apply(i).apply(j + deltaY))
                .collect(Collectors.joining("\n"));
IntFunction<String> row = i -> repeaterY.apply(10).apply(4).apply(i).apply(1) + "\n";
IntStream.of(2, 6).mapToObj(row).forEach(System.out::println);

multable.jpg

But of course, this is a joke. No one is forcing you to write such code. 😬

12. Tasks#

IntStream.concat(
        IntStream.range(2, /*!interactive answer='6'*/),
        IntStream.rangeClosed(/*!interactive answer='-1'*/, /*!interactive answer='2'*/))
    .forEach(System.out::println);
// 2, 3, 4, 5, -1, 0, 1, 2

IntStream.range(5, 30)
    .limit(12)
    .skip(3)
    .limit(6)
    .skip(2)
    .forEach(System.out::println);
// !interactive answer='10', !interactive answer='11', !interactive answer='12', !interactive answer='13'

IntStream.range(0, 10)
    .skip(2)
    .dropWhile(x -> x < /*!interactive answer='5'*/)
    .limit(/*!interactive answer='3'*/)
    .forEach(System.out::println);
// 5, 6, 7

IntStream.range(0, 10)
    .skip(/*!interactive answer='3'*/)
    .takeWhile(x -> x < /*!interactive answer='5'*/)
    .limit(3)
    .forEach(System.out::println);
// 3, 4

IntStream.range(1, 5)
    .flatMap(i -> IntStream.generate(() -> /*!interactive answer='i'*/)./*!interactive size='5' answer='limit'*/(/*!interactive answer='i'*/))
    .forEach(System.out::println);
// 1, 2, 2, 3, 3, 3, 4, 4, 4, 4

int x = IntStream.range(-2, 2)
    .map(i -> i * /*!interactive answer='5'*/)
    .reduce(10, Integer::sum);
// x: 0

IntStream.range(0, 10)
    .boxed()
    .collect(Collectors./*!interactive size='12' answer='partitioningBy'*/(i -> /*!interactive size='10' answer='i % 2 == 0'*/))
    .entrySet().forEach(System.out::println);
// false=[1, 3, 5, 7, 9]
// true=[0, 2, 4, 6, 8]

IntStream.range(-5, 0)
    .flatMap(i -> IntStream.of(i, /*!interactive answer='-i'*/))
    ./*!interactive size='7' answer='sorted'*/()
    .forEach(System.out::println);
// -5, -4, -3, -2, -1, 1, 2, 3, 4, 5

IntStream.range(-5, 0)
    .flatMap(i -> IntStream.of(i, /*!interactive answer='-i'*/))
    ./*!interactive size='5' answer='boxed'*/()
    .sorted(Comparator.comparing(Math::/*!interactive size='5' answer='abs'*/))
    .forEach(System.out::println);
// -1, 1, -2, 2, -3, 3, -4, 4, -5, 5

IntStream.range(1, 5)
    .flatMap(i -> IntStream.generate(() -> i).limit(i))
    .boxed()
    .collect(Collectors.groupingBy(Function.identity(), Collectors./*!interactive size='8' answer='counting'*/()))
    .entrySet().forEach(System.out::println);
// 1=1
// 2=2
// 3=3
// 4=4

13. Tips and best practices#

  1. If you can't solve a problem elegantly with streams, don't solve it with streams.

  2. If you can't solve a problem elegantly with streams, don't solve it with streams!

  3. If the problem is already elegantly solved without streams, everything works, and everyone is satisfied, don't rewrite it with streams!

  4. In most cases, there's no need to store a stream to a variable. Use method chaining instead.

    // Hard to read
    Stream<Integer> stream = list.stream();
    stream = stream.filter(x -> x > 2);
    stream.forEach(System.out::println);
    // Better
    list.stream()
        .filter(x -> x > 2)
        .forEach(System.out::println);
    
  5. Try to filter the stream of unnecessary elements or limit it first, and then perform transformations.

    // Inefficient
    list.stream()
        .sorted()
        .filter(x -> x > 0)
        .forEach(System.out::println);
    // Better
    list.stream()
        .filter(x -> x > 0)
        .sorted()
        .forEach(System.out::println);
    
  6. Don't use parallel streams everywhere. The overhead of splitting elements, processing them in another thread, and then merging them can sometimes outweigh the benefits of single-threaded execution. Read more about it here: When to use parallel streams.

  7. When using parallel streams, make sure there are no blocking operations or anything that could interfere with element processing.

    list.parallelStream()
        .filter(s -> isFileExists(hash(s)))
        ...
    
  8. If somewhere in the model you return a copy of a list or another collection, consider replacing it with streams. For example:

    // Before
    class Model {
        private final List<String> data;
    
        public List<String> getData() {
            return new ArrayList<>(data);
        }
    }
    
    // After
    class Model {
        private final List<String> data;
    
        public Stream<String> dataStream() {
            return data.stream();
        }
    }
    

Now you have the option to get not only a list model.dataStream().collect(toList());, but also a set, any other collection, filter something, sort, and so on. The original List<String> data remains untouched.