Escolar Documentos
Profissional Documentos
Cultura Documentos
Contribute
About About Purpose Exclusive updates on:
Development Architecture Process & Operations & Enterprise San Francisco 2014
Nov 3 - Nov 7
En | 中文 | 日本 | Fr | Br & Design Practices Infrastructure Architecture London 2015
942,295 Aug unique visitors Mar 2 - Mar 6
Mobile HTML5 JavaScript APM Agile Big Data Cloud REST All topics
From Imperative Programming to Fork/Join to Parallel Streams in
Java 8
Posted by
Raoul-Gabriel Urma & Mario Fusco
on
Feb 21, 2014
|
Java 8 brings many features that let you write code in a more concise way. For example, instead
of writing code as follows: Educational Content
All Articles Presentations Interviews Books Research
Collections.sort(transactions, new Comparator<Transaction>(){
public int compare(Transaction t1, Transaction t2){
return t1.getValue().compareTo(t2.getValue());
DevOps in Telecoms – Is It
}
}); Possible?
Joachim Bauernberger
Sep 30, 2014
you can now write the following more compact code that does the same thing but reads a lot
closer to the problem statement:
transactions.sort(comparing(Transaction::getValue));
Carol McEwan on the State
of the Scrum Alliance
The major features introduced by Java 8 are lambda expressions, method references and the new
Carol McEwan
Sep 30, 2014
Streams API. It is considered the largest language change since the advent of Java 20 years ago.
To find detailed practical examples of how you can benefit from these features refer to the book
Java 8 in Action: Lambdas, Streams and Functional-style programming written by the authors of
this article and Alan Mycroft.
Book Review: Pro Website
These features enable programmers to write more concise code, and additionally they let Development and
programmers benefit from multi-core architecture. In fact, writing programs that execute gracefully
Operations
in parallel is currently the preserve of Java specialists. However, thanks to its new Streams API,
Manuel Pais
Sep 29, 2014
Java 8 changes the game and lets everyone more easily write code that leverages multi-core
architecture.
Nico Bevacqua on Writing
Related Vendor Content Related Sponsor Modular JavaScript
Nicolas Bevacqua
Sep 29, 2014
Start your FREE TRIAL of AppDynamics
Pro
AppDynamics is the next-generation
5 OAuth Essentials for API Access Control
application performance management
Navigating the Agile Testing Tool solution that simplifies the management of
Let Me Graph That For You
Landscape
complex, business-critical apps.
Ian Robinson
Sep 29, 2014
5 Pillars of API Management
IBM Worklight compared to “do-it-
yourself” mobile platforms
An Unseen Interface
In this article we will compare different methods to compute the variance of a large data set using
Halle Winkler
Sep 28, 2014
1. An imperative style
2. The fork/join framework
3. The Streams API
The variance is used in statistics to measure how far a set of numbers is spread out. It can be
Older
calculated by averaging the squared difference from the mean of the set of numbers. For
example, given the numbers 40, 30, 50 and 80 representing the ages of a population, we can
calculate the variance by:
Sponsored Links
1. calculating the mean: (40 + 30 + 50 + 80) / 4 = 50
http://www.infoq.com/articles/forkjoin-to-parallel-streams[30/09/2014 22:16:07]
From Imperative Programming to Fork/Join to Parallel Streams in Java 8
2. taking the square difference from the mean of the set of numbers: (40-50)2 + (30-50)2
InfoQ Weekly Newsletter
+ (50-50)2 + (80-50)2 = 1400
Subscribe to our Weekly email newsletter to follow all
3. finally averaging it: 1400/4 = 350
new content on InfoQ
Imperative style
A typical imperative implementation of the variance formula is as follows:
Fork/Join framework
However, how would you write this implementation to execute on multiple-core architectures?
Should you use threads? Should they synchronise at some point? The fork/join framework
introduced in Java 7 alleviated some of these difficulties, so let’s try to develop a parallel version
of this algorithm using it.
@Override
protected Double compute() {
int length = end - start;
if (length <= THRESHOLD) {
return sequentialCalculator.computeSequentially(numbers, start, end);
}
ForkJoinCalculator leftTask = new ForkJoinCalculator(numbers, start, start +
length/2, sequentialCalculator);
leftTask.fork();
ForkJoinCalculator rightTask = new ForkJoinCalculator(numbers, start + length/2, end,
sequentialCalculator);
Double rightResult = rightTask.compute();
Double leftResult = leftTask.join();
return leftResult + rightResult;
}
}
Here we develop a RecursiveTask splitting an array of doubles until the length of a subarray
doesn’t go below a given threshold. At this point the subarray is processed sequentially applying
on it the operation defined by the following interface.
http://www.infoq.com/articles/forkjoin-to-parallel-streams[30/09/2014 22:16:07]
From Imperative Programming to Fork/Join to Parallel Streams in Java 8
The bottom line is that, even with the help of the fork/join framework, the parallel version is
significantly harder to write, and eventually debug, than its sequential counterpart.
Parallel Streams
Java 8 lets you achieve this in a different way. Instead of writing how a computation should be
implemented, you describe what it does in broad brush strokes using the Streams API. As a
result, the library can figure out how to implement the computation for you and make use of
various optimisations. This style is called declarative programming. In Java 8 specifically, a
parallel stream is designed to leverage a multi-core architecture. Let’s see how you can use them
to run our first attempt of calculating the variance in a faster way.
We assume that you have some familiarity with streams in this section. However as a refresher, a
Stream<T> is a sequence of elements T that support aggregate operations. You can use these
operations to create a pipeline which represents a computation just like a pipeline of UNIX
commands. A parallel stream is simply a stream that will execute the pipeline in parallel and can
be obtained by calling the method parallel() on a normal stream. To brush up on what a stream is,
refer to the Javadoc documentation.
The good news is that a few numeric operations such as max, min and average are built-in in the
Java 8 API. They can be accessed through primitive specialisations of a Stream: IntStream
(primitive int-valued elements), LongStream (primitive long-valued elements) and DoubleStream
(primitive double-valued elements). For example, you can simply create a range of numbers with
IntStream.rangeClosed(), calculate the maximum or minimum element in a stream using the
method max() and min().
Coming back to our initial problem, we would like to use these operations to calculate the variance
of a large population. The first step is to create a stream from the population array. We can
achieve this using the Arrays.stream() static method:
The next step is to calculate the variance which makes use of the average. Each element of the
population needs first to have the average subtracted from it and the result squared. This can be
viewed as a map operation which transforms each element into another one using a lambda
expression (double p) -> (p - average) * (p - average). Once this is done we can calculate the sum
of all resulting elements by calling the method sum().
But not so fast! Streams can only be consumed once. If we re-use populationStream we will get
http://www.infoq.com/articles/forkjoin-to-parallel-streams[30/09/2014 22:16:07]
From Imperative Programming to Fork/Join to Parallel Streams in Java 8
By making use of built-in operations in the Streams API we’ve rewritten our initial imperative style
code in a declarative and concise way which reads almost like the mathematical definition of the
variance. Let’s now explore the performance of the three versions of our implementation.
Benchmark
We wrote the three versions of our variance algorithm in very different styles. The streams version
is the most concise and is written declaratively, which allows the library to decide on an adequate
implementation and leverage the multi-core infrastructure. However, you may wonder how they
perform? To find out let’s create a benchmark to see how the different versions compare. We
calculate the variance of a population of 30 million random numbers between 1 and 140. We used
jmh to investigate the performance of each version. Jmh is a Java harness supported by
OpenJDK. You can run the benchmark yourself by cloning the project from GitHub.
The benchmark was run on a Macbook Pro 2.3 GHz quad-core Intel Core i7, with 16 GB 1600
MHz DDR3. In addition, we used the following version of JDK8:
The results are illustrated in the histogram below. The imperative version took 60ms, the fork/join
version 22ms and the streams version 46ms.
These numbers should be treated with caution. It’s likely that you will get very different
performance if you run the test on a 32-bit JVM for example. However, it is interesting to notice
that adopting a different programming style using the Streams API in Java 8 opens the door for
optimisations behind the scenes that are not possible in a strictly imperative style and in a much
more straightforward way than is possible with fork/join.
Mario Fusco is a senior software engineer at Red Hat working at the development
of the core of Drools, the JBoss rules engine. He has extensive experience as a
Java developer, having been involved in (and often leading) many enterprise level
projects in industries ranging from media companies to the financial sector.
Among his interests are functional programming and domain specific languages.
http://www.infoq.com/articles/forkjoin-to-parallel-streams[30/09/2014 22:16:07]
From Imperative Programming to Fork/Join to Parallel Streams in Java 8
By leveraging these two passions, he created the open source library lambdaj with
the purposes of providing an internal Java DSL for manipulating collections and for allowing a bit
of functional programming in Java. Twitter: @mariofusco.
Post Message
Community comments
Compensated sum
by
Paul Sandoz
Posted
Excellent article. Seldom do I see someone comparing sequential/parallel together. The
Oracle developers seem focused only on getting something to work, not how well it works.
Perhaps I can shine a light on why the basic F/J is so much faster than the streams version.
The streams do not do parallel processing, they do paraquential,
coopsoft.com/ar/Calamity2Article.html#para
Since F/J is essentially a failure it is necessary to switch to sequential to avoid stack overflows
and out of memory errors.
Yes, I wrote the article. I've been doing parallel applications for several decades and I pointed
out the problems with F/J years ago to no avail.
ed
Reply Back to top
It turns out you don't actually need to know the average before streaming in order to calculate
the variance. As explained in the wikipedia article Algorithms for calculating variance, there are
mathematical approaches that let you calculate the variance in a single pass, and they can
also be modified to run in parallel. Here is a small snippet that makes use of collect() to do so:
http://www.infoq.com/articles/forkjoin-to-parallel-streams[30/09/2014 22:16:07]
From Imperative Programming to Fork/Join to Parallel Streams in Java 8
An added benefit of this approach is that the input argument doesn't need to be an array, it
could be an iterator or stream; there is no need to keep all values in memory at once.
Reply Back to top
The emphasis in the article as I see it is more on parallel computing and readability. I agree
there are better algorithms, however I think you are missing the point.
Reply Back to top
I did understand that the focus of the article wasn't on how to calculate variance, that was kind
of obvious from the title. :-) And if the article had at least mentioned that the implementation
presented was very naive and pointed out that much better approaches exist, that would have
been fine. I agree that a proper implementation of calculating variance would have made the
article harder to to read.
However, as the article currently stands, a reader might believe that the implementations
shown here are suitable for real use. Which sadly is not the case, for reasons listed in the
wikipedia article I link to.
i got different results from my corei3 laptop, here below on 30M records :
average :49.99299774874399
Sequential variance : 833.454385192641
http://www.infoq.com/articles/forkjoin-to-parallel-streams[30/09/2014 22:16:07]
From Imperative Programming to Fork/Join to Parallel Streams in Java 8
Time : 160ms
average :49.992997748747015
Fork variance : 833.4543851928006
Time : 237ms
average :49.99299774874777
Stream variance : 833.4543851928023
Time : 273ms
small correction to stream variance calculation above using collect and results : (variance2
from Markus Krüger really ROCKS!!!)
---------------
average :50.00379441649668
Sequential variance : 833.6599070601363
Time : 158ms
average :50.003794416505926
Fork variance : 833.6599070600581
Time : 236ms
average :50.00379441650572
Stream variance : 833.6599070600663
Time : 274ms
average :50.00379441650572
Stream variance2 : 833.6599070600663
Time : 68ms
---------------
http://www.infoq.com/articles/forkjoin-to-parallel-streams[30/09/2014 22:16:07]
From Imperative Programming to Fork/Join to Parallel Streams in Java 8
Interesting results, thanks for sharing. However, for a fair comparison of runtimes, you should
modify sequential variance and fork variance as well to use a one-pass algorithm. Here are a
couple of implementations you could use:
http://www.infoq.com/articles/forkjoin-to-parallel-streams[30/09/2014 22:16:07]
From Imperative Programming to Fork/Join to Parallel Streams in Java 8
return tmpA;
}
}
VarianceTmp tmp =
new ForkJoinPool().invoke(new VarianceTask(0, values.length));
return (tmp.count <= 1) ? 0 : tmp.m2 / (tmp.count - 1);
}
Also note that my previous way of combining means was imprecise. I've replaced
with
in the fork/join code, you should do the same for the stream code.
Reply Back to top
The preceding code contains a precision bug. Please replace
with
Compensated sum
by
Paul Sandoz
The reason for some of the differences between the F/J version and the stream version is that
the average operation performs a compensated sum (Kahan summation).
My guess is if you do that then the results will be much closer (my results indicate a parallel
uncompensated sum is about 2x faster than a compensated sum)
I thought i could optimize the average operation by removing some redundant computation, but
it appears the Kahan summation dominates. See the following issue for details:
http://www.infoq.com/articles/forkjoin-to-parallel-streams[30/09/2014 22:16:07]
From Imperative Programming to Fork/Join to Parallel Streams in Java 8
bugs.openjdk.java.net/browse/JDK-8035561
Oh, and one can safely ignore the first comment on this thread referring to paraquential, it's
completely wrong.
Reply Back to top
Not so fast. Paraquential doesn’t just mean a single submission of each Task with invoke(), it
also pertains to sequential processing anywhere parallel processing should be done.
10
Discuss
General Feedback Bugs Advertising Editorial InfoQ.com and all content copyright © 2006-2014 C4Media Inc.
feedback@infoq.com bugs@infoq.com sales@infoq.com editors@infoq.com InfoQ.com hosted at Contegix, the best ISP we've ever worked
with.
Privacy policy
http://www.infoq.com/articles/forkjoin-to-parallel-streams[30/09/2014 22:16:07]