Václav is a programming enthusiast who's constantly seeking ways to make development more effective and enjoyable. He's particularly interested in server-side Java technologies, distributed systems, concurrency, agile methodologies, modern programming languages and DSLs. He works for JetBrains as a senior software developer and a technology evangelist. He is also a board member of the JetBrains Academy. On the side, he's leading the GPars project, an opensource concurrency library, and investigates the domains of neural networks, evolutionary programming and data mining. You can check out his blog or follow him on twitter. [dzone] Václav is a DZone Zone Leader and has posted 45 posts at DZone. View Full User Profile

Parallelize your arrays with JSR 166y

08.26.2008
| 20655 views |
  • submit to reddit

The JSR-166y, which I described earlier in the post dedicated to the fork/join functionality is really amazing in how nicely it shields developers from dealing with threads. I've spent some time experimenting with another very handy capability of JSR-166y, which are Parallel Arrays.

In brief, they allow you to perform operations like filtering or calculations on elements stored in a collection in parallel. And what is important, very little knowledge of concurrent programming is required to use Parallel Arrays. All you need to do is to define a bunch of predicates and mapping operations and chain them so that they get called on your collection.

Internally all the actions are split into small tasks which are handed off to a thread pool for concurrent processing. So if you need to perform time-consuming operations on a collection of objects, Parallel Arrays will allow you to leverage the full power of your multi-core processor and complete the operation in the shortest possible time. Look at the following piece of code, which prints out names of all women in my little collection of people:

 //create a thread pool and store the collection in a Parallel Array
final ForkJoinPool pool = new ForkJoinPool(2);
final ParallelArray people = ParallelArray.createFromCopy(friends, pool);

//Create a predicate for filtering
final Ops.Predicate aWoman = new Ops.Predicate() {
public boolean op(final Person friend) {
return !friend.isMale();
}
};

//Create an operation to retrieve the name from an element
final Ops.Op retrieveName = new Ops.Op() {
public Object op(final Person friend) {
return friend.getName();
}
};

//Perform filtering and mapping concurrently on all elements of the collection
final ParallelArray namesOfWomen =
people.withFilter(aWoman).withMapping(retrieveName).all();
System.out.println("namesOfWomen = " + namesOfWomen);

 

You see, there's no thread manipulation or synchronization involved. Plain Java code without any tricks.

Let's make it more Groovy now

When combining the power of parallel arrays with Groovy closures, you can shorten the code considerably.

 

	final ForkJoinPool pool = new ForkJoinPool(2)
ParallelArray people = ParallelArray.createFromCopy(friends, pool)

namesOfWomen =
people.withFilter({!it.male} as Predicate).withMapping({it.name} as Op).all()
println "namesOfWomen = $namesOfWomen"

 

Of course, both my filter and my mapping operation are quick and don't really need parallelization, but if we wanted, for instance, to extract and modify photographs of some people, Parallel Arrays could halve the time needed for that operation on a dual-core processor. Parallel Arrays certainly pay off in situations when the operations demand more computing power.

Reposted from Vaclav Pech's Weblog

Published at DZone with permission of its author, Václav Pech.