org.infinispan.distexec.mapreduce
Class MapReduceTask<K,V,T,R>

java.lang.Object
  extended by org.infinispan.distexec.AbstractDistributedTask<K,V,T,R>
      extended by org.infinispan.distexec.mapreduce.MapReduceTask<K,V,T,R>

public class MapReduceTask<K,V,T,R>
extends AbstractDistributedTask<K,V,T,R>

MapReduceTask is a distributed task which allows a large scale computation to be transparently parallelized across Infinispan cluster nodes.

Users of MapReduceTask should provide name of the cache whose data is used as input for this task. Infinispan execution environment will instantiate and migrate instances of provided mappers and reducers seamlessly across Infinispan nodes.

Unless otherwise specified using onKeys input keys filter all available key value pairs of a specified cache will be used as input data for this task In a nutshell, map reduce task is executed in following fashion:

 On each Infinispan node:

 mapped = list() 
 for entry in cache.entries: 
    t = mapper.map(entry.key, entry.value)
    mapped.add(t)
 
 r = null 
 for t in mapped: 
    r = reducer.reduce(t, r)
 return r to Infinispan node that invoked the task
 
 On Infinispan node invoking this task: 
 reduced_results = invoke map reduce task on all nodes, retrieve map{address:result} 
 for r in reduced_results.entries: 
    remote_address = r.key 
    remote_reduced_result = r.value
    collator.add(remote_address, remote_reduced_result)

 return collator.collate()
 

Since:
5.0
Author:
Manik Surtani, Vladimir Blagojevic

Constructor Summary
MapReduceTask(Cache<K,V> cache)
           
 
Method Summary
 R collate(Collator<R> mapper)
          Specifies collator to use for this MapReduceTask and returns a result of this task's computation
 Future<R> collateAsynchronously(Collator<R> mapper)
          Specifies collator to use for this MapReduceTask and returns a result of this task's computation asynchronously
 MapReduceTask<K,V,T,R> mappedWith(Mapper<K,V,T> mapper)
          Specifies mapper to use for this MapReduceTask
 MapReduceTask<K,V,T,R> onKeys(K... input)
          Rather than use all available keys as input onKeys allows users to specify a subset of keys as input to this task
 MapReduceTask<K,V,T,R> reducedWith(Reducer<R,T> reducer)
          Specifies reducer to use for this MapReduceTask
 
Methods inherited from class org.infinispan.distexec.AbstractDistributedTask
executionMap, getCacheName, setCancelationPolicy, setExecutionNodeSplittingPolicy, setFailOverPolicy
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MapReduceTask

public MapReduceTask(Cache<K,V> cache)
Method Detail

onKeys

public MapReduceTask<K,V,T,R> onKeys(K... input)
Rather than use all available keys as input onKeys allows users to specify a subset of keys as input to this task

Parameters:
input - input keys for this task
Returns:
this task

mappedWith

public MapReduceTask<K,V,T,R> mappedWith(Mapper<K,V,T> mapper)
Specifies mapper to use for this MapReduceTask

Parameters:
mapper -
Returns:

reducedWith

public MapReduceTask<K,V,T,R> reducedWith(Reducer<R,T> reducer)
Specifies reducer to use for this MapReduceTask

Parameters:
reducer -
Returns:

collate

public R collate(Collator<R> mapper)
Specifies collator to use for this MapReduceTask and returns a result of this task's computation

Parameters:
mapper -
Returns:

collateAsynchronously

public Future<R> collateAsynchronously(Collator<R> mapper)
Specifies collator to use for this MapReduceTask and returns a result of this task's computation asynchronously

Parameters:
mapper -
Returns:


Copyright © 2011 JBoss, a division of Red Hat. All Rights Reserved.