I want to do something like that:
val myBigRdd2: RDD[_] = ???
myBigRdd1.mapPartition{ dataBlock =>
// operation involving dataBlock and an other RDD
// like myBigRdd2.multiply(dataBlock)
// if myBigRdd2 is a matrix. Or something similar.
}
is there a way of giving an RDD to the executor ?
I think Broadcast on rdd2 won't work because it is too big.
And doing collect and grouped on the rdd1 won't work either because the driver memory will blow up.
Is there any other way ?
cartesian work but takes forever.