So I have a question about RDD's persistence. Let's say I have an RDD that's persisted MEMORY_AND_DISK, and I know that I now have enough memory space cleared up that I can force the data on disk into memory. Is it possible to tell spark to re-evaluate the open RDD memory and move that information?
Essentially I'm running into an issue with my RDD where I persist it and the entire RDD doesn't end up in memory until I query the RDD multiple times. This makes the first few runs extremely slow. One thing I'm hoping to try is to initially set the RDD to MEMORY_AND_DISK and then force the disk data back into memory.