My impression with python multiprocessing is that when you create a new process with multiprocessing.Process(), it creates an entire copy of your current program in memory and continues working from there. With that in mind, I'm confused by the behaviour of the following script.
WARNING: This script will allocate a large amount of memory! Run it with caution!
import multiprocessing
import numpy as np
from time import sleep
#Declare a dictionary globally
bigDict = {}
def sharedMemory():
#Using numpy, store 1GB of random data
for i in xrange(1000):
bigDict[i] = np.random.random((125000))
bigDict[0] = "Known information"
#In System Monitor, 1GB of memory is being used
sleep(5)
#Start 4 processes - each should get a copy of the 1GB dict
for _ in xrange(4):
p = multiprocessing.Process(target=workerProcess)
p.start()
print "Done"
def workerProcess():
#Sleep - only 1GB of memory is being used, not the expected 4GB
sleep(5)
#Each process has access to the dictionary, even though the memory is shared
print multiprocessing.current_process().pid,bigDict[0]
if __name__ == "__main__":
sharedMemory()
The above program illustrates my confusion - it seems like the dict automatically becomes shared between the processes. I thought to get that behaviour I had to use a multiprocessing manager. Could someone explain what is going on?