When I run a spark job written with pyspark, I get a jvm running which has an Xmx1g setting I cannot seem to set. Here is ps aux output:
/usr/lib/jvm/jre/bin/java -cp /home/ec2-user/miniconda3/lib/python3.6/site-packages/pyspark/conf:/home/****/miniconda3/lib/python3.6/site-packages/pyspark/jars/* -Xmx1g org.apache.spark.deploy.SparkSubmit pyspark-shell
My question is, how do I set this property? I can set the master memory by using SPARK_DAEMON_MEMORY and SPARK_DRIVER_MEMORY but this doesn't affect pyspark's spawned process.
I already tried JAVA_OPTS or actually looking at the packages /bin files but couldn't understand where this is set.
Setting spark.driver.memory and spark.executor.memory in the job context itself didn't help as well.
Edit:
After moving to submit jobs with spark-submit (the code and infrastructure were evloved from standalone configuration) - everything was resolved. Submitting programmatically (using SparkConf) seems to override some of the cluster's setup.