unix - Spark on Yarn memory(physical+virtual) usage -
i struggling understand how memory management works spark on yarn:
my spark-submit has
--executor-memory 48g --num-executors 2 when run top -p <pids_of_2_yarn_containers/executors>
virt res %mem 51.059g 0.015t ~4 (container 1) 51.039g 0.012t ~3 (container 2) the total memory of system 380g.
and finally, on yarn when click on each of containers page can see:
resource: 54272 memory (container 1) resource: 54272 memory (container 2) why each of above metrics not add up? requesting 48g on each spark executor, yarn shows 54g, os reports 15gb physical memory used(res column in top) , 51g virtual memory used(virt column).
in yarn-site.xml
yarn.scheduler.minimum-allocation-mb (this value changes based on cluster ram capacity) - minimum allocation every container request @ rm, in mbs. memory requests lower won't take effect, , specified value allocated @ minimum , max container size
yarn.scheduler.maximum-allocation-mb (this value changes based on cluster ram capacity) - maximum allocation every container request @ rm, in mbs. memory requests higher won't take effect, , capped value
yarn.nodemanager.resource.memory-mb - amount of physical memory, in mb, can allocated containers.
yarn.nodemanager.vmem-pmem-ratio - virtual memory (physical + paged memory) upper limit each map , reduce task determined virtual memory ratio each yarn container allowed. set following configuration, , default value 2.1
yarn.nodemanager.resource.cpu-vcores - property controls maximum sum of cores used containers on each node.
in mapred-site.xml
mapreduce.map.memory.mb - maximum memory each map task use.
mapreduce.reduce.memory.mb - maximum memory each reduce task use.
mapreduce.map.java.opts - jvm heap size map task.
mapreduce.reduce.java.opts - jvm heap size map task.
spark settings
the --executor-memory/spark.executor.memory controls executor heap size, jvms can use memory off heap, example interned strings , direct byte buffers. value of spark.yarn.executor.memoryoverhead property added executor memory determine full memory request yarn each executor. defaults max(384, .07 * spark.executor.memory)
the --num-executors command-line flag or spark.executor.instances configuration property control number of executors requested
so can specify values these parameters mentioned above. calculate memory allocations in case.
Comments
Post a Comment