Pyspark : is there any way to use Unix exported variables in python code -

April 15, 2014

i looking solution unix exported variables in pyspark code when running on cluster mode. got solution using os.getenv not working in cluster mode me. in local mode working fine.

is there other way replaces complete set of variables in 1 go. passing n number of arguments , reading them bit overhead.

for cluster mode should use spark configuration named spark.yarn.appmasterenv.xxx pass variables driver. can find more information in documentation.

but, woudn't environment variables best way deal parameters. did head optparse library? shipped python distributions (2.x , 3.x) , allows parse parameters in quite easy way, like:

from optparse import optionparser parser = optionparser() parser.add_option("-i", "--input-file", default="hdfs:///some/input/file.csv") parser.add_option("-o", "--output-dir", default= "hdfs:///tmp/output") parser.add_option("-l", "--log-level", default='info')  (options, args) = parser.parse_args()

Search This Blog

Enable

Pyspark : is there any way to use Unix exported variables in python code -

Comments

Post a Comment

Popular posts from this blog

javascript - How to bind ViewModel Store to View? -

recursion - Can every recursive algorithm be improved with dynamic programming? -

c - Why does alarm() cause fgets() to stop waiting? -