Pyspark : is there any way to use Unix exported variables in python code -


i looking solution unix exported variables in pyspark code when running on cluster mode. got solution using os.getenv not working in cluster mode me. in local mode working fine.

is there other way replaces complete set of variables in 1 go. passing n number of arguments , reading them bit overhead.

for cluster mode should use spark configuration named spark.yarn.appmasterenv.xxx pass variables driver. can find more information in documentation.

but, woudn't environment variables best way deal parameters. did head optparse library? shipped python distributions (2.x , 3.x) , allows parse parameters in quite easy way, like:

from optparse import optionparser parser = optionparser() parser.add_option("-i", "--input-file", default="hdfs:///some/input/file.csv") parser.add_option("-o", "--output-dir", default= "hdfs:///tmp/output") parser.add_option("-l", "--log-level", default='info')  (options, args) = parser.parse_args() 

Comments

Popular posts from this blog

resizing Telegram inline keyboard -

command line - How can a Python program background itself? -

php - "cURL error 28: Resolving timed out" on Wordpress on Azure App Service on Linux -