scala - Integrate BigQuery with Spark -


how can connect spark google's bigquery?

i imagine 1 use spark's jdbc functionality communicate bigquery.

but jdbc driver found starschema old.

if answer involves jdbc should url parameter like?

from spark docs:

  rdd.todf.write.format("jdbc").options(map(     "url" -> "jdbc:postgresql:dbserver",     "dbtable" -> "schema.tablename"   )) 

you can use bigquery connector hadoop (which works spark): https://cloud.google.com/hadoop/bigquery-connector

if use google cloud dataproc (https://cloud.google.com/dataproc/) deploy spark cluster, bigquery connector (as gcs connector) automatically deployed , configured out of box.

but can add connector existing spark deployment, whether runs on google cloud or anywhere else. if cluster not deployed on google cloud you'll have configure authentication (using service-account "keyfile" authentication).

[added] answer other question (dataproc + bigquery examples - available?) provides example of using bigquery spark.


Comments

Popular posts from this blog

vb.net - How to ignore if a cell is empty nothing -

Sort a complex associative array in PHP -

recursion - Can every recursive algorithm be improved with dynamic programming? -