scala - Integrate BigQuery with Spark -


how can connect spark google's bigquery?

i imagine 1 use spark's jdbc functionality communicate bigquery.

but jdbc driver found starschema old.

if answer involves jdbc should url parameter like?

from spark docs:

  rdd.todf.write.format("jdbc").options(map(     "url" -> "jdbc:postgresql:dbserver",     "dbtable" -> "schema.tablename"   )) 

you can use bigquery connector hadoop (which works spark): https://cloud.google.com/hadoop/bigquery-connector

if use google cloud dataproc (https://cloud.google.com/dataproc/) deploy spark cluster, bigquery connector (as gcs connector) automatically deployed , configured out of box.

but can add connector existing spark deployment, whether runs on google cloud or anywhere else. if cluster not deployed on google cloud you'll have configure authentication (using service-account "keyfile" authentication).

[added] answer other question (dataproc + bigquery examples - available?) provides example of using bigquery spark.


Comments

Popular posts from this blog

resizing Telegram inline keyboard -

command line - How can a Python program background itself? -

php - "cURL error 28: Resolving timed out" on Wordpress on Azure App Service on Linux -