mysql - Spark DataFrame InsertIntoJDBC - TableAlreadyExists Exception -
using spark 1.4.0, trying insert data spark dataframe memsql database (which should interacting mysql database) using insertintojdbc(). keep getting runtime tablealreadyexists exception.
first create memsql table this:
create table if not exists table1 (id int auto_increment primary key, val int);
then create simple dataframe in spark , try insert memsql this:
val df = sc.parallelize(array(123,234)).todf.todf("val") //df: org.apache.spark.sql.dataframe = [val: int] df.insertintojdbc("jdbc:mysql://172.17.01:3306/test?user=root", "table1", false) java.lang.runtimeexception: table table1 exists.
this solution applies general jdbc connections, although answer @wayne better solution memsql specifically.
insertintojdbc seems have been deprecated of 1.4.0, , using calls write.jdbc().
write() returns dataframewriter object. if want append data table have change save mode of object "append"
.
another issue example in question above dataframe schema didn't match schema of target table.
the code below gives working example spark shell. using spark-shell --driver-class-path mysql-connector-java-5.1.36-bin.jar
start spark-shell session.
import java.util.properties val prop = new properties() prop.put("user", "root") prop.put("password", "") val df = sc.parallelize(array((1,234), (2,1233))).todf.todf("id", "val") val dfwriter = df.write.mode("append") dfwriter.jdbc("jdbc:mysql://172.17.01:3306/test", "table1", prop)
Comments
Post a Comment