Spark-Scala quote issue -
i have input data in iso-8859-1 format. cedilla delimited file. data has double quote in it. converting file utf8 format. when doing so, spark inserting escape character , more quotes. can make sure quotes , escape character not added output?
sample input
xyzÇvib bros crane , big "tonyÇ1961-02-23Ç00:00:00
sample output
xyzÇ"vib bros crane , big \"tony"Ç1961-02-23Ç00:00:00
code
var inputformatdataframe = sparksession.sqlcontext.read .format("com.databricks.spark.csv") .option("delimiter", delimiter) .option("charset", input_format) .option("header", "false") .option("treatemptyvaluesasnulls","true") .option("nullvalue"," ") .option("quote","") .option("quotemode","none") //.option("escape","\"") .option("ignoreleadingwhitespace", "true") .option("ignoretrailingwhitespace", "true") .option("mode","failfast") .load(input_location) inputformatdataframe.write.mode("overwrite").option("delimiter", delimiter).option("charset", "utf-8").csv(output_location)
Comments
Post a Comment