Spark HiveContext : Spark Engine OR Hive Engine? -
i trying understand spark hivecontext
. when write query using hivecontext
sqlcontext=new hivecontext(sc) sqlcontext.sql("select * tablea inner join tableb on ( a=b) ")
is using spark engine or hive engine?? believe above query executed spark engine. if thats case why need dataframes?
we can blindly copy hive queries in sqlcontext.sql("")
, run without using dataframes.
by dataframes, mean tablea.join(tableb, === b)
can perform aggregation using sql commands. 1 please clarify concept? if there advantage of using dataframe joins rather sqlcontext.sql()
join? join example. :)
the spark hivecontext uses spark execution engine underneath see spark code.
parser support in spark pluggable, hivecontext uses spark's hivequery parser.
functionally can sql , dataframes not needed. dataframes provided convenient way achieve same results. user doesn't need write sql statement.
Comments
Post a Comment