Spark HiveContext : Spark Engine OR Hive Engine? -


i trying understand spark hivecontext. when write query using hivecontext

sqlcontext=new hivecontext(sc) sqlcontext.sql("select * tablea inner join tableb on ( a=b) ") 

is using spark engine or hive engine?? believe above query executed spark engine. if thats case why need dataframes?

we can blindly copy hive queries in sqlcontext.sql("") , run without using dataframes.

by dataframes, mean tablea.join(tableb, === b) can perform aggregation using sql commands. 1 please clarify concept? if there advantage of using dataframe joins rather sqlcontext.sql() join? join example. :)

the spark hivecontext uses spark execution engine underneath see spark code.

parser support in spark pluggable, hivecontext uses spark's hivequery parser.

functionally can sql , dataframes not needed. dataframes provided convenient way achieve same results. user doesn't need write sql statement.


Comments

Popular posts from this blog

ZeroMQ on Windows, with Qt Creator -

unity3d - Unity SceneManager.LoadScene quits application -

python - Error while using APScheduler: 'NoneType' object has no attribute 'now' -