hadoop - Pyspark es.query only works when default -


in pypspark way can data returned es leaving es.query default. why this?

es_query = {"match" : {"key" : "value"}} es_conf = {"es.nodes" : "localhost", "es.resource" : "index/type", "es.query" : json.dumps(es_query)} rdd = sc.newapihadooprdd(inputformatclass="org.elasticsearch.hadoop.mr.esinputformat",keyclass="org.apache.hadoop.io.nullwritable",valueclass="org.elasticsearch.hadoop.mr.linkedmapwritable", conf=es_conf) ... rdd.count() 0 rdd.first() valueerror: rdd empty 

yet query (the default) seems work

es_query = {"match_all" : {}} ... rdd.first() (u'2017-01-01 23:59:59) 

*i have tested queries directly querying elastic search , work wrong spark/es-hadoop.


Comments

Popular posts from this blog

ZeroMQ on Windows, with Qt Creator -

unity3d - Unity SceneManager.LoadScene quits application -

python - Error while using APScheduler: 'NoneType' object has no attribute 'now' -