If you just want to emulate pushing down a join, you can just wrap the IN
list query in a JDBCRelation directly:
scala> val r_df = spark.read.format("jdbc").option("url",
> "jdbc:h2:/tmp/testdb").option("dbtable", "R").load()
> r_df: org.apache.spark.sql.DataFrame = [A: int]
> scala> r_df.show
> +
2017-04-06 4:00 GMT+02:00 Michael Segel :
> Just out of curiosity, what would happen if you put your 10K values in to a
> temp table and then did a join against it?
The answer is predicates pushdown.
In my case I'm using this kind of query on JDBC table and IN predicate
is executed on DB in less
Just out of curiosity, what would happen if you put your 10K values in to a
temp table and then did a join against it?
> On Apr 5, 2017, at 4:30 PM, Maciej Bryński wrote:
>
> Hi,
> I'm trying to run queries with many values in IN operator.
>
> The result is that for more than 10K values IN op
l%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
--
View this
l%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
--
View this