Thanks again all.
Anyway as Nicola suggested I used the trench war approach to sort this out
by just using jars and working out their dependencies in ~/.ivy2/jars
directory using grep -lRi <missing> :)
This now works with just using jars (new added ones in grey) after
resolving the dependencies
${SPARK_HOME}/bin/spark-submit \
--master yarn \
--deploy-mode client \
--conf spark.executor.memoryOverhead=3000 \
--class org.apache.spark.repl.Main \
--name "my own Spark shell on Yarn" "$@" \
--driver-class-path /home/hduser/jars/ddhybrid.jar \
--jars /home/hduser/jars/spark-bigquery-latest.jar, \
/home/hduser/jars/ddhybrid.jar, \
/home/hduser/jars/com.google.http-client_google-http-client-1.24.1.jar, \
/home/hduser/jars/com.google.http-client_google-http-client-jackson2-1.24.1.jar,
\
/home/hduser/jars/com.google.cloud.bigdataoss_util-1.9.4.jar, \
/home/hduser/jars/com.google.api-client_google-api-client-1.24.1.jar, \
/home/hduser/jars/com.google.oauth-client_google-oauth-client-1.24.1.jar, \
/home/hduser/jars/com.google.apis_google-api-services-bigquery-v2-rev398-1.24.1.jar,
\
/home/hduser/jars/com.google.cloud.bigdataoss_bigquery-connector-0.13.4-hadoop2.jar,
\
/home/hduser/jars/spark-bigquery_2.11-0.2.6.jar \
Compared to using the package itself as before
${SPARK_HOME}/bin/spark-submit \
--master yarn \
--deploy-mode client \
--conf spark.executor.memoryOverhead=3000 \
--class org.apache.spark.repl.Main \
--name "my own Spark shell on Yarn" "$@" \
--driver-class-path /home/hduser/jars/ddhybrid.jar \
--jars /home/hduser/jars/spark-bigquery-latest.jar, \
/home/hduser/jars/ddhybrid.jar \
--packages com.github.samelamin:spark-bigquery_2.11:0.2.6
I think as Sean suggested this approach may or may not work (a manual
process) and if jars change, the whole thing has to be re-evaluated adding
to the complexity.
Cheers
On Tue, 20 Oct 2020 at 23:01, Sean Owen <[email protected]> wrote:
> Rather, let --packages (via Ivy) worry about them, because they tell Ivy
> what they need.
> There's no 100% guarantee that conflicting dependencies are resolved in a
> way that works in every single case, which you run into sometimes when
> using incompatible libraries, but yes this is the point of --packages and
> Ivy.
>
> On Tue, Oct 20, 2020 at 4:43 PM Mich Talebzadeh <[email protected]>
> wrote:
>
>> Thanks again all.
>>
>> Hi Sean,
>>
>> As I understood from your statement, you are suggesting just use
>> --packages without worrying about individual jar dependencies?
>>
>>>
>>>>>