Will Berkeley created KUDU-2485:
-----------------------------------

             Summary: Enhance Kudu-Spark docs
                 Key: KUDU-2485
                 URL: https://issues.apache.org/jira/browse/KUDU-2485
             Project: Kudu
          Issue Type: Improvement
          Components: spark
    Affects Versions: 1.7.1
            Reporter: Will Berkeley


Users often get confused about the right way to use the Kudu-Spark integration. 
The most common dangerous result is that they create multiple Kudu clients, 
sometimes even one per task. It's pretty easy to overwhelm the master in this 
way, e.g., with a 2 second batch window and a client per task in a Spark 
streaming job. We should take our current minimal Spark docs and provide better 
examples and bigger, louder, redder warnings about making extra Kudu clients. 
Users should be directed to use the KuduContext exclusively. When a client is 
needed, the client instance inside the KuduContext should be used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to