Hi Pritam.. since this is a look-up to an external system considering there is network i/o in place and also the time to get the results it might be normal to notice backpressure there. Also note that the queries in Cassandra highly depend on the data model, so data can be easy to find between the different nodes and also depends on the amount of data that needs to be found in order to calculate the query. For example, even a filtering query, might depend on the number of keys it needs to find between the nodes, it needs to have proper partition keys that can gather the results fast, grab the amount of data, perform the operation, etc. It might be a good approach to also try and see how much time it takes on the Cassandra side.
Hope this helps, Best On Tue, Jul 18, 2023 at 4:40 AM Shammon FY <zjur...@gmail.com> wrote: > Hi Pritam, > > I'm sorry that I'm not familiar with Cassandra. If your async function is > always the root cause for backpressure, I think you can check the latency > for the async request in your function and log some metrics. > > By the way, I think you can add cache in your async function to speedup > the lookup request which we always do in loopup join for sql jobs. > > > Best, > Shammon FY > > On Mon, Jul 17, 2023 at 10:09 PM Pritam Agarwala < > pritamagarwala...@gmail.com> wrote: > >> Hi Team, >> >> >> Any input on this will be really helpful. >> >> >> Thanks! >> >> On Tue, Jul 11, 2023 at 12:04 PM Pritam Agarwala < >> pritamagarwala...@gmail.com> wrote: >> >>> Hi Team, >>> >>> >>> I am using "AsyncDataStream.unorderedWait" to connect to cassandra . >>> The cassandra lookup operators are becoming the busy operator and creating >>> back-pressure result low throughput. >>> >>> >>> The Cassandra lookup is a very simple query. So I increased the capacity >>> parameter to 80 from 15 and could see low busy % of cassandra operators. I >>> am monitoring the cassandra open connections and connected host metrics. >>> Couldn't see any change on these metrics. >>> >>> >>> How is the capacity parameter related to cassandra open connections and >>> host ? If I increase capacity more will it have any impact on these metrics >>> ? >>> >>> Thanks & Regards, >>> Pritam >>> >>