Does anybody have any questions that we could answer about this proposal?
> On Apr 27, 2023, at 1:24 PM, Francisco Guerrero <frank.guerr...@gmail.com>
> wrote:
>
> Hi folks,
>
> We have updated the confluence page with the source code for CEP-28.
> There are two repositories with contributions. One is the patch [1]
> for Cassandra Sidecar with the bulk APIs that enable the Cassandra
> Spark Analytics library. The second is a new repository [2] with
> contributions to the Cassandra Spark Analytics code
>
> We also have a README markdown file that you can follow to give the
> code a try:
>
> https://github.com/frankgh/cassandra-analytics/blob/trunk/cassandra-analytics-core-example/README.md
>
> Best,
> - Francisco
>
> [1] Apache Cassandra Sidecar bulk APIs source code:
> https://github.com/frankgh/cassandra-sidecar/tree/CEP-28-bulk-apis
> [2] Apache Cassandra Spark Analytics source code:
> https://github.com/frankgh/cassandra-analytics
>
>
> On 2023/04/05 15:18:07 Doug Rohrer wrote: > Sorry for the delay in responding
> here - yes, we can add some diagrams to the CEP - I’ll try to get that done
> by end-of-week. > > Thanks, > > Doug > > > On Mar 28, 2023, at 1:14 PM, J. D.
> Jordan <jeremiah.jor...@gmail.com <mailto:jeremiah.jor...@gmail.com>> wrote:
> > > > > Maybe some data flow diagrams could be added to the cep showing some
> example operations for read/write? > > > >> On Mar 28, 2023, at 11:35 AM,
> Yifan Cai <yc25c...@gmail.com <mailto:yc25c...@gmail.com>> wrote: > >> > >>
> > >> A lot of great discussions! > >> > >> On the sidecar front, especially
> what the role sidecar plays in terms of this CEP, I feel there might be some
> confusion. Once the code is published, we should have clarity. > >> Sidecar
> does not read sstables nor do any coordination for analytics queries. It is
> local to the companion Cassandra instance. For bulk read, it takes snapshots
> and streams sstables to spark workers to read. For bulk write, it imports the
> sstables uploaded from spark workers. All commands are existing jmx/nodetool
> functionalities from Cassandra. Sidecar adds the http interface to them. It
> might be an over simplified description. The complex computation is performed
> in spark clusters only. > >> > >> In the long run, Cassandra might evolve
> into a database that does both OLTP and OLAP. (Not what this thread aims for)
> > >> At the current stage, Spark is very suited for analytic purposes. > >> >
> >> On Tue, Mar 28, 2023 at 9:06 AM Benedict <bened...@apache.org
> <mailto:bened...@apache.org> <mailto:bened...@apache.org
> <mailto:bened...@apache.org>>> wrote: > >>> I disagree with the first claim,
> as the process has all the information it chooses to utilise about which
> resources it’s using and what it’s using those resources for. > >>> > >>> The
> inability to isolate GC domains is something we cannot address, but also
> probably not a problem if we were doing everything with memory management as
> well as we could be. > >>> > >>> But, not worth detailing this thread for.
> Today we do very little well on this front within the process, and a separate
> process is well justified given the state of play. > >>> > >>>> On 28 Mar
> 2023, at 16:38, Derek Chen-Becker <de...@chen-becker.org
> <mailto:de...@chen-becker.org> <mailto:de...@chen-becker.org
> <mailto:de...@chen-becker.org>>> wrote: > >>>> > >>>> > >>>> > >>>> On Tue,
> Mar 28, 2023 at 9:03 AM Joseph Lynch <joe.e.ly...@gmail.com
> <mailto:joe.e.ly...@gmail.com> <mailto:joe.e.ly...@gmail.com
> <mailto:joe.e.ly...@gmail.com>>> wrote: > >>>> ... > >>>> > >>>>> I think we
> might be underselling how valuable JVM isolation is, > >>>>> especially for
> analytics queries that are going to pass the entire > >>>>> dataset through
> heap somewhat constantly. > >>>> > >>>> Big +1 here. The JVM simply does not
> have significant granularity of control for resource utilization, but this is
> explicitly a feature of separate processes. Add in being able to separate GC
> domains and you can avoid a lot of noisy neighbor in-VM behavior for the
> disparate workloads. > >>>> > >>>> Cheers, > >>>> > >>>> Derek > >>>> > >>>>
> > >>>> -- > >>>>
> +---------------------------------------------------------------+ > >>>> |
> Derek Chen-Becker | > >>>> | GPG Key available at
> https://keybase.io/dchenbecker and | > >>>> |
> https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org | > >>>> |
> Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC | > >>>>
> +---------------------------------------------------------------+ > >>>> > >
> --
> Francisco Guerrero