Re: [EXTERNAL] Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-10-26 Thread Mick Semb Wever
> microsoft/DiskANN: > Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate > Nearest Neighbor Search (github.com) > <https://github.com/microsoft/DiskANN> > > Thanks, > German > > -------------- > *From:* Josh McKenzie

Re: [EXTERNAL] Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-10-24 Thread Benedict
-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search (github.com) Thanks, German From: Josh McKenzie Sent: Friday, September 22, 2023 7:43 AM To: dev Subject: [EXTERNAL] Re: [DISCUSS] Add JVector as a dependency for CEP-30   I highly doubt

Re: [EXTERNAL] Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread German Eichberger via dev
Friday, September 22, 2023 7:43 AM To: dev Subject: [EXTERNAL] Re: [DISCUSS] Add JVector as a dependency for CEP-30 I highly doubt liability works like that in all jurisdictions That's a fantastic point. When speculating there, I overlooked the fact that there are literally dozens of legal juris

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Mike Adamson
> For my understanding, isn’t it gonna be an issue to be copyrighted also to a single person? For the same reasons? This was partly why I asked. I did a random check of libraries that are definite dependencies (netty, guava) and both contain author copyrights. On Fri, 22 Sept 2023, 16:01 Ekaterin

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Ekaterina Dimitrova
For my understanding, isn’t it gonna be an issue to be copyrighted also to a single person? For the same reasons? On Fri, 22 Sep 2023 at 7:59, Mick Semb Wever wrote: > > > Just for my understanding on this. Is the issue that the code has a >> copyright header on it or that it is copyright to a c

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Josh McKenzie
> I highly doubt liability works like that in all jurisdictions That's a fantastic point. When speculating there, I overlooked the fact that there are literally dozens of legal jurisdictions in which this project is used and the foundation operates. As a PMC let's take this to legal. On Fri, Se

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Jeff Jirsa
To do that, the cassandra PMC can open a legal JIRA and ask for a (durable, concrete) opinion. On Fri, Sep 22, 2023 at 5:59 AM Benedict wrote: > >1. my understanding is that with the former the liability rests on the >provider of the lib to ensure it's in compliance with their claims to

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Benedict
my understanding is that with the former the liability rests on the provider of the lib to ensure it's in compliance with their claims to copyrightI highly doubt liability works like that in all jurisdictions, even if it might in some. I can even think of some historic cases related to Linux where

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread J. D. Jordan
This Gen AI generated code use thread should probably be its own mailing list DISCUSS thread?  It applies to all source code we take in, and accept copyright assignment of, not to jars we depend on and not only to vector related code contributions.On Sep 22, 2023, at 7:29 AM, Josh McKenzie wrote:

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Josh McKenzie
So if we're going to chat about GenAI on this thread here, 2 things: 1. A dependency we pull in != a code contribution (I am not a lawyer but my understanding is that with the former the liability rests on the provider of the lib to ensure it's in compliance with their claims to copyright and it

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Benedict
My reading is quite different, in fact it is quite explicit that e.g. ChatGPT is forbidden from use, whereas AWS CodeWhisperer may be permitted depending on the attribution.I assume you are reading clause 2.1, but this requires that work "would not be [copyrightable] even if produced by a human” wh

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Mick Semb Wever
On Thu, 21 Sept 2023 at 10:41, Benedict wrote: > At some point we have to discuss this, and here’s as good a place as any. > There’s a great news article published talking about how generative AI was > used to assist in developing the new vector search feature, which is itself > really cool. Unfo

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Mick Semb Wever
Just for my understanding on this. Is the issue that the code has a > copyright header on it or that it is copyright to a corporate entity? > The potential issue here is about dependence upon one vendor (or commercial actor). If the project is not usable without a specific piece of work (library)

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Mike Adamson
Just for my understanding on this. Is the issue that the code has a copyright header on it or that it is copyright to a corporate entity? On Fri, 22 Sept 2023 at 10:11, Mick Semb Wever wrote: > Especially for an optional feature with clear alternative implementations, >> this doesn't bother me a

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-22 Thread Mick Semb Wever
> > Especially for an optional feature with clear alternative implementations, > this doesn't bother me at all. It's well within ASF policy to include > permissively licensed code copyrighted by other people or entities. > We should be conscious of the problem if this was a crucial (and evolving)

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-21 Thread Josh McKenzie
Oops; thought I'd already +1'ed earlier in the thread. In case it wasn't clear: +1 on inclusion as-is. On Thu, Sep 21, 2023, at 4:00 PM, Josh McKenzie wrote: > My .02 re: the copyright: the library is licensed ASL v2.0. Who it's > originally copyrighted by / to (Jonathan personally, DataStax as

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-21 Thread Josh McKenzie
My .02 re: the copyright: the library is licensed ASL v2.0. Who it's originally copyrighted by / to (Jonathan personally, DataStax as a corporate entity, Santa Claus, my dog :)) doesn't really have any impact on the legalities of our ability to make use of it or the durability or safety of the c

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-21 Thread Mick Semb Wever
> I am confused by your +1 here. You are +1 on including it, but only if the > copyright were different? Given DataStax wrote the library I don’t see how > that will change? > No blocker on including the library. I'm hoping we can address concerns in parallel, I don't want to hold things up. (

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-21 Thread J. D. Jordan
Mick,I am confused by your +1 here. You are +1 on including it, but only if the copyright were different?  Given DataStax wrote the library I don’t see how that will change?On Sep 21, 2023, at 3:05 AM, Mick Semb Wever wrote:On Wed, 20 Sept 2023 at 18:31, Mike Adamson wrote

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-21 Thread Benedict
At some point we have to discuss this, and here’s as good a place as any. There’s a great news article published talking about how generative AI was used to assist in developing the new vector search feature, which is itself really cool. Unfortunately it *sounds* like it runs afoul of the ASF legal

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-21 Thread Mick Semb Wever
On Wed, 20 Sept 2023 at 18:31, Mike Adamson wrote: > The original patch for CEP-30 brought several modified Lucene classes > in-tree to implement the concurrent HNSW graph used by the vector index. > These classes are now being replaced with the io.github.jbellis.jvector > library, which contains

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-20 Thread J. D. Jordan
: dev Subject: [EXTERNAL] [DISCUSS] Add JVector as a dependency for CEP-30   You don't often get email from madam...@datastax.com. Learn why this is important The original patch for CEP-30 brought several modified Lucene classes in-tree to implement the concurrent HNSW graph used by

Re: [DISCUSS] Add JVector as a dependency for CEP-30

2023-09-20 Thread German Eichberger via dev
+1 I am biased because DiskANN is from Microsoft Research but it's a good library/algorithm From: Mike Adamson Sent: Wednesday, September 20, 2023 8:58 AM To: dev Subject: [EXTERNAL] [DISCUSS] Add JVector as a dependency for CEP-30 You don't often

[DISCUSS] Add JVector as a dependency for CEP-30

2023-09-20 Thread Mike Adamson
The original patch for CEP-30 brought several modified Lucene classes in-tree to implement the concurrent HNSW graph used by the vector index. These classes are now being replaced with the io.github.jbellis.jvector library, which contains an improved diskANN implementation for the on-disk graph fo