> DENSE seems to just be an array? So very similar to a frozen list, but with a 
> fixed size?

How I read the doc, DENSE = ARRAY, but knew that couldn’t be the case, so when 
I read the code its fixed size array…. So the real syntax was “DENSE 
FLOAT32[42]”

Not a fan of the type naming, and feel that a fixed size array could be useful 
for other cases as well, so think we can improve here (personally prefer 
float[42], text[42], etc… vector<float, 42> maybe closer to our existing syntax 
but not a fan).

> I guess this is an excellent example to explore the minima of what 
> constitutes a CEP

The ANN change itself feels like a CEP makes sense.  Are we going to depend on 
Lucene’s HNSW or build our own?  How do we validate this for correctness?  What 
does correctness mean in a distributed context?  Is this going to be pluggable 
(big push recently to offer plugability)?


> On Apr 26, 2023, at 7:37 AM, Patrick McFadin <pmcfa...@gmail.com> wrote:
> 
> I guess this is an excellent example to explore the minima of what 
> constitutes a CEP. So far, CEPs have been some large changes, so where does 
> something like this fit? (Wait. Did I beat Benedict to a Bike Shed? I think I 
> did.)
> 
> This is a list of everything needed for a CEP:
> 
> Status
> Scope
> Goals
> Approach
> Timeline
> Mailing list / Slack channels
> Related JIRA tickets
> Motivation
> Audience
> Proposed Changes
> New or Changed Public Interfaces
> Compatibility, Deprecation, and Migration Plan
> Test Plan
> Rejected Alternatives
> 
> This is a big enough change to provide information for each element. Going 
> back to the spirit of why we started CEPs, we wanted to avoid a mega-commit 
> without some shaping and agreement before code goes into trunk. I don't have 
> a clear indication of where that line lies. From our own wiki: "It is highly 
> recommended to pursue a CEP for significant user-facing or changes that cut 
> across multiple subsystems." That seems to fit here. Part of my motivation is 
> being clear with potential new contributors by example and encouraging more 
> awesomeness.  
> 
> The changes for operators:
>  - New drivers
>  - New gaurdrails?
>  - Indexing == storage requirements
> 
> Patrick
> 
> On Tue, Apr 25, 2023 at 10:53 PM Mick Semb Wever <m...@apache.org> wrote:
> I was soooooo happy when I saw this, I know many users are going to be 
> thrilled about it.
> 
> 
> On Wed, 26 Apr 2023 at 05:15, Patrick McFadin <pmcfa...@gmail.com> wrote:
> Not sure if this is what you are saying, Josh, but I believe this needs to be 
> its own CEP. It's a change in CQL syntax and changes how clusters operate. 
> The change needs to be documented and voted on. Jonathan, you know how to 
> find me if you want me to help write it. :) 
> 
> I'd be fine with just a DISCUSS thread to agree to the CQL change, since it: 
> `DENSE FLOAT32` appears to be a minimal,  and the overall patch building on 
> SAI. As Henrik mentioned there's other SAI extensions being added too without 
> CEPs.  Can you elaborate on how you see this changing how the cluster 
> operates?
> 
> This will be easier to decide once we have a patch to look at, but that 
> depends on a CEP-7 base (e.g. no feature branch exists). If we do want a CEP 
> we need to allow a few weeks to get it through, but that can happen in 
> parallel and maybe drafting up something now will be valuable anyway for an 
> eventual CEP that proposes the more complete features (e.g. 
> cosine_similarity(…)). 
> 

Reply via email to