On Wed, Feb 26, 2025 at 12:57 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> > In the interest of ASF trademarks, I would suggest it be called
> "apache/tinkerpop" with "Gremlin" naming reserved for operators and the
> like, as it is now with GremlinOperator. I think this makes sense because
> it is connecting to TinkerPop-enabled systems via Gremlin. I would
> similarly suggest that references to "Apache Gremlin" and the like become
> "Apache TinkerPop".
>
> That's an interesting one - indeed TinkerPop is the PMC/ Framework -
> Gremlin is the language.
>
> I am not sure we are actually using TinkerPop here - because TinkerPop is
> the whole framework - Ahmad, can you explain the relation there - are those
> other systems simply implement Gremlin as language or do they use TinkerPop
> for something / as a backend?
>

I'm sure Ahmad could answer but I'll quickly offer my take. I think that in
this case we should prefer "TinkerPop" over Gremlin as a top-level name
particularly because it's prefixed with "Apache" and there is no "Apache
Gremlin" which I tend to think is confusing when the words are that close
together. I can't recall over the years just how many times I've asked for
corrections in blog posts. :)

Because that's a bit of a conceptual difference here. For example in the
> provider we are importing https://pypi.org/project/gremlinpython not
> "tinkerpop" - and it also does not have tinkerpop as dependency.
>

A bit of history goes along with a lot of our naming for what we term
Gremlin Language Variants (GLVs), like gremlinpython, which are variants of
Gremlin natively implemented to allow users to express Gremlin in the
idioms of their own language. They also provide driver connectivity to
compatible servers. TinkerPop has mostly inherited all of its language
variants, including gremlinpython which was the first, from third-party
community developers. As a project, we didn't really get a hand in the
naming so with those projects already in heavy use we just kinda of stuck
to it and even doubled-down (like when we built gremlin-go within the ASF).

I think in this case, your project organization under "apache" seems to
almost lend itself nicely to apache/tinkerpop. i think users will recognize
it as equally as they recognize Gremlin.


>
> I wonder if Gremlin is also a Trademark by Apache ? Maybe we should ask
> tinkerpop PMC what they think about it?
>

Gremlin is not an ASF trademark. That was debated for quite a long time
with trademarks@ along with deciding if Gremlin, the character and his
friends[1], were to be protected. In the end, for reasons I'm not sure I
quite remember, the ASF didn't think it was necessary.

Anyway, I'm not sure if you noted my earlier post[2] but I'm one of the
original contributors to Apache TinkerPop, even before we brought it to the
ASF so I'm pretty familiar with our project.  :)

[1]
https://github.com/apache/tinkerpop/blob/master/docs/static/images/tinkerpop3-splash.png
[2] https://lists.apache.org/thread/9hf4t8hyk944fyo4q3nygczyo5xhk18y


>
>
> J.
>
>
> On Wed, Feb 26, 2025 at 4:55 PM Stephen Mallette <spmalle...@apache.org>
> wrote:
>
> >
> >
> > On 2025/02/26 12:38:02 Jarek Potiuk wrote:
> > > Yeah . `apache/gremlin" seems like a better option then. Does anyone
> have
> > > anything against it?
> >
> > In the interest of ASF trademarks, I would suggest it be called
> > "apache/tinkerpop" with "Gremlin" naming reserved for operators and the
> > like, as it is now with GremlinOperator. I think this makes sense because
> > it is connecting to TinkerPop-enabled systems via Gremlin. I would
> > similarly suggest that references to "Apache Gremlin" and the like become
> > "Apache TinkerPop".
> >
> > > I think we are pretty happy with accepting "other
> > > apache" projects as providers, so I see no issue with Gremlin - knowing
> > > that we can always reach out to our friendly Apache Community in case
> of
> > > any issues. So - unless we do not hear any "opposition" in a few days,
> I
> > > think it would make sense if you start `[LAZY CONSENSUS]` thread -
> > > without a need for `[VOTE]` thread.
> > >
> > > One thing though that I would love to have - is to also have an
> > integration
> > > test if possible (we had it with apache.kafka for example) - those are
> > > tests that could run **some** graphdb database locally (via
> > docker-compose)
> > > and run a very rudimentary checks against a "real" database, not a
> mocked
> > > call. That would make it more robust.
> > >
> > > More about integration tests, how to build, run, test them and
> integrate
> > > them in our CI can be found here:
> > >
> >
> https://github.com/apache/airflow/blob/main/contributing-docs/testing/integration_tests.rst
> > > - happy to help if you are stuck with it.
> > >
> > > J.
> > >
> > >
> > > On Wed, Feb 26, 2025 at 1:25 PM Ahmad Farhan <
> ahmad.farhan9...@gmail.com
> > >
> > > wrote:
> > >
> > > > I pushed changes to move the provider into the “apache” directory.
> > After
> > > > updating the class references across the project, I re-tested and all
> > tests
> > > > passed.
> > > >
> > > > Regarding the use of Gremlin (or another graph query language like
> > Cypher
> > > > and SPARQL) for a common package approach, here are my thoughts on
> the
> > pros
> > > > and cons:
> > > >
> > > > pros (I can see only one):
> > > >
> > > >    - Gremlin has been widely adopted by different cloud vendors (e.g.
> > Azure
> > > >    Cosmos DB with Apache Gremlin and AWS Neptune) as well as in
> > self-hosted
> > > >    environments.
> > > >
> > > > cons:
> > > >
> > > >    - Gremlin, Cypher (native for Neo4j) and SPARQL each have their
> own
> > > >    drivers for executing queries.
> > > >    - To achieve a common abstraction, a wrapper around each driver
> > would be
> > > >    required. Each driver has its own connection parameters,
> underlying
> > > >    protocols, and may need method overrides for compatibility with
> > > > different
> > > >    Python versions.
> > > >    - Not all vendors support every query language; for instance,
> > Gremlin
> > > >    for Neo4j has been deprecated in recent releases, while Cosmos DB
> > does
> > > > not
> > > >    support Cypher or SPARQL.
> > > >
> > > > While it would be ideal to have a unified graph query language and
> > driver
> > > > that works seamlessly across different vendors, such a solution does
> > not
> > > > exist at the moment. In my opinion, implementing provider-specific
> > > > solutions for each query language (Gremlin, Cypher, SPARQL) is more
> > > > realistic and practical given the current landscape.
> > > >
> > > > Happy to discuss further or answer any questions!
> > > >
> > > > Farhan
> > > >
> > > > On Mon, Feb 24, 2025 at 11:33 AM Ahmad Farhan <
> > ahmad.farhan9...@gmail.com>
> > > > wrote:
> > > >
> > > > > I have worked with two different graph database vendors—Azure
> Cosmos
> > DB
> > > > > and Neo4j. During our migration to Neo4j, we discovered that using
> > the
> > > > > Gremlin language wasn’t possible; we were forced to rewrite all our
> > > > queries
> > > > > into Cypher, which is the native language for Neo4j and, in my
> > > > experience,
> > > > > much simpler for querying.
> > > > >
> > > > > This situation highlights a key challenge for a common abstraction:
> > the
> > > > > underlying query languages and connection/authentication mechanisms
> > vary
> > > > > significantly. Gremlin is not only different from Cypher in syntax
> > but is
> > > > > also deprecated for Neo4j (see
> > > > > https://tinkerpop.apache.org/docs/3.7.3/reference/#neo4j-gremlin).
> > > > >
> > > > > The question would be how can the common approach accommodate these
> > > > > different query languages?
> > > > >
> > > > > On Fri, Feb 21, 2025 at 7:36 PM Jarek Potiuk <ja...@potiuk.com>
> > wrote:
> > > > >
> > > > >> Without deep looking at the code I love the idea - it's very
> > similar to
> > > > >> what we have for common.sql and common.io - and soon
> > common.messaging
> > > > - I
> > > > >> also - long time ago - suggested common.dataframe that someone
> could
> > > > >> submit
> > > > >> using Apache Ibis:
> > > > >> https://lists.apache.org/thread/qx3yh6h0l6jb0kh3fz9q95b3x5b4001l
> -
> > > > >> similarly I believe there was an idea about common.llm ...
> > > > >>
> > > > >> I think the "common" pattern is a great one for Airflow, to build
> > on top
> > > > >> of
> > > > >> "other giants" who build those common abstractions that you can
> > easily
> > > > >> switch between different implementations of various data access
> > layers.
> > > > >>
> > > > >> My suggestion and question - would be however (not very strong on
> > it, I
> > > > >> would love to hear what others think, I know it's been somewhat
> > > > >> contentious
> > > > >> when I started the ibis discussion) - would be to make it
> > > > "common.graph",
> > > > >> "common.dataframe" - instead of "apache.gremlin" or "apache.ibis"
> -
> > just
> > > > >> to
> > > > >> stress that those are not implementations of particular service
> but
> > > > >> opinionated choice of particular technology to do "common"
> > operations.
> > > > >> This
> > > > >> is what essentially "common.io" is . - it should be named
> "fsspec"
> > > > >> provider
> > > > >> if we were to name it by the "library" that implemented it.
> > > > >>
> > > > >> J.
> > > > >>
> > > > >>
> > > > >> On Fri, Feb 21, 2025 at 8:22 PM Ahmad Farhan <
> > > > ahmad.farhan9...@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >> > Hi Everyone,
> > > > >> >
> > > > >> > I’ve created a draft PR (
> > https://github.com/apache/airflow/pull/46977
> > > > )
> > > > >> to
> > > > >> > introduce and discuss a new provider for using Gremlin—the graph
> > > > >> traversal
> > > > >> > language of Apache TinkerPop (more details here:
> > > > >> > https://tinkerpop.apache.org/gremlin.html). Gremlin is
> supported
> > by
> > > > >> > various
> > > > >> > graph database vendors such as Azure Cosmos DB and Amazon
> Neptune.
> > > > >> > Previously, I had to develop a custom hook to query data from
> > Azure
> > > > >> Cosmos
> > > > >> > DB using Apache Gremlin.
> > > > >> >
> > > > >> > I managed to create a provider and run it locally on the main
> > branch.
> > > > >> > However, I ran into the BaseHook issue (
> > > > >> > https://github.com/apache/airflow/issues/45233) on that branch,
> > so I
> > > > >> ended
> > > > >> > up testing it fully on the v2-10-test branch. The PR should be
> > > > complete,
> > > > >> > but I’ve kept it as a draft for now while we discuss the
> provider.
> > > > >> >
> > > > >> > I’m a new contributor, so I’m especially eager to hear your
> > feedback.
> > > > >> > Comments on the PR is very welcome, and please feel free to
> reach
> > out
> > > > >> with
> > > > >> > any questions via email or Slack.
> > > > >> >
> > > > >> > Thanks,
> > > > >> > Ahmad Farhan
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
> >
>

Reply via email to