Re: Hiring: Distributed Systems Engineers (USA) - Azure Storage | C++ | Scale

2025-07-03 Thread Julian Hyde
Vishwajeet, Does this position involve working on Arrow? Your posting doesn’t say that it does. Unless a posting is for a job that explicitly requires Arrow skills and where working on Arrow is a significant fraction of the job, posting is an abuse of this list. Julian > On Jul 1, 2025, a

Re: Query on stuck Arrow Flight Client while interacting from local workstation (mac)

2025-05-07 Thread Julian Hyde
This thread is getting rather long. I think you should open a case. That’s a permanent place that you can attach screenshots, stack traces. And it will also be found by anyone who has the same problem in future. Julian > On May 7, 2025, at 2:51 AM, David Li wrote: > > Thanks for reporting b

Re: [DISCUSS] Donation of a User-Defined Function Framework for Apache Arrow

2024-06-28 Thread Julian Hyde
In some ways, the problem of a UDF framework is larger than Arrow. UDFs need to give the same results, and execute efficiently, regardless of the platform (e.g. Arrow), hosting language, and UDF language. At SIGMOD there was a paper from TU Berlin that addresses this problem: "Query Compilation

Re: [VOTE] Move Arrow DataFusion Subproject to new Top Level Apache Project

2024-03-03 Thread Julian Hyde
+1 (binding) > On Mar 2, 2024, at 2:28 PM, Dewey Dunnington > wrote: > > +1 (binding) > >> On Sat, Mar 2, 2024 at 8:08 AM vin jake wrote: >> >> +1 (binding) >> >>> On Fri, Mar 1, 2024 at 7:33 PM Andrew Lamb wrote: >>> >>> Hello, >>> >>> As we have discussed[1][2] I would like to vote o

Re: [DISCUSS] Move sqlparser-rs back into DataFusion project?

2024-02-26 Thread Julian Hyde
I am torn on this. One one hand, I am a big fan of components that are standalone - have no more dependencies than necessary, and are self-evidently standalone. So, I think that re-absorbing sqlparser-rs back into DataFusion would not be a good step. It would reduce the perception that it is stand

Re: [DISCUSS] Status and future of @ApacheArrow Twitter account

2024-01-29 Thread Julian Hyde
The easiest thing is to share the Twitter credentials with any PMC member who is interested in sending tweets (which is usually a very small number). To answer Antoine’s point. I have found Twitter an extremely effective way for an open-source project to communicate with the “exo-community” — pe

Re: [DISCUSS] Linear Formula Types

2024-01-07 Thread Julian Hyde
If the DB layer above Arrow supports it, I would define a (non-stored) calculated column. Given celsius_percent between 0 and 1, I would define fahrenheit as (32 + celsius_percent * 1.8). A good query optimizer would convert the condition 'where fahrenheit > 122' into 'where celsius_percent > 0.5'.

Re: Call for Presentations, Community Over Code 2023

2023-05-10 Thread Julian Hyde
Follow the unsubscribe instructions: https://arrow.apache.org/community/#mailing-lists On Wed, May 10, 2023 at 10:27 AM Gil Cohen wrote: > > How do I unsubscribe? > > On Wed, 10 May 2023 at 0:36 Rich Bowen wrote: > > > (Note: You are receiving this because you are subscribed to the dev@ > > list

Re: Apache Arrow PGP Key

2023-03-20 Thread Julian Hyde
I think we’re confusing two concepts: signing each others’ keys and adding them to the KEYS file. It is reasonable that we, as a community, extend the web of trust by mutual signing. Let’s suppose Wes and I have signed each other’s keys. Someone from the Pandas community, who knows Wes, downlo

Re: Pre-release feedback for 'nanoarrow'

2023-01-02 Thread Julian Hyde
Can you make sure that it adheres to ASF branding guidelines? As an ASF project its name should be “Apache Nanoarrow” and it should define itself in terms of its relationship with “Apache Arrow”. Julian > On Jan 2, 2023, at 8:28 AM, Dewey Dunnington > wrote: > > Hi all, > > Following a di

Re: Parser for ExecPlans

2022-11-03 Thread Julian Hyde
When people design a language to represent a data structure, they often do a poor job with literals (i.e. the constant values for each data type). And that causes problems with operator overloading. I recommend that you give each data type its own literal format, so you can distinguish, say, a s

Re: [RUST][Go][proposal] Arrow Intermediate Representation to facilitate the transformation of row-oriented data sources into Arrow columnar representation

2022-07-28 Thread Julian Hyde
If the 'row-oriented format' is an API rather than a physical data representation then it can be implemented via coroutines and could therefore have less scattered patterns of read/write access. By 'coroutines' I'm being rather imprecise, but I hope you get the general idea. An asynchronous API (w

Re: [Rust] [DataFusion] Discuss moving Python bindings back to Apache Arrow

2022-07-15 Thread Julian Hyde
Have significant changes been made since January? If not, IP clearance may not be required. The code as of January is still kosher Arrow IP, even if it’s been deleted from git. Julian > On Jul 15, 2022, at 7:02 PM, Andy Grove wrote: > > datafusion-python was donated to the Apache Arrow proj

Re: Apache Arrow and "Native"-themed mascotry

2022-07-09 Thread Julian Hyde
Walter, I am very sympathetic to your concerns about the Apache brand but your email contains an untrue statement that needs to be corrected. You say that "Arrow’s project name is an unfortunate example of ASF’s stereotyping", but stereotyping implies intent: that the people who named the Arrow p

Re: Adding Apache Arrow to the registry of Digital Public Goods

2022-03-25 Thread Julian Hyde
Good idea. I posted on ComDev. It would be interesting to know what Fineract's experience was, and whether other projects have considered this. Julian [1] https://lists.apache.org/thread/kgg6ml3n5ddr4ndbhnsfxc4ynn41djss On Fri, Mar 25, 2022 at 10:08 AM Wes McKinney wrote: > > As some research g

Re: [FlightSQL] Higher-level facade API to increase adoption/audience? Or does this belong as a personal project

2022-03-15 Thread Julian Hyde
e are in the >> process >>>> of being contributed already. Which as you noted there is power in >>>> standards, so I expect this avenue to see heavy use. >>>> 2. For clients that can handle it and want to go through the trouble, >>>> consumi

Re: [FlightSQL] Higher-level facade API to increase adoption/audience? Or does this belong as a personal project

2022-03-14 Thread Julian Hyde
When I read “language-agnostic standard for data access” I cringed a little. (See [1].) Sure, it’s fun to create a new standard. But if your standard is successful, there will need to be a huge amount of work changing existing code to use your standard. That effort might even be difference betw

Re: Managing usage of the @ApacheArrow Twitter handle and other social media

2022-02-01 Thread Julian Hyde
In my opinion, any PMC member should be allowed to use the Twitter account without any other checks, balances, or friction. They know that they are speaking for the project, and only for the project. They are PMC members so we trust them to do the right thing. If committers and other non-PMC co

Re: [DISCUSS] Deprecate user@ in favor for github issues/discussions

2021-09-29 Thread Julian Hyde
I'm not for or against this proposal. I took a few minutes to browse the archives [1]. It seems to me that the user@ list is working extremely well. People get answers quickly, problems are converted into JIRA cases, and the discussion often references existing information sources. I want to than

Re: Temporal Arithmetic

2021-09-23 Thread Julian Hyde
I wouldn’t discuss the algorithm on this list. I’d just commit to being compatible with Postgres, and write a bunch of tests based on Postgres’ observed behavior. > On Sep 23, 2021, at 5:12 AM, Phillip Cloud wrote: > > Hi all, > > I wanted to draw some attention to ARROW-11090 [1] in an eff

Re: [DISCUSS] Developing an "Arrow Compute IR [Intermediate Representation]" to decouple language front ends from Arrow-native compute engines

2021-08-12 Thread Julian Hyde
lease let me know. We might need to figure out how to >>>>> accommodate "many cooks" by setting up the ComputeIR project somewhere >>>>> separate from the format/ directory to permit it to exist in a >>>>> Work-In-Progress status for a period of tim

Re: [DISCUSS] Developing an "Arrow Compute IR [Intermediate Representation]" to decouple language front ends from Arrow-native compute engines

2021-08-05 Thread Julian Hyde
Wes, Thanks for this. I’ve added comments to the doc and to the PR. The biggest surprise is that this language does full relational operations. I was expecting that it would do fragments of the operations. Consider join. A distributed hybrid hash join needs to partition rows into output buffers

Re: [STRAW POLL] (How) should Arrow define storage for "Instant"s

2021-06-28 Thread Julian Hyde
D (2nd choice E if we’re doing ranked-choice voting) Julian > On Jun 24, 2021, at 12:24 PM, Weston Pace wrote: > > The discussion in [1] led to the following question. Before we > proceed on a vote it was decided we should do a straw poll to settle > on an approach (which can then be voted o

Re: [Discuss] If and how we should integrate geospatial data (specs) in Arrow

2021-06-25 Thread Julian Hyde
Cc += geospatial@. I think allowing WKB and WKT is sufficient. Perhaps Geometry could be a composite type (WKT, SRID) or (WKB, SRID). SRID (spatial reference identifier) is almost always needed to qualify a geometry value. It is analogous to how TimeZone is needed (implicitly or explicitly) to

Re: [VOTE] Clarify meaning of timestamp without time zone to equal the concept of "LocalDateTime"

2021-06-25 Thread Julian Hyde
+1 > On Jun 25, 2021, at 10:36 AM, Antoine Pitrou wrote: > > > Le 24/06/2021 à 21:16, Weston Pace a écrit : >> The discussion in [1] led to the following proposal which I would like >> to submit for a vote. >> --- >> Arrow allows a timestamp column to omit the time zone property. This >> has c

Re: [Format][Important] Needed clarification of timezone-less timestamps

2021-06-22 Thread Julian Hyde
amp without time zone > * > https://docs.google.com/document/d/1wDAuxEDVo3YxZx20fGUGqQxi3aoss7TJ-TzOUjaoZk8/edit?usp=sharing > > # Proposal: Arrow should define how an “Instant” is stored > * > https://docs.google.com/document/d/1xEKRhs-GUSMwjMhgmQdnCNMXwZrA10226AcXRoP8g9E/edit?usp=sh

Re: [Format][Important] Needed clarification of timezone-less timestamps

2021-06-22 Thread Julian Hyde
My proposal is that Arrow should support three different kinds of date-times: zoneless, zoned, and instant. (Not necessarily with those names.) All three kinds occur frequently in the industry. Many systems only have two, and users of those systems have figured out how to make do. (For example,

Re: [Format] Timestamp timezone semantics?

2021-06-04 Thread Julian Hyde
me objects — if you call access > datetime.hour on a timezone-less datetime.datetime, it will return the > same result no matter where in the world you are. > > On Thu, Jun 3, 2021 at 1:19 PM Julian Hyde wrote: >> >> It seems that Arrow’s timestamp type can either have no

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Julian Hyde
It seems that Arrow’s timestamp type can either have no time zone or be UTC. I think that is a flawed design, because doesn’t catch user errors. Suppose you want to find the number of milliseconds between two timestamps. If the first has a timezone and the second is implicitly UTC, then you can

Re: [Format] Timestamp timezone semantics?

2021-06-03 Thread Julian Hyde
My answer to Antoine’s question would not be “kind of”, it would be “no”. In a system such as Joda-time, which I claim is the only system that Arrow should be considering, a timestamp-without-timezone does not have an implicit time zone of UTC. It has no time zone. > On Jun 3, 2021, at 8:52 AM

Re: [Format] Timestamp timezone semantics?

2021-06-02 Thread Julian Hyde
> On Jun 2, 2021, at 1:56 PM, Micah Kornfield wrote: > > > At least in bigquery we do the following mapping: > SQL TIMESTAMP -> Arrow Timestamp with "UTC" timezone > SQL DATETIME -> Arrow Timestamp without a time-zone. BigQuery was one of the systems I had in mind when I said "naming is a lit

Re: [Format] Timestamp timezone semantics?

2021-06-02 Thread Julian Hyde
Good time libraries support all. E.g. Jodatime [1] has * Instant - an instantaneous point on the time-line * DateTime - full date and time with time-zone * LocalDateTime - date-time without a time-zone The SQL world isn't quite as much of a mess as Adam makes it out to be. The SQL standard define

Re: Long title on github page

2021-05-17 Thread Julian Hyde
wth. We have a data format, yes, >> but we are also creating a computational platform to go hand-in-hand >> with the data format to make it easier to build fast applications that >> use the data format. So the description needs to capture both of these >> ideas. >> >>

Re: Long title on github page

2021-05-17 Thread Julian Hyde
I think that the “cross-language development platform for” is noise. (I’m sure that JPEG developers think that JPEG is a “cross-language development platform” too. But it isn’t. It is an image format.) "Apache Arrow is data format for efficient in-memory processing.” I’ll note that In marketing

Re: [C++][DISCUSS] Implementing interpreted (non-compiled) tests for compute functions

2021-05-14 Thread Julian Hyde
Do these any of these compute functions have analogs in other implementations of Arrow (e.g. Rust)? I believe that as much as possible of Arrow’s compute functionality should be cross-language. Perhaps there are language-specific differences in how functions are invoked, but the basic functiona

Re: [Discuss] Storing metadata about the "sortedness" of data

2021-05-11 Thread Julian Hyde
Note that Calcite’s Statistic interface is heavily simplified, designed to be really simple for people to implement when they write their first table adapter. There are more advanced forms of metadata, such as RelMdDistribution [1] and Collation [2]. Since Arrow data sets will typically consist

Re: [RUST] Proposal for more frequent Rust Arrow release process

2021-05-03 Thread Julian Hyde
dious than necessarily onerous. Generating a signed > > > > tarball, seems like it should take ~5 minutes or less with the proper > > > > tooling? Verification is more heavy weight but again with the proper > > > > tooling and a good system for testing out mo

Re: [DISCUSS] Host DataFusion website on GitHub pages

2021-05-03 Thread Julian Hyde
M, Wes McKinney wrote: > > What would be the advantages of this versus publishing a website to > arrow.apache.org/datafusion? If the project is actually part of Apache > Arrow, I would be worried about having different base URLs altogether > for different subprojects > > On

Re: [DISCUSS] Host DataFusion website on GitHub pages

2021-05-03 Thread Julian Hyde
Would this web site be served from an apache.org domain? > On May 3, 2021, at 7:34 AM, Andy Grove wrote: > > Based on a quick reading of ASF documentation, I don't think we need to > vote on creating a website, but I do think that the user guide should be > published from th

Re: [RUST] Proposal for more frequent Rust Arrow release process

2021-05-01 Thread Julian Hyde
e votes last for at least 72 hours this does seem like a lot of > overhead every two weeks, but it seems that is something for Rust > maintainers to decide and adjust. > > -Micah > > On Saturday, May 1, 2021, Julian Hyde wrote: > > > (Removing user@ from cc. I think t

Re: [RUST] Proposal for more frequent Rust Arrow release process

2021-05-01 Thread Julian Hyde
(Removing user@ from cc. I think this is mainly a dev@ issue.) I believe there are some tensions between this process and the Apache process. In particular, Apache releases tend to be a signed source distribution (tarball) that at least three PMC members download and verify. I totally understand w

Re: [Gandiva] Replacing the LRU cache in gandiva

2021-04-21 Thread Julian Hyde
e: > > Julian, How do you plan to use Gandiva in Apache Calcite? > > On Tue, Apr 20, 2021 at 9:57 PM Julian Hyde wrote: > >> We would love to use Gandiva in Apache Calcite [1] but we are blocked >> because the JAR on Maven Central doesn't work on macOS, Linux or >

Re: [Gandiva] Replacing the LRU cache in gandiva

2021-04-20 Thread Julian Hyde
We would love to use Gandiva in Apache Calcite [1] but we are blocked because the JAR on Maven Central doesn't work on macOS, Linux or Windows [2] and there seems to be no interest in fixing the problem. So I doubt whether anyone is using Gandiva in production (unless they have built the artifacts

Re: Rust sync meeting

2021-04-08 Thread Julian Hyde
Julia's needs with respect > > > > > to releasing. > > > > > > > > > > The discussions that we have had recently have centered around > > > > > communication questions. > > > > > > > > > > * Mailing list disc

Re: Rust sync meeting

2021-04-08 Thread Julian Hyde
Antoine, I need to correct your assertion > we develop on the side every day when we submit PRs from forks; > it's just a matter of how much complexity is being submitted at once Intuitively, there seems to be a continuum between a PR developed within a project to a major feature/codebase devel

Re: [Rust] Contributing to Apache Arrow

2021-03-07 Thread Julian Hyde
We have the exact same problem in Apache Calcite. People get the impression that “contributor” is some kind of achievement within the Apache hierarchy - it’s not, it’s just a JIRA concept - and it creates friction for people who want to contribute. (After all, I think we want people to log a JIR

Arrow papers

2021-02-07 Thread Julian Hyde
A couple of interesting Arrow-related papers have appeared at conferences recently: Integrating Lightweight Compression Capabilities into Apache Arrow [1] Magpie: Python at Speed and Scale using Cloud Backends [2] I’m sharing them so that people are aware of the evolving state-of-the-art. Julian

Re: [DISCUSS] Rotating the PMC Chair

2020-09-29 Thread Julian Hyde
certain > matters, but overall IMHO we've had a generally healthy dynamic in our > governance. > > On Tue, Sep 29, 2020 at 2:12 AM Julian Hyde wrote: >> >> There has been some discussion in the Arrow PMC about rotating the PMC >> Chair (also known as the project VP

Re: [Rust] Arrow SQL Adapters/Connectors

2020-09-29 Thread Julian Hyde
ODBC and JDBC do not specify a wire protocol. So, while the client APIs are definitely row-based, any particular driver could use a protocol that is based on Arrow data. There is immense investment in ODBC and JDBC drivers, and they handle complex cases such as connection pooling, statement can

[DISCUSS] Rotating the PMC Chair

2020-09-29 Thread Julian Hyde
There has been some discussion in the Arrow PMC about rotating the PMC Chair (also known as the project VP) every year. I wanted to raise the topic here for discussion among Arrow committers and within the broader Arrow community. Quite a few Apache projects have adopted a policy where they choose

Re: Some interesting VLDB reading on vectorized query evaluation relevant to Gandiva, other items

2018-09-28 Thread Julian Hyde
An excellent paper, thanks for sharing. (It’s worth reading every single one of the references.) I wonder whether Timo Kersten is related to Martin. > On Sep 27, 2018, at 9:44 AM, Wes McKinney wrote: > > http://www.vldb.org/pvldb/vol11/p2209-kersten.pdf

Re: [DISCUSS] Standardize Java style

2018-08-28 Thread Julian Hyde
My two cents: it’s much, much more important to have a standard style (enforced automatically) than what that style is. People should come into this expecting to compromise their personal preferences. > On Aug 28, 2018, at 10:29 AM, Bryan Cutler wrote: > > Sounds good Li. I just wanted to make

Re: New u...@arrow.apache.org mailing list

2018-08-23 Thread Julian Hyde
Thanks! I think you should also announce on twitter. There may be many followers who feel that dev@ is too heavy for them but @user would be just right. Julian > On Aug 23, 2018, at 9:13 AM, Wes McKinney wrote: > > hi all, > > We have just created a user-oriented mailing list. Please subscr

Re: [DISCUSS] Moving forward on the Arrow-Parquet C++ monorepo project

2018-08-19 Thread Julian Hyde
The votes to grant commit access that you refer to are votes to appoint committers or PMC members. Those votes are conducted in private to prevent embarrassment in case the vote fails, or if the vote passes and the individual declines the offer. I don’t see any such potential embarrassment here

Re: Progress on Arrow RPC a.k.a. Arrow Flight

2018-08-16 Thread Julian Hyde
If your use case is SQL RPC, then you are getting close to Avatica's territory. Avatica[1] is a protocol for implementing language-independent JDBC and ODBC stacks. Now, I agree that many ODBC implementations are inefficient. Some ODBC stacks make more round trips than necessary, and do more copyi

Re: [VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-16 Thread Julian Hyde
+1 On Thu, Aug 16, 2018 at 8:56 AM Wes McKinney wrote: > > Dear all, > > The developers of Gandiva, an LLVM-based vectorized expression > evaluation engine for Arrow columnar memory, are proposing to donate > the project to Apache Arrow at some point in the near future, as has > been discussed on

Re: Increasing transparency of corporate support for Apache Arrow development

2018-08-16 Thread Julian Hyde
This is a tough one. I think we need to strike a delicate balance: we should thank companies for being benefactors, but should not put up with bragging (or as Ted puts it, genital comparisons). In Calcite, we allow committers to show their company affiliations[1]. I was initially concerned, but it

Re: [ANNOUNCE] Apache Arrow 0.10.0 released

2018-08-07 Thread Julian Hyde
.10.0.tar.gz.sha256 >> Second issue: I'm not sure what the issue is there. FWIW, they are not >> there for the 0.9.0 release either, and I don't see them in the main >> parquet project either: >> http://apache.cs.utah.edu/parquet/apache-parquet-1.10.0/ >>

Re: [ANNOUNCE] Apache Arrow 0.10.0 released

2018-08-07 Thread Julian Hyde
Congratulations! One thing: on http://arrow.apache.org/install/ there were the checksums (.tar.gz.asc and .tar.gz.sha512), but I couldn’t find a link to the mirrors with the source tarball (i.e. https://www.apache.org/dyn/closer.cgi/arrow/arrow-0.10.0/

Re: [DISCUSS] Solutions for improving the Arrow-Parquet C++ development morass

2018-07-31 Thread Julian Hyde
A controlled fork doesn’t sound like a terrible option. Copy the code from parquet into arrow, and for a limited period of time it would be the primary. When that period is over, the code in parquet becomes the primary. During the period during which arrow has the primary, the parquet release m

Re: [DISCUSS] Solutions for improving the Arrow-Parquet C++ development morass

2018-07-30 Thread Julian Hyde
I'm not going to comment on the design of the parquet-cpp module and whether it is “closer” to parquet or arrow. But I do think Wes’s proposal is consistent with Apache policy. PMCs make releases and govern communities; they don’t exist to manage code bases, except as a means to the end of crea

Re: Arrow stickers

2018-07-10 Thread Julian Hyde
Thanks for driving this. Can you put the word “apache” in there (in smaller font if you like). That way, if you have the logo on slide 1 of your presentation, you’ve already done your duty to mention the Apache brand. Julian > On Jul 9, 2018, at 19:07, Kelly Stirman wrote: > > Hi everyone!

Re: Housing longer-term Arrow development, design, and roadmap documents

2018-06-26 Thread Julian Hyde
bly good job curating JIRA, but it would > be helpful to have some kind of high level narrative about the > different areas of the project. > > On Tue, Jun 26, 2018 at 1:21 PM, Julian Hyde wrote: >> I have a bias against wikis of all kinds. If left to their own devices, they >>

Re: Housing longer-term Arrow development, design, and roadmap documents

2018-06-26 Thread Julian Hyde
I have a bias against wikis of all kinds. If left to their own devices, they tend to become an unstructured mess. Of course, the lack of structure is what makes them useful for what Wes is proposing: gathering knowledge and organizing it as it evolves. But someone will need to play the “librari

Re: Gandiva Initiative

2018-06-22 Thread Julian Hyde
This is exciting. We have wanted to build an Arrow adapter in Calcite for some time and have a prototype (see https://issues.apache.org/jira/browse/CALCITE-2173 ) but I hope that we can use Gandiva. I know that Gandiva has Java bindings, but w

Re: Arrow stickers

2018-06-13 Thread Julian Hyde
gt;>>> have an "official" logo either. Could the design also be applied >> for >>> this >>>>>> and added to the site, etc? IBM has a design team that might be >> able >>> to >>>>>> help out. Does it make sense to

Re: Arrow stickers

2018-06-02 Thread Julian Hyde
Stickers are a great idea. Ask on comdev. Here’s a recent thread I found on the topic. https://lists.apache.org/thread.html/6a73bcc86929d0bd2d4bffb8d9b30f9a5590e872cfc2a884a6de9c5d@%3Cdev.community.apache.org%3E I recall someone (maybe Sharan from comdev) saying that any project can get free st

Re: What do people think about a one day get together?

2018-04-09 Thread Julian Hyde
+1 The Arrow community would benefit greatly from a conference/unconference. Remember not to schedule it too close to ApacheCon. Julian > On Apr 9, 2018, at 10:18 AM, Jacques Nadeau wrote: > > Hey all, given that several people are busy in June, let's way until the > fall. I'll take at look

Re: Taking some time off Arrow maintenance / development in April

2018-03-28 Thread Julian Hyde
+1 Great move. Your presence looms large over this project, so stepping back will give others the chance to take the lead. I know (from other projects) how oppressive it is to be the main “go to” person in a project. It irks me, for instance, when people address comments in PRs directly to me

Re: [VOTE] Accept donation of Arrow Go implementation

2018-03-07 Thread Julian Hyde
+1 > On Mar 6, 2018, at 3:49 PM, Kouhei Sutou wrote: > > +1 > > In > "Re: [VOTE] Accept donation of Arrow Go implementation" on Tue, 6 Mar 2018 > 15:46:31 -0500, > Li Jin wrote: > >> +1 >> >> On Tue, Mar 6, 2018 at 3:31 PM, Uwe L. Korn wrote: >> >>> +1 >>> >>> On Tue, Mar 6, 2018, at

Re: [DISCUSS] Expanding Arrow interval type metadata, changing Java memory representation

2017-11-08 Thread Julian Hyde
ve a timedelta64[UNIT] type (which results from any arithmetic > between timestamp values) > > On Wed, Nov 8, 2017 at 5:38 PM, Julian Hyde wrote: >> I don't know many examples of interval being used in the real world. >> But here's the kind of thing: the policy is

Re: [DISCUSS] Expanding Arrow interval type metadata, changing Java memory representation

2017-11-08 Thread Julian Hyde
implement it. Otherwise, we have >> huge amounts of type bloat (which means nothing will fully implement the >> spec and be able to interoperate). >> >> On Sat, Nov 4, 2017 at 3:46 PM, Julian Hyde wrote: >> >>> As I understand it, the proposal is to have both

Re: JDBC Adapter for Apache-Arrow

2017-11-07 Thread Julian Hyde
approach of adding an interface and making it art of Arrow, > so any specific JDBC driver can implement that interface to directly expose > Arrow objects without having to create JDBC objects in the first place. One > such implementation could be for Avatica itself what Julian was sugge

Re: [DISCUSS] Expanding Arrow interval type metadata, changing Java memory representation

2017-11-04 Thread Julian Hyde
As I understand it, the proposal is to have both an interval data type[1] and a timedelta type[2]. The interval is compatible with the SQL standard (but not Postgres) and can be implemented with a single numeric value representing a particular time unit (year, month, day, hour, minute, second,

Re: JDBC Adapter for Apache-Arrow

2017-11-01 Thread Julian Hyde
http://lmgtfy.com/?q=unsubscribe+apache+arrow <http://lmgtfy.com/?q=unsubscribe+apache+arrow> > On Oct 31, 2017, at 5:20 PM, 丁锦祥 wrote: > > unsubscribe > > On Tue, Oct 31, 2017 at 4:28 PM, Julian Hyde wrote: > >> Yeah, I agree, it should be an interface

Re: JDBC Adapter for Apache-Arrow

2017-10-31 Thread Julian Hyde
other Dev >> folks, on the JDBC adapter aspect. >> >> -Atul >> >> -Original Message- >> From: Julian Hyde [mailto:jh...@apache.org] >> Sent: Tuesday, October 31, 2017 11:12 AM >> To: dev@arrow.apache.org >> Subject: Re: JDBC Adapter for Apa

Re: JDBC Adapter for Apache-Arrow

2017-10-31 Thread Julian Hyde
to figure out the underlying DB schema and define Arrow columnar > schema. Also underlying database in this case would be any relational DB and > hence would be persisted to the disk, but the Arrow objects being in-memory > can be ephemeral. > > Please correct me if I am missing any

Re: JDBC Adapter for Apache-Arrow

2017-10-30 Thread Julian Hyde
How about writing an Arrow adapter for Calcite? I think it amounts to the same thing - you would inherit Calcite’s SQL parser and Avatica JDBC stack. Would this database be ephemeral (i.e. would the data go away when you close the connection)? If not, how would you know where to load the data f

Re: [DISCUSS] Updating Arrow's "elevator pitch" on web properties

2017-10-27 Thread Julian Hyde
es of the project > > 3. We are building computation and messaging libraries to be > companions to the columnar format and memory management > > 4. We support many languages (I added "currently" to imply that we are > not closed to new languages) > > - Wes >

Re: [DISCUSS] Updating Arrow's "elevator pitch" on web properties

2017-10-22 Thread Julian Hyde
s >>> of applications architected around the mantra of zero-copy. With new >>> architectures designed to leverage non-volatile memory on the horizon, >>> this grows more important with each passing day. >>> >>> - Wes >>> >>> On Sun, Oct 22, 2017 at

Re: [DISCUSS] Updating Arrow's "elevator pitch" on web properties

2017-10-21 Thread Julian Hyde
Your proposed version is definitely an improvement. > "Apache Arrow is a cross-language development platform for in-memory > structured data access and analytics. It specifies a standardized > language-independent columnar memory format for flat and hierarchical > data, with support for zero-copy

Fwd: [DISCUSS] Storage-class memory ecosystem program

2017-10-19 Thread Julian Hyde
This thread on general@incubator may be of interest to Arrow. Julian > Begin forwarded message: > > From: "Gang(Gary) Wang" > Subject: [DISCUSS] Storage-class memory ecosystem program > Date: October 19, 2017 at 11:55:46 AM PDT > To: gene...@incubator.apache.org > Cc: d...@mnemonic.incubator.

Re: [DRAFT] Apache Arrow October 2017 Board Report

2017-10-06 Thread Julian Hyde
Looks very healthy. The board frequently says they don't like seeing mailing list activity stats unless they make a significant point. Maybe remove that section, and note in "Health report" that emails are up 30% and subscribers are up 10%. On Fri, Oct 6, 2017 at 1:32 PM, Wes McKinney wrote: > H

Re: [ANNOUNCE] New Arrow committers: Phillip Cloud and Bryan Cutler

2017-10-03 Thread Julian Hyde
Congratulations and welcome, Philip and Bryan! > On Oct 3, 2017, at 5:27 AM, Wes McKinney wrote: > > On behalf of the Arrow PMC, I'm pleased to announce that Phillip Cloud > and Bryan Cutler have been invited to be Arrow committers. > > We are grateful for your contributions to the project and

Re: [DISCUSS] Publishing Arrow development artifacts more frequently for alpha stage components

2017-09-08 Thread Julian Hyde
e requirement)? > > On Fri, Sep 8, 2017 at 4:58 PM, Julian Hyde wrote: >> >>> On Sep 7, 2017, at 7:06 PM, Wes McKinney wrote: >>> >>> I personally don't have a problem with subcomponents publishing >>> artifacts to package managers outside of t

Re: [DISCUSS] Publishing Arrow development artifacts more frequently for alpha stage components

2017-09-08 Thread Julian Hyde
> On Sep 7, 2017, at 7:06 PM, Wes McKinney wrote: > > I personally don't have a problem with subcomponents publishing > artifacts to package managers outside of the primary Apache project > votes and releases, so long as they clearly signal that these package > builds are for development and not

Re: Apache Arrow at JupyterCon

2017-08-30 Thread Julian Hyde
Thanks for sharing. Can we tweet those videos as well? I see that https://twitter.com/apachearrow only tweeted your slides. > On Aug 26, 2017, at 1:11 PM, Wes McKinney wrote: > > hi all, > > In case folks here are interested, I gave a keynote this week at > J

Arrow Plasma Object Store - IP clearance

2017-08-07 Thread Julian Hyde
The vote for IP clearance of the Plasma Object Store on the Incubator list has passed[1]. We can now proceed with a release. Julian [1] https://s.apache.org/arrow-plasma-object-store-clearance-result

Fwd: [IP CLEARANCE] Arrow Plasma Object Store

2017-08-02 Thread Julian Hyde
FYI: I started a vote on the Plasma IP. > Begin forwarded message: > > From: Julian Hyde > Subject: [IP CLEARANCE] Arrow Plasma Object Store > Date: August 2, 2017 at 12:12:27 PM PDT > To: gene...@incubator.apache.org > > Apache Arrow is receiving a code for t

Avro Arrow

2017-07-30 Thread Julian Hyde
This news item made me chuckle. What sounds like an interesting mash-up of two Apache projects ended up in a Canadian lake in the 1950s. http://www.torontosun.com/2017/07/28/search-for-long-missing-avro-arrow-models-gets-underway

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-27 Thread Julian Hyde
hat mean we need to bump the whole project to 2.x? > As more languages come into the fold, this could happen more and more > often. How would people interpret a fast escalating major version > number? > > I am curious how Avro or Thrift have addressed this issue. > > - Wes > &g

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Julian Hyde
y is really important so that if any >> systems know how to parse or emit Arrow 1.x data, but aren't >> necessarily using the libraries provided by the project, they can have >> some assurance that we aren't going to break the Flatbuffers or the >> arrangement o

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Julian Hyde
nt so that if any > systems know how to parse or emit Arrow 1.x data, but aren't > necessarily using the libraries provided by the project, they can have > some assurance that we aren't going to break the Flatbuffers or the > arrangement of bytes in a record batch on the wire

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Julian Hyde
1.0 is a Big Deal because, under semantic versioning, there is a commitment to not change public APIs. If it weren’t for that, 1.0 would have vague marketing connotations of robustness, adoption etc. but otherwise be no different from another release. So, if API and data format lifecycle and co

Re: [VOTE] Accept contribution of Plasma Object Store

2017-07-20 Thread Julian Hyde
+1 > On Jul 20, 2017, at 3:07 PM, Bryan Cutler wrote: > > +1 sounds great! > > On Thu, Jul 20, 2017 at 11:14 AM, Wes McKinney wrote: > >> Dear all, >> >> The Plasma Object Store provides a server process, reference C++ client, >> and >> Python binding for managing a collection of binary "obj

Re: Adding a Map logical type to the Arrow metadata

2017-07-19 Thread Julian Hyde
s very useful; the metadata for Map should have > a field indicating whether or not the keys are sorted within each map > value > > - Wes > > On Wed, Jul 19, 2017 at 1:37 PM, Julian Hyde wrote: >> List> isn’t the only physical representation that makes sense. >> Becaus

Re: Adding a Map logical type to the Arrow metadata

2017-07-19 Thread Julian Hyde
List> isn’t the only physical representation that makes sense. Because it doesn’t take advantage of the fact that (a) keys can be re-ordered, (b) keys are unique. So, another viable physical representation would be Struct, List>, with the keys sorted. If keys are constant width and in contiguou

Re: Branching for Arrow releases

2017-05-05 Thread Julian Hyde
e when we come to > it. > > Thanks > Wes > > On Fri, May 5, 2017 at 12:51 PM, Julian Hyde wrote: > >> I’m fine with either proposal (holding off commits during the release >> vote, or rebasing master afterwards). >> >> I agree with Julien that it’s re

Re: Branching for Arrow releases

2017-05-05 Thread Julian Hyde
I’m fine with either proposal (holding off commits during the release vote, or rebasing master afterwards). I agree with Julien that it’s really nice to have a simple, linear history (with releases on the master branch) and since Arrow is a fairly low-volume project we’re lucky we can do that.

[jira] [Commented] (ARROW-690) Only send JIRA updates to iss...@arrow.apache.org

2017-03-22 Thread Julian Hyde (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15936642#comment-15936642 ] Julian Hyde commented on ARROW-690: --- Presumably the creation email would continue t

  1   2   >