Re: [DISCUSS] HadoopInputFormat based IOs

2017-06-05 Thread Stephen Sisk
That summary phrase looks good to me, thanks for writing it up. S On Thu, Jun 1, 2017 at 3:08 AM Ismaël Mejía wrote: > Stephen I agree with you that the most important thing is not to lose > functionality for the integration tests, so it is important to keep at > least one of the two (Cassandra

Re: [DISCUSS] HadoopInputFormat based IOs

2017-06-01 Thread Ismaël Mejía
Stephen I agree with you that the most important thing is not to lose functionality for the integration tests, so it is important to keep at least one of the two (Cassandra or Elasticsearch) to do a real integration test for HIFIO. Your proposal of making the IT tests for the native IOs paralleliz

Re: [DISCUSS] HadoopInputFormat based IOs

2017-05-30 Thread Stephen Sisk
Ah, thanks for clarifying ismael. I think you would agree that we need to have integration testing of HIFIO. Cassandra and ES are currently the only ITs for HIFIO. If we want to write ITs for HIFIO that don't rely on ES/Cassandra with the idea that we'd remove ES/Cassandra, I could be okay with th

RE: [DISCUSS] HadoopInputFormat based IOs

2017-05-30 Thread Seshadri Raghunathan
+1 I think this is a good way to streamline HIFIO and native IOs. Regards, Seshadri 408 601 7548 -Original Message- From: Ismaël Mejía [mailto:ieme...@gmail.com] Sent: Tuesday, May 30, 2017 1:47 PM To: dev@beam.apache.org Subject: Re: [DISCUSS] HadoopInputFormat based IOs The whole

Re: [DISCUSS] HadoopInputFormat based IOs

2017-05-30 Thread Ismaël Mejía
The whole goal of this discussion is that we define what shall we do when someone wants to add a new IO that uses HIFIO. The consensus so far following the PR comments + this thread is that it should be discouraged and those contribution be included as documentation in the website, and that we shou

Re: [DISCUSS] HadoopInputFormat based IOs

2017-05-30 Thread Stephen Sisk
Great, I'm glad to hear that. I filed BEAM-2388 to track the work (currently unassigned) > today we have Cassandra and Elasticsearch5 examples based on HIF that will be clearly redundant once we have the native versions, so they should maybe moved into the proposed website section Can you clarify

Re: [DISCUSS] HadoopInputFormat based IOs

2017-05-30 Thread Ismaël Mejía
I agree 100% with Stephen points, I think that including a 'discoverability' section for these IOs that are shared by multiple data stores is a great step, in particular for the HIF ones. I would like that we define what would we do in concrete with the HIFIO based implementations of IOs once thei

Re: [DISCUSS] HadoopInputFormat based IOs

2017-05-23 Thread Stephen Sisk
hey, Thanks for bringing this up! It's definitely an interesting question and I can see both sides of the argument. I can see the appeal of HIFIO wrapper IOs as stop-gaps and if they have good test coverage, it does ensure that the HIFIO route is working. If we have good IT coverage, it also mean

[DISCUSS] HadoopInputFormat based IOs

2017-05-23 Thread Ismaël Mejía
Hello, I bring this subject to the mailing list to see everybody’s opinion on the subject. The recent inclusion of HadoopInputFormatIO (HiFiIO) gave Beam users the option to ‘easily’ include data stores that support the Hadoop-based partitioning scheme. There are currently examples of how to use i