date:20170622

Re: [DISCUSS] Bridge beam metrics to underlying runners to support metrics reporters?

2017-06-22 Thread Cody Innowhere

Hi JB, Glad to hear that. Still, I'm thinking about adding support of Meters & Histograms(maybe extending Distribution). As the discussion mentions, problem is that Meter/Histogram cannot be updated directly in current way because their internal data decays after time. Do you plan to refactor curre

Re: Looking for a good "write-here-if-fails" pattern

2017-06-22 Thread Kenneth Knowles

Using provenance to explain bad data in a general manner requires deep support from your data processing engine and is still a research topic (for example, https://blog.acolyer.org/2017/02/01/explaining-outputs-in-modern-data-analytics/) so I wouldn't go down that path. I expect that putting in the

Re: SQL in Stream Computing: MERGE or INSERT?

2017-06-22 Thread James

Hi Tyler, I think upsert is a good alternative, concise as INSERT and have the valid semantics. Just that user seems rarely use UPSERT either(might because there's no UPDATE in batch big data processing). By *"INSERT will behave differently in batch & stream processing"* I mean, if we use the "IN

Re: SQL in Stream Computing: MERGE or INSERT?

2017-06-22 Thread James

Hi Jesse, Yeah, I know the insert...select grammar. In my scenario, each of the value column is calculated separately(might calculated from different datasources), so insert...select might not be sufficient. Jesse Anderson 于2017年6月22日周四下午10:35写道： > If I'm understanding correctly, Hive does that

答复: Fwd: [Report] Eagle - June 2017

2017-06-22 Thread 上海_中台研发部_数据平台部_基础数据部_唐觊隽

I am working on Alert engine based Apache Beam. I can help volunteers. -邮件原件- 发件人: Jyotirmoy Sundi [mailto:sundi...@gmail.com] 发送时间: 2017年6月22日 10:23 收件人: JingsongLee; dev@beam.apache.org 主题: Re: Fwd: [Report] Eagle - June 2017 Would like to help have worked on we beam apps On Wed, Jun

Re: Bundling multiple TestPipeline tests into one pipeline

2017-06-22 Thread Eugene Kirpichov

Another advantage of "custom runner" approach is that we can convert existing ValidatesRunner test classes one by one, switching them from RunWith(Junit4.class) to RunWith(BundledTestPipelines.class) or whatever (and making other necessary changes). On Thu, Jun 22, 2017 at 3:48 PM Kenneth Knowles

Re: Bundling multiple TestPipeline tests into one pipeline

2017-06-22 Thread Kenneth Knowles

This is a great idea! Your suggestion to do it via a JUnit test runner makes it very concrete. Kenn On Thu, Jun 22, 2017 at 3:27 PM, Eugene Kirpichov < kirpic...@google.com.invalid> wrote: > Hi folks and especially runner developers, > > https://issues.apache.org/jira/browse/BEAM-2506 - quoting

Bundling multiple TestPipeline tests into one pipeline

2017-06-22 Thread Eugene Kirpichov

Hi folks and especially runner developers, https://issues.apache.org/jira/browse/BEAM-2506 - quoting from there: Currently ValidatesRunner test suites run 1 pipeline per unit test. That's a lot of small pipelines, and consumes a lot of resources especially in case of a pretty heavyweight runner l

Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

2017-06-22 Thread Ahmet Altay

+1 For Python, there are 2 hard blocking issues (and 2 nice to haves) all tagged as blocking 2.1.0 [1]. Ahmet [1] https://issues.apache.org/jira/browse/BEAM-2497?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20Reopened)%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%202

Re: reading from s3 file in aws

2017-06-22 Thread Lukasz Cwik

Filed BEAM-2500 as a feature request. On Thu, Jun 22, 2017 at 9:00 AM, tarush grover wrote: > Hi All, > > Can we add a module s3-file-system in beam to directly support and have > integration with s3? > > Regards, > Tarush > > On Thu, 22 Jun 2017 at 9:21 PM, Lukasz Cwik > wrote: > > > You want

Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

2017-06-22 Thread Davor Bonaci

+1 On Thu, Jun 22, 2017 at 5:42 AM, Etienne Chauchot wrote: > Besides, there are some minor fixes/enhancements that lack in spark > > For info, bellow are the ones raised by nexmark test suite: > > https://issues.apache.org/jira/browse/BEAM-2499 > > https://issues.apache.org/jira/browse/BEAM-211

Re: reading from s3 file in aws

2017-06-22 Thread tarush grover

Hi All, Can we add a module s3-file-system in beam to directly support and have integration with s3? Regards, Tarush On Thu, 22 Jun 2017 at 9:21 PM, Lukasz Cwik wrote: > You want to depend on the Hadoop File System module[1] and configure > HadoopFileSystemOptions[2] with a S3 configuration[3]

Re: reading from s3 file in aws

2017-06-22 Thread Lukasz Cwik

You want to depend on the Hadoop File System module[1] and configure HadoopFileSystemOptions[2] with a S3 configuration[3]. 1: https://github.com/apache/beam/tree/master/sdks/java/io/hadoop-file-system 2: https://github.com/apache/beam/blob/master/sdks/java/io/hadoop-file-system/src/main/java/org/

Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

2017-06-22 Thread Etienne Chauchot

Besides, there are some minor fixes/enhancements that lack in spark For info, bellow are the ones raised by nexmark test suite: https://issues.apache.org/jira/browse/BEAM-2499 https://issues.apache.org/jira/browse/BEAM-2112 https://issues.apache.org/jira/browse/BEAM-2409 https://issues.apache

Re: SQL in Stream Computing: MERGE or INSERT?

2017-06-22 Thread Tyler Akidau

Calcite appears to have UPSERT support, can we just use that instead? Also, I don't understand your statement that "INSERT will behave differently in batch & stream processing". Can you explain further? -Tyler On Thu, Jun 22, 2017 at 7:35 AM J

Re: SQL in Stream Computing: MERGE or INSERT?

2017-06-22 Thread Jesse Anderson

If I'm understanding correctly, Hive does that with a insert into followed by a select statement that does the aggregation. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingdataintoHiveTablesfromqueries On Thu, Jun 22, 2017 at 1:32 AM James wrote: >

Re: [DISCUSS] Bridge beam metrics to underlying runners to support metrics reporters?

2017-06-22 Thread Jean-Baptiste Onofré

Hi Agree with Aviem and yes actually I'm working on a generic metric sink. I created a Jira about that. I'm off today, I will send some details asap. Regards JB On Jun 22, 2017, 15:16, at 15:16, Aviem Zur wrote: >Hi Cody, > >Some of the runners have their own metrics sink, for example Spark >r

Re: [DISCUSS] Bridge beam metrics to underlying runners to support metrics reporters?

2017-06-22 Thread Aviem Zur

Hi Cody, Some of the runners have their own metrics sink, for example Spark runner uses Spark's metrics sink which you can configure to send the metrics to backends such as Graphite. There have been ideas floating around for a Beam metrics sink extension which will allow users to send Beam metric

Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

2017-06-22 Thread Aviem Zur

+1 There are important bug fixes that need to be released. On Thu, Jun 22, 2017 at 11:42 AM Etienne Chauchot wrote: > +1 on Ismaël words, but not a blocking point indeed, maybe more a nice > to have. > > > Le 22/06/2017 à 06:59, Ismaël Mejía a écrit : > > Thahks JB for keeping the time based rel

[DISCUSS] Bridge beam metrics to underlying runners to support metrics reporters?

2017-06-22 Thread Cody Innowhere

Hi guys, Currently metrics are implemented in runners/core as CounterCell, GaugeCell, DistributionCell, etc. If we want to send metrics to external systems via metrics reporter, we would have to define another set of metrics, say, codahale metrics, and update codahale metrics periodically with beam

Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

2017-06-22 Thread Etienne Chauchot

+1 on Ismaël words, but not a blocking point indeed, maybe more a nice to have. Le 22/06/2017 à 06:59, Ismaël Mejía a écrit : Thahks JB for keeping the time based release agenda. I really don't have any blocker but I would like to have the hadoop version alignment PR merged before this one and

Re: Reduced Availability from 17.6. - 24.6

2017-06-22 Thread Etienne Chauchot

Enjoy Aljoscha! Le 17/06/2017 à 07:03, Aljoscha Krettek a écrit : Hi, I’ll be on vacation next week, just in case anyone is wondering why I’m not responding. :-) Best, Aljoscha

SQL in Stream Computing: MERGE or INSERT?

2017-06-22 Thread James

Hi team, I am thinking about a SQL and stream computing related problem, want to hear your opinions. In stream computing, there is a typical case like this: *We want to calculate a big wide result table, which has one rowkey and ten value columns:* *create table result (* *rowkey varchar(127

Jenkins build became unstable: beam_Release_NightlySnapshot #455

2017-06-22 Thread Apache Jenkins Server

See

Re: [DISCUSS] Bridge beam metrics to underlying runners to support metrics reporters?

Re: Looking for a good "write-here-if-fails" pattern

Re: SQL in Stream Computing: MERGE or INSERT?

Re: SQL in Stream Computing: MERGE or INSERT?

答复: Fwd: [Report] Eagle - June 2017

Re: Bundling multiple TestPipeline tests into one pipeline

Re: Bundling multiple TestPipeline tests into one pipeline

Bundling multiple TestPipeline tests into one pipeline

Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

Re: reading from s3 file in aws

Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

Re: reading from s3 file in aws

Re: reading from s3 file in aws

Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

Re: SQL in Stream Computing: MERGE or INSERT?

Re: SQL in Stream Computing: MERGE or INSERT?

Re: [DISCUSS] Bridge beam metrics to underlying runners to support metrics reporters?

Re: [DISCUSS] Bridge beam metrics to underlying runners to support metrics reporters?

Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

[DISCUSS] Bridge beam metrics to underlying runners to support metrics reporters?

Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

Re: Reduced Availability from 17.6. - 24.6

SQL in Stream Computing: MERGE or INSERT?

Jenkins build became unstable: beam_Release_NightlySnapshot #455

24 matches

Site Navigation

Mail list logo

Footer information