date:20170912

Re: BigQueryIO withSchemaFromView

2017-09-12 Thread Chaim Turkel

from what i found it I have the windowing with bigquery partition (per day - 1545 partitions) the insert can take 5 hours, where if there is no partitions then it takes about 12 minutes I have 13,843,080 recrods 6.76 GB. Any ideas how to get the partition to work faster. Is there a way to get the

Re: BigQueryIO withSchemaFromView

2017-09-12 Thread Chaim Turkel

i am using windowing for the partion of the table, maybe that has to do with it? On Tue, Sep 12, 2017 at 11:25 PM, Reuven Lax wrote: > Ok, something is going wrong then. It appears that your job created over > 14,000 BigQuery load jobs, which is not expected (and probably why things > were so slo

Re: BigQueryIO withSchemaFromView

2017-09-12 Thread Chaim Turkel

any idea how i can debug it or find the issue? On Tue, Sep 12, 2017 at 11:25 PM, Reuven Lax wrote: > Ok, something is going wrong then. It appears that your job created over > 14,000 BigQuery load jobs, which is not expected (and probably why things > were so slow). > > On Tue, Sep 12, 2017 at 8:

Re: BigQueryIO withSchemaFromView

2017-09-12 Thread Reuven Lax

Ok, something is going wrong then. It appears that your job created over 14,000 BigQuery load jobs, which is not expected (and probably why things were so slow). On Tue, Sep 12, 2017 at 8:50 AM, Chaim Turkel wrote: > no that specific job created only 2 tables > > On Tue, Sep 12, 2017 at 4:36 PM,

Re: Unable to find registrar for s3n when restoring flink job from savepoint

2017-09-12 Thread Lukasz Cwik

Filed https://issues.apache.org/jira/browse/BEAM-2948 On Tue, Sep 12, 2017 at 2:10 AM, Pawel Bartoszek wrote: > Hi, > > I am running a flink v1.2.1 job on a EMR cluster using Beam v2.0.0. > When I try to restore job from savepoint on one task manager I get the > exception *Unable to find registr

Re: BigQueryIO withSchemaFromView

2017-09-12 Thread Chaim Turkel

no that specific job created only 2 tables On Tue, Sep 12, 2017 at 4:36 PM, Reuven Lax wrote: > It looks like your job is creating about 14,45 distinct BigQuery tables. > Does that sound correct to you? > > Reuven > > On Tue, Sep 12, 2017 at 6:22 AM, Chaim Turkel wrote: > >> the job id is 2017-0

Re: Merge branch DSL_SQL to master

2017-09-12 Thread Tyler Akidau

Congrats, and thanks! -Tyler On Tue, Sep 12, 2017 at 5:49 AM Etienne Chauchot wrote: > Great work guys! > > Etienne > > > Le 11/09/2017 à 23:51, Mingmin Xu a écrit : > > Now it's merged to master. Thanks to everyone! > > > > Mingmin > > > > On Thu, Sep 7, 2017 at 10:09 AM, Ahmet Altay > > wrot

Unable to find registrar for s3n when restoring flink job from savepoint

2017-09-12 Thread Pawel Bartoszek

Hi, I am running a flink v1.2.1 job on a EMR cluster using Beam v2.0.0. When I try to restore job from savepoint on one task manager I get the exception *Unable to find registrar for s3n*. The job can write files to s3 acting as a sink. So S3 access works except when restoring from savepoint. I a

Re: BigQueryIO withSchemaFromView

2017-09-12 Thread Reuven Lax

It looks like your job is creating about 14,45 distinct BigQuery tables. Does that sound correct to you? Reuven On Tue, Sep 12, 2017 at 6:22 AM, Chaim Turkel wrote: > the job id is 2017-09-12_02_57_55-5233544151932101752 > as you can see the majority of the time is inserting into bigquery. > is

Re: BigQueryIO withSchemaFromView

2017-09-12 Thread Chaim Turkel

the job id is 2017-09-12_02_57_55-5233544151932101752 as you can see the majority of the time is inserting into bigquery. is there any way to parallel this? My feeling for the windowing is that writing should be done per window (my window is daily) or at least to be able to configure it chaim On

Re: BigQueryIO withSchemaFromView

2017-09-12 Thread Reuven Lax

So the problem is you are running on Dataflow, and it's taking longer than you think it should? If you provide the Dataflow job id we can help you debug why it's taking 30 minutes. (and as an aside, if this turns into a Dataflow debugging session we should move it off of the Beam list and onto a Da

Re: Merge branch DSL_SQL to master

2017-09-12 Thread Etienne Chauchot

Great work guys! Etienne Le 11/09/2017 à 23:51, Mingmin Xu a écrit : Now it's merged to master. Thanks to everyone! Mingmin On Thu, Sep 7, 2017 at 10:09 AM, Ahmet Altay wrote: +1 Thanks to all contributors/reviewers! On Thu, Sep 7, 2017 at 9:55 AM, Kai Jiang wrote: +1 looking forward

Re: BigQueryIO withSchemaFromView

2017-09-12 Thread Chaim Turkel

is there a way around this, my time for 13gb is not close to 30 minutes, while it should be around 15 minutes. Do i need to chunk the code myself to windows, and run in parallel? chaim On Sun, Sep 10, 2017 at 6:32 PM, Reuven Lax wrote: > In that case I can say unequivocally that Dataflow (in batc

Re: BigQueryIO withSchemaFromView

Re: BigQueryIO withSchemaFromView

Re: BigQueryIO withSchemaFromView

Re: BigQueryIO withSchemaFromView

Re: Unable to find registrar for s3n when restoring flink job from savepoint

Re: BigQueryIO withSchemaFromView

Re: Merge branch DSL_SQL to master

Unable to find registrar for s3n when restoring flink job from savepoint

Re: BigQueryIO withSchemaFromView

Re: BigQueryIO withSchemaFromView

Re: BigQueryIO withSchemaFromView

Re: Merge branch DSL_SQL to master

Re: BigQueryIO withSchemaFromView

13 matches

Site Navigation

Mail list logo

Footer information