Hey Oliver, 

If you're experiencing an issue, I recommend posting to the BigQuery public 
issue tracker <https://code.google.com/p/google-bigquery/issues/list>, 
since an old thread like this probably won't have much activity, and the 
public issue tracker is a more responsive way to report an issue. 

Best wishes,

Nick

On Wednesday, September 9, 2015 at 10:32:08 AM UTC-4, Oliver Urs Lenz wrote:
>
> I can confirm that more than two years later, this is still an issue.. :-(
>
> On Monday, May 6, 2013 at 10:28:27 AM UTC+2, Mike wrote:
>>
>> Great - thanks Arie. Any idea when this will be ready? An approximation 
>> only would be appreciated. i.e. 1 month, 6 months, 1 year?
>>
>> On Friday, May 3, 2013 6:30:34 AM UTC+10, Arie Ozarov wrote:
>>>
>>>
>>>
>>> On Wednesday, May 1, 2013 3:39:06 PM UTC-7, Jason Collins wrote:
>>>>
>>>> On reflection, I suspect it has more to do with Map-Reduce task retries 
>>>> than some race condition.
>>>
>>> Correct. Not an issue for backup/restore but is a known issue for 
>>> BigQuery imports.
>>> We plan to eliminate duplicates in the MR level. 
>>>
>>>
>>>> j
>>>>
>>>> On Tuesday, 30 April 2013 22:59:53 UTC-6, Jason Collins wrote:
>>>>>
>>>>> We have seen the same phenomenon. 
>>>>>
>>>>> It's likely due to some kind of race condition in the backup tool 
>>>>> itself, but is not a problem there because when restoring, one of the 
>>>>> dups 
>>>>> will just overwrite the other. But it does become a problem once ingested 
>>>>> into BigQuery.
>>>>>
>>>>> j
>>>>>
>>>>> On Monday, 29 April 2013 20:10:34 UTC-6, Mike wrote:
>>>>>>
>>>>>> Hi there
>>>>>>
>>>>>> I've noticed there may be duplicate records in the Backup data that 
>>>>>> AppEngine produces.
>>>>>>
>>>>>> I can verify this because I'm loading the Backups into BigQuery. When 
>>>>>> I search one of my tables, I can see the duplicates:
>>>>>>
>>>>>> SELECT __key__.id as X_id, COUNT(__key__.id) as X_count, created FROM 
>>>>>> [TableId] GROUP BY X_id, created HAVING X_count > 1 ORDER BY created 
>>>>>> DESC;
>>>>>>
>>>>>> This shows there are 5,807 duplicates in a table of ~2 million 
>>>>>> entries (~0.2%)
>>>>>>
>>>>>> I can give Google employees access to our BigQuery and Google Storage 
>>>>>> accounts if that helps track down the issue.
>>>>>>
>>>>>> Cheers
>>>>>> Mike
>>>>>>
>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/1d925c32-ce98-4619-8d1b-f135fd3e3510%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to