Hi,

In theory, it *should* just be our script writing to the output CSV file.

However, I wanted it to be robust - e.g. in case somebody spins up two copies 
of this script running concurrently.

I suppose the timing would have to be pretty unlucky to hit a race condition 
there, right?

As in, somebody would have have to open the new file and write to it somewhere 
in between the check line (os.path.getsize) and the following line 
(writeheaders).

However, you're saying the only way to be completely safe is some kind of file 
locking?

Another person (Zachary Ware) suggested using .tell() on the file as well - I 
suppose that's similar enough to using os.path.getsize(), right?

But basically, I can call .tell() or os.path.getsize() on the file to see if 
it's zero, and then just call writeheaders() on the following line.

In the future - we may be moving to storing results in something like SQLite, 
or MongoDB and outputting a CSV directly from there.

Cheers,
Victor

On Wednesday, 30 October 2013 13:55:53 UTC+11, Joseph L. Casale  wrote:
> > Like Victor says, that opens him up to race conditions.
> 
> 
> 
> Slim chance, it's no more possible than it happening in the time try/except
> 
> takes to recover an alternative procedure.
> 
> 
> 
> with open('in_file') as in_file, open('out_file', 'ab') as outfile_file:
> 
>     if os.path.getsize('out_file'):
> 
>         print('file not empty')
> 
>     else:
> 
>         #write header
> 
>         print('file was empty')
> 
> 
> 
> And if that's still not acceptable (you did say new) than open the out_file 
> 'r+' an seek
> 
> and read to check for a header.
> 
> 
> 
> But if your file is not new and lacks a header, then what?
> 
> jlc

-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to