> On 17/3/10, cocoa-dev-requ...@lists.apple.com wrote:
> 
>> Do you mean "more than one application simultaneously on more than one 
>> physical computer over NFS/AFP/SMB" ?  Don't do that.
> 
> When did that become the official policy, Ben?

The short answer is in 10.6.

The longer answer is that there are two classes of problems, one for NFS and 
one for AFP/SMB.  

For NFS, the protocol does not reliably allow for clients to maintain their 
local file caches coherently with the server in real time.  Only the newest NFS 
servers even respect file cache coherency around byte range file locks **at 
all**.  Unfortunately, the latest protocol doesn't mesh well with SQLite or 
most other existing incrementally updated file formats.   Many deployed NFS 
servers and clients only provide on "close to open" coherency.  File 
modification timestamps on NFS severs may (or may not) provide accuracy better 
than 1.0 seconds.  And so forth.  There's also no good way to perform a cache 
bypassing file read on OSX or selectively evict ranges of files from the local 
file cache by hand.  We churned on this for a while with various teams, and 
there wasn't a good solution for multiple machines simultaneously accessing an 
SQLite db file (or most other incrementally updated binary file formats).  By 
"good" I mean a solution that worked reliably and didn't make other more 
important things work less well.  

NFS is just not a free Oracle.  Software that wants real time distributed cache 
coherency needs to use IPC and mange the problem themselves.   It is trivial to 
write a program that writes to a file on NFS and sends the same update to its 
clone on another machine via IPC and for its clone to verify that NFS cache 
coherency indeed fails regularly (e.g. file read bytes != IPC read bytes).  
This is what I mean by real time distributed cache coherency.

For AFP & SMB, the problem is different.  These FS do not support POSIX 
advisory byte range locks at all.  They only support mandatory locks.  
Consequently, they never cache data read from files with any existing locks at 
all.  No file caching means all the I/O is slow.  Painfully slow.  AFP over 
Airport without file caching is bad.  The I/O throughput on a file free of 
locks on AFP is close to 100x better than a file with a single byte locked that 
isn't even anywhere near where you are reading.  For nearly all Mac OS X 
customers (sadly not you) achieving a near 100x performance boost when 
accessing database files on an AFP or SMB mount (like their home directory in 
educational deployments) is pretty huge. 

So we focused on making the experience that could work well work even better.  
10.6 is a significantly better network FS client as Apple applications like 
Mail don't slam the servers with byte range lock requests all the time (good 
for NFS), and on AFP also gets to use local file caching.

To address both sets of problems on all network FS, we enforce a single 
exclusive lock on the server for the duration the application has the database 
open.  Closing the database connection (or logging out) allows another machine 
to take its turn.  This behavior was supposed to be opt in for Core Data apps, 
but on 10.6.2 is not.

> I'm doing that with some success.  For the past three years, my 
> client's point of sale system has been connecting from five 
> machines to a single database on a Synology network drive over 
> afp.  I had to write some code to periodically get the latest 
> values from the database during idle time.  That was a little 
> complicated but its working well now.

It can work technically on AFP.  However, the distributed cache coherency 
problem is avoided by these network FS because they don't do any file caching 
on files with locks.  Your server set up and networking hardware is pretty 
sophisticated compared to most so the performance is adequate.  As an engineer, 
I would wish AFP over VPN over Airport was the more uncommon deployment 
scenario, but sadly not.

> There are mysterious but harmless optimistic locking errors once 
> in a while where no attributes appear have changed -- just to 
> keep me on my toes, along with an occasional real data collision 
> (two staff members carelessly editing the same object) but we've 
> had no real issues in a year or so.

Those mysterious updates are probably version numbers being bumped because the 
object's participation in a to-many relationship, either outgoing or incoming, 
changed.

> However, 10.6.2 has a bug where only one machine can connect to 
> a Core Data store (its likely a sqlite-level connection issue -- 
> but I'm not sure).

ADC did pass your bug report along to us, and it is a backward binary 
compatibility issue, and a regression from 10.6.0.  It will be fixed in a 
future update.  You'll get the 10.5 performance characteristics, however.

> So, for a while we were down to one 
> machine.  I eventually had to roll the five machines back to 
> Leopard.  That remains a PR nightmare.

I'm sorry about that.  You should follow up with ADC or Evangelism directly for 
status updates regarding your bug.

- Ben

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to