Missing multiple X displays is not the problem as having them would not
lead to a cross-platform solution including native Windows (i.e. without
X server) and Mac.
I can imagine a possible solution along the lines of separate processes
communicating over the network using something like the lyxserver protocol.
An ad-hoc outline of the communication: When someone hits a key or
does a mouse move that would change the contents of the document, the
local program would broadcast its intention to change something. The
others would confirm that they pass control and prevent local changes.
If all confirmations arrived, the actual change can happen, and be
broadcast to all listeners. That would be two full roundtrips per
keystroke, which is not nice but should work. If that is set up one
might go down and optimize stuff a bit...
I don't think it is very complicated if one is allowed to use suitable
tools.
Or perhaps copy (parts of) the solution used by Google Wave, as the
protocol/API is open?
/Christian