More opxforms…
Originally shared by Ian Bicking
Huh, after getting some more hands-on experience with Operational Transformations (ala Etherpad) I’m a little confused by why the whole diamond-rewriting-composition part is as complicated as it is. The problem is out-of-order transformations being received, and everyone has to come to a consistent document. Transformations are rewritten to handle this conflict gracefully, given potentially quite a lot of these conflicts (like when two people are actively typing in a document).
But couldn’t we give every version a number, with all versions ordered? We assign the clients an order as well. Versions then are MAJOR.MINOR, where the minor version is the client’s order, and the major version is the increment of the last major version. Any client can then produce a version that fits into a global sequence of versions.
We’ll assume that while timing is unreliable, ultimately all messages get through. So then we have the case:
1. A client produces/shares a change, giving the previous document version and a new document version, and the diff that produces the change.
2. A client receives a change that works from the version it already has; it can then simply apply the diff.
3. A client receives a change that works from a previous version of the document (i.e., the client that produced the change hadn’t received some local change, or two entirely separate clients produced conflicting changes).
OK, so 3 is really the tricky part. What then do we do? Let’s say we have client A and B (ordered alphabetically). We start with document 6.A. Client A applies a change creating 7.A, we’ll call this change Patch1. It receives a change transforming 6.A into 7.B; we’ll call that Patch2. The new version will be 7.B (because it is the latest version). It sees that 7.A is therefore an obsolete change, and it creates 8.A that re-applies the change. Client B ultimately receives that patch from 6.A to 7.A and, using the same algorithm, can also produce 8.A.
Which I guess is the same as OT. The transformation of patches is that 6.A+Patch1=7.A, and 6.A+Patch2=7.B. We then define the operations:
6.A+Patch2+transpose(Patch1, Patch2)=8.A, 6.A+Patch1+transpose(Patch2, Path1)=8.A
transpose() is not too terribly hard a function to produce, though it will effect user experience when conflicts arise. It must be deterministic, and must satisfy this basic equivalence.
I can’t decide if the ordering of clients matters. I think it does? That is, I think without that ordering we may not be able to make those two orders of execution reliable when conflicts arise (i.e., the same edit has to win on both clients). Maybe this isn’t as hard as I thought?