On Thu, Oct 8, 2015 at 1:56 PM, Konstantin Boudnik <c...@apache.org> wrote:
> On Thu, Oct 08, 2015 at 12:46PM, Dmitriy Setrakyan wrote: > > On Thu, Oct 8, 2015 at 12:28 PM, Konstantin Boudnik <c...@apache.org> > wrote: > > > > > This conversation reminds me of the situation with Spark and akka that > I > > > just > > > ran into. Or rather with Akka and the way they designed the remote > > > execution. > > > The situation is actually _completely_ ridiculous. I stood up a small > Spark > > > cluster and then tried to submit a job into it, which had some > > > Spark dependencies. The way the job is written it pulls the > dependencies > > > automatically from the maven repo. To my horror, the job was crashing > > > because > > > local and remote serialIDs of the classes differed, although the > dependency > > > versions were the same. The root cause is this: the versions are > compiled > > > with > > > the same version of JDK (like JDK7) or something, but one is Open and > the > > > other one is Oracle's. > > > > > > I think this is a very shaky way of designing the software for > distributed > > > environments and it badly complicates the operation and integration of > the > > > clusters. It clearly shows the lack practical experience beyond the > > > academic > > > ivory towers on the account of Akka guys. RPC, while not without its > own > > > issues, allows to get around such problems with ease. > > > > > > I guess what I am saying: aren't we trying to find an even more complex > > > solution for already pretty tough problem? > > > > > > > I think that the problem you are describing is not the same. What we are > > solving here is, for example, ability to run Ignite with IBM WebSphere on > > the client side and OpenJDK on the server side. > > > > This issue has little to do with dependencies, and mostly with removing a > > legacy restriction from the project about matching JDK versions. > > The problem is the same: the use of dynamic dependencies just illustrates > it > clearly. Different JDKs are producing different serial.vers. of the classes > and it will come and haunt you one way or another. The manifestation of the > problem could be different, but you can count that the problem will be > there > for you on any heterogeneous cluster. > In the upcoming 1.5 release, this will only apply to compute grid and not to data grid. I think we should print out a warning, but not disallow the cluster startup, like we do now. > Cos >