Gary Dusbabek <gdusba...@gmail.com> writes: > > I looked into doing this when I was first learning the code and had an > experience simliar to yours. At the time there wasn't much interest > in seeing it through to fruition, but maybe times have changed.
any lack of interest in solving these problems just means that people haven't stumbled on these problems yet :-) ...but eventually they will (and people like Ran Tavory and the Hector team have already stumbled across these hurdles and had to devote time to creating some workarounds). > If I were to attempt it again I would do it in this error: > 1. Make the config customizable. Would it be good enough if you had a CassandraConfig object and some ways to create it? Either directly or through: CassandraConfig config = CassandraConfig.parseFile(...); and then some: Cassandra cassandra = Cassandra.createInstance(config); or even Cassandra cassandra = new Cassandra(config); > 2. Make the services re-entrant (You should be able to start, stop, > then start again without problems). you mean restart an instance or be able to throw away your instance and create a new one? for me, being able to restart a stopped instance isn't really that important because it would work fine for me to create a new instance (possibly with the same config, using the same files/dirs and ports). you may have good reasons to be able to restart a stopped Cassandra instance though. (But I suspect we more or less want the same thing). > 3. Get rid of the singletons. This will involve coming up with a > smart way to couple instances of the services with each other. indeed. but I hope nobody falls for the temptation of introducing Spring or something similar to do the wiring in the Cassandra code. (what people do in their own projects is their problem, but Cassandra should not require you to adopt additional mamoth frameworks). > 4. Integrate the storage port into how we canonically identify a node > (its just hostname now). hmm, I see your point, but I am not sure I understand the consequences fully. > 5. While you're at it, figure out how to get JMX to bind to something > other than 0.0.0.0. (I hear it is possible, see > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6425769) I have limited experience with JMX so I'll pass on commenting on this. >> there are other valid reasons for wanting to embed Cassandra besides >> unit testing. for instance, if you are writing an application that >> depends on Cassandra and you want the option of packaging it as a single >> binary for single node experimentation, development and demo purposes. > > I'd kind of like to see this too, although I admit that from the > pragmatic standpoint of running a Cassandra server, it represents a > whole lot of change for what amounts to very little tangible benefit. while the benefit may be hard to articulate, I think it is significant. any time you can embed a "server" in your binary you can make life a lot easier for casual users and for testing. almost all server projects I have done in the past 7-8 years have been like this: I make it possible to embed the server so that people can build and distribute prototypes or they can use the exact same binary to either use an external (distributed) instance or just create an internal instance for simpler use-cases (by config). compare to Hudson. it is distributed as a WAR so you can load it into your web server. but for most people, they just want it up and running with as little hassle as possible on a single node, so being able to fire it up from the command line, and rely on the embedded web server is very attractive compared to fooling around with Jetty, Tomcat or worse. if Hudson had required me to manage a number of services that I need to manually set up and manage, I would probably not have bothered using it. (not sure if that example is very clear, but hey... :-) > From a development standpoint, the biggest benefit I see it would that > we could write unit tests for small clusters that run on a single > node. yeah, it is critical for unit testing. right now we are forced to do testing in a rather clumsy fashion. it is a big step backward from, for instance, the way I do testing with Apache Derby (which has hairy lifecycle management, but it is embeddable). > One interesting thing that this would make possible is the ability to > have a node with >1 tokens in a single JVM. Useful, who knows? But > it is interesting because I think it would make Cassandra more elastic > (and could theoretically help with the hot-node problem when using > OPP). (there are some usage scenarios using OSGi to run multiple Cassandra instances in the same JVM that come to mind, but I haven't really given this a lot of (any) detailed thought) -Bjørn