Hello, I am currently looking into the new (DRM) mahout framework.
I find myself wondering why is it so that from one side there is a lot of thought, effort and design complexity being invested into abstracting engines, contexts or algebraic operations, but from the other side, even abstract interfaces, are defined in a way that everything has to be read or written from files (on HDFS). I am considering to implement reading/writing to NoSQL database and initially I assumed it will be enough just to implement own ReaderWriter, but I am currently realizing that I will have to re-implement or hack-around by derivating own versions of large(?) portions of framework including own variant of CheckpointedDrm, DistributedEngine and what not. Is it because abstracting away storage type would introduce even more complexity or because there are aspects of design that absolutely require to read/write only to (seq)files? kind regards reinis
