Mark,

Thanks for your suggestion, It's really not a good idea to store one
file in multiple columns in one row. The heap space problem will still
exist. And I take your advice to store it in multiple rows, it works,
I can event store one file with 2G.



On Mon, Apr 26, 2010 at 6:12 PM, Mark Robson <mar...@gmail.com> wrote:
> On 26 April 2010 00:57, Shuge Lee <shuge....@gmail.com> wrote:
>>
>> In Python:
>>
>> keyspace.columnfamily[key][column] = value
>>
>> files.video[uuid.uuid4()]['name'] = 'foo.flv'
>> files.video[uuid.uuid4()]['path'] = '/var/files/foo.flv'
>
> Hi.
> Storing the filename in the database will not solve the file storage
> problem. Cassandra is a distributed database, and a file stored locally will
> not be available on other client nodes.
> If you're using Cassandra at all, that probably implies that you have lots
> of client nodes. A non-redundant NFS server (for example) would not offer
> high availability, so would be inadequate for the OP's situation.
> Storing files *IN* Cassandra is very useful because you can then retrieve
> them from anywhere with high availability.
> However, as others have discussed, they should be split across multiple
> columns, or if very big, multiple rows.
> I prefer to split by row because this scales better to very large files.
> During compaction, as is well noted, Cassandra needs the entire row in
> memory, which will cause a FAIL  once you have files more than a few gigs.
> Mark



-- 
Best Regards

Jeff Zhang

Reply via email to