Ning Li wrote:
1 is good. But for 2:
  - Won't it have a security concern as well? Or is this not a general
local cache?

A client-side RAM cache would be filled through the same security mechanisms as all other filesystem accesses.

  - You are referring to caching in RAM, not caching in local FS,
right? In general, a Lucene index size could be quite large. We may
have to cache a lot of data to reach a reasonable hit ratio...

Lucene on a local disk benefits significantly from the local filesystem's RAM cache (aka the kernel's buffer cache). HDFS has no such local RAM cache outside of the stream's buffer. The cache would need to be no larger than the kernel's buffer cache to get an equivalent hit ratio. And if you're accessing a remote index then you shouldn't also need a large buffer cache.

Doug

Reply via email to