Since this has been a topic for a while, I will just throw out an idea to see how fast you guys can shoot it down.
A cache object is stored as a series of fragments. If we subdivided each fragment in to "chunks", we could have 64 chunks / fragment and represent them with a bitmap in a single uint64_t. A set bit would indicate valid data in that chunk. Partial content would be written only in chunk units and only for chunks that are complete in the data. For the default size of 1M fragments, each chunk would be 16K which seems a reasonable value. The bitmaps would be stored along with the fragment offset table in the alternate info header. This would keep it out of the directory while making it available when serving because the alternate data is loaded before that point. Range validity checks could also be done without additional disk I/O because you can't detect if a range is valid for an object before the alternate is determined. We could only serve if the request range was completely covered, or generate a synthetic range request to cover parts that were not already in the cache. This would mean that files less than one fragment would not have partial content cached but I think that's acceptable as the advantages of partial caching are only for larger objects.