Ah, okay.
Then, I suppose that an approach between 1 and 2 makes some sense to me: add an
option to disable creating the marker on object deletion/removal. I think this
alone isn't the best solution but it seems to at least add a mode where
creating the marker is more controlled.
As an aside, ar
Hello Aldrin,
It's not either/or, the directory marker is created everytime necessary,
for example when CreateDir() is called.
Regards
Antoine.
Le 15/07/2024 à 19:20, Aldrin a écrit :
Thanks Antoine!
Preserving the property across multiple clients (and presumably across
independent ses
And to clarify, by "other clients" I mean "other remote clients on other
systems concurrently accessing the same data."
I still think that many cients on a single system could use a local filesystem
to gate directory-based operations more efficiently (since a local filesystem
is optimized for t
Thanks Antoine!
Preserving the property across multiple clients (and presumably across
independent sessions of the same client) is the part that I was missing.
>From the link you shared, I saw an aws page discussing the use of folders in
>the s3 console [1]. Their approach is to create the mark
No, because these markers also communicate the information to other
implementations of S3 abstractions.
An example of this is: https://docs.cyberduck.io/protocols/s3/#folders
Regards
Antoine.
Le 13/07/2024 à 07:15, Aldrin a écrit :
...then I still expect the directory /foo to exist
Rig
> ...then I still expect the directory /foo to exist
Right, but if that is the sole purpose of empty directory markers, I'm curious
if there was an attempt at keeping track of the prefixes/directories locally?
# --
# Aldrin
https://github.com/drin/
https://gitla
I wonder why S3 (object storage) operates based on file system semantics.
Python users are usually data scientists. They might not be familiar with
the differences between object storage and file storage. Furthermore, I
think there are a lot of pyarrow users.
> Avoiding file by file operations so
>I think my question is still relevant: no matter what semantics
`S3FileSystem` is trying to provide, I'm still not sure how the placeholder
object helps. I assume it's for listing objects, but what else?
If I have a local filesystem and I delete a file /foo/bar then I still
expect the directory /
But I think the issue being addressed [1] is essentially, "`delete_file`
shouldn't create additional files/directories in S3."
I think discussion about the semantics at large is interesting but may be a
digression? Also, I think there are varying degrees of "filesystem semantics"
that are even
Many people
are familiar with object stores these days. You could create a new
abstraction `ObjectStore` which is very similar to `FileSystem` except the
semantics are object store semantics and not filesystem semantics.
FWIW in the Arrow Rust ecosystem we only provide an object store
abstractio
> The markers are necessary to offer file system semantics on top of object
> stores. You will get a ton of subtle bugs otherwise.
Yes, object stores and filesystems are different. If you expect your
filesystem to act like a filesystem then these things need to be done in
order to avoid these bug
Hello!
This may be naive, but why does the empty directory marker need to exist on the
S3 side at all? If a local directory is created (because filesystem semantics),
then I am not sure why a fake object needs to exist on the object-store side.
# --
# Aldrin
h
Hi,
The markers are necessary to offer file system semantics on top of object
stores. You will get a ton of subtle bugs otherwise.
If instead of arrow::FileSystem, Arrow offered an arrow::ObjectStore
interface that wraps local filesystems and object stores with object-store
semantics (i.e. no con
Hello.
Thank you for your feedback!!
> In which situation does this make a sizable difference in number of
> requests?
The issue I am addressing does not completely resolve the problem, but
there is also the problem caused by *EnsureParentExists* as described in
[2].
*The 42,129 requests with t
Hi,
Le 12/07/2024 à 12:21, Hyunseok Seo a écrit :
*### Why Maintain Empty Directory Markers?*
From what I understand, object stores like S3 do not have a concept of
directories. The motivation behind maintaining these markers could be to
manage the object store as if it were a traditional fi
Hello. community!
I am currently working on addressing the issue described in [[C++] Add
option to not create parent directory with S3 delete_file](
https://github.com/apache/arrow/issues/36275). In this process, I have
found it necessary to gather feedback on how to best resolve this issue.
Below
16 matches
Mail list logo