[ceph-users] Re: How radosgw considers that the file upload is done?

2024-06-12 Thread Daniel Gryniewicz

On 6/12/24 5:43 AM, Szabo, Istvan (Agoda) wrote:

Hi,

Wonder how radosgw knows that a transaction is done and didn't break the 
connection between the user interface and gateway?

Let's see this is one request:

2024-06-12T16:26:03.386+0700 7fa34c7f0700  1 beast: 0x7fa5bc776750: 1.1.1.1 - - 
[2024-06-12T16:26:03.386063+0700] "PUT /bucket/0/2/966394.delta HTTP/1.1" 200 238 - 
"User-Agent: APN/1.0 Hortonworks/1.0 HDP/3.1.0.0-78, Hadoop 3.2.2, aws-sdk-java/1.11.563 
Linux/5.15.0-101-generic OpenJDK_64-Bit_Server_VM/11.0.18+10-post-Debian-1deb10u1 java/11.0.18 
scala/2.12.15 vendor/Debian com.amazonaws.services.s3.transfer.TransferManager/1.11.563" -
2024-06-12T16:26:03.386+0700 7fa4e9ffb700  1 == req done req=0x7fa5a4572750 
op status=0 http_status=200 latency=737ns ==

What I can see here is the
req done
op status=0

I guess if the connection broke between user and gateway the req will be done 
also, but what s op status? Is it the one that I'm actually looking for? If 
connection broke maybe that is different value?

Thank you



op status will be the error returned from the recv() systemcall, 
effectively.  So probably something like -ECONNRESET, which is -104.


Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph rgw zone create fails EINVAL

2024-06-26 Thread Daniel Gryniewicz

On 6/25/24 3:21 PM, Matthew Vernon wrote:

On 24/06/2024 21:18, Matthew Vernon wrote:

2024-06-24T17:33:26.880065+00:00 moss-be2001 ceph-mgr[129346]: [rgw 
ERROR root] Non-zero return from ['radosgw-admin', '-k', 
'/var/lib/ceph/mgr/ceph-moss-be2001.qvwcaq/keyring', '-n', 
'mgr.moss-be2001.qvwcaq', 'realm', 'pull', '--url', 
'https://apus.svc.eqiad.wmnet:443', '--access-key', 'REDACTED', 
'--secret', 'REDACTED', '--rgw-realm', 'apus']: request failed: (5) 
Input/output error


EIO is an odd sort of error [doesn't sound very network-y], and I 
don't think I see any corresponding request in the radosgw logs in the 
primary zone. From the CLI outside the container I can do e.g. curl 
https://apus.svc.eqiad.wmnet/ just fine, are there other things worth 
checking here? Could it matter that the mgr node isn't an rgw?


...the answer turned out to be "container image lacked the relevant CA 
details to validate the TLS of the other end".




Also, for the record, radosgw-admin logs do not end up in the same log 
file as RGW's logs.  Each invocation of radosgw-admin makes it's own log 
file for the run of that command.  (This is because radosgw-admin is 
really a stripped down version of RGW itself, and it does not 
communicate with the running RGWs, but connects to the Ceph cluster 
directly.)  They're generally small, and frequently empty, but should 
have error messages in them on failure.


Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph rgw zone create fails EINVAL

2024-06-27 Thread Daniel Gryniewicz

I would guess that it probably does, but I don't know for sure.

Daniel

On 6/26/24 10:04 AM, Adam King wrote:
Interesting. Given this is coming from a radosgw-admin call being done 
from within the rgw mgr module, I wonder if a  radosgw-admin log file is 
ending up in the active mgr container when this happens.


On Wed, Jun 26, 2024 at 9:04 AM Daniel Gryniewicz <mailto:d...@redhat.com>> wrote:


On 6/25/24 3:21 PM, Matthew Vernon wrote:
 > On 24/06/2024 21:18, Matthew Vernon wrote:
 >
 >> 2024-06-24T17:33:26.880065+00:00 moss-be2001 ceph-mgr[129346]: [rgw
 >> ERROR root] Non-zero return from ['radosgw-admin', '-k',
 >> '/var/lib/ceph/mgr/ceph-moss-be2001.qvwcaq/keyring', '-n',
 >> 'mgr.moss-be2001.qvwcaq', 'realm', 'pull', '--url',
 >> 'https://apus.svc.eqiad.wmnet:443
<https://apus.svc.eqiad.wmnet:443>', '--access-key', 'REDACTED',
 >> '--secret', 'REDACTED', '--rgw-realm', 'apus']: request failed: (5)
 >> Input/output error
 >>
 >> EIO is an odd sort of error [doesn't sound very network-y], and I
 >> don't think I see any corresponding request in the radosgw logs
in the
 >> primary zone. From the CLI outside the container I can do e.g. curl
 >> https://apus.svc.eqiad.wmnet/ <https://apus.svc.eqiad.wmnet/>
just fine, are there other things worth
 >> checking here? Could it matter that the mgr node isn't an rgw?
 >
 > ...the answer turned out to be "container image lacked the
relevant CA
 > details to validate the TLS of the other end".
 >

Also, for the record, radosgw-admin logs do not end up in the same log
file as RGW's logs.  Each invocation of radosgw-admin makes it's own
log
file for the run of that command.  (This is because radosgw-admin is
really a stripped down version of RGW itself, and it does not
communicate with the running RGWs, but connects to the Ceph cluster
directly.)  They're generally small, and frequently empty, but should
have error messages in them on failure.

Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
<mailto:ceph-users@ceph.io>
To unsubscribe send an email to ceph-users-le...@ceph.io
<mailto:ceph-users-le...@ceph.io>


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: What is the specific meaning "total_time" in RGW ops log

2021-01-04 Thread Daniel Gryniewicz
total_time is calculated from the top of process_request() until the 
bottom of process_request().  I know that's not hugely helpful, but it's 
accurate.


This means it starts after the front-end passes the request off, and 
counts until after a response is sent to the client.  I'm not sure if it 
includes reading data from the client or not; some amount of metadata 
work at least has been done before total_time is started.


Daniel

On 12/23/20 10:50 PM, opengers wrote:

In other words, I want to figure out when "total_time" is calculated from
and when it ends

opengers  于2020年12月24日周四 上午11:14写道:


Hello everyone,I enabled rgw ops log by setting "rgw_enable_ops_log =
true",There is a "total_time" field in  rgw ops log

But I want to figure out whether "total_time" includes the period of time
when rgw returns a response to the client?


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw process crashes multiple times an hour

2021-01-28 Thread Daniel Gryniewicz
It looks like your radosgw is using a different version of librados.  In 
the backtrace, the top useful line begins:


librados::v14_2_0

when it should be v15.2.0, like the ceph::buffer in the same line.

Is there an old librados lying around that didn't get cleaned up somehow?

Daniel



On 1/28/21 7:27 AM, Andrei Mikhailovsky wrote:

Hello,

I am experiencing very frequent crashes of the radosgw service. It happens 
multiple times every hour. As an example, over the last 12 hours we've had 35 
crashes. Has anyone experienced similar behaviour of the radosgw octopus 
release service? More info below:

Radosgw service is running on two Ubuntu servers. I have tried upgrading OS on 
one of the servers to Ubuntu 20.04 with latest updates. The second server is 
still running Ubuntu 18.04. Both services crash occasionally, but the service 
which is running on Ubuntu 20.04 crashes far more often it seems. The ceph 
cluster itself is pretty old and was initially setup around 2013. The cluster 
was updated pretty regularly with every major release. Currently, I've got 
Octopus 15.2.8 running on all osd, mon, mgr and radosgw servers.

Crash Backtrace:

ceph crash info 
2021-01-28T11:36:48.912771Z_08f80efd-c0ad-4551-88ce-905ca9cd3aa8 |less
{
"backtrace": [
"(()+0x46210) [0x7f815a49a210]",
"(gsignal()+0xcb) [0x7f815a49a18b]",
"(abort()+0x12b) [0x7f815a479859]",
"(()+0x9e951) [0x7f8150ee9951]",
"(()+0xaa47c) [0x7f8150ef547c]",
"(()+0xaa4e7) [0x7f8150ef54e7]",
"(()+0xaa799) [0x7f8150ef5799]",
"(()+0x344ba) [0x7f815a1404ba]",
"(()+0x71e04) [0x7f815a17de04]",
"(librados::v14_2_0::IoCtx::nobjects_begin(librados::v14_2_0::ObjectCursor const&, 
ceph::buffer::v15_2_0::list const&)+0x5d) [0x7f815a18c7bd]",
"(RGWSI_RADOS::Pool::List::init(std::__cxx11::basic_string, 
std::allocator > const&, RGWAccessListFilter*)+0x115) [0x7f815b0d9935]",
"(RGWSI_SysObj_Core::pool_list_objects_init(rgw_pool const&, std::__cxx11::basic_string, 
std::allocator > const&, std::__cxx11::basic_string, std::allocator 
> const&, RGWSI_SysObj::Pool::ListCtx*)+0x255) [0x7f815abd7035]",
"(RGWSI_MetaBackend_SObj::list_init(RGWSI_MetaBackend::Context*, std::__cxx11::basic_string, std::allocator > const&)+0x206) [0x7f815b0ccfe6]",
"(RGWMetadataHandler_GenericMetaBE::list_keys_init(std::__cxx11::basic_string, std::allocator > const&, void**)+0x41) [0x7f815ad23201]",
"(RGWMetadataManager::list_keys_init(std::__cxx11::basic_string, 
std::allocator > const&, std::__cxx11::basic_string, 
std::allocator > const&, void**)+0x71) [0x7f815ad254d1]",
"(AsyncMetadataList::_send_request()+0x9b) [0x7f815b13c70b]",
"(RGWAsyncRadosProcessor::handle_request(RGWAsyncRadosRequest*)+0x25) 
[0x7f815ae60f25]",
"(RGWAsyncRadosProcessor::RGWWQ::_process(RGWAsyncRadosRequest*, 
ThreadPool::TPHandle&)+0x11) [0x7f815ae69401]",
"(ThreadPool::worker(ThreadPool::WorkThread*)+0x5bb) [0x7f81517b072b]",
"(ThreadPool::WorkThread::entry()+0x15) [0x7f81517b17f5]",
"(()+0x9609) [0x7f815130d609]",
"(clone()+0x43) [0x7f815a576293]"
],
"ceph_version": "15.2.8",
"crash_id": "2021-01-28T11:36:48.912771Z_08f80efd-c0ad-4551-88ce-905ca9cd3aa8",
"entity_name": "client.radosgw1.gateway",
"os_id": "ubuntu",
"os_name": "Ubuntu",
"os_version": "20.04.1 LTS (Focal Fossa)",
"os_version_id": "20.04",
"process_name": "radosgw",
"stack_sig": "347474f09a756104ac2bb99d80e0c1fba3e9dc6f26e4ef68fe55946c103b274a",
"timestamp": "2021-01-28T11:36:48.912771Z",
"utsname_hostname": "arh-ibstorage1-ib",
"utsname_machine": "x86_64",
"utsname_release": "5.4.0-64-generic",
"utsname_sysname": "Linux",
"utsname_version": "#72-Ubuntu SMP Fri Jan 15 10:27:54 UTC 2021"
}





radosgw.log file (file names were redacted):


-25> 2021-01-28T11:36:48.794+ 7f8043fff700 1 civetweb: 0x7f814c0cf010: 176.35.173.88 - - 
[28/Jan/2021:11:36:48 +] "PUT /-u115134.JPG HTTP/1.1" 400 460 - -
-24> 2021-01-28T11:36:48.814+ 7f80437fe700 1 == starting new request 
req=0x7f80437f5780 =
-23> 2021-01-28T11:36:48.814+ 7f80437fe700 2 req 5169 0s initializing for 
trans_id = tx01431-006012a1d0-31197b5c-default
-22> 2021-01-28T11:36:48.814+ 7f80437fe700 2 req 5169 0s getting op 1
-21> 2021-01-28T11:36:48.814+ 7f80437fe700 2 req 5169 0s s3:put_obj 
verifying requester
-20> 2021-01-28T11:36:48.814+ 7f80437fe700 2 req 5169 0s s3:put_obj 
normalizing buckets and tenants
-19> 2021-01-28T11:36:48.814+ 7f80437fe700 2 req 5169 0s s3:put_obj init 
permissions
-18> 2021-01-28T11:36:48.814+ 7f80437fe700 0 req 5169 0s NOTICE: invalid 
dest placement: default-placement/REDUCED_REDUNDANCY
-17> 2021-01-28T11:36:48.814+ 7f80437fe700 1 op->ERRORHANDLER: err_no=-22 
new_err_no=-22
-16> 2021-01-28T11:36:48.814+ 7f80437fe700 2 req 5169 0s s3:put_obj op 
status=0
-15> 2021-01-28T11:36:48.814+ 7f80437fe700 2 req 5169 0s s3:put_obj http 
status=400
-14> 2021-01-28T11:36:48.814+ 7f80437fe700 1 == req done 
req=0x7f80437f5780 op status=0 http_status=400 latency=0s ==
-13> 2021-01-28T11:36:48.822+000

[ceph-users] Re: NFS version 4.0

2021-02-04 Thread Daniel Gryniewicz
The preference for 4.1 and later is because 4.0 has a much less useful 
graceful restart (which is used for HA/failover as well).  Ganesha 
itself supports 4.0 perfectly fine, and it should work fine with Ceph, 
but HA setups will be much more difficult, and will be limited in 
functionality.


Daniel

On 2/4/21 3:27 AM, Jens Hyllegaard (Soft Design A/S) wrote:

Hi.

We are trying to set up an NFS server using ceph which needs to be accessed by 
an IBM System i.
As far as I can tell the IBM System i only supports nfs v. 4.
Looking at the nfs-ganesha deployments it seems that these only support 4.1 or 
4.2. I have tried editing the configuration file to support 4.0 and it seems to 
work.
Is there a reason than it currently only support 4.1 and 4.2?

I can of course edit the configuration file, but I would have to do that after 
any deployment or upgrade of the nfs servers.

Regards

Jens Hyllegaard
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: NFS Ganesha NFSv3

2021-02-24 Thread Daniel Gryniewicz
I've never used cephadm, sorry.  When I last ran containerized Ganesha, 
I was using docker directly.


Daniel

On 9/23/20 1:58 PM, Gabriel Medve wrote:

Hi

Thanks for the reply.

cephadm runs ceph containers automatically. How to set privileged mode 
in ceph container?


--


El 23/9/20 a las 13:24, Daniel Gryniewicz escribió:
NFSv3 needs privileges to connect to the portmapper.  Try running 
your docker container in privileged mode, and see if that helps.


Daniel

On 9/23/20 11:42 AM, Gabriel Medve wrote:

Hi,

I have a CEPH 15.2.5 running in a docker , i configure nfs ganesha 
with nfs version 3 but i can not mount it.
If configure ganesha with nfs version 4 i can mounted without 
problems but i need the version 3 .


The error is mount.nfs: Protocol not supported

Can help me?

Thanks.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

--
Untitled Document



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: NFS Ganesha NFSv3

2021-02-24 Thread Daniel Gryniewicz
I'm not sure what to add.  NFSv3 uses a random port for the server, and 
uses a service named portmapper so that clients can find the port of the 
server.  Connecting to the portmapper requires a privileged container. 
With docker, this is done with the --privileged option.  I don't know 
how to do it with other systems.


NFSv4 doesn't use portmapper, it uses a well-known port, so it can be 
run in a non-privileged container fine, as long as you're not using FSAL 
VFS (which needs a privileged container for another reason), NFSv4 works 
fine.


Daniel

On 2/23/21 9:56 PM, louis_...@outlook.com wrote:

Hi Daniel,

Can you give me more details, i have same issue use NFSv3 mount.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to sizing nfs-ganesha.

2021-03-22 Thread Daniel Gryniewicz

Hi.

Unfortunately, there isn't a good guide for sizing Ganesha.  It's pretty 
light weight, and so the machines it needs are generally smaller than 
what Ceph needs, so you probably won't have much of a problem.


The scaling of Ganesha is in 2 factors, based on the workload involved: 
the CPU usage scales with the number of clients, and the memory scales 
with the size of the working set of files.


The number of clients is controlled by the RPC_Max_Connections parameter 
(default 1024), and the max number of parallel operations is controlled 
by the RPC_Ioq_ThrdMax parameter, both in the NFS_CORE_PARAM block.  You 
probably won't need to change these, unless you have a *lot* of clients. 
 With these settings, any decent server-class multi-core CPU can keep 
up with demand.


Memory usage is controlled by several parameters, controlling 2 separate 
caches: the handle cache, and the dirent cache.


The handle cache is global to the machine, and there is one handle per 
file/directory/symlink/etc.  When this cache is full, entries in it will 
start to be re-used, and performance will degrade.  This is controlled 
by the Entries_HWMark parameter in the MDCACHE block, and defaults to 
100,000.  This is deliberately sized low, so that small deployments can 
be made in containers or on small systems.  If you have a large data 
set, you will definitely need to raise this.  Memory per handle is 
fairly small, in the mid 10s of k, so this can be raised a lot.  The 
handle cache is the largest user of memory on a Ganesha system.


The dirent cache is per-directory, and it makes a very large difference 
in directory listing performance.  Dirents are stored in chunks of 1000, 
and the number of chunks saved per directory is controlled by the 
Dir_Chunk parameter in the MDCACHE block.  It defaults to 128, which is 
again low.  If you have large directories that are listed commonly (or 
are listed by multiple clients at once) you probably want to raise it. 
Dirent memory is generally small, dominated by the size of the filename 
in the dirent, but keep in mind the dirent cache is per-directory, so if 
you have lots of directories in your working set, this could take a 
significant amount of memory.


Ganesha does not do any data caching, only metadata caching, so it's 
memory doesn't need to scale with the amount of data being used.


In general, Ganesha is a lightweight daemon, since it's primarily a 
translator, and it will use much less resources than the equivalent 
CephFS MDS or RGW serving the same workload.


Daniel

On 3/20/21 5:01 AM, Quang Lê wrote:

Hi guys,

I'm using manila-openstack to provide a filesystem service using
backend CEPHFS. My design use nfs-ganesha as the gateway for the VM in
openstack mount to CephFS. I am having problems with sizing the
ganesha-servers.

Can anyone suggest me *what are the hardware requirements of the ganesha
server are* or *what parameters are needed to consider when sizing a
ganesha server* ?

My simple topology in link: *https://i.imgur.com/xrYqxAh.png
*

Thank you guys.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Pacific unable to configure NFS-Ganesha

2021-04-05 Thread Daniel Gryniewicz
In order to enable NFS via Ganesha, you will need either an RGW or a 
CephFS.  Within the context of a Ceph deployment, Ganesha cannot export 
anything it's own, it just exports either RGW or CephFS.


Daniel

On 4/5/21 1:43 PM, Robert Sander wrote:

Hi,

I have a test cluster now running on Pacific with the cephadm
orchestrator and upstream container images.

In the Dashboard on the services tab I created a new service for NFS.
The containers got deployed.

But when I go to the NFS tab and try to create a new NFS share the
Dashboard only returns a 500 error:

Apr 05 19:38:49 ceph01 bash[35064]: debug 2021-04-05T17:38:49.146+ 
7f64468d1700  0 [dashboard ERROR exception] Internal Server Error
Apr 05 19:38:49 ceph01 bash[35064]: Traceback (most recent call last):
Apr 05 19:38:49 ceph01 bash[35064]:   File 
"/usr/share/ceph/mgr/dashboard/services/exception.py", line 46, in 
dashboard_exception_handler
Apr 05 19:38:49 ceph01 bash[35064]: return handler(*args, **kwargs)
Apr 05 19:38:49 ceph01 bash[35064]:   File 
"/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 54, in __call__
Apr 05 19:38:49 ceph01 bash[35064]: return self.callable(*self.args, 
**self.kwargs)
Apr 05 19:38:49 ceph01 bash[35064]:   File 
"/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line 694, in inner
Apr 05 19:38:49 ceph01 bash[35064]: ret = func(*args, **kwargs)
Apr 05 19:38:49 ceph01 bash[35064]:   File 
"/usr/share/ceph/mgr/dashboard/controllers/nfsganesha.py", line 265, in fsals
Apr 05 19:38:49 ceph01 bash[35064]: return Ganesha.fsals_available()
Apr 05 19:38:49 ceph01 bash[35064]:   File 
"/usr/share/ceph/mgr/dashboard/services/ganesha.py", line 154, in 
fsals_available
Apr 05 19:38:49 ceph01 bash[35064]: if 
RgwClient.admin_instance().is_service_online() and \
Apr 05 19:38:49 ceph01 bash[35064]:   File 
"/usr/share/ceph/mgr/dashboard/services/rgw_client.py", line 301, in 
admin_instance
Apr 05 19:38:49 ceph01 bash[35064]: return 
RgwClient.instance(daemon_name=daemon_name)
Apr 05 19:38:49 ceph01 bash[35064]:   File 
"/usr/share/ceph/mgr/dashboard/services/rgw_client.py", line 241, in instance
Apr 05 19:38:49 ceph01 bash[35064]: RgwClient._daemons = _get_daemons()
Apr 05 19:38:49 ceph01 bash[35064]:   File 
"/usr/share/ceph/mgr/dashboard/services/rgw_client.py", line 53, in _get_daemons
Apr 05 19:38:49 ceph01 bash[35064]: raise NoRgwDaemonsException
Apr 05 19:38:49 ceph01 bash[35064]: 
dashboard.services.rgw_client.NoRgwDaemonsException: No RGW service is running.
Apr 05 19:38:49 ceph01 bash[35064]: debug 2021-04-05T17:38:49.150+ 
7f64468d1700  0 [dashboard ERROR request] [:::10.0.44.42:39898] [GET] [500] 
[0.030s] [admin] [513.0B] /ui-api/nfs-ganesha/fsals
Apr 05 19:38:49 ceph01 bash[35064]: debug 2021-04-05T17:38:49.150+ 7f64468d1700  0 [dashboard ERROR request] [b'{"status": 
"500 Internal Server Error", "detail": "The server encountered an unexpected condition which prevented it from 
fulfilling the request.", "request_id": "e89b8519-352f-4e44-a364-6e6faf9dc533"}

']

I have no radosgateways in that cluster (currently). There are the pools
for radosgw (.rgw.root etc) but no running instance.

Regards


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW segmentation fault on Pacific 16.2.1 with multipart upload

2021-05-13 Thread Daniel Gryniewicz

This tracker:
https://tracker.ceph.com/issues/50556

and this PR:
https://github.com/ceph/ceph/pull/41288

Daniel

On 5/12/21 7:00 AM, Daniel Iwan wrote:

Hi
I have started to see segfaults during multiplart upload to one of the
buckets
File is about 60MB in size
Upload of the same file to a brand new bucket works OK

Command used
aws --profile=tester --endpoint=$HOST_S3_API --region="" s3 cp
./pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack
s3://tester-bucket/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack

For some reason log shows upload to  tester-bucket-2 ???
Bucket tester-bucket-2 is owned by the same user TESTER.

I'm using Ceph 16.2.1 (recently upgraded from Octopus).
Installed with cephadm in Docker
OS Ubuntu 18.04.5 LTS

Logs show as below

May 11 11:00:46 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:46.891+ 7ffb0e25e700  1 == starting new request
req=0x7ffa8e15d620 =
May 11 11:00:46 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:46.907+ 7ffb0b258700  1 == req done
req=0x7ffa8e15d620 op status=0 http_status=200 latency=0.011999841s ==
May 11 11:00:46 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:46.907+ 7ffb0b258700  1 beast: 0x7ffa8e15d620:
11.1.150.14 - TESTER [11/May/2021:11:00:46.891 +] "POST
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploads
HTTP/1.1" 200 296 - "aws-cli/2.1.23 Python/3.7.3
Linux/4.19.128-microsoft-standard exe/x86_64.ubuntu.18 p
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.055+ 7ffb09254700  1 == starting new request
req=0x7ffa8e15d620 =
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.355+ 7ffb51ae5700  1 == starting new request
req=0x7ffa8e0dc620 =
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.355+ 7ffb4eadf700  1 == starting new request
req=0x7ffa8e05b620 =
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.355+ 7ffb46acf700  1 == starting new request
req=0x7ffa8df59620 =
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.355+ 7ffb44acb700  1 == starting new request
req=0x7ffa8ded8620 =
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.355+ 7ffb3dabd700  1 == starting new request
req=0x7ffa8dfda620 =
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.359+ 7ffb1d27c700  1 == starting new request
req=0x7ffa8de57620 =
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.359+ 7ffb22a87700  1 == starting new request
req=0x7ffa8ddd6620 =
May 11 11:00:48 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:48.275+ 7ffb2d29c700  1 == req done
req=0x7ffa8e15d620 op status=0 http_status=200 latency=1.219983697s ==
May 11 11:00:48 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:48.275+ 7ffb2d29c700  1 beast: 0x7ffa8e15d620:
11.1.150.14 - TESTER [11/May/2021:11:00:47.055 +] "PUT
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=8
HTTP/1.1" 200 2485288 - "aws-cli/2.1.23 Python/3.7.3 Linux
May 11 11:00:54 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:54.695+ 7ffad89f3700  1 == req done
req=0x7ffa8ddd6620 op status=0 http_status=200 latency=7.335902214s ==
May 11 11:00:54 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:54.695+ 7ffad89f3700  1 beast: 0x7ffa8ddd6620:
11.1.150.14 - TESTER [11/May/2021:11:00:47.359 +] "PUT
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=6
HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux
May 11 11:00:56 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:56.871+ 7ffb11a65700  1 == req done
req=0x7ffa8e0dc620 op status=0 http_status=200 latency=9.515872955s ==
May 11 11:00:56 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:56.871+ 7ffb11a65700  1 beast: 0x7ffa8e0dc620:
11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +] "PUT
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=7
HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux
May 11 11:00:59 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:59.491+ 7ffac89d3700  1 == req done
req=0x7ffa8dfda620 op status=0 http_status=200 latency=12.135838509s ==
May 11 11:00:59 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:59.491+ 7ffac89d3700  1 beast: 0x7ffa8dfda620:
11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +] "PUT
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=2
HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux
May 11 11:01:02 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:02.891+ 7ffb68312700  1 == req done
req=0x7ffa8e05b62

[ceph-users] Re: Pacific: RadosGW crashing on multipart uploads.

2021-07-01 Thread Daniel Gryniewicz

That's this one:

https://github.com/ceph/ceph/pull/41893

Daniel

On 6/29/21 5:35 PM, Chu, Vincent wrote:

Hi, I'm running into an issue with RadosGW where multipart uploads crash, but 
only on buckets with a hyphen, period or underscore in the bucket name and with 
a bucket policy applied. We've tested this in pacific 16.2.3 and pacific 16.2.4.


Anyone run into this before?


ubuntu@ubuntu:~/ubuntu$ aws --endpoint http://placeholder.com:7480 s3 cp 
ubuntu.iso s3://bucket.test

upload failed: ./ubuntu.iso to s3://bucket.test/ubuntu.iso Connection was closed before 
we received a valid response from endpoint URL: 
"http://placeholder.com:7480/bucket.test/ubuntu.iso?uploads";.



Here is the crash log.

-12> 2021-06-29T20:44:10.940+ 7fae1f4ec700  1 == starting new 
request req=0x7fadf8998620 =
-11> 2021-06-29T20:44:10.940+ 7fae1f4ec700  2 req 2403 0.0s 
initializing for trans_id = tx00963-0060db861a-17e77ee-default
-10> 2021-06-29T20:44:10.940+ 7fae1f4ec700  2 req 2403 0.0s 
getting op 4
 -9> 2021-06-29T20:44:10.940+ 7fae1f4ec700  2 req 2403 0.0s 
s3:init_multipart verifying requester
 -8> 2021-06-29T20:44:10.948+ 7fae1f4ec700  2 req 2403 0.008000608s 
s3:init_multipart normalizing buckets and tenants
 -7> 2021-06-29T20:44:10.948+ 7fae1f4ec700  2 req 2403 0.008000608s 
s3:init_multipart init permissions
 -6> 2021-06-29T20:44:10.954+ 7faedf66c700  0 Supplied principal is 
discarded: arn:aws:iam::default:user
 -5> 2021-06-29T20:44:10.954+ 7faedf66c700  2 req 2403 0.014001064s 
s3:init_multipart recalculating target
 -4> 2021-06-29T20:44:10.954+ 7faedf66c700  2 req 2403 0.014001064s 
s3:init_multipart reading permissions
 -3> 2021-06-29T20:44:10.954+ 7faedf66c700  2 req 2403 0.014001064s 
s3:init_multipart init op
 -2> 2021-06-29T20:44:10.954+ 7faedf66c700  2 req 2403 0.014001064s 
s3:init_multipart verifying op mask
 -1> 2021-06-29T20:44:10.955+ 7faedf66c700  2 req 2403 0.015001140s 
s3:init_multipart verifying op permissions
  0> 2021-06-29T20:44:10.964+ 7faedf66c700 -1 *** Caught signal 
(Segmentation fault) **
  in thread 7faedf66c700 thread_name:radosgw

  ceph version 16.2.3 (381b476cb3900f9a92eb95d03b4850b953cfd79a) pacific 
(stable)
  1: /lib64/libpthread.so.0(+0x12b20) [0x7faf2dd05b20]
  2: (rgw_bucket::rgw_bucket(rgw_bucket const&)+0x23) [0x7faf38b4d083]
  3: (rgw::sal::RGWObject::get_obj() const+0x20) [0x7faf38b7bcf0]
  4: (RGWInitMultipart::verify_permission(optional_yield)+0x6c) [0x7faf38e6608c]
  5: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, 
req_state*, optional_yield, bool)+0x86a) [0x7faf38b2db1a]
  6: (process_request(rgw::sal::RGWRadosStore*, RGWREST*, RGWRequest*, std::__cxx11::basic_string, std::allocator > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, 
OpsLogSocket*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string, 
std::allocator >*, std::chrono::duration >*, int*)+0x26dd) 
[0x7faf38b3232d]
  7: /lib64/libradosgw.so.2(+0x4a1c0b) [0x7faf38a83c0b]
  8: /lib64/libradosgw.so.2(+0x4a36a4) [0x7faf38a856a4]
  9: /lib64/libradosgw.so.2(+0x4a390e) [0x7faf38a8590e]
  10: make_fcontext()
  NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
interpret this.




--

Vincent Chu

A-4: Advanced Research in Cyber Systems

Los Alamos National Laboratory
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Radosgw bucket listing limited to 10001 object ?

2021-07-20 Thread Daniel Gryniewicz
That's probably this one: https://tracker.ceph.com/issues/49892  Looks 
like we forgot to mark it for backport.  I've done that now, so it 
should be in the next Pacific.


Daniel

On 7/20/21 11:28 AM, [AR] Guillaume CephML wrote:

Hi all,

Context :
We are moving a customer users/buckets/objects from another Object 
storage to Ceph.
Thiis customer has 2 users : “test” and “prod”, the “test” user has 
53069 buckets, the “prod” user has 285291 buckets.
Ceph is in 16.2.5 (installed in 16.2.4 and upgraded via cephadm to 
16.2.5).

We copied all buckets/objects for the “test" user.
I want to list buckets for this user to check if there is no missing bucket.
I tried many thing :
- S3 API (via awscli or boto3 lib): returns only 1000 buckets
- Ceph dashboard: show only 10001 buckets
- radosgw-admin --rgw-realm=THE_REALM bucket list: returns 10001 buckets

I did not find an option to allow pagination for this or anything about this 
limit in the documentation (I may have missed it).

Do you know how can I get the full bucket list ?

Thank you,
—
Guillaume


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Radosgw bucket listing limited to 10001 object ?

2021-07-21 Thread Daniel Gryniewicz



On 7/20/21 5:23 PM, [AR] Guillaume CephML wrote:

Hello,


On 20 Jul 2021, at 17:48, Daniel Gryniewicz  wrote:
That's probably this one: https://tracker.ceph.com/issues/49892  Looks like we 
forgot to mark it for backport.  I've done that now, so it should be in the 
next Pacific.


I’m not sure it is related, as it seems the fix is about listing objects inside 
buckets not about the bucket list itself (but I started reading Ceph code quite 
recently, so I’m not sure I understand it well for now ;-)).

Do you have an ETA for next Pacific release ?



I don't have an ETA, sorry.  That's above my pay grade.

Daniel

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: nfs and showmount

2021-10-04 Thread Daniel Gryniewicz
showmount uses the MNT protocol, which is only part of NFSv3.  NFSv4 
mounts a pseudoroot, under which actual exports are exposed, so the 
NFSv4 equivalent is to mount /, and then list it.


In general, NFSv4 should be used in preference to NFSv3 whenever possible.

Daniel

On 10/4/21 9:10 AM, Fyodor Ustinov wrote:

Hi!

Yes. You're right. Ganesha does. But ceph doesn't use all of ganesh's 
functionality.
In the ceph dashboard there is no way to enable nfs3, only nfs4

- Original Message -

From: "Marc" 
To: "Fyodor Ustinov" 
Cc: "ceph-users" 
Sent: Monday, 4 October, 2021 15:33:43
Subject: RE: nfs and showmount



Afaik ceph uses nfs-ganesha, and ganesha supports nfs 3 and 4 and other
protocols.


Hi!

I think ceph only supports nsf4?


- Original Message -

Sent: Monday, 4 October, 2021 12:44:38
Subject: RE: nfs and showmount



I can remember asking the same some time ago. I think it has to do

with the

version of nfs you are using.


-Original Message-
From: Fyodor Ustinov 
Sent: Monday, 4 October 2021 11:32
To: ceph-users 
Subject: [ceph-users] nfs and showmount

Hi!

As I understand it - the built-in NFS server does not support the
command "showmount -e"?

WBR,
 Fyodor.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Etag change of a parent object

2023-12-13 Thread Daniel Gryniewicz



On 12/13/23 05:27, Janne Johansson wrote:

Den ons 13 dec. 2023 kl 10:57 skrev Rok Jaklič :


Hi,

shouldn't etag of a "parent" object change when "child" objects are added
on s3?

Example:
1. I add an object to test bucket: "example/" - size 0
 "example/" has an etag XYZ1
2. I add an object to test bucket: "example/test1.txt" - size 12
 "example/test1.txt" has an etag XYZ2
 "example/" has an etag XYZ1 ... should this change?

I understand that object storage is not hierarchical by design and objects
are "not connected" by some other means than the bucket name.



So, if they are not connected, then the first 0-sized object did not change.
If it doesn't change, it should not have a different ETag. Simple as that.



Expanding a bit: etag is related to the contents of the object only.  It 
has nothing to do with any other object.  As such, it will never change.


Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [EXTERNAL] Re: RGW Bucket Notifications and MultiPart Uploads

2022-07-20 Thread Daniel Gryniewicz
Seems like the notification for a multipart upload should look different 
to a normal upload?


Daniel

On 7/20/22 08:53, Yehuda Sadeh-Weinraub wrote:

Can maybe leverage one of the other calls to check for upload completion:
list multipart uploads and/or list parts. The latter should work if you
have the upload id at hand.

Yehuda

On Wed, Jul 20, 2022, 8:40 AM Casey Bodley  wrote:


On Wed, Jul 20, 2022 at 12:57 AM Yuval Lifshitz 
wrote:


yes, that would work. you would get a "404" until the object is fully
uploaded.


just note that you won't always get 404 before multipart complete,
because multipart uploads can overwrite existing objects

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: binary file cannot execute in cephfs directory

2022-08-23 Thread Daniel Gryniewicz

Does the mount have the "noexec" option on it?

Daniel

On 8/22/22 21:02, zxcs wrote:

In case someone missing the picture. Just copy the text as below:


1d@***ceph dir**$ 1s -lrth
total 13M
-rwxr-xr-x 1 ld ld 13M Nov 29 2021 cmake-3.22
1rwxrwxrwx 1 ld ld 10 Jul 26 10:03 cmake > cmake-3.22
-rwxrwxr-x 1 ld ld 25 Aug 19 15:52 test.sh


ld@***ceph dir**$./cmake-3.22
bash: ./cmake-3.22: Permission denied



2022年8月23日 08:57,zxcs  写道:

Hi, experts,


We are using cephfs 15.2.13, and after mount ceph on one node, copy a binary 
into the ceph dir, see below (cmake-3.22 is a binary),

but when i using `./cmake-3.22` it report permission denied, why? this file has “x” 
permission, and “ld" is the binary file owner.

could anyone please help to tell the story here? Thanks a ton!!!



Thanks

Xiong
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: s3 compatible interface

2023-03-01 Thread Daniel Gryniewicz
We're actually writing this for RGW right now.  It'll be a bit before 
it's productized, but it's in the works.


Daniel

On 2/28/23 14:13, Fox, Kevin M wrote:

Minio no longer lets you read / write from the posix side. Only through minio 
itself. :(

Haven't found a replacement yet. If you do, please let me know.

Thanks,
Kevin


From: Robert Sander 
Sent: Tuesday, February 28, 2023 9:37 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: s3 compatible interface

Check twice before you click! This email originated from outside PNNL.


On 28.02.23 16:31, Marc wrote:


Anyone know of a s3 compatible interface that I can just run, and reads/writes 
files from a local file system and not from object storage?


Have a look at Minio:

https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmin.io%2Fproduct%2Foverview%23architecture&data=05%7C01%7Ckevin.fox%40pnnl.gov%7Cfbffadde8e0a45e1d18308db19b2b714%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638132027594291339%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=uPhkVghMl%2B%2BU75ddjwv9FMaLlAHO4GgkcreH5bZFIm0%3D&reserved=0

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
https://gcc02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.heinlein-support.de%2F&data=05%7C01%7Ckevin.fox%40pnnl.gov%7Cfbffadde8e0a45e1d18308db19b2b714%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638132027594291339%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ciJR1pAWHTbtBbpJJ6GDtcBl7pUJdnU8C5ZBLoWlcaM%3D&reserved=0

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Minimum client version for Quincy

2023-03-03 Thread Daniel Gryniewicz
I can't speak for RBD, but for RGW, as long as you upgrade all the RGWs 
themselves, clients will be fine, since they speak S3 to the RGWs, not 
RADOS.


Daniel

On 3/3/23 04:29, Massimo Sgaravatto wrote:

Dear all
I am going to update a ceph cluster (where I am using only rbd and rgw,
i.e. I didn't deploy cephfs) from Octtopus to Quincy

Before doing that I would like to understand if some old nautilus clients
(that I can't update for several reasons) will still be able to connect

In general: I am not able to find this information in the documentation of
any ceph release

Should I refer to get-require-min-compat-client ?

Now in my Octopus cluster I see:

[root@ceph-mon-01 ~]# ceph osd get-require-min-compat-client
luminous


but I have the feeling that this value is simply the one I set a while ago
to support the upmap feature

Thanks, Massimo
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: s3 compatible interface

2023-03-06 Thread Daniel Gryniewicz

On 3/3/23 13:53, Kai Stian Olstad wrote:

On Wed, Mar 01, 2023 at 08:39:56AM -0500, Daniel Gryniewicz wrote:
We're actually writing this for RGW right now.  It'll be a bit before 
it's productized, but it's in the works.


Just curious, what is the use cases for this feature?
S3 against CephFS?



Local FS for development use, and distributed FS (initial target is 
GPFS) for production.   There's no current plans to make it work against 
CephFS, although I would imagine it will work fine.  But if you have a 
Ceph cluster, you're much better off using standard RGW on RADOS.


Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: s3 compatible interface

2023-03-06 Thread Daniel Gryniewicz
As far as I know, we have no plans to productize (or even test) on 
CephFS.  It should work, but CephFS isn't pure POSIX, so there may be 
issues.


Daniel

On 3/6/23 11:57, Fox, Kevin M wrote:

+1. If I know radosgw on top of cephfs is a thing, I may change some plans. Is 
that the planned route?

Thanks,
Kevin


From: Daniel Gryniewicz 
Sent: Monday, March 6, 2023 6:21 AM
To: Kai Stian Olstad
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: s3 compatible interface

Check twice before you click! This email originated from outside PNNL.


On 3/3/23 13:53, Kai Stian Olstad wrote:

On Wed, Mar 01, 2023 at 08:39:56AM -0500, Daniel Gryniewicz wrote:

We're actually writing this for RGW right now.  It'll be a bit before
it's productized, but it's in the works.


Just curious, what is the use cases for this feature?
S3 against CephFS?



Local FS for development use, and distributed FS (initial target is
GPFS) for production.   There's no current plans to make it work against
CephFS, although I would imagine it will work fine.  But if you have a
Ceph cluster, you're much better off using standard RGW on RADOS.

Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: s3 compatible interface

2023-03-22 Thread Daniel Gryniewicz
Yes, the POSIXDriver will support that.  If you want NFS access, we'd 
suggest you use Ganesha's FSAL_RGW to access through RGW (because 
multipart uploads are not fun), but it will work.


Daniel

On 3/21/23 15:48, Fox, Kevin M wrote:

Will either the file store or the posix/gpfs filter support the underlying 
files changing underneath so you can access the files either through s3 or by 
other out of band means (smb, nfs, etc)?

Thanks,
Kevin


From: Matt Benjamin 
Sent: Monday, March 20, 2023 5:27 PM
To: Chris MacNaughton
Cc: ceph-users@ceph.io; Kyle Bader
Subject: [ceph-users] Re: s3 compatible interface

Check twice before you click! This email originated from outside PNNL.


Hi Chris,

This looks useful.  Note for this thread:  this *looks like* it's using the
zipper dbstore backend?  Yes, that's coming in Reef.  We think of dbstore
as mostly the zipper reference driver, but it can be useful as a standalone
setup, potentially.

But there's now a prototype of a posix file filter that can be stacked on
dbstore (or rados, I guess)--not yet merged, and iiuc post-Reef.  That's
the project Daniel was describing.  The posix/gpfs filter is aiming for
being thin and fast and horizontally scalable.

The s3gw project that Clyso and folks were writing about is distinct from
both of these.  I *think* it's truthful to say that s3gw is its own
thing--a hybrid backing store with objects in files, but also metadata
atomicity from an embedded db--plus interesting orchestration.

Matt

On Mon, Mar 20, 2023 at 3:45 PM Chris MacNaughton <
chris.macnaugh...@canonical.com> wrote:


On 3/20/23 12:02, Frank Schilder wrote:

Hi Marc,

I'm also interested in an S3 service that uses a file system as a back-end. I looked at the 
documentation of 
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Faquarist-labs%2Fs3gw&data=05%7C01%7Ckevin.fox%40pnnl.gov%7C748fc400c7aa4d6e60db08db29a36b4b%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638149554103894808%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=nq5PtA585rwTIsKwtyuh2EYcCDMIu%2Bwry6%2BXh1GukKs%3D&reserved=0
 and have to say that it doesn't make much sense to me. I don't see this kind of gateway 
anywhere there. What I see is a build of a rados gateway that can be pointed at a ceph 
cluster. That's not a gateway to an FS.

Did I misunderstand your actual request or can you point me to the part of the 
documentation where it says how to spin up an S3 interface using a file system 
for user data?

The only thing I found is 
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fs3gw-docs.readthedocs.io%2Fen%2Flatest%2Fhelm-charts%2F%23local-storage&data=05%7C01%7Ckevin.fox%40pnnl.gov%7C748fc400c7aa4d6e60db08db29a36b4b%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638149554103894808%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1fr9aDJ3nqnB3RDDzsF6vpxzXN4961YRDQ%2BhHCdEC%2Bw%3D&reserved=0,
 but it sounds to me that this is not where the user data will be going.

Thanks for any hints and best regards,


for testing you can try: 
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Faquarist-labs%2Fs3gw&data=05%7C01%7Ckevin.fox%40pnnl.gov%7C748fc400c7aa4d6e60db08db29a36b4b%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638149554103894808%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=nq5PtA585rwTIsKwtyuh2EYcCDMIu%2Bwry6%2BXh1GukKs%3D&reserved=0

Yes indeed, that looks like it can be used with a simple fs backend.

Hey,

(Re-sending this email from a mailing-list subscribed email)

I was playing around with RadosGW's file backend (coming in Reef, zipper)
a few months back and ended up making this docker container that just works
to setup things:
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FChrisMacNaughton%2Fceph-rgw-docker&data=05%7C01%7Ckevin.fox%40pnnl.gov%7C748fc400c7aa4d6e60db08db29a36b4b%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638149554103894808%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Lu%2F9P50FHeInNkTkYUKQGzwDePnvkvcRR%2FmTOPdzeRE%3D&reserved=0;
 published (still,
maybe for a while?) at 
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhub.docker.com%2Fr%2Ficeyec%2Fceph-rgw-zipper&data=05%7C01%7Ckevin.fox%40pnnl.gov%7C748fc400c7aa4d6e60db08db29a36b4b%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638149554103894808%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WQI5wYhaP6XDTiR%2FcKvkAe7i6o4iBgATWVdr4zSBDRI%3D&reserved=0

Chris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




--

Matt Benjamin

[ceph-users] Re: Can I delete rgw log entries?

2023-04-20 Thread Daniel Gryniewicz

On 4/20/23 10:38, Casey Bodley wrote:

On Sun, Apr 16, 2023 at 11:47 PM Richard Bade  wrote:


Hi Everyone,
I've been having trouble finding an answer to this question. Basically
I'm wanting to know if stuff in the .log pool is actively used for
anything or if it's just logs that can be deleted.
In particular I was wondering about sync logs.
In my particular situation I have had some tests of zone sync setup,
but now I've removed the secondary zone and pools. My primary zone is
filled with thousands of logs like this:
data_log.71
data.full-sync.index.e2cf2c3e-7870-4fc4-8ab9-d78a17263b4f.47
meta.full-sync.index.7
datalog.sync-status.shard.e2cf2c3e-7870-4fc4-8ab9-d78a17263b4f.13
bucket.sync-status.f3113d30-ecd3-4873-8537-aa006e54b884:{bucketname}:default.623958784.455

I assume that because I'm not doing any sync anymore I can delete all
the sync related logs? Is anyone able to confirm this?


yes


What about if the sync is running? Are these being written and read
from and therefore must be left alone?


right. while a multisite configuration is operating, the replication
logs will be trimmed in the background. in addition to the replication
logs, the log pool also contains sync status objects. these track the
progress of replication, and removing those objects would generally
cause sync to start over from the beginning


It seems like these are more of a status than just a log and that
deleting them might confuse the sync process. If so, does that mean
that the log pool is not just output that can be removed as needed?
Are there perhaps other things in there that need to stay?


the log pool is used by several subsystems like multisite sync,
garbage collection, bucket notifications, and lifecycle. those
features won't work reliably if you delete their rados objects



Also, to be clear (in case you were confused), these logs are not data 
to be read by admins (like "log files") but structured data that 
represents changes to be used by syncing (like "log structured 
filesystem").  So deleting logs while sync is running will break sync.


Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Deleting millions of objects

2023-05-17 Thread Daniel Gryniewicz

multi delete is inherently limited to 1000 per operation by AWS S3:

https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html

This is a hard-coded limit in RGW as well, currently.  You will need to 
batch your deletes in groups of 1000.  radosgw-admin has a 
"--purge-objects" option to "bucket rm", if you want to delete whole 
buckets at a time.


Daniel

On 5/17/23 05:58, Robert Hish wrote:


I think this is capped at 1000 by the config setting. Ive used the aws
and s3cmd clients to delete more than 1000 objects at a time and it
works even with the config setting capped at 1000. But it is a bit slow.

#> ceph config help rgw_delete_multi_obj_max_num

rgw_delete_multi_obj_max_num - Max number of objects in a single multi-
object delete request
   (int, advanced)
   Default: 1000
   Can update at runtime: true
   Services: [rgw]

On Wed, 2023-05-17 at 10:51 +0200, Rok Jaklič wrote:

Hi,

I would like to delete millions of objects in RGW instance with:
mc rm --recursive --force ceph/archive/veeam

but it seems it allows only 1000 (or 1002 exactly) removals per
command.

How can I delete/remove all objects with some prefix?

Kind regards,
Rok
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Deleting millions of objects

2023-05-18 Thread Daniel Gryniewicz
Since 1000 is the hard coded limit in AWS, maybe you need to set 
something on the client as well?  "client.rgw" should work for setting 
the config in RGW.


Daniel

On 5/18/23 03:01, Rok Jaklič wrote:

Thx for the input.

I tried several config sets e.g.:
ceph config set client.radosgw.mon2 rgw_delete_multi_obj_max_num 1
ceph config set client.radosgw.mon1 rgw_delete_multi_obj_max_num 1
ceph config set client.rgw rgw_delete_multi_obj_max_num 1

where client.radosgw.mon2 is the same as in ceph.conf but without success.

It also seems from
https://github.com/ceph/ceph/blob/8c4f52415bddba65e654f3a4f7ba37d98446d202/src/rgw/rgw_op.cc#L7131
that it should check config setting, but for some reason it is not working.

---

For now I ended up with spawning up to 100 background processes (more than
that it fills up our FE queue and we get response timeouts) with:
mc rm --recursive --force ceph/archive/veeam &

Regards,
Rok

On Thu, May 18, 2023 at 3:47 AM Szabo, Istvan (Agoda) <
istvan.sz...@agoda.com> wrote:


If it works I’d be amazed. We have this slow and limited delete issue
also. What we’ve done to run on the same bucket multiple delete from
multiple servers via s3cmd.

Istvan Szabo
Staff Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---

On 2023. May 17., at 20:14, Joachim Kraftmayer - ceph ambassador <
joachim.kraftma...@clyso.com> wrote:

Email received from the internet. If in doubt, don't click any link nor
open any attachment !


Hi Rok,

try this:


rgw_delete_multi_obj_max_num - Max number of objects in a single
multi-object delete request
  (int, advanced)
  Default: 1000
  Can update at runtime: true
  Services: [rgw]


config set   


WHO: client. or client.rgw

KEY: rgw_delete_multi_obj_max_num

VALUE: 1

Regards, Joachim

___
ceph ambassador DACH
ceph consultant since 2012

Clyso GmbH - Premier Ceph Foundation Member

https://www.clyso.com/

Am 17.05.23 um 14:24 schrieb Rok Jaklič:

thx.


I tried with:

ceph config set mon rgw_delete_multi_obj_max_num 1

ceph config set client rgw_delete_multi_obj_max_num 1

ceph config set global rgw_delete_multi_obj_max_num 1


but still only 1000 objects get deleted.


Is the target something different?


On Wed, May 17, 2023 at 11:58 AM Robert Hish 

wrote:


I think this is capped at 1000 by the config setting. Ive used the aws

and s3cmd clients to delete more than 1000 objects at a time and it

works even with the config setting capped at 1000. But it is a bit slow.


#> ceph config help rgw_delete_multi_obj_max_num


rgw_delete_multi_obj_max_num - Max number of objects in a single multi-

object delete request

   (int, advanced)

   Default: 1000

   Can update at runtime: true

   Services: [rgw]


On Wed, 2023-05-17 at 10:51 +0200, Rok Jaklič wrote:

Hi,


I would like to delete millions of objects in RGW instance with:

mc rm --recursive --force ceph/archive/veeam


but it seems it allows only 1000 (or 1002 exactly) removals per

command.


How can I delete/remove all objects with some prefix?


Kind regards,

Rok

___

ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io


___

ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io


___

ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
This message is confidential and is for the sole use of the intended
recipient(s). It may also be privileged or otherwise protected by copyright
or other legal rules. If you have received it by mistake please let us know
by reply email and delete it from your system. It is prohibited to copy
this message or disclose its content to anyone. Any confidentiality or
privilege is not waived or lost by any mistaken delivery or unauthorized
disclosure of the message. All messages sent to and from Agoda may be
monitored to ensure compliance with company policies, to protect the
company's interests and to remove potential malware. Electronic messages
may be intercepted, amended, lost or deleted, or contain viruses.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: NFS Ganesha Active Active Question

2021-11-01 Thread Daniel Gryniewicz
You can fail from one running Ganesha to another, using something like 
ctdb or pacemaker/corosync.  This is how some other clustered 
filesysytem (e.g. Gluster) use Ganesha.  This is not how the Ceph 
community has decided to implement HA with Ganesha, so it will be a more 
manual setup for you, but it can be done.


Daniel

On 10/31/21 1:47 PM, Xiaolong Jiang wrote:

Hi Maged

Yea,  it requires the cloud integration to quickly fail over IP.  For 
me,  I probably need to have standby server and once i detect instance 
is dead.  I probably need to ask cephadm to schedule ganesha there and 
attach the ip to New server.



On Oct 31, 2021, at 10:40 AM, Maged Mokhtar  wrote:



Hi Xiaolong

The grace period is 90 sec, the failover process should be automated 
and should run quicker than this, maybe like 15-30 sec ( not too quick 
to avoid false alarms ), this will make client io resume after a small 
pause.


/Maged

On 31/10/2021 17:37, Xiaolong Jiang wrote:

Hi Maged ,

Thank you for the response. That helps a lot!

Looks like I have to spin up a new server quickly and float the ip to 
the new server. If I spin up the server after about 20 mins, I guess 
IO will recover after that but the previous state will be gone since 
it passed the grace period?



On Oct 31, 2021, at 4:51 AM, Maged Mokhtar  wrote:




On 31/10/2021 05:29, Xiaolong Jiang wrote:

Hi Experts.

I am a bit confused about ganesha active-active setup.

We can set up multiple ganesha servers on top of cephfs and clients 
can point to different ganesh server to serve the traffic. that can 
scale out the traffic.


From client side, is it using DNS round robin directly connecting 
to ganesha server ?
Is it possible to front all ganesha server with a load balancer so 
client only connects load balancer IP and byte writes can load 
balancer across all ganesha server?


My current feeling is we probably have to use DNS way and specific 
client read/write request can only go to same ganesha server for 
the session.


--
Best regards,
Xiaolong Jiang

Senior Software Engineer at Netflix
Columbia University

___
Dev mailing list --d...@ceph.io
To unsubscribe send an email todev-le...@ceph.io



Load balancing ganesha means some clients are being served by a 
gateway and other clients by other gateways, so we distribute the 
clients and their load on the different gateways but each client 
remains on a specific gateway, you cannot have a single client load 
balance on several gateways.


A good way to distribute clients on the gateways is via round robin 
dns, but you do not have, you can distribute ips manually among your 
clients if you want, but dns automates the process in scalable way.


One note about high availability, currently you cannot failover 
clients to another ganesha gateway in case of failure, but if you 
bring the failed gateway back online quickly enough, the client 
connections will resume. So to support HA in case a host server 
failure, the ganesha gateways are implemented as containers so you 
can start the failed container on a new host server.


/Maged



___
Dev mailing list -- d...@ceph.io
To unsubscribe send an email to dev-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rgw crash when use swift api

2022-05-31 Thread Daniel Gryniewicz
This is caused by an object that does not yet have a bucket associated 
with it.  It doesn't happen in S3, because S3 doesn't set_atomic() that 
early, and it's fixed on main by the objctx removal (which is too 
complicated for backport).  Can you open a tracker for this, so that we 
can get a fix into quincy (and older versions, which probably have this 
issue too)?


Daniel

On 5/19/22 20:47, zhou-jie...@gmo.jp wrote:

Hi

My ceph version is 17.2.0 also rgw is the same version.
When I use swift api .if I set a X-Container-Meta-Web-Index: on a container. 
And access it. It will case rgw crash.
This is the log :

#swift post -m 'web-index:index.html' okamura-static-web
# curl http:// 
*/swift/v1/AUTH_9c4a3213fc124afcb369c3b8246cc543/okamura-static-web/ -v
*   Trying *:80...
* TCP_NODELAY set
* Connected to dev-swift-vip.i1.dev.v6.internal-gmo (*) port 80 (#0)

GET /swift/v1/AUTH_9c4a3213fc124afcb369c3b8246cc543/okamura-static-web/ HTTP/1.1
Host: dev-swift-vip.i1.dev.v6.internal-gmo
User-Agent: curl/7.68.0
Accept: */*


* Empty reply from server
* Connection #0 to host dev-swift-vip.i1.dev.v6.internal-gmo left intact
curl: (52) Empty reply from server

first time it will return Empty reply
next request will connect refused

#curl http://*** -v
*   Trying ***:80...
* TCP_NODELAY set
* connect to 172.22.35.94 port 80 failed: Connection refused
* Failed to connect to dev-swift-vip.i1.dev.v6.internal-gmo port 80: Connection 
refused
* Closing connection 0
curl: (7) Failed to connect to dev-swift-vip.i1.dev.v6.internal-gmo port 80: 
Connection refused


in ceph rgw log. We will see thread is crashed

-60> 2022-05-20T00:21:19.739+ 7fe264a1c700  5 lifecycle: schedule life 
cycle next start time: Sat May 21 00:00:00 2022
-59> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 HTTP_ACCEPT=*/*
-58> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 HTTP_HOST=***
-57> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 
HTTP_USER_AGENT=curl/7.68.0
-56> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 HTTP_VERSION=1.1
-55> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 REMOTE_ADDR=***
-54> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 REQUEST_METHOD=GET
-53> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 
REQUEST_URI=/swift/v1/AUTH_9c4a3213fc124afcb369c3b8246cc543/okamura-static-web/
-52> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 
SCRIPT_URI=/swift/v1/AUTH_9c4a3213fc124afcb369c3b8246cc543/okamura-static-web/
-51> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 SERVER_PORT=80
-50> 2022-05-20T00:21:20.294+ 7fe2471e1700  1 == starting new 
request req=0x7fe2983db650 =
-49> 2022-05-20T00:21:20.294+ 7fe2471e1700  2 req 14058840826806468654 
0.0s initializing for trans_id = 
tx0c31b093ebb8f002e-006286df00-d605-dev-c3j1
-48> 2022-05-20T00:21:20.294+ 7fe2471e1700 10 req 14058840826806468654 
0.0s rgw api priority: s3=8 s3website=7
-47> 2022-05-20T00:21:20.294+ 7fe2471e1700 10 req 14058840826806468654 
0.0s host=dev-swift-vip.i1.dev.v6.internal-gmo
-46> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 req 14058840826806468654 
0.0s subdomain= domain= in_hosted_domain=0 in_hosted_domain_s3website=0
-45> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 req 14058840826806468654 
0.0s final domain/bucket subdomain= domain= in_hosted_domain=0 
in_hosted_domain_s3website=0 s->info.domain= 
s->info.request_uri=/swift/v1/AUTH_9c4a3213fc124afcb369c3b8246cc543/okamura-static-web/
-44> 2022-05-20T00:21:20.294+ 7fe2471e1700 10 req 14058840826806468654 
0.0s ver=v1 first=okamura-static-web req=
-43> 2022-05-20T00:21:20.294+ 7fe2471e1700 10 req 14058840826806468654 
0.0s handler=28RGWHandler_REST_Bucket_SWIFT
-42> 2022-05-20T00:21:20.294+ 7fe2471e1700  2 req 14058840826806468654 
0.0s getting op 0
-41> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 req 14058840826806468654 
0.0s get_system_obj_state: rctx=0x7fe2983da680 
obj=dev-c3j1.rgw.log:script.prerequest. state=0x560db724e9a0 s->prefetch_data=0
-40> 2022-05-20T00:21:20.294+ 7fe2471e1700 10 req 14058840826806468654 
0.0s cache get: name=dev-c3j1.rgw.log++script.prerequest. : hit (negative 
entry)
-39> 2022-05-20T00:21:20.294+ 7fe2471e1700 10 req 14058840826806468654 
0.0s swift:list_bucket scheduling with throttler client=3 cost=1
-38> 2022-05-20T00:21:20.294+ 7fe2471e1700 10 req 14058840826806468654 
0.0s swift:list_bucket op=28RGWListBucket_ObjStore_SWIFT
-37> 2022-05-20T00:21:20.294+ 7fe2471e1700  2 req 14058840826806468654 
0.0s swift:list_bucket verifying requester
-36> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 req 14058840826806468654 
0.0s swift:list_bucket rgw::auth::swift::DefaultStrategy: trying 
rgw::auth::swift::TempURLEngine
-35> 2022-05-20T00:21:20.294+ 7fe2471e1700 20 req 14058

[ceph-users] Re: rfc: Accounts in RGW

2022-06-16 Thread Daniel Gryniewicz

On 6/15/22 14:06, Casey Bodley wrote:

(oops, i had cc'ed this to the old ceph-users list)

On Wed, Jun 15, 2022 at 1:56 PM Casey Bodley  wrote:


On Mon, May 11, 2020 at 10:20 AM Abhishek Lekshmanan  wrote:



The basic premise is for an account to be a container for users, and
also related functionality like roles & groups. This would converge
similar to the AWS concept of an account, where the AWS account can
further create iam users/roles or groups. Every account can have a root
user or user(s) with permissions to administer creation of users and
allot quotas within an account. These can be implemented with a new
account cap. IAM set of apis already have a huge subset of functionality
to summarize accounts and inspect/create users/roles or groups. Every
account would also store the membership of its users/groups and roles,
(similar to user's buckets) though we'd ideally limit to < 10k
users/roles or groups per account.

In order to deal with the currently used tenants which namespace
buckets, but also currently stand in for the account id in the policy
language & ARNs, we'd have a tenant_id attribute in the account, which
if set will prevent cross tenant users being added. Though this is not
enforced when the tenant id isn't set, accounts without this field set
can potentially add users across tenants, so this is one of the cases
where we expect the account owner to know what they are doing.
We'd transition away from : in the Policy principal to
:, so if users with different tenants are in the same account
we'd expect the user to change their policies to reuse the account ids.

In terms of regular operations IO costs, the user info would have an account id
attribute, and if non empty we'd have to read the Account root user policies and
/or public access configuration, though other attributes like list of 
users/roles
and groups would only be read for necessary IAM/admin apis.

Quotas
~~
For quotas we can implement one of the following ways
- a user_bytes/buckets quota, which would be alloted to every user
- a total account quota, in which case it is the responsibility of the account
   user to allot a quota upon user creation

Though for operations themselves it is th user quota that comes into play.

APIs

- creating an account itself should be available via the admin tooling/apis
- Ideally creation of a root user under an account would still have to be
   explicitly, though we could consider adding this to the account creation
   process itself to simplify things.
- For further user creation and management, we could start implementing to the
   iam set of apis in the future, though currently we already have admin apis 
for
   user creation and the like, and we could allow the user with account caps to
   do these operations

Deviations
~~
Some apis like list buckets in AWS list all the buckets in the user account and
not the specific iam user, we'd probably still list only the user buckets,
though we could consider this for the account root user.

Wrt to the openstack swift apis, we'd still keep the current user_id -> swift
account id mapping, so no breakage is expected wrt end user apis, so the
account stats and related apis would be similar to the older version where
it is still user's summary that is displayed

Comments on if this is the right direction?

--
Abhishek
___
Dev mailing list -- d...@ceph.io
To unsubscribe send an email to dev-le...@ceph.io



this project has been revived in
https://github.com/ceph/ceph/pull/46373 and we've been talking through
the design in our weekly refactoring meeting

Abhishek shared a good summary of the design above. the only major
changes we've made are in its interaction with swift tenants:
- accounts will be strictly namespaced by tenant, so an account can't
mix users from different tenants
- add a unique account ID, separate from the account name, for use in
IAM policy. use a specific, documented format to disambiguate account
IDs from tenant names

the account features we're planning to start with are:
- radosgw-admin commands and /admin/ APIs to add/remove/list the users
and roles under an account
- support for IAM principals like ACCOUNTID/username,
ACCOUNTID/rolename and ACCOUNTID/* in addition to tenant/...
- ListAllMyBuckets lists all buckets under the user's account, not
only those owned by the user
- account quotas that limit objects/bytes, on top of existing user/bucket quotas


A thought just occurred to me.  The impetus of this work was to have 
tenant quota.  Since we're not having one account/tenant anymore, how do 
we get tenant quota out of this work?





eventually we'd like to add:
- IAM APIs for account and user management by 'account root users'
without global admin caps
- support for groups under account

i'd love to hear feedback from the community - what kind of account
functionality would you most like to see?


___
ceph-users mai

[ceph-users] Re: lifecycle config minimum time

2022-06-23 Thread Daniel Gryniewicz

Lifecycle only runs once per day, so you cannot set times less than a day.

Daniel

On 6/21/22 05:04, farhad kh wrote:

i want set lc for incomplete multipart but i not find document that say use
minute or hour for time
  how can set time for lc less than day ?
  
 Abort incomplete multipart upload after 1 day
 
 Enabled
 
 1
 
 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephfs - NFS Ganesha

2020-05-15 Thread Daniel Gryniewicz
It sounds like you're putting the FSAL_CEPH config in another file in 
/etc/ganesha.  Ganesha only loads one file: /etc/ganesha/ganesha.conf - 
other files need to be included in that file with the %include command. 
For a simple config like yours, just use the single 
/etc/ganesha/ganesha.conf file.


Daniel

On 5/15/20 4:59 AM, Amudhan P wrote:

Hi Rafael,

I have used config you have provided but still i am not able mount nfs. I
don't see any error in log msg

Output from ganesha.log
---
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8732[main]
main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 2.6.0
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file
successfully parsed
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
lower_my_caps :NFS STARTUP :EVENT :CAP_SYS_RESOURCE was successfully
removed for proper quota
  management in FSAL
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
lower_my_caps :NFS STARTUP :EVENT :currenty set capabilities are: =
cap_chown,cap_dac_overrid
e,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_
raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_time,cap_sys_tty
_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap+ep
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_Init_svc :DISP :CRIT :Cannot acquire credentials for principal nfs
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_Init_admin_thread :NFS CB :EVENT :Admin thread initialized
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_rpc_cb_init_ccache :NFS STARTUP :EVENT :Callback creds directory
(/var/run/ganesha) alrea
dy exists
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_rpc_cb_init_ccache :NFS STARTUP :WARN
:gssd_refresh_krb5_machine_credential failed (-1765
328160:0)
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_Start_threads :THREAD :EVENT :Starting delayed executor.
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_Start_threads :THREAD :EVENT :9P/TCP dispatcher thread was started
successfully
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl :
ganesha.nfsd-8738[_9p_disp] _9p_dispatcher_thread :9P DISP :EVENT :9P
dispatcher started
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_Start_threads :THREAD :EVENT :gsh_dbusthread was started successfully
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_Start_threads :THREAD :EVENT :admin thread was started successfully
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_Start_threads :THREAD :EVENT :reaper thread was started successfully
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_Start_threads :THREAD :EVENT :General fridge was started successfully
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_start :NFS STARTUP :EVENT
:-
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_start :NFS STARTUP :EVENT : NFS SERVER INITIALIZED
15/05/2020 08:50:43 : epoch 5ebe57e3 : strgcntrl : ganesha.nfsd-8738[main]
nfs_start :NFS STARTUP :EVENT
:-
15/05/2020 08:52:13 : epoch 5ebe57e3 : strgcntrl :
ganesha.nfsd-8738[reaper] nfs_lift_grace_locked :STATE :EVENT :NFS Server
Now NOT IN GRACE

Regards
Amudhan P

On Fri, May 15, 2020 at 1:01 PM Rafael Lopez 
wrote:


Hello Amudhan,

The only ceph specific thing required in the ganesha config is to add the
FSAL block to your export, everything else is standard ganesha config as
far as I know. eg: this would export the root dir of your cephfs as
nfs-server:/cephfs
EXPORT
{
 Export_ID = 100;
 Path = /;
 Pseudo = /cephfs;
 FSAL {
 Name = CEPH;
 User_Id = cephfs_cephx_user;
 }
 CLIENT {
 Clients =  1.2.3.4;
 Access_type = RW;
 }
}

This will rely on ceph config in /etc/ceph/ceph.conf containing typical
cluster client conn

[ceph-users] Re: Nfs-ganesha rpm still has samba package dependency

2020-05-29 Thread Daniel Gryniewicz

You need to disable _MSPAC_SUPPORT to get rid of this dep.


Daniel

On 11/17/19 5:55 AM, Marc Roos wrote:



==
  Package Arch Version
   RepositorySize

==
Installing:
  nfs-ganesha x86_64   2.8.1.2-0.1.el7
   CentOS7-custom   680 k
  nfs-ganesha-cephx86_64   2.8.1.2-0.1.el7
   CentOS7-custom30 k
  nfs-ganesha-mem x86_64   2.8.1.2-0.1.el7
   CentOS7-custom30 k
  nfs-ganesha-rgw x86_64   2.8.1.2-0.1.el7
   CentOS7-custom21 k
  nfs-ganesha-vfs x86_64   2.8.1.2-0.1.el7
   CentOS7-custom44 k
  nfs-ganesha-xfs x86_64   2.8.1.2-0.1.el7
   CentOS7-custom42 k
Installing for dependencies:
  libldb  x86_64   1.4.2-1.el7
   CentOS7  144 k
  libntirpc   x86_64   1.8.0-0.1.el7
   CentOS7-custom   113 k
  libtalloc   x86_64   2.1.14-1.el7
   CentOS7   32 k
  libtdb  x86_64   1.3.16-1.el7
   CentOS7   48 k
  libtevent   x86_64   0.9.37-1.el7
   CentOS7   40 k
  libwbclient x86_64   4.9.1-6.el7
   CentOS7  110 k
  samba-client-libs   x86_64   4.9.1-6.el7
   CentOS7  4.9 M
  samba-commonnoarch   4.9.1-6.el7
   CentOS7  209 k
  samba-common-libs   x86_64   4.9.1-6.el7
   CentOS7  170 k

Transaction Summary

==
Install  6 Packages (+9 Dependent packages)

Total download size: 6.6 M
Installed size: 23 M
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Deploy nfs from cephadm

2020-06-03 Thread Daniel Gryniewicz
rados_connect() is used by the recovery and/or grace code.  It's 
configured separately from CephFS, so it's errors are unrelated to 
CephFS issues.


Daniel

On 6/3/20 8:54 AM, Simon Sutter wrote:

Hello,


Thank you very much.

I was a bit worried about all the other messages, especially those two from a 
different container (started before the right one?):


Jun 03 08:22:23 testnode1 bash[3169]: rados_connect: -13
Jun 03 08:22:23 testnode1 bash[3169]: Can't connect to cluster: -13


Nevertheless, installing and starting rpcbind did it for me.

After I configured the ganesha service in 
"/var/lib/ceph/{UUID}/nfs.cephnfs.testnode1/etc/ganesha/ganesha.conf" I was 
able to start and accesss cephfs from a nfs client.

Maybe a hint in the docs, to install rpcbind, or to enable just nfsv4 would 
also help others.



Best Regards,

Simon



Von: Michael Fritch 
Gesendet: Dienstag, 2. Juni 2020 22:33:46
An: ceph-users@ceph.io
Betreff: [ceph-users] Re: Deploy nfs from cephadm

Hi,

Do you have a running rpcbind service?
$ systemctl status rpcbind

NFSv3 requires rpcbind, but this dependency will be removed in a later release 
of Octopus. I've updated the tracker with more detail.

Hope this helps,
-Mike


John Zachary Dover wrote:

I've created a docs tracker ticket for this issue:

https://tracker.ceph.com/issues/45819

Bug #45819: Possible error in deploying-nfs-ganesha docs - Orchestrator - 
Ceph
tracker.ceph.com
Redmine





Zac
Ceph Docs

On Wed, Jun 3, 2020 at 12:34 AM Simon Sutter 
  Sorry, allways the wrong button...

  So I ran the command:
  ceph orch apply nfs cephnfs cephfs.backuptest.data

  And there is now a not working container:
  ceph orch ps:
  nfs.cephnfs.testnode1testnode1  error  6m ago 71m
docker.io/ceph/ceph:v15  


  journalctl tells me this:

  Jun 02 15:17:45 testnode1 systemd[1]: Starting Ceph nfs.cephnfs.testnode1
  for 915cdf28-8f66-11ea-bb83-ac1f6b4cd516...
  Jun 02 15:17:45 testnode1 podman[63413]: Error: no container with name or
  ID ceph-915cdf28-8f66-11ea-bb83-ac1f6b4cd516-nfs.cephnfs.testnode1 found:
  no such container
  Jun 02 15:17:45 testnode1 systemd[1]: Started Ceph nfs.cephnfs.testnode1
  for 915cdf28-8f66-11ea-bb83-ac1f6b4cd516.
  Jun 02 15:17:45 testnode1 podman[63434]: 2020-06-02 15:17:45.867685349
  +0200 CEST m=+0.080338785 container create
  7290cc21b0e2498876773f1ef2a2be24abf62e9ed058d60e79c8f3d3d3e9e0d3 (image=
  docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-1>
  Jun 02 15:17:46 testnode1 podman[63434]: 2020-06-02 15:17:46.196760186
  +0200 CEST m=+0.409413617 container init
  7290cc21b0e2498876773f1ef2a2be24abf62e9ed058d60e79c8f3d3d3e9e0d3 (image=
  docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11e>
  Jun 02 15:17:46 testnode1 podman[63434]: 2020-06-02 15:17:46.211149759
  +0200 CEST m=+0.423803191 container start
  7290cc21b0e2498876773f1ef2a2be24abf62e9ed058d60e79c8f3d3d3e9e0d3 (image=
  docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11>
  Jun 02 15:17:46 testnode1 podman[63434]: 2020-06-02 15:17:46.21122888
  +0200 CEST m=+0.423882373 container attach
  7290cc21b0e2498876773f1ef2a2be24abf62e9ed058d60e79c8f3d3d3e9e0d3 (image=
  docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11>
  Jun 02 15:17:46 testnode1 bash[63432]: rados_connect: -13
  Jun 02 15:17:46 testnode1 bash[63432]: Can't connect to cluster: -13
  Jun 02 15:17:46 testnode1 podman[63434]: 2020-06-02 15:17:46.300445833
  +0200 CEST m=+0.513099326 container died
  7290cc21b0e2498876773f1ef2a2be24abf62e9ed058d60e79c8f3d3d3e9e0d3 (image=
  docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11e>
  Jun 02 15:17:46 testnode1 podman[63434]: 2020-06-02 15:17:46.391730251
  +0200 CEST m=+0.604383723 container remove
  7290cc21b0e2498876773f1ef2a2be24abf62e9ed058d60e79c8f3d3d3e9e0d3 (image=
  docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-1>
  Jun 02 15:17:46 testnode1 podman[63531]: 2020-06-02 15:17:46.496154808
  +0200 CEST m=+0.085374929 container create
  aab4e893b162a643181cea3a9f5f687aae236eb2f4a7f6fad27d503d1fdee893 (image=
  docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-1>
  Jun 02 15:17:46 testnode1 podman[63531]: 2020-06-02 15:17:46.81399203
  +0200 CEST m=+0.403212198 container init
  aab4e893b162a643181cea3a9f5f687aae236eb2f4a7f6fad27d503d1fdee893 (image=
  docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11ea>
  Jun 02 15:17:46 testnode1 podman[63531]: 2020-06-02 15:17:46.828546918
  +0200 CEST m=+0.417767036 container start
  aab4e893b162a643181cea3a9f5f687aae236eb2f4a7f6fad27d503d1fdee893 (image=
  docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-11>
  Jun 02 15:17:46 testnode1 podman[63531]: 2020-06-02 15:17:46.828661425
  +0200 CEST m=+0.417881609 container attach
  aab4e893b162a643181cea3a9f5f687aae236eb2f4a7f6fad27d503d1fdee893 (image=
  docker.io/ceph/ceph:v15, name=ceph-915cdf28-8f66-1>
  Jun 02 15:17:46 testnode1 bash[63432]: 02/06/2020 13:17:46 : epoch
  5ed6517a : 

[ceph-users] Re: RGW bucket sync

2020-09-09 Thread Daniel Gryniewicz
Basically same thing that happens when you overwrite any object.  New 
data is sent from the client, and a new Head is created pointing at it. 
The old head is removed, and the data marked for garbage collection if 
it's unused (which it won't be, in this case, since another Head points 
at it).


Daniel

On 9/9/20 4:55 AM, Eugen Block wrote:

I think rgw will make a header that points to the original data only, so
you are right in that there is no huge data copy operation.


Alright, that would explain it. But what happens when I overwrite the 
object in bucket1 with different content? Because I'm still able to get 
the original content from bucket2/file although it's overwritten in 
bucket1.



Zitat von Janne Johansson :


Den ons 9 sep. 2020 kl 10:06 skrev Eugen Block :


Hi *,

I'm wondering about what actually happens in the ceph cluster if I
copy/sync the content of one bucket into a different bucket.

How does this work? It seems as if there's (almost) no client traffic
(except for the cp command, of course) to recreate the file in the
second bucket, as if the OSDs are directly instructed to create copies
of the objects.



I think rgw will make a header that points to the original data only, so
you are right in that there is no huge data copy operation.

--
May the most significant bit of your life be positive.



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: NFS Ganesha NFSv3

2020-09-23 Thread Daniel Gryniewicz
NFSv3 needs privileges to connect to the portmapper.  Try running your 
docker container in privileged mode, and see if that helps.


Daniel

On 9/23/20 11:42 AM, Gabriel Medve wrote:

Hi,

I have a CEPH 15.2.5 running in a docker , i configure nfs ganesha with 
nfs version 3 but i can not mount it.
If configure ganesha with nfs version 4 i can mounted without problems 
but i need the version 3 .


The error is mount.nfs: Protocol not supported

Can help me?

Thanks.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Nfs-ganesha 2.6 upgrade to 2.7

2019-09-26 Thread Daniel Gryniewicz
Ganesha itself has no dependencies on samba (and there aren't any on
my system, when I build).  These must be being pulled in by something
else that Ganesha does use.

Daniel

On Thu, Sep 26, 2019 at 11:21 AM Marc Roos  wrote:
>
>
> Is it really necessary to have these dependencies in nfs-ganesha 2.7
>
> Dep-Install samba-client-libs-4.8.3-4.el7.x86_64  @CentOS7
> Dep-Install samba-common-4.8.3-4.el7.noarch   @CentOS7
> Dep-Install samba-common-libs-4.8.3-4.el7.x86_64  @CentOS7
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Nfs-ganesha 2.6 upgrade to 2.7

2019-09-27 Thread Daniel Gryniewicz
Sounds like someone turned on MSPAC support, which is off by default. 
It should probably be left off.


Daniel

On 9/26/19 1:19 PM, Marc Roos wrote:


Yes I think this one libntirpc. In 2.6 this samba dependency was not
there.

-Original Message-
From: Daniel Gryniewicz [mailto:d...@redhat.com]
Sent: donderdag 26 september 2019 19:07
To: Marc Roos
Cc: ceph-users
Subject: Re: [ceph-users] Nfs-ganesha 2.6 upgrade to 2.7

Ganesha itself has no dependencies on samba (and there aren't any on my
system, when I build).  These must be being pulled in by something else
that Ganesha does use.

Daniel

On Thu, Sep 26, 2019 at 11:21 AM Marc Roos 
wrote:



Is it really necessary to have these dependencies in nfs-ganesha 2.7

 Dep-Install samba-client-libs-4.8.3-4.el7.x86_64  @CentOS7
 Dep-Install samba-common-4.8.3-4.el7.noarch   @CentOS7
 Dep-Install samba-common-libs-4.8.3-4.el7.x86_64  @CentOS7

___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
email to ceph-users-le...@ceph.io





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: File listing with browser

2019-10-16 Thread Daniel Gryniewicz
S3 is not a browser friendly protocol.  There isn't a way to get
user-friendly output via the browser alone, you need some form of
client that speaks the S3 REST protocol.  The most commonly used one
by us is s3cmd, which is a command line utility.  A quick google
search finds some web-based clients, such as this one:
https://github.com/rufuspollock/s3-bucket-listing  I've never used it,
so I cannot say how well it works.

Daniel

On Wed, Oct 16, 2019 at 8:49 AM  wrote:
>
> How do I setup ceph that user can access objects from one pool from browser?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: nfs ganesha rgw write errors

2019-11-18 Thread Daniel Gryniewicz

On 11/17/19 1:42 PM, Marc Roos wrote:
  


Hi Daniel,

I am able to mount the buckets with your config, however when I try to
write something, my logs get a lot of these errors:

svc_732] nfs4_Errno_verbose :NFS4 :CRIT :Error I/O error in
nfs4_write_cb converted to NFS4ERR_IO but was set non-retryable

Any chance you know how to resolve this?

  



Sounds like the RGW user configured in Ganesha doesn't have permissions 
to write to the buckets in question.


Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw lifecycle seems work strangely

2020-02-26 Thread Daniel Gryniewicz
Lifecycle is designed to run once per day.  There's a lot of resource 
optimization that's done based on this assumption to reduce the overhead 
of lifecycle on the cluster.   One of these is that it only builds the 
list of objects to handle the first time it's run in that day.  So, in 
this case, you ran it manually, and it built the list of objects, and 
processed them.  The next time you run it the same day, it processes the 
same list, finds nothing to do, and exits.  Objects added since the 
lifecycle run for the day was started will be processed the next day.



Daniel

On 2/26/20 7:18 AM, quexian da wrote:

ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus
(stable)

I made a bucket named "test_lc" and ran `s3cmd expire
  --expiry-date=2019-01-01 s3://test_lc` to set the lifecycle (2019-01-01 is
earlier than current date so every object will be removed).

Then I ran `radosgw-admin lc process`, the objects got deleted as expected,
and the status from `radosgw-admin lc list` is "completed". However, if I
upload some objects, and ran  `radosgw-admin lc process` again, the objects
were not deleted.

Could you please tell me what the reason is and what I should do in this
case? Thanks in advance!
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: multi-node NFS Ganesha + libcephfs caching

2020-03-24 Thread Daniel Gryniewicz




On 3/23/20 4:31 PM, Maged Mokhtar wrote:


On 23/03/2020 20:50, Jeff Layton wrote:

On Mon, 2020-03-23 at 15:49 +0200, Maged Mokhtar wrote:

Hello all,

For multi-node NFS Ganesha over CephFS, is it OK to leave libcephfs 
write caching on, or should it be configured off for failover ?



You can do libcephfs write caching, as the caps would need to be
recalled for any competing access. What you really want to avoid is any
sort of caching at the ganesha daemon layer.


Hi Jeff,

Thanks for your reply. I meant caching by libcepfs used within the 
ganesha ceph fsal plugin, which i am not sure from your reply if this is 
what you refer to as ganesha daemon layer (or does the later mean the 
internal mdcache in ganesha). I really appreciate if you can clarify 
this point.


Caching in libcephfs is fine, it's caching above the FSAL layer that you 
should avoid.




I really have doubts that it is safe to leave write caching in the 
plugin and have safe failover, yet i see comments in the conf file such as:

# The libcephfs client will aggressively cache information while it
# can, so there is little benefit to ganesha actively caching the same
# objects.

Or is it up to the NFS client to issue cache syncs and re-submit writes 
if it detects failover ?


Correct.  During failover, NFS will go into it's Grace period, which 
blocks new state,  and allow the NFS clients to re-acquire the state 
(opens, locks, delegations, etc.).  This includes re-sending any 
non-committed writes (commits will cause the data to be saved to the 
cluster, not just the libcephfs cache).  Once this is all done, normal 
operation proceeds.  It should be safe, even with caching in libcephfs.


Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: multi-node NFS Ganesha + libcephfs caching

2020-03-24 Thread Daniel Gryniewicz



On 3/24/20 8:19 AM, Maged Mokhtar wrote:


On 24/03/2020 13:35, Daniel Gryniewicz wrote:



On 3/23/20 4:31 PM, Maged Mokhtar wrote:


On 23/03/2020 20:50, Jeff Layton wrote:

On Mon, 2020-03-23 at 15:49 +0200, Maged Mokhtar wrote:

Hello all,

For multi-node NFS Ganesha over CephFS, is it OK to leave libcephfs 
write caching on, or should it be configured off for failover ?



You can do libcephfs write caching, as the caps would need to be
recalled for any competing access. What you really want to avoid is any
sort of caching at the ganesha daemon layer.


Hi Jeff,

Thanks for your reply. I meant caching by libcepfs used within the 
ganesha ceph fsal plugin, which i am not sure from your reply if this 
is what you refer to as ganesha daemon layer (or does the later mean 
the internal mdcache in ganesha). I really appreciate if you can 
clarify this point.


Caching in libcephfs is fine, it's caching above the FSAL layer that 
you should avoid.




I really have doubts that it is safe to leave write caching in the 
plugin and have safe failover, yet i see comments in the conf file 
such as:

# The libcephfs client will aggressively cache information while it
# can, so there is little benefit to ganesha actively caching the same
# objects.

Or is it up to the NFS client to issue cache syncs and re-submit 
writes if it detects failover ?


Correct.  During failover, NFS will go into it's Grace period, which 
blocks new state,  and allow the NFS clients to re-acquire the state 
(opens, locks, delegations, etc.).  This includes re-sending any 
non-committed writes (commits will cause the data to be saved to the 
cluster, not just the libcephfs cache).  Once this is all done, normal 
operation proceeds.  It should be safe, even with caching in libcephfs.


Daniel

Thanks Daniel for the clarification..so it is the responsibility of the 
client tor re-send writes...2 questions so i can understand this better:


-If this is handled at the client..why on the gateway it is ok to cache 
at the FSAL layer but not above ?


In principle, it's fine above.  However, that requires a level of 
coordination that's not there right now.  The libcephfs cache is 
integrated with the CAPs system, and knows when it can cache and when it 
needs to flush.  There's work to do to get that up to the higher layers.




-At what level/layer on the client does this get handled: NFS client 
layer (which will detect failover), filesystem layer, page cache...?


The NFS client layer, interacting with the VFS/page cache.  (NFS is the 
filesystem in this case, so technically the filesystem layer.)


Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: multi-node NFS Ganesha + libcephfs caching

2020-03-24 Thread Daniel Gryniewicz



On 3/24/20 1:16 PM, Maged Mokhtar wrote:


On 24/03/2020 16:48, Maged Mokhtar wrote:


On 24/03/2020 15:14, Daniel Gryniewicz wrote:



On 3/24/20 8:19 AM, Maged Mokhtar wrote:


On 24/03/2020 13:35, Daniel Gryniewicz wrote:



On 3/23/20 4:31 PM, Maged Mokhtar wrote:


On 23/03/2020 20:50, Jeff Layton wrote:

On Mon, 2020-03-23 at 15:49 +0200, Maged Mokhtar wrote:

Hello all,

For multi-node NFS Ganesha over CephFS, is it OK to leave 
libcephfs write caching on, or should it be configured off for 
failover ?



You can do libcephfs write caching, as the caps would need to be
recalled for any competing access. What you really want to avoid 
is any

sort of caching at the ganesha daemon layer.


Hi Jeff,

Thanks for your reply. I meant caching by libcepfs used within the 
ganesha ceph fsal plugin, which i am not sure from your reply if 
this is what you refer to as ganesha daemon layer (or does the 
later mean the internal mdcache in ganesha). I really appreciate 
if you can clarify this point.


Caching in libcephfs is fine, it's caching above the FSAL layer 
that you should avoid.




I really have doubts that it is safe to leave write caching in the 
plugin and have safe failover, yet i see comments in the conf file 
such as:

# The libcephfs client will aggressively cache information while it
# can, so there is little benefit to ganesha actively caching the 
same

# objects.

Or is it up to the NFS client to issue cache syncs and re-submit 
writes if it detects failover ?


Correct.  During failover, NFS will go into it's Grace period, 
which blocks new state,  and allow the NFS clients to re-acquire 
the state (opens, locks, delegations, etc.). This includes 
re-sending any non-committed writes (commits will cause the data to 
be saved to the cluster, not just the libcephfs cache).  Once this 
is all done, normal operation proceeds.  It should be safe, even 
with caching in libcephfs.


Daniel

Thanks Daniel for the clarification..so it is the responsibility of 
the client tor re-send writes...2 questions so i can understand this 
better:


-If this is handled at the client..why on the gateway it is ok to 
cache at the FSAL layer but not above ?


In principle, it's fine above.  However, that requires a level of 
coordination that's not there right now.  The libcephfs cache is 
integrated with the CAPs system, and knows when it can cache and when 
it needs to flush.  There's work to do to get that up to the higher 
layers.




-At what level/layer on the client does this get handled: NFS client 
layer (which will detect failover), filesystem layer, page cache...?


The NFS client layer, interacting with the VFS/page cache.  (NFS is 
the filesystem in this case, so technically the filesystem layer.)


Daniel



Thank you so much for the clarification..

Maged
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


One more thing: for non-Linux clients, specifically VMWare, their NFS 
client may not behave the same, correct ?  In the iSCSI domain, VMWare 
does not have any kind of buffer/page cache, which is probably to 
support failover among ESXi nodes, should i test this or am i on the 
wrong track ? /Maged





This behavior is a requirement of the spec.  All compliant NFS 
implementations behave this way.  If you don't have a client side cache, 
then you have to do only stable writes (each write is sync'd to the 
backing store).  This is slower, but it's safe.  If VMWare doesn't do 
this, then they *will* lose data if the server ever crashes, and it will 
be their exclusive fault.


Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io