On Mon, Jul 25, 2016 at 1:07 PM, David Gossage <[email protected]> wrote:
> > On Mon, Jul 25, 2016 at 1:00 PM, David Gossage < > [email protected]> wrote: > >> >> On Mon, Jul 25, 2016 at 9:58 AM, Krutika Dhananjay <[email protected]> >> wrote: >> >>> OK, could you try the following: >>> >>> i. Set network.remote-dio to off >>> # gluster volume set <VOL> network.remote-dio off >>> >>> ii. Set performance.strict-o-direct to on >>> # gluster volume set <VOL> performance.strict-o-direct on >>> >>> iii. Stop the affected vm(s) and start again >>> >>> and tell me if you notice any improvement? >>> >> Not sure if helpful but over the gluster mount it creates even though it won't attech to data center I get this error from bricks log running following dd if=/dev/zero of=/rhev/data-center/mnt/glusterSD/192.168.71.10\:_glustershard/5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test oflag=direct count=100 bs=1M dd: error writing ‘/rhev/data-center/mnt/glusterSD/192.168.71.10:_glustershard/5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test’: Invalid argument dd: closing output file ‘/rhev/data-center/mnt/glusterSD/192.168.71.10:_glustershard/5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test’: Invalid argument [2016-07-25 18:20:19.393121] E [MSGID: 113039] [posix.c:2939:posix_open] 0-glustershard-posix: open on /gluster2/brick1/1/.glusterfs/02/f4/02f4783b-2799-46d9-b787-53e4ccd9a052, flags: 16385 [Invalid argument] [2016-07-25 18:20:19.393204] E [MSGID: 115070] [server-rpc-fops.c:1568:server_open_cbk] 0-glustershard-server: 120: OPEN /5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test (02f4783b-2799-46d9-b787-53e4ccd9a052) ==> (Invalid argument) [Invalid argument] and /var/log/glusterfs/rhev-data-center-mnt-glusterSD-192.168.71.10\:_glustershard.log [2016-07-25 18:20:19.393275] E [MSGID: 114031] [client-rpc-fops.c:466:client3_3_open_cbk] 0-glustershard-client-0: remote operation failed. Path: /5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test (02f4783b-2799-46d9-b787-53e4ccd9a052) [Invalid argument] [2016-07-25 18:20:19.393270] E [MSGID: 114031] [client-rpc-fops.c:466:client3_3_open_cbk] 0-glustershard-client-1: remote operation failed. Path: /5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test (02f4783b-2799-46d9-b787-53e4ccd9a052) [Invalid argument] [2016-07-25 18:20:19.393317] E [MSGID: 114031] [client-rpc-fops.c:466:client3_3_open_cbk] 0-glustershard-client-2: remote operation failed. Path: /5b8a4477-4d87-43a1-aa52-b664b1bd9e08/images/test (02f4783b-2799-46d9-b787-53e4ccd9a052) [Invalid argument] [2016-07-25 18:20:19.393357] W [fuse-bridge.c:2311:fuse_writev_cbk] 0-glusterfs-fuse: 117: WRITE => -1 gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid argument) [2016-07-25 18:20:19.393389] W [fuse-bridge.c:2311:fuse_writev_cbk] 0-glusterfs-fuse: 118: WRITE => -1 gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid argument) [2016-07-25 18:20:19.393611] W [fuse-bridge.c:2311:fuse_writev_cbk] 0-glusterfs-fuse: 119: WRITE => -1 gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid argument) [2016-07-25 18:20:19.393708] W [fuse-bridge.c:2311:fuse_writev_cbk] 0-glusterfs-fuse: 120: WRITE => -1 gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid argument) [2016-07-25 18:20:19.393771] W [fuse-bridge.c:2311:fuse_writev_cbk] 0-glusterfs-fuse: 121: WRITE => -1 gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid argument) [2016-07-25 18:20:19.393840] W [fuse-bridge.c:2311:fuse_writev_cbk] 0-glusterfs-fuse: 122: WRITE => -1 gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid argument) [2016-07-25 18:20:19.393914] W [fuse-bridge.c:2311:fuse_writev_cbk] 0-glusterfs-fuse: 123: WRITE => -1 gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid argument) [2016-07-25 18:20:19.393982] W [fuse-bridge.c:2311:fuse_writev_cbk] 0-glusterfs-fuse: 124: WRITE => -1 gfid=02f4783b-2799-46d9-b787-53e4ccd9a052 fd=0x7f5fec0ba08c (Invalid argument) [2016-07-25 18:20:19.394045] W [fuse-bridge.c:709:fuse_truncate_cbk] 0-glusterfs-fuse: 125: FTRUNCATE() ERR => -1 (Invalid argument) [2016-07-25 18:20:19.394338] W [fuse-bridge.c:1290:fuse_err_cbk] 0-glusterfs-fuse: 126: FLUSH() ERR => -1 (Invalid argument) >>> >> Previous instll I had issue with is still on gluster 3.7.11 >> >> My test install of ovirt 3.6.7 and gluster 3.7.13 with 3 bricks on a >> locak disk right now isn't allowing me to add the gluster storage at all. >> >> Keep getting some type of UI error >> >> 2016-07-25 12:49:09,277 ERROR >> [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService] >> (default task-33) [] Permutation name: 430985F23DFC1C8BE1C7FDD91EDAA785 >> 2016-07-25 12:49:09,277 ERROR >> [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService] >> (default task-33) [] Uncaught exception: : java.lang.ClassCastException >> at Unknown.ps( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@3837) >> at Unknown.ts( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@20) >> at Unknown.vs( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@18) >> at Unknown.iJf( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@19) >> at Unknown.Xab( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@48) >> at Unknown.P8o( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@4447) >> at Unknown.jQr( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@21) >> at Unknown.A8o( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@51) >> at Unknown.u8o( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@101) >> at Unknown.Eap( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@10718) >> at Unknown.p8n( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@161) >> at Unknown.Cao( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@31) >> at Unknown.Bap( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@10469) >> at Unknown.kRn( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@49) >> at Unknown.nRn( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@438) >> at Unknown.eVn( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@40) >> at Unknown.hVn( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@25827) >> at Unknown.MTn( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@25) >> at Unknown.PTn( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@24052) >> at Unknown.KJe( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@21125) >> at Unknown.Izk( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@10384) >> at Unknown.P3( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@137) >> at Unknown.g4( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@8271) >> at Unknown.<anonymous>( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@65) >> at Unknown._t( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@29) >> at Unknown.du( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@57) >> at Unknown.<anonymous>( >> https://ccengine2.carouselchecks.local/ovirt-engine/webadmin/430985F23DFC1C8BE1C7FDD91EDAA785.cache.html@54 >> ) >> >> > > If I add from storage tab it creates storage domaibn but won't attach to a > datacenter > > Error while executing action Attach Storage Domain: AcquireHostIdFailure > engine.log > 2016-07-25 13:04:45,186 ERROR > [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand] > (default task-90) [4e0e7cbd] Failed in 'CreateStoragePoolVDS' method > 2016-07-25 13:04:45,211 ERROR > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (default task-90) [4e0e7cbd] Correlation ID: null, Call Stack: null, Custom > Event ID: -1, Message: VDSM local command failed: Cannot acquire host id: > (u'5b8a4477-4d87-43a1-aa52-b664b1bd9e08', SanlockException(1, 'Sanlock > lockspace add failure', 'Operation not permitted')) > 2016-07-25 13:04:45,211 INFO > [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand] > (default task-90) [4e0e7cbd] Command > 'org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand' > return value 'StatusOnlyReturnForXmlRpc [status=StatusForXmlRpc [code=661, > message=Cannot acquire host id: (u'5b8a4477-4d87-43a1-aa52-b664b1bd9e08', > SanlockException(1, 'Sanlock lockspace add failure', 'Operation not > permitted'))]]' > 2016-07-25 13:04:45,211 INFO > [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand] > (default task-90) [4e0e7cbd] HostName = local > 2016-07-25 13:04:45,212 ERROR > [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand] > (default task-90) [4e0e7cbd] Command 'CreateStoragePoolVDSCommand(HostName > = local, CreateStoragePoolVDSCommandParameters:{runAsync='true', > hostId='b4d03420-3de8-45b8-a671-45bbe7c05e06', > storagePoolId='7fe4f6ec-71aa-485b-8bba-958e493b66eb', > storagePoolName='NewDefault', > masterDomainId='5b8a4477-4d87-43a1-aa52-b664b1bd9e08', > domainsIdList='[5b8a4477-4d87-43a1-aa52-b664b1bd9e08]', > masterVersion='4'})' execution failed: VDSGenericException: > VDSErrorException: Failed to CreateStoragePoolVDS, error = Cannot acquire > host id: (u'5b8a4477-4d87-43a1-aa52-b664b1bd9e08', SanlockException(1, > 'Sanlock lockspace add failure', 'Operation not permitted')), code = 661 > 2016-07-25 13:04:45,212 INFO > [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand] > (default task-90) [4e0e7cbd] FINISH, CreateStoragePoolVDSCommand, log id: > 2ed8b2b6 > 2016-07-25 13:04:45,212 ERROR > [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] > (default task-90) [4e0e7cbd] Command > 'org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand' > failed: EngineException: > org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: > VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS, > error = Cannot acquire host id: (u'5b8a4477-4d87-43a1-aa52-b664b1bd9e08', > SanlockException(1, 'Sanlock lockspace add failure', 'Operation not > permitted')), code = 661 (Failed with error AcquireHostIdFailure and code > 661) > 2016-07-25 13:04:45,220 ERROR > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (default task-90) [4e0e7cbd] Correlation ID: 4f77f0e0, Job ID: > 6aae65f2-ff61-4bec-a513-18b31828442b, Call Stack: null, Custom Event ID: > -1, Message: Failed to attach Storage Domains to Data Center NewDefault. > (User: admin@internal) > 2016-07-25 13:04:45,228 INFO > [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] > (default task-90) [4e0e7cbd] Lock freed to object > 'EngineLock:{exclusiveLocks='[5b8a4477-4d87-43a1-aa52-b664b1bd9e08=<STORAGE, > ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}' > 2016-07-25 13:04:45,229 INFO > [org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand] > (default task-90) [4e0e7cbd] Command > [id=d08f24d6-f0f9-4df8-aa34-3718ab44f454]: Compensating > DELETED_OR_UPDATED_ENTITY of > org.ovirt.engine.core.common.businessentities.StoragePool; snapshot: > id=7fe4f6ec-71aa-485b-8bba-958e493b66eb. > 2016-07-25 13:04:45,231 INFO > [org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand] > (default task-90) [4e0e7cbd] Command > [id=d08f24d6-f0f9-4df8-aa34-3718ab44f454]: Compensating NEW_ENTITY_ID of > org.ovirt.engine.core.common.businessentities.StoragePoolIsoMap; snapshot: > StoragePoolIsoMapId:{storagePoolId='7fe4f6ec-71aa-485b-8bba-958e493b66eb', > storageId='5b8a4477-4d87-43a1-aa52-b664b1bd9e08'}. > 2016-07-25 13:04:45,231 INFO > [org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand] > (default task-90) [4e0e7cbd] Command > [id=d08f24d6-f0f9-4df8-aa34-3718ab44f454]: Compensating > DELETED_OR_UPDATED_ENTITY of > org.ovirt.engine.core.common.businessentities.StorageDomainStatic; > snapshot: id=5b8a4477-4d87-43a1-aa52-b664b1bd9e08. > 2016-07-25 13:04:45,245 ERROR > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (default task-90) [4e0e7cbd] Correlation ID: 6cae9150, Job ID: > 6aae65f2-ff61-4bec-a513-18b31828442b, Call Stack: null, Custom Event ID: > -1, Message: Failed to attach Storage Domain newone to Data Center > NewDefault. (User: admin@internal) > 2016-07-25 13:04:45,253 WARN > [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (default task-90) > [4e0e7cbd] Trying to release exclusive lock which does not exist, lock key: > '5b8a4477-4d87-43a1-aa52-b664b1bd9e08STORAGE' > 2016-07-25 13:04:45,253 INFO > [org.ovirt.engine.core.bll.storage.AttachStorageDomainToPoolCommand] > (default task-90) [4e0e7cbd] Lock freed to object > 'EngineLock:{exclusiveLocks='[5b8a4477-4d87-43a1-aa52-b664b1bd9e08=<STORAGE, > ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}' > > > > >> -Krutika >>> >>> On Mon, Jul 25, 2016 at 4:57 PM, Samuli Heinonen <[email protected]> >>> wrote: >>> >>>> Hi, >>>> >>>> > On 25 Jul 2016, at 12:34, David Gossage <[email protected]> >>>> wrote: >>>> > >>>> > On Mon, Jul 25, 2016 at 1:01 AM, Krutika Dhananjay < >>>> [email protected]> wrote: >>>> > Hi, >>>> > >>>> > Thanks for the logs. So I have identified one issue from the logs for >>>> which the fix is this: http://review.gluster.org/#/c/14669/. Because >>>> of a bug in the code, ENOENT was getting converted to EPERM and being >>>> propagated up the stack causing the reads to bail out early with 'Operation >>>> not permitted' errors. >>>> > I still need to find out two things: >>>> > i) why there was a readv() sent on a non-existent (ENOENT) file (this >>>> is important since some of the other users have not faced or reported this >>>> issue on gluster-users with 3.7.13) >>>> > ii) need to see if there's a way to work around this issue. >>>> > >>>> > Do you mind sharing the steps needed to be executed to run into this >>>> issue? This is so that we can apply our patches, test and ensure they fix >>>> the problem. >>>> >>>> >>>> Unfortunately I can’t test this right away nor give exact steps how to >>>> test this. This is just a theory but please correct me if you see some >>>> mistakes. >>>> >>>> oVirt uses cache=none settings for VM’s by default which requires >>>> direct I/O. oVirt also uses dd with iflag=direct to check that storage has >>>> direct I/O enabled. Problems exist with GlusterFS with sharding enabled and >>>> bricks running on ZFS on Linux. Everything seems to be fine with GlusterFS >>>> 3.7.11 and problems exist at least with version .12 and .13. There has been >>>> some posts saying that GlusterFS 3.8.x is also affected. >>>> >>>> Steps to reproduce: >>>> 1. Sharded file is created with GlusterFS 3.7.11. Everything works ok. >>>> 2. GlusterFS is upgraded to 3.7.12+ >>>> 3. Sharded file cannot be read or written with direct I/O enabled. (Ie. >>>> oVirt uses to check storage connection with command "dd >>>> if=/rhev/data-center/00000001-0001-0001-0001-0000000002b6/mastersd/dom_md/inbox >>>> iflag=direct,fullblock count=1 bs=1024000”) >>>> >>>> Please let me know if you need more information. >>>> >>>> -samuli >>>> >>>> > Well after upgrade of gluster all I did was start ovirt hosts up >>>> which launched and started their ha-agent and broker processes. I don't >>>> believe I started getting any errors till it mounted GLUSTER1. I had >>>> enabled sharding but had no sharded disk images yet. Not sure if the check >>>> for shards would have caused that. Unfortunately I can't just update this >>>> cluster and try and see what caused it as it has sme VM's users expect to >>>> be available in few hours. >>>> > >>>> > I can see if I can get my test setup to recreate it. I think I'll >>>> need to de-activate data center so I can detach the storage thats on xfs >>>> and attach the one thats over zfs with sharding enabled. My test is 3 >>>> bricks on same local machine, with 3 different volumes but I think im >>>> running into sanlock issue or something as it won't mount more than one >>>> volume that was created locally. >>>> > >>>> > >>>> > -Krutika >>>> > >>>> > On Fri, Jul 22, 2016 at 7:17 PM, David Gossage < >>>> [email protected]> wrote: >>>> > Trimmed out the logs to just about when I was shutting down ovirt >>>> servers for updates which was 14:30 UTC 2016-07-09 >>>> > >>>> > Pre-update settings were >>>> > >>>> > Volume Name: GLUSTER1 >>>> > Type: Replicate >>>> > Volume ID: 167b8e57-28c3-447a-95cc-8410cbdf3f7f >>>> > Status: Started >>>> > Number of Bricks: 1 x 3 = 3 >>>> > Transport-type: tcp >>>> > Bricks: >>>> > Brick1: ccgl1.gl.local:/gluster1/BRICK1/1 >>>> > Brick2: ccgl2.gl.local:/gluster1/BRICK1/1 >>>> > Brick3: ccgl3.gl.local:/gluster1/BRICK1/1 >>>> > Options Reconfigured: >>>> > performance.readdir-ahead: on >>>> > storage.owner-uid: 36 >>>> > storage.owner-gid: 36 >>>> > performance.quick-read: off >>>> > performance.read-ahead: off >>>> > performance.io-cache: off >>>> > performance.stat-prefetch: off >>>> > cluster.eager-lock: enable >>>> > network.remote-dio: enable >>>> > cluster.quorum-type: auto >>>> > cluster.server-quorum-type: server >>>> > server.allow-insecure: on >>>> > cluster.self-heal-window-size: 1024 >>>> > cluster.background-self-heal-count: 16 >>>> > performance.strict-write-ordering: off >>>> > nfs.disable: on >>>> > nfs.addr-namelookup: off >>>> > nfs.enable-ino32: off >>>> > >>>> > At the time of updates ccgl3 was offline from bad nic on server but >>>> had been so for about a week with no issues in volume >>>> > >>>> > Shortly after update I added these settings to enable sharding but >>>> did not as of yet have any VM images sharded. >>>> > features.shard-block-size: 64MB >>>> > features.shard: on >>>> > >>>> > >>>> > >>>> > >>>> > David Gossage >>>> > Carousel Checks Inc. | System Administrator >>>> > Office 708.613.2284 >>>> > >>>> > On Fri, Jul 22, 2016 at 5:00 AM, Krutika Dhananjay < >>>> [email protected]> wrote: >>>> > Hi David, >>>> > >>>> > Could you also share the brick logs from the affected volume? They're >>>> located at >>>> /var/log/glusterfs/bricks/<hyphenated-path-to-the-brick-directory>.log. >>>> > >>>> > Also, could you share the volume configuration (output of `gluster >>>> volume info <VOL>`) for the affected volume(s) AND at the time you actually >>>> saw this issue? >>>> > >>>> > -Krutika >>>> > >>>> > >>>> > >>>> > >>>> > On Thu, Jul 21, 2016 at 11:23 PM, David Gossage < >>>> [email protected]> wrote: >>>> > On Thu, Jul 21, 2016 at 11:47 AM, Scott <[email protected]> wrote: >>>> > Hi David, >>>> > >>>> > My backend storage is ZFS. >>>> > >>>> > I thought about moving from FUSE to NFS mounts for my Gluster volumes >>>> to help test. But since I use hosted engine this would be a real pain. >>>> Its difficult to modify the storage domain type/path in the >>>> hosted-engine.conf. And I don't want to go through the process of >>>> re-deploying hosted engine. >>>> > >>>> > >>>> > I found this >>>> > >>>> > https://bugzilla.redhat.com/show_bug.cgi?id=1347553 >>>> > >>>> > Not sure if related. >>>> > >>>> > But I also have zfs backend, another user in gluster mailing list had >>>> issues and used zfs backend although she used proxmox and got it working by >>>> changing disk to writeback cache I think it was. >>>> > >>>> > I also use hosted engine, but I run my gluster volume for HE actually >>>> on a LVM separate from zfs on xfs and if i recall it did not have the >>>> issues my gluster on zfs did. I'm wondering now if the issue was zfs >>>> settings. >>>> > >>>> > Hopefully should have a test machone up soon I can play around with >>>> more. >>>> > >>>> > Scott >>>> > >>>> > On Thu, Jul 21, 2016 at 11:36 AM David Gossage < >>>> [email protected]> wrote: >>>> > What back end storage do you run gluster on? xfs/zfs/ext4 etc? >>>> > >>>> > David Gossage >>>> > Carousel Checks Inc. | System Administrator >>>> > Office 708.613.2284 >>>> > >>>> > On Thu, Jul 21, 2016 at 8:18 AM, Scott <[email protected]> wrote: >>>> > I get similar problems with oVirt 4.0.1 and hosted engine. After >>>> upgrading all my hosts to Gluster 3.7.13 (client and server), I get the >>>> following: >>>> > >>>> > $ sudo hosted-engine --set-maintenance --mode=none >>>> > Traceback (most recent call last): >>>> > File "/usr/lib64/python2.7/runpy.py", line 162, in >>>> _run_module_as_main >>>> > "__main__", fname, loader, pkg_name) >>>> > File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code >>>> > exec code in run_globals >>>> > File >>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py", >>>> line 73, in <module> >>>> > if not maintenance.set_mode(sys.argv[1]): >>>> > File >>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py", >>>> line 61, in set_mode >>>> > value=m_global, >>>> > File >>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", >>>> line 259, in set_maintenance_mode >>>> > str(value)) >>>> > File >>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", >>>> line 204, in set_global_md_flag >>>> > all_stats = broker.get_stats_from_storage(service) >>>> > File >>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", >>>> line 232, in get_stats_from_storage >>>> > result = self._checked_communicate(request) >>>> > File >>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", >>>> line 260, in _checked_communicate >>>> > .format(message or response)) >>>> > ovirt_hosted_engine_ha.lib.exceptions.RequestError: Request failed: >>>> failed to read metadata: [Errno 1] Operation not permitted >>>> > >>>> > If I only upgrade one host, then things will continue to work but my >>>> nodes are constantly healing shards. My logs are also flooded with: >>>> > >>>> > [2016-07-21 13:15:14.137734] W [fuse-bridge.c:2227:fuse_readv_cbk] >>>> 0-glusterfs-fuse: 274714: READ => -1 gfid=4 >>>> > 41f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation not >>>> permitted) >>>> > The message "W [MSGID: 114031] >>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-0: remote >>>> operation failed [Operation not permitted]" repeated 6 times between >>>> [2016-07-21 13:13:24.134985] and [2016-07-21 13:15:04.132226] >>>> > The message "W [MSGID: 114031] >>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-1: remote >>>> operation failed [Operation not permitted]" repeated 8 times between >>>> [2016-07-21 13:13:34.133116] and [2016-07-21 13:15:14.137178] >>>> > The message "W [MSGID: 114031] >>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-2: remote >>>> operation failed [Operation not permitted]" repeated 7 times between >>>> [2016-07-21 13:13:24.135071] and [2016-07-21 13:15:14.137666] >>>> > [2016-07-21 13:15:24.134647] W [MSGID: 114031] >>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-0: remote >>>> operation failed [Operation not permitted] >>>> > [2016-07-21 13:15:24.134764] W [MSGID: 114031] >>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-2: remote >>>> operation failed [Operation not permitted] >>>> > [2016-07-21 13:15:24.134793] W [fuse-bridge.c:2227:fuse_readv_cbk] >>>> 0-glusterfs-fuse: 274741: READ => -1 >>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0038f4 (Operation not >>>> permitted) >>>> > [2016-07-21 13:15:34.135413] W [fuse-bridge.c:2227:fuse_readv_cbk] >>>> 0-glusterfs-fuse: 274756: READ => -1 >>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation not >>>> permitted) >>>> > [2016-07-21 13:15:44.141062] W [fuse-bridge.c:2227:fuse_readv_cbk] >>>> 0-glusterfs-fuse: 274818: READ => -1 >>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0038f4 (Operation not >>>> permitted) >>>> > [2016-07-21 13:15:54.133582] W [MSGID: 114031] >>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-data-client-1: remote >>>> operation failed [Operation not permitted] >>>> > [2016-07-21 13:15:54.133629] W [fuse-bridge.c:2227:fuse_readv_cbk] >>>> 0-glusterfs-fuse: 274853: READ => -1 >>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0036d8 (Operation not >>>> permitted) >>>> > [2016-07-21 13:16:04.133666] W [fuse-bridge.c:2227:fuse_readv_cbk] >>>> 0-glusterfs-fuse: 274879: READ => -1 >>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0041d0 (Operation not >>>> permitted) >>>> > [2016-07-21 13:16:14.134954] W [fuse-bridge.c:2227:fuse_readv_cbk] >>>> 0-glusterfs-fuse: 274894: READ => -1 >>>> gfid=441f2789-f6b1-4918-a280-1b9905a11429 fd=0x7f19bc0036d8 (Operation not >>>> permitted) >>>> > >>>> > Scott >>>> > >>>> > >>>> > On Thu, Jul 21, 2016 at 6:57 AM Frank Rothenstein < >>>> [email protected]> wrote: >>>> > Hey Devid, >>>> > >>>> > I have the very same problem on my test-cluster, despite on running >>>> ovirt 4.0. >>>> > If you access your volumes via NFS all is fine, problem is FUSE. I >>>> stayed on 3.7.13, but have no solution yet, now I use NFS. >>>> > >>>> > Frank >>>> > >>>> > Am Donnerstag, den 21.07.2016, 04:28 -0500 schrieb David Gossage: >>>> >> Anyone running one of recent 3.6.x lines and gluster using 3.7.13? >>>> I am looking to upgrade gluster from 3.7.11->3.7.13 for some bug fixes, but >>>> have been told by users on gluster mail list due to some gluster changes >>>> I'd need to change the disk parameters to use writeback cache. Something >>>> to do with aio support being removed. >>>> >> >>>> >> I believe this could be done with custom parameters? But I believe >>>> strage tests are done using dd and would they fail with current settings >>>> then? Last upgrade to 3.7.13 I had to rollback to 3.7.11 due to stability >>>> isues where gluster storage would go into down state and always show N/A as >>>> space available/used. Even if hosts saw storage still and VM's were >>>> running on it on all 3 hosts. >>>> >> >>>> >> Saw a lot of messages like these that went away once gluster >>>> rollback finished >>>> >> >>>> >> [2016-07-09 15:27:46.935694] I [fuse-bridge.c:4083:fuse_init] >>>> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel >>>> 7.22 >>>> >> [2016-07-09 15:27:49.555466] W [MSGID: 114031] >>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: remote >>>> operation failed [Operation not permitted] >>>> >> [2016-07-09 15:27:49.556574] W [MSGID: 114031] >>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-0: remote >>>> operation failed [Operation not permitted] >>>> >> [2016-07-09 15:27:49.556659] W [fuse-bridge.c:2227:fuse_readv_cbk] >>>> 0-glusterfs-fuse: 80: READ => -1 gfid=deb61291-5176-4b81-8315-3f1cf8e3534d >>>> fd=0x7f5224002f68 (Operation not permitted) >>>> >> [2016-07-09 15:27:59.612477] W [MSGID: 114031] >>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: remote >>>> operation failed [Operation not permitted] >>>> >> [2016-07-09 15:27:59.613700] W [MSGID: 114031] >>>> [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-0: remote >>>> operation failed [Operation not permitted] >>>> >> [2016-07-09 15:27:59.613781] W [fuse-bridge.c:2227:fuse_readv_cbk] >>>> 0-glusterfs-fuse: 168: READ => -1 gfid=deb61291-5176-4b81-8315-3f1cf8e3534d >>>> fd=0x7f5224002f68 (Operation not permitted) >>>> >> >>>> >> David Gossage >>>> >> Carousel Checks Inc. | System Administrator >>>> >> Office 708.613.2284 >>>> >> _______________________________________________ >>>> >> Users mailing list >>>> >> >>>> >> [email protected] >>>> >> http://lists.ovirt.org/mailman/listinfo/users >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> ______________________________________________________________________________ >>>> > BODDEN-KLINIKEN Ribnitz-Damgarten GmbH >>>> > Sandhufe 2 >>>> > 18311 Ribnitz-Damgarten >>>> > >>>> > Telefon: 03821-700-0 >>>> > Fax: 03821-700-240 >>>> > >>>> > E-Mail: [email protected] Internet: >>>> http://www.bodden-kliniken.de >>>> > >>>> > Sitz: Ribnitz-Damgarten, Amtsgericht: Stralsund, HRB 2919, >>>> Steuer-Nr.: 079/133/40188 >>>> > Aufsichtsratsvorsitzende: Carmen Schröter, Geschäftsführer: Dr. Falko >>>> Milski >>>> > >>>> > Der Inhalt dieser E-Mail ist ausschließlich für den bezeichneten >>>> Adressaten bestimmt. Wenn Sie nicht der vorge- >>>> > sehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, >>>> beachten Sie bitte, dass jede Form der Veröf- >>>> > fentlichung, Vervielfältigung oder Weitergabe des Inhalts dieser >>>> E-Mail unzulässig ist. Wir bitten Sie, sofort den >>>> > Absender zu informieren und die E-Mail zu löschen. >>>> > >>>> > >>>> > Bodden-Kliniken Ribnitz-Damgarten GmbH 2016 >>>> > *** Virenfrei durch Kerio Mail Server und Sophos Antivirus *** >>>> > _______________________________________________ >>>> > Users mailing list >>>> > [email protected] >>>> > http://lists.ovirt.org/mailman/listinfo/users >>>> > >>>> > >>>> > _______________________________________________ >>>> > Users mailing list >>>> > [email protected] >>>> > http://lists.ovirt.org/mailman/listinfo/users >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > _______________________________________________ >>>> > Users mailing list >>>> > [email protected] >>>> > http://lists.ovirt.org/mailman/listinfo/users >>>> >>>> >>> >> >
_______________________________________________ Users mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/users

