Dear Amit.

That is not exactly the issue we're facing; I have two clusters an old
4.2 cluster and a fresh 4.3 cluster. The DC is set to 4.2 because of this.

I wish to bring the 4.3 cluster online and migrate everything to it,
then remove the 4.2 cluster.

This should be a path that can be taken imho.

Otherwise I could upgrade the current 4.2 cluster althought it gets
decomissioned but then I have the same problem with hosts upgraded to
4.3 not being able to connect to the NFS and able to be activated. Until
I upgrade them all and set the cluster and DC to 4.3.

I could detach the NFS domain for a bit, but we use this mount for
backup disks of several VM's and I'd rather not migrate everything in a
hurry or do without these backups for a longer time.

Regards,

Jorick Astrego

On 2/8/20 11:51 AM, Amit Bawer wrote:
>  I doubt if you can use 4.3.8 nodes with a 4.2 cluster without
> upgrading it first. But myabe members of this list could say differently.
>
> On Friday, February 7, 2020, Jorick Astrego <[email protected]
> <mailto:[email protected]>> wrote:
>
>
>     On 2/6/20 6:22 PM, Amit Bawer wrote:
>>
>>
>>     On Thu, Feb 6, 2020 at 2:54 PM Jorick Astrego <[email protected]
>>     <mailto:[email protected]>> wrote:
>>
>>
>>         On 2/6/20 1:44 PM, Amit Bawer wrote:
>>>
>>>
>>>         On Thu, Feb 6, 2020 at 1:07 PM Jorick Astrego
>>>         <[email protected] <mailto:[email protected]>> wrote:
>>>
>>>             Here you go, this is from the activation I just did a
>>>             couple of minutes ago.
>>>
>>>         I was hoping to see how it was first connected to host, but
>>>         it doesn't go that far back. Anyway, the storage domain type
>>>         is set from engine and vdsm never try to guess it as far as
>>>         I saw.
>>
>>         I put the host in maintenance and activated it again, this
>>         should give you some more info. See attached log.
>>
>>>         Could you query the engine db about the misbehaving domain
>>>         and paste the results?
>>>
>>>         # su - postgres
>>>         Last login: Thu Feb  6 07:17:52 EST 2020 on pts/0
>>>         -bash-4.2$
>>>         LD_LIBRARY_PATH=/opt/rh/rh-postgresql10/root/lib64/  
>>>         /opt/rh/rh-postgresql10/root/usr/bin/psql engine
>>>         psql (10.6)
>>>         Type "help" for help.
>>>         engine=# select * from storage_domain_static where id =
>>>         'f5d2f7c6-093f-46d6-a844-224d92db5ef9' ;
>>
>>
>>             engine=# select * from storage_domain_static where id =
>>             'f5d2f7c6-093f-46d6-a844-224d92db5ef9' ;
>>                               id                  |              
>>             storage                | storage_name |
>>             storage_domain_type | storage_type |
>>             storage_domain_format_type |        
>>             _create_date          |        _update_date         |
>>             recoverable | la
>>             st_time_used_as_master | storage_description |
>>             storage_comment | wipe_after_delete |
>>             warning_low_space_indicator |
>>             critical_space_action_blocker | first_metadata_device |
>>             vg_metadata_device | discard_after_delete | backup |
>>             warning_low_co
>>             nfirmed_space_indicator | block_size
>>             
>> --------------------------------------+--------------------------------------+--------------+---------------------+--------------+----------------------------+-------------------------------+-----------------------------+-------------+---
>>             
>> -----------------------+---------------------+-----------------+-------------------+-----------------------------+-------------------------------+-----------------------+--------------------+----------------------+--------+---------------
>>             ------------------------+------------
>>              f5d2f7c6-093f-46d6-a844-224d92db5ef9 |
>>             b8b456f0-27c3-49b9-b5e9-9fa81fb3cdaa | backupnfs   
>>             |                   1 |            1 |
>>             4                          | 2018-01-19
>>             13:31:25.899738+01 | 2019-02-14 14:36:22.3171+01 |
>>             t           |  
>>                      1530772724454 |                    
>>             |                 | f                
>>             |                          10
>>             |                             5 |                      
>>             |                    | f                    | f     
>>             |              
>>                                   0 |        512
>>             (1 row)
>>
>>
>>
>>     Thanks for sharing,
>>
>>     The storage_type in db is indeed NFS (1),
>>     storage_domain_format_type is 4 - for ovirt 4.3 the
>>     storage_domain_format_type is 5 by default and usually datacenter
>>     upgrade is required for 4.2 to 4.3 migration, which not sure if
>>     possible in your current setup since you have 4.2 nodes using
>>     this storage as well.
>>
>>     Regarding the repeating monitor failure for the SD:
>>
>>     2020-02-05 14:17:54,190+0000 WARN  (monitor/f5d2f7c)
>>     [storage.LVM] Reloading VGs failed
>>     (vgs=[u'f5d2f7c6-093f-46d6-a844-224d92db5ef9'] rc=5 out=[]
>>     err=['  Volume group "f5d2f7c6-093f-46d6-a844-224d92db5ef9" not
>>     found', '  Cannot process volume group
>>     f5d2f7c6-093f-46d6-a844-224d92db5ef9']) (lvm:470)
>>
>>     This error means that the monitor has tried to query the SD as a
>>     VG first and failed, this is expected for the fallback code
>>     called for finding a domain missing from SD cache:
>>
>>     def_findUnfetchedDomain(self, sdUUID):
>>     ...
>>     formod in(blockSD, glusterSD, localFsSD, nfsSD):
>>     try:
>>     returnmod.findDomain(sdUUID)
>>     exceptse.StorageDomainDoesNotExist:
>>     pass
>>     exceptException:
>>     self.log.error(
>>     "Error while looking for domain `%s`",
>>     sdUUID, exc_info=True)
>>     raisese.StorageDomainDoesNotExist(sdUUID)
>>
>>     2020-02-05 14:17:54,201+0000 ERROR (monitor/f5d2f7c)
>>     [storage.Monitor] Setting up monitor for
>>     f5d2f7c6-093f-46d6-a844-224d92db5ef9 failed (monitor:330)
>>     Traceback (most recent call last):
>>       File
>>     "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
>>     327, in _setupLoop
>>         self._setupMonitor()
>>       File
>>     "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
>>     349, in _setupMonitor
>>         self._produceDomain()
>>       File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line
>>     159, in wrapper
>>         value = meth(self, *a, **kw)
>>       File
>>     "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
>>     367, in _produceDomain
>>         self.domain = sdCache.produce(self.sdUUID)
>>       File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>     line 110, in produce
>>         domain.getRealDomain()
>>       File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>     line 51, in getRealDomain
>>         return self._cache._realProduce(self._sdUUID)
>>       File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>     line 134, in _realProduce
>>         domain = self._findDomain(sdUUID)
>>       File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>     line 151, in _findDomain
>>         return findMethod(sdUUID)
>>       File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>>     line 176, in _findUnfetchedDomain
>>         raise se.StorageDomainDoesNotExist(sdUUID)
>>     StorageDomainDoesNotExist: Storage domain does not exist:
>>     (u'f5d2f7c6-093f-46d6-a844-224d92db5ef9',)
>>
>>     This error part means it failed to query the domain for any
>>     possible type, either for NFS.
>>
>>     Are you able to create a new NFS storage domain on the same
>>     storage server (but on another export path not to harm the
>>     existing one)?
>>     If you do succeed to connect to it from the 4.3 datacenter, it
>>     could mean the v4 format is an issue;
>>     otherwise it could mean there is an issue with a different NFS
>>     settings required for 4.3.
>
>     Well this will be a problem either way, when I add a new NFS it
>     will not be storage_domain_format_type 5 as the DC is still on 4.2.
>
>     Also when I do add a type 5 nfs domain, the 4.2 nodes will try to
>     mount it, fail and then become non-responsive taking the whole
>     running cluster down?
>
>
>     Regards,
>
>     Jorick Astrego
>
>
>
>
>
>
>
>     Met vriendelijke groet, With kind regards,
>
>     Jorick Astrego
>     *
>     Netbulae Virtualization Experts *
>     ------------------------------------------------------------------------
>     Tel: 053 20 30 270        [email protected] <mailto:[email protected]>
>     Staalsteden 4-3A
>     <https://www.google.com/maps/search/Staalsteden+4-3A?entry=gmail&source=g>
>       KvK 08198180
>     Fax: 053 20 30 271        www.netbulae.eu <http://www.netbulae.eu>        
> 7547
>     TA Enschede       BTW NL821234584B01
>
>
>     ------------------------------------------------------------------------
>




Met vriendelijke groet, With kind regards,

Jorick Astrego

Netbulae Virtualization Experts 

----------------

        Tel: 053 20 30 270      [email protected]        Staalsteden 4-3A        
KvK 08198180
        Fax: 053 20 30 271      www.netbulae.eu         7547 TA Enschede        
BTW NL821234584B01

----------------

_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/GGBJYUDTJ3SQEKJJDNBBEENO3AGOANBK/

Reply via email to