Dear Dietmar,

I am very interested in helping you with that geo-replication, since we
also have a setup with geo-replication that is crucial for the

backup procedure. I just had a quick look at this and for the moment, I
just can suggest:

is there any suitable setting in the gluster-environment which would
take influence on the speed of the processing (current settings
attached) ?
gluster volume geo-replication mvol1 gl-slave-05-int::svol  config
sync_jobs  9


in order to increase the number of rsync processes.

Furthermore, taken from
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/recommended_practices3


Performance Tuning

When the following option is set, it has been observed that there is
an increase in geo-replication performance. On the slave volume, run
the following command:

#|gluster volume set /SLAVE_VOL/ batch-fsync-delay-usec 0|

Can you verify that the changelog-files are consumed?


Regards,

Felix

On 03/03/2021 17:28, Dietmar Putz wrote:

Hi,

I'm having a problem with geo-replication. A short summary...
About two month ago I have added two further nodes to a distributed
replicated volume. For that purpose I have stopped the
geo-replication, added two nodes on mvol and svol and started a
rebalance process on both sides. Once the rebalance process was
finished I startet the geo-replication again.

After a few days and beside some Unicode Errors the status of the new
added brick changed from hybrid crawl to history crawl. Since then no
progress, no files / directories have been created on svol for a
couple of days.

Looking for a possible reason I recognized that there is was
/var/log/glusterfs/geo-replication-slaves/mvol1_gl-slave-01-int_svol1
directory on the new added slave nodes.
Obviously I forgot to add the new svol node IP addresses on all
master's /etc/hosts. After fixing that I did the '... execute
gsec_create' and '...create push-pem force' command again and
corresponding directory were created. Geo-replication started normal,
all active sessions were in history crawl (as shown below) and for a
short while some data were transfered to svol. But for about a week
nothing had changed on svol, 0 byte transferred.

Meanwhile i have deleted (without reset-sync-time) and recreated the
geo-rep session. the current status is as shown below but without any
last_synced date.
an entry like "last_synced_entry": 1609283145 is still visible in
/var/lib/glusterd/geo-replication/mvol1_gl-slave-01-int_svol1/*status
and changelog files are continously created in
/var/lib/misc/gluster/gsyncd/mvol1_gl-slave-01-int_svol1/<brick>/.processing.


Short time ago i changed log_level to DEBUG for a moment.
Unfortunately I got an 'EOFError: Ran out of input' in gsyncd.log and
rebuild of .processing starts from beginning.
But one of the first very long lines in gsyncd.log looks like :

[2021-03-03 11:59:39.503881] D [repce(worker
/brick1/mvol1):215:__call__] RepceClient: call
9163:139944064358208:1614772779.4982471 history_getchanges ->
['/var/lib/misc/gluster/gsyncd/mvol1_gl-slave-01-int_svol1/brick1-mvol1/.history/.processing/CHANGELOG.1609280278',...

1609280278 means Tuesday, December 29, 2020 10:17:58 PM and would
somehow fit to the last_synced date.

However, I got nearly 300k files in <brick>/.history/.processing and
in in log/trace it seems that any file in <brick>/.history/.processing
will be processed and transferred to <brick>/.processing.
My questions so far...
first of all, is everything still ok with this geo-replication ?
do i have to wait until all changelog files in
<brick>/.history/.processing are processed until transfers to svol start ?
what happens if any other error appears in geo-replication while these
changelog files are processed resp. crawl status is history crawl ...
does the entire process starts from the beginning ? would a checkpiont
be helpful...for future decisions...?
is there any suitable setting in the gluster-environment which would
take influence on the speed of the processing (current settings
attached) ?


I hope someone can help...

best regards
dietmar



[ 15:17:47 ] - root@gl-master-01
/var/lib/misc/gluster/gsyncd/mvol1_gl-slave-01-int_svol1/brick1-mvol1/.history
$ls .processing/ | wc -l
294669

[ 12:56:31 ] - root@gl-master-01  ~ $gluster volume geo-replication
mvol1 gl-slave-01-int::svol1 status

MASTER NODE         MASTER VOL    MASTER BRICK     SLAVE USER
SLAVE                     SLAVE NODE         STATUS     CRAWL
STATUS     LAST_SYNCED
----------------------------------------------------------------------------------------------------------------------------------------------------
gl-master-01-int    mvol1         /brick1/mvol1    root
gl-slave-01-int::svol1    gl-slave-05-int    Active History Crawl   
2020-12-29 23:00:48
gl-master-01-int    mvol1         /brick2/mvol1    root
gl-slave-01-int::svol1    gl-slave-03-int    Active History Crawl   
2020-12-29 23:05:45
gl-master-05-int    mvol1         /brick1/mvol1    root
gl-slave-01-int::svol1    gl-slave-03-int    Active History Crawl   
2021-02-20 17:38:38
gl-master-06-int    mvol1         /brick1/mvol1    root
gl-slave-01-int::svol1    gl-slave-06-int    Passive N/A              N/A
gl-master-03-int    mvol1         /brick1/mvol1    root
gl-slave-01-int::svol1    gl-slave-05-int    Passive N/A              N/A
gl-master-03-int    mvol1         /brick2/mvol1    root
gl-slave-01-int::svol1    gl-slave-04-int    Active History Crawl   
2020-12-29 23:07:34
gl-master-04-int    mvol1         /brick1/mvol1    root
gl-slave-01-int::svol1    gl-slave-06-int    Active History Crawl   
2020-12-29 23:07:22
gl-master-04-int    mvol1         /brick2/mvol1    root
gl-slave-01-int::svol1    gl-slave-01-int    Passive N/A              N/A
gl-master-02-int    mvol1         /brick1/mvol1    root
gl-slave-01-int::svol1    gl-slave-01-int    Passive N/A              N/A
gl-master-02-int    mvol1         /brick2/mvol1    root
gl-slave-01-int::svol1    gl-slave-06-int    Passive N/A              N/A
[ 13:14:47 ] - root@gl-master-01  ~ $


________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to