Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Vladimir Sementsov-Ogievskiy
06.11.2019 16:52, Max Reitz wrote: > On 06.11.19 14:34, Dietmar Maurer wrote: >> >>> On 6 November 2019 14:17 Max Reitz wrote: >>> >>> >>> On 06.11.19 14:09, Dietmar Maurer wrote: > Let me elaborate: Yes, a cluster size generally means that it is most > “efficient” to access the storage

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Max Reitz
On 06.11.19 14:34, Dietmar Maurer wrote: > >> On 6 November 2019 14:17 Max Reitz wrote: >> >> >> On 06.11.19 14:09, Dietmar Maurer wrote: Let me elaborate: Yes, a cluster size generally means that it is most “efficient” to access the storage at that size. But there’s a tradeoff.

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Dietmar Maurer
> On 6 November 2019 14:17 Max Reitz wrote: > > > On 06.11.19 14:09, Dietmar Maurer wrote: > >> Let me elaborate: Yes, a cluster size generally means that it is most > >> “efficient” to access the storage at that size. But there’s a tradeoff. > >> At some point, reading the data takes suffi

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Max Reitz
On 06.11.19 14:09, Dietmar Maurer wrote: >> Let me elaborate: Yes, a cluster size generally means that it is most >> “efficient” to access the storage at that size. But there’s a tradeoff. >> At some point, reading the data takes sufficiently long that reading a >> bit of metadata doesn’t matter

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Dietmar Maurer
> Let me elaborate: Yes, a cluster size generally means that it is most > “efficient” to access the storage at that size. But there’s a tradeoff. > At some point, reading the data takes sufficiently long that reading a > bit of metadata doesn’t matter anymore (usually, that is). Any network stor

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Max Reitz
On 06.11.19 12:22, Max Reitz wrote: > On 06.11.19 12:18, Dietmar Maurer wrote: >>> And if it issues a smaller request, there is no way for a guest device >>> to tell it “OK, here’s your data, but note we have a whole 4 MB chunk >>> around it, maybe you’d like to take that as well...?” >>> >>> I und

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Max Reitz
On 06.11.19 12:18, Dietmar Maurer wrote: >> And if it issues a smaller request, there is no way for a guest device >> to tell it “OK, here’s your data, but note we have a whole 4 MB chunk >> around it, maybe you’d like to take that as well...?” >> >> I understand wanting to increase the backup buff

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Dietmar Maurer
> And if it issues a smaller request, there is no way for a guest device > to tell it “OK, here’s your data, but note we have a whole 4 MB chunk > around it, maybe you’d like to take that as well...?” > > I understand wanting to increase the backup buffer size, but I don’t > quite understand why w

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Max Reitz
On 06.11.19 11:34, Wolfgang Bumiller wrote: > On Wed, Nov 06, 2019 at 10:37:04AM +0100, Max Reitz wrote: >> On 06.11.19 09:32, Stefan Hajnoczi wrote: >>> On Tue, Nov 05, 2019 at 11:02:44AM +0100, Dietmar Maurer wrote: Example: Backup from ceph disk (rbd_cache=false) to local disk: ba

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Max Reitz
On 06.11.19 11:18, Dietmar Maurer wrote: >> The thing is, it just seems unnecessary to me to take the source cluster >> size into account in general. It seems weird that a medium only allows >> 4 MB reads, because, well, guests aren’t going to take that into account. > > Maybe it is strange, but

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Wolfgang Bumiller
On Wed, Nov 06, 2019 at 10:37:04AM +0100, Max Reitz wrote: > On 06.11.19 09:32, Stefan Hajnoczi wrote: > > On Tue, Nov 05, 2019 at 11:02:44AM +0100, Dietmar Maurer wrote: > >> Example: Backup from ceph disk (rbd_cache=false) to local disk: > >> > >> backup_calculate_cluster_size returns 64K (correc

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Dietmar Maurer
> The thing is, it just seems unnecessary to me to take the source cluster > size into account in general. It seems weird that a medium only allows > 4 MB reads, because, well, guests aren’t going to take that into account. Maybe it is strange, but it is quite obvious that there is an optimal clu

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Max Reitz
On 06.11.19 09:32, Stefan Hajnoczi wrote: > On Tue, Nov 05, 2019 at 11:02:44AM +0100, Dietmar Maurer wrote: >> Example: Backup from ceph disk (rbd_cache=false) to local disk: >> >> backup_calculate_cluster_size returns 64K (correct for my local .raw image) >> >> Then the backup job starts to read 6

Re: backup_calculate_cluster_size does not consider source

2019-11-06 Thread Stefan Hajnoczi
On Tue, Nov 05, 2019 at 11:02:44AM +0100, Dietmar Maurer wrote: > Example: Backup from ceph disk (rbd_cache=false) to local disk: > > backup_calculate_cluster_size returns 64K (correct for my local .raw image) > > Then the backup job starts to read 64K blocks from ceph. > > But ceph always reads

backup_calculate_cluster_size does not consider source

2019-11-05 Thread Dietmar Maurer
Example: Backup from ceph disk (rbd_cache=false) to local disk: backup_calculate_cluster_size returns 64K (correct for my local .raw image) Then the backup job starts to read 64K blocks from ceph. But ceph always reads 4M block, so this is incredibly slow and produces way too much network traffi