[ 
https://issues.apache.org/jira/browse/SOLR-15500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396272#comment-17396272
 ] 

Jason Gerlowski commented on SOLR-15500:
----------------------------------------

Thanks for offering to help out Sayan, maybe we can get into specifics a bit.

(Sorry for the late reply - quite behind on mail - hopefully your offer still 
stands.)

You may already be aware of this, but Solr recently added support for 
"incremental backups" in Solr 8.9 (see 
[here|https://issues.apache.org/jira/browse/SOLR-15086] or 
[here|https://cwiki.apache.org/confluence/display/SOLR/SIP-12%3A+Incremental+Backup+and+Restore]).
  This work changed the backup format so that multiple backups can now live 
side-by-side in the same location. Repeated backups of the same collection are 
smart enough to avoid uploading files that haven't changed since being uploaded 
by some previous backup.  This brings big efficiency improvements for that 
common backup usecase which is cool.

But it does complicate your compression use case some, unfortunately.  
Compression is easy to imagine on Solr's legacy backup structure, but backups 
using the new file structure have Solr scan the files already present to 
identify which index files have been uploaded by a previous backup.  Were you 
aware of the new format changes by chance, and if so, did you have any ideas 
how that might be handled?  I guess we would just leave the uncompressed files 
around after creating a zip/tarball of them?  Or maybe compression is only 
something we'd support in the legacy backup file format?

Had you thought at all about how you'd do the compression?

> Compressed Backup
> -----------------
>
>                 Key: SOLR-15500
>                 URL: https://issues.apache.org/jira/browse/SOLR-15500
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Sayan Das
>            Priority: Major
>
> Right now in BliBli, we do dirty hacks to compress backups from the backup 
> scheduler VMs. It would be great if we can improve collection BACKUP command 
> with some expert flag which can compress the backup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to