Hello Zsolt, On Wed, Jan 4, 2023 at 10:00 PM Zsolt Kozak <koza...@gmail.com> wrote:
> Hello, > > I have some problems with backuping Kubernetes PVCs with Bacula Kubernetes > Plugin. > > I am using the latest 13.0.1 Bacula from the community builds on Debian > Bullseye hosts. > > Backuping only the Kubernetes objects except Persistent Volume Claims > (PVC) works like a charm. I've installed the Kubernetes plugin and the > latest Bacula File Daemon on the master node (control plane) of our > Kubernetes cluster. Bacula can access the Kubernetes cluster and backup > every single object as YAML files. > > The interesting part comes with trying to backup a PVC... > > First of all I could build my own Bacula Backup Proxy Pod Image from the > source and it's deployed into our local Docker image repository (repo). The > Bacula File Daemon is configured properly I guess. Backup process started > and the following things happened. > You mentioned you could run a kubernetes backup of all resources successfully, thus the Bacula File Daemon should be ok. > 1. Bacula File Daemon deployed Bacula Backup Proxy Pod Image into the > Kubernetes cluster, so Bacula-backup container pod started. > 2. I got into the pod and I could see the Baculatar application started and > running. > 3. The k8s_backend application started on the Bacula File Daemon host > (kubernetes.server) in 2 instances. > 4. From the Bacula-backup pod I could check that Baculatar could connect > to the k8s_backend at the default 9104 port (kubernetes.server:9104). > All fine so far! > 5. I checked the console messages of the job with Bat that Bacula File > Daemon started to process the configured PVC, started to write a pvc.tar > but nothing happened. > 6. After default 600 sec, after timeout the job was cancelled. > Ok, so we have a problem. > 7. It may be important that Bacula File Daemon could not delete the > Bacula-backup pod. (It could create it but could not delete it.) > This is a design decision. If there is a failure with the bacula-backup pod, it is not removed by Bacula. It requires the kubernetes admin to manually remove it. > Could you please tell me what's wrong? > > > Here are some log parts. (I've changed some sensitive data.) > > > Bacula File Daemon configuration: > > FileSet { > Name = "Kubernetes Set" > Include { > Options { > signature = SHA512 > compression = GZIP > Verify = pins3 > } > Plugin = "kubernetes: \ > debug=1 \ > baculaimage=repo/bacula-backup:04jan23 \ > namespace=namespace \ > pvcdata \ > pluginhost=kubernetes.server \ > timeout=120 \ > verify_ssl=0 \ > fdcertfile=/etc/bacula/certs/bacula-backup.cert \ > fdkeyfile=/etc/bacula/certs/bacula-backup.key" > } > } > > > > Bacula File Daemon debug log (parts): > > > DEBUG:[baculak8s/jobs/estimation_job.py:134 in processing_loop] processing > get_annotated_namespaced_pods_data:namespace:nrfound:0 > DEBUG:[baculak8s/plugins/kubernetes_plugin.py:319 in > list_pvcdata_for_namespace] list pvcdata for namespace:namespace > pvcfilter=True estimate=False > DEBUG:[baculak8s/plugins/k8sbackend/pvcdata.py:108 in > pvcdata_list_namespaced] pvcfilter: True > DEBUG:[baculak8s/plugins/k8sbackend/pvcdata.py:112 in > pvcdata_list_namespaced] found:some-claim > DEBUG:[baculak8s/plugins/k8sbackend/pvcdata.py:127 in > pvcdata_list_namespaced] add pvc: {'name': 'some-claim', 'node_name': None, > 'storage_class_name': 'nfs-client', 'capacity': '2Gi', 'fi': > <baculak8s.entities.file_info.FileInfo object at 0x7ffaa55bfcc0>} > DEBUG:[baculak8s/jobs/estimation_job.py:165 in processing_loop] processing > list_pvcdata_for_namespace:namespace:nrfound:1 > DEBUG:[baculak8s/jobs/estimation_job.py:172 in processing_loop] > PVCDATA:some-claim:{'name': 'some-claim', 'node_name': 'node1', > 'storage_class_name': 'nfs-client', 'capacity': '2Gi', 'fi': > <baculak8s.entities.file_info.FileInfo object at 0x7ffaa55bfcc0>} > DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet > I000041 > Start backup volume claim: some-claim > > DEBUG:[baculak8s/jobs/job_pod_bacula.py:298 in prepare_bacula_pod] > prepare_bacula_pod:token=xx88M5oggQJ....4YDbSwBRxTOhT namespace=namespace > DEBUG:[baculak8s/jobs/job_pod_bacula.py:136 in prepare_pod_yaml] pvcdata: > {'name': 'some-claim', 'node_name': 'node1', 'storage_class_name': > 'nfs-client', 'capacity': '2Gi', 'fi': > <baculak8s.entities.file_info.FileInfo object at 0x7ffaa55bfcc0>} > DEBUG:[baculak8s/plugins/k8sbackend/baculabackup.py:102 in > prepare_backup_pod_yaml] host:kubernetes.server port:9104 > namespace:namespace image:repo/bacula-backup:04jan23 > job:KubernetesBackup.2023-01-04_21.05.03_10:410706 > DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet > I000149 > Prepare Bacula Pod on: node1 with: repo/bacula-backup:04jan23 > <IfNotPresent> kubernetes.server:9104 > > DEBUG:[baculak8s/jobs/job_pod_bacula.py:198 in prepare_connection_server] > prepare_connection_server:New ConnectionServer: 0.0.0.0:9104 > DEBUG:[baculak8s/util/sslserver.py:180 in listen] > ConnectionServer:Listening... > DEBUG:[baculak8s/jobs/job_pod_bacula.py:307 in prepare_bacula_pod] > prepare_bacula_pod:start pod > INFO:[baculak8s/plugins/kubernetes_plugin.py:771 in backup_pod_isready] > backup_pod_status:isReady: False / 0 > INFO:[baculak8s/plugins/kubernetes_plugin.py:771 in backup_pod_isready] > backup_pod_status:isReady: True / 1 > DEBUG:[baculak8s/jobs/estimation_job.py:183 in _estimate_file] {'name': > 'some-claim', 'node_name': 'node1', 'storage_class_name': 'nfs-client', > 'capacity': '2Gi', 'fi': <baculak8s.entities.file_info.FileInfo object at > 0x7ffaa55bfcc0>} > DEBUG:[baculak8s/jobs/estimation_job.py:190 in _estimate_file] file_info: > {FileInfo > name:/@kubernetes/namespaces/namespace/persistentvolumeclaims/some-claim.tar > namespace:None type:F objtype:pvcdata cached:False} > DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet > C000079 > > FNAME:/@kubernetes/namespaces/namespace/persistentvolumeclaims/some-claim.tar > > > DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet > C000040 > TSTAMP:1672861077 1672861077 1672861077 > > DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet > C000031 > STAT:F 2147483648 0 0 100640 1 > > DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet > F000000 > (EOD PACKET) > > DEBUG:[baculak8s/jobs/backup_job.py:77 in __backup_pvcdata] > backup_pvcdata:data recv > DEBUG:[baculak8s/io/log.py:110 in save_sent_packet] Sent Packet > C000005 > DATA > > DEBUG:[baculak8s/util/sslserver.py:193 in handle_connection] > ConnectionServer:Connection from: ('192.168.XX.YY', 10541) > DEBUG:[baculak8s/util/sslserver.py:145 in gethello] ['Hello', > 'KubernetesBackup.2023-01-04_21.05.03_10', '410706'] > DEBUG:[baculak8s/util/token.py:57 in check_auth_data] AUTH_DATA:Token: > xx88M5oggQJuGsPbtD........ohQjeU7PkA4YDbSwBRxTOhT > DEBUG:[baculak8s/util/token.py:59 in check_auth_data] > RECV_TOKEN_DATA:Token: xx88M5oggQJuGsPbtD....ohQjeU7PkA4YDbSwBRxTOhT > DEBUG:[baculak8s/util/sslserver.py:105 in authenticate] > ConnectionServer:Authenticated > > .... after timeout > > DEBUG:[baculak8s/jobs/job_pod_bacula.py:121 in handle_pod_data_recv] > handle_pod_data_recv:EOT > DEBUG:[baculak8s/util/sslserver.py:201 in handle_connection] > ConnectionServer:Finish - disconnect. > DEBUG:[baculak8s/jobs/backup_job.py:85 in __backup_pvcdata] > backup_pvcdata:logs recv > > > > > Job messages: > > bacula-dir No prior or suitable Full backup found in catalog for the > current FileSet. Doing FULL backup. > The FileSet "Kubernetes Set" was modified on 2023-01-04 20:20:41, this is > after the last successful backup on 2023-01-04 19:19:49. > bacula-sd Ready to append to end of Volume "Full-XXX" size=3,838,161,002 > bacula-fd Connected to Storage at bacula.server:9103 with TLS > bacula-sd Volume "Full-XXXX" previously written, moving to end of data. > bacula-dir Connected to Client "bacula-fd" at kubernetes.server:9102 with > TLS > Using Device "FileStorageEeyoreFull" to write. > Connected to Storage "InternalStorageFull" at bacula.server:9103 with TLS > Start Backup JobId 410830, Job=KubernetesBackup.2023-01-04_21.05.03_10 > bacula-fd kubernetes: Prepare Bacula Pod on: node with: > repo/bacula-backup:04jan23 kubernetes.server:9104 > kubernetes: Processing namespace: namespace > kubernetes: Start backup volume claim: some-claim > kubernetes: Connected to Kubernetes 1.25 - v1.25.4. > bacula-dir > Error: Bacula Enterprise bacula-dir 13.0.1 (05Aug22): > Build OS: x86_64-pc-linux-gnu-bacula-enterprise debian 11.2 > JobId: 410830 > Job: KubernetesBackup.2023-01-04_21.05.03_10 > Backup Level: Full (upgraded from Differential) > Client: "bacula-fd" 13.0.1 (05Aug22) > x86_64-pc-linux-gnu-bacula-enterprise,debian,10.11 > FileSet: "Kubernetes Set" 2023-01-04 20:20:41 > Pool: "Full-Pool" (From Job FullPool override) > Catalog: "MyCatalog" (From Client resource) > Storage: "InternalStorageFull" (From Pool resource) > Scheduled time: 04-Jan-2023 21:05:03 > Start time: 04-Jan-2023 21:27:04 > End time: 04-Jan-2023 21:29:06 > Elapsed time: 2 mins 2 secs > Priority: 10 > FD Files Written: 23 > SD Files Written: 0 > FD Bytes Written: 52,784 (52.78 KB) > SD Bytes Written: 0 (0 B) > Rate: 0.4 KB/s > Software Compression: 100.0% 1.0:1 > Comm Line Compression: 5.6% 1.1:1 > Snapshot/VSS: no > Encryption: yes > Accurate: yes > Volume name(s): Full-XXXX > Volume Session Id: 43 > Volume Session Time: 1672853724 > Last Volume Bytes: 3,838,244,105 (3.838 GB) > Non-fatal FD errors: 3 > SD Errors: 0 > FD termination status: OK > SD termination status: SD despooling Attributes > Termination: *** Backup Error *** > Fatal error: catreq.c:680 Restore object create error. > bacula-fd > Error: kubernetes: PTCOMM cannot get packet header from backend. > bacula-dir Fatal error: sql_create.c:1273 Create db Object record INSERT > INTO RestoreObject > (ObjectName,PluginName,RestoreObject,ObjectLength,ObjectFullLength,ObjectIndex,ObjectType,ObjectCompression,FileIndex,JobId) > > VALUES ('RestoreOptions','kubernetes: \n debug=1 \n > baculaimage=repo/bacula-backup:04jan23 \n namespace=namespace > \n pvcdata \n > pluginhost=kubernetes.server \n timeout=120 \n > verify_ssl=0 \n fdcertfile=/etc/bacula/certs/bacula-backup.cert > \n > fdkeyfile=/etc/bacula/certs/bacula-backup.key','# Plugin configuration > file\n# Version 1\nOptPrompt=\"K8S config > file\"\nOptDefault=\"*None*\"\nconfig=@STR@\n\n > OptPrompt=\"K8S API server > URL/Host\"\nOptDefault=\"*None*\"\nhost=@STR@\n\nOptPrompt=\"K8S > Bearertoken\"\nOptDefault=\"*None*\"\ntoken=@STR@\n\nOptPrompt=\"K8S API > server cert verification\"\n > OptDefault=\"True\"\nverify_ssl=@BOOL@\n\nOptPrompt=\"Custom CA Certs > file to use\"\nOptDefault=\"*None*\"\nssl_ca_cert=@STR@\n\nOptPrompt=\"Output > format when saving to file (JSON, YAML)\"\n > OptDefault=\"RAW\"\noutputformat=@STR@\n\nOptPrompt=\"The address for > listen to incoming backup pod > data\"\nOptDefault=\"*FDAddress*\"\nfdaddress=@STR@\n\n > OptPrompt=\"The port for opening socket for > listen\"\nOptDefault=\"9104\"\nfdport=@INT32@\n\nOptPrompt=\"The endpoint > address for backup pod to connect\"\n > OptDefault=\"*FDAddress*\"\npluginhost=@STR@\n\nOptPrompt=\"The endpoint > port to connect\"\nOptDefault=\"9104\"\n > pluginport=@INT32@\n\n',859,859,0,27,0,1,410830) failed. ERR=Data too > long for column 'PluginName' at row 1 > It seems to be an issue related to the Catalog update. Are you using Bacula with a MySQL or a PostgreSQL Catalog? Can you check if you have a "RestoreObject" in the bacula database and share the table information using the following query for a MySQL database? SELECT * FROM `INFORMATION_SCHEMA`.`TABLES`WHERE TABLE_NAME LIKE 'table1' For Postgresql, "\dt+ table" should work. It seems to me we would need a bug report for this issue. Best regards, Ana > bacula-sd Sending spooled attrs to the Director. Despooling 8,214 bytes ... > bacula-fd > Error: kubernetes: Error closing backend. Err=Child exited with code 1 > Fatal error: kubernetes: Wrong backend response to JobEnd command. > bacula-sd Elapsed time=00:02:02, Transfer rate=659 Bytes/second > bacula-fd > Error: kubernetes: PTCOMM cannot get packet header from backend. > > Error: kubernetes: Cannot successfully start bacula-backup pod in expected > time! > > Error: kubernetes: Job already running in 'namespace' namespace. Check > logs or delete bacula-backup Pod manually. > > > Best regards, > Zsolt > > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users >
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users