18 KiB
| sidebar_position |
|---|
| 90 |
Troubleshooting
This guide helps you diagnose and resolve common issues with the Barman Cloud plugin.
:::important We are continuously improving the integration between CloudNativePG and the Barman Cloud plugin as it moves toward greater stability and maturity. For this reason, we recommend using the latest available version of both components. See the Requirements section for details. :::
Viewing Logs
To troubleshoot effectively, you’ll often need to review logs from multiple sources:
:::note
The following commands assume you installed the CloudNativePG operator in
the default cnpg-system namespace. If you installed it in a different
namespace, adjust the commands accordingly.
:::
# View operator logs (includes plugin interaction logs)
kubectl logs -n cnpg-system deployment/cnpg-controller-manager -f
# View sidecar container logs (Barman Cloud operations)
kubectl logs -n <namespace> <cluster-pod-name> -c plugin-barman-cloud -f
# View plugin manager logs
kubectl logs -n cnpg-system deployment/barman-cloud -f
# View all containers in a pod
kubectl logs -n <namespace> <cluster-pod-name> --all-containers=true
# View previous container logs (if container restarted)
kubectl logs -n <namespace> <cluster-pod-name> -c plugin-barman-cloud --previous
Common Issues
Plugin Installation Issues
Plugin pods not starting
Symptoms:
- Plugin pods stuck in
CrashLoopBackOfforError - Plugin deployment not ready
Possible causes and solutions:
-
Certificate issues
# Check if cert-manager is installed and running kubectl get pods -n cert-manager # Check if the plugin certificate is created kubectl get certificates -n cnpg-systemIf cert-manager is not installed, install it first:
# Note: other installation methods for cert-manager are available kubectl apply -f \ https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yamlIf you are using your own certificates without cert-manager, you will need to verify the entire certificate chain yourself.
-
Image pull errors
# Check pod events for image pull errors kubectl describe pod -n cnpg-system -l app.kubernetes.io/name=barman-cloudVerify the image exists and you have proper credentials if using a private registry.
-
Resource constraints
# Check node resources kubectl top nodes kubectl describe nodesMake sure your cluster has sufficient CPU and memory resources.
Backup Failures
Quick Backup Troubleshooting Checklist
When a backup fails, follow these steps in order:
-
Check backup status:
kubectl get backups.postgresql.cnpg.io -n <namespace> -
Get error details and target pod:
kubectl describe backups.postgresql.cnpg.io \ -n <namespace> <backup-name> kubectl get backups.postgresql.cnpg.io \ -n <namespace> <backup-name> \ -o jsonpath='{.status.instanceID.podName}' -
Check the target pod’s sidecar logs:
TARGET_POD=$(kubectl get backups.postgresql.cnpg.io \ -n <namespace> <backup-name> \ -o jsonpath='{.status.instanceID.podName}') kubectl logs \ -n <namespace> $TARGET_POD -c plugin-barman-cloud --tail=100 | grep -E "ERROR|FATAL|panic" -
Check cluster events:
kubectl get events -n <namespace> \ --field-selector involvedObject.name=<cluster-name> \ --sort-by='.lastTimestamp' -
Verify plugin is running:
kubectl get pods \ -n cnpg-system -l app.kubernetes.io/name=barman-cloud -
Check operator logs:
kubectl logs \ -n cnpg-system deployment/cnpg-controller-manager \ --tail=100 | grep -i "backup\|plugin" -
Check plugin manager logs:
kubectl logs \ -n cnpg-system deployment/barman-cloud --tail=100
Backup job fails immediately
Symptoms:
- Backup pods terminate with error
- No backup files appear in object storage
- Backup shows
failedphase with various error messages
Common failure modes and solutions:
-
"requested plugin is not available" errors
ERROR: requested plugin is not available: barman ERROR: requested plugin is not available: barman-cloud ERROR: requested plugin is not available: barman-cloud.cloudnative-pg.ioCause: The plugin name in the Cluster configuration doesn't match the deployed plugin or the plugin isn't registered
Solution:
a. Check plugin registration status:
# If you have kubectl-cnpg plugin installed (v1.27.0+) kubectl cnpg status -n <namespace> <cluster-name>Look for the "Plugins status" section:
Plugins status Name Version Status Reported Operator Capabilities ---- ------- ------ ------------------------------ barman-cloud.cloudnative-pg.io 0.6.0 N/A Reconciler Hooks, Lifecycle Service:::tip If the Plugins status section is missing:
- Install or update kubectl-cnpg plugin to the latest version
- Ensure CloudNativePG operator is v1.27.0 or later :::
b. Verify correct plugin name in Cluster spec:
apiVersion: postgresql.cnpg.io/v1 kind: Cluster spec: plugins: - name: barman-cloud.cloudnative-pg.io parameters: barmanObjectStore: <your-objectstore-name>c. Check plugin deployment is running:
kubectl get deployment -n cnpg-system barman-cloud -
"rpc error: code = Unknown desc = panic caught: assignment to entry in nil map" errors
Cause: Configuration issue, often a typo or missing required field in the ObjectStore configuration
Solution:
- Check the sidecar container logs for detailed error messages:
kubectl logs -n <namespace> <cluster-pod> -c plugin-barman-cloud - Verify your ObjectStore configuration has all required fields
- Common issues include:
- Missing or incorrect secret references
- Typos in configuration parameters
- Missing required environment variables in secrets
- Check the sidecar container logs for detailed error messages:
General debugging steps:
-
Check backup status and identify the target instance
# List all backups and their status kubectl get backups.postgresql.cnpg.io -n <namespace> # Using kubectl-cnpg plugin (if installed) kubectl cnpg backup list -n <namespace> # Get detailed backup information including error messages and target instance kubectl describe backups.postgresql.cnpg.io -n <namespace> <backup-name> # Extract the target pod name from a failed backup kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='{.status.instanceID.podName}' # Or get more details including the target pod, method, phase and error kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='Pod: {.status.instanceID.podName}{"\n"}Method: {.status.method}{"\n"}Phase: {.status.phase}{"\n"}Error: {.status.error}{"\n"}' # Check the cluster status for backup-related information kubectl cnpg status <cluster-name> -n <namespace> --verbose -
Check sidecar logs on the backup target pod
# First, identify which pod was the backup target (from step 1) TARGET_POD=$(kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='{.status.instanceID.podName}') echo "Backup target pod: $TARGET_POD" # Check the sidecar logs on the specific target pod kubectl logs -n <namespace> $TARGET_POD -c plugin-barman-cloud --tail=100 # Follow the logs in real-time to see ongoing issues kubectl logs -n <namespace> $TARGET_POD -c plugin-barman-cloud -f # Check for specific errors in the target pod around the backup time kubectl logs -n <namespace> $TARGET_POD -c plugin-barman-cloud --since=10m | grep -E "ERROR|FATAL|panic|failed" # Alternative: List all cluster pods and their roles kubectl get pods -n <namespace> -l cnpg.io/cluster=<cluster-name> \ -o custom-columns=NAME:.metadata.name,ROLE:.metadata.labels.cnpg\\.io/instanceRole,INSTANCE:.metadata.labels.cnpg\\.io/instanceName # Check sidecar logs on ALL cluster pods for any errors (if target is unclear) for pod in $(kubectl get pods -n <namespace> -l cnpg.io/cluster=<cluster-name> -o name); do echo "=== Checking $pod ===" kubectl logs -n <namespace> $pod -c plugin-barman-cloud --tail=20 | grep -i error || echo "No errors found" done -
Check events for backup-related issues
# Check events for the cluster kubectl get events -n <namespace> --field-selector involvedObject.name=<cluster-name> # Check events for failed backups kubectl get events -n <namespace> --field-selector involvedObject.kind=Backup # Get all recent events in the namespace kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20 -
Verify ObjectStore configuration
# Check the ObjectStore resource kubectl get objectstores.barmancloud.cnpg.io -n <namespace> <objectstore-name> -o yaml # Verify the secret exists and has correct keys kubectl get secret -n <namespace> <secret-name> -o yaml -
Common error messages and solutions:
- "AccessDenied" or "403 Forbidden": Check cloud credentials and bucket permissions
- "NoSuchBucket": Verify the bucket exists and the endpoint URL is correct
- "Connection timeout": Check network connectivity and firewall rules
- "SSL certificate problem": For self-signed certificates, check CA bundle configuration
Backup performance issues
Symptoms:
- Backups take extremely long
- Backups timeout
Plugin-specific considerations:
-
Check ObjectStore parallelism settings
- Adjust
maxParallelin ObjectStore configuration - Monitor sidecar container resource usage during backups
- Adjust
-
Verify plugin resource allocation
- Check if the sidecar container has sufficient CPU/memory
- Review plugin container logs for resource-related warnings
:::tip For Barman-specific features like compression, encryption, and performance tuning, refer to the Barman documentation. :::
WAL Archiving Issues
WAL archiving through plugin stops working
Symptoms:
- WAL files accumulating on primary
- Cluster warnings about WAL archiving
- Plugin sidecar logs show WAL archive errors
Plugin-specific debugging:
-
Check plugin sidecar logs for WAL archiving errors
# Check recent WAL archive operations in sidecar kubectl logs -n <namespace> <primary-pod> -c plugin-barman-cloud --tail=50 | grep -i wal -
Verify the plugin is handling archive_command
# The archive_command should be routing through the plugin kubectl exec -n <namespace> <primary-pod> -c postgres -- psql -U postgres -c "SHOW archive_command;" -
Check ObjectStore configuration for WAL settings
- Ensure ObjectStore has proper WAL retention settings
- Verify credentials have permissions for WAL operations
Restore Issues
Restore fails during recovery
Symptoms:
- New cluster stuck in recovery mode
- Plugin sidecar shows restore errors
- PostgreSQL won't start
Plugin-specific debugging:
-
Check plugin sidecar logs during restore
# Check the sidecar logs on the recovering cluster pods kubectl logs -n <namespace> <cluster-pod-name> -c plugin-barman-cloud --tail=100 # Look for restore-related errors kubectl logs -n <namespace> <cluster-pod-name> -c plugin-barman-cloud | grep -E "restore|recovery|ERROR" -
Verify plugin can access backups
# Check if ObjectStore is properly configured for restore kubectl get objectstores.barmancloud.cnpg.io -n <namespace> <objectstore-name> -o yaml # Check PostgreSQL recovery logs kubectl logs -n <namespace> <cluster-pod> -c postgres | grep -i recovery
:::tip For detailed Barman restore operations and troubleshooting, refer to the Barman documentation. :::
Point-in-time recovery (PITR) configuration issues
Symptoms:
- PITR target time not reached
- Plugin sidecar shows WAL access errors
- Recovery stops before target
Plugin-specific configuration:
-
Verify plugin configuration for PITR
apiVersion: postgresql.cnpg.io/v1 kind: Cluster spec: plugins: - name: barman-cloud.cloudnative-pg.io parameters: barmanObjectStore: <objectstore-name> bootstrap: recovery: recoveryTarget: targetTime: "2024-01-15 10:30:00" targetTimezone: "UTC" -
Check plugin sidecar for WAL access
# Check sidecar logs during recovery for WAL-related errors kubectl logs -n <namespace> <cluster-pod> -c plugin-barman-cloud | grep -i wal
:::note For detailed PITR configuration and WAL management, see the Barman PITR documentation. :::
Plugin Configuration Issues
Plugin cannot connect to object storage
Symptoms:
- Plugin sidecar logs show connection errors
- Backups fail with authentication or network errors
- ObjectStore resource shows errors
Plugin-specific solutions:
-
Verify ObjectStore CRD configuration
# Check ObjectStore resource status kubectl get objectstores.barmancloud.cnpg.io -n <namespace> <objectstore-name> -o yaml # Verify the secret exists and has correct keys for your provider kubectl get secret -n <namespace> <secret-name> -o jsonpath='{.data}' | jq 'keys' -
Check plugin sidecar connectivity
# Check sidecar logs for connection errors kubectl logs -n <namespace> <cluster-pod> -c plugin-barman-cloud | grep -E "connection|timeout|SSL|certificate" -
Provider-specific configuration
- See Object Store Configuration for provider-specific settings
- Ensure
endpointURLands3UsePathStylematch your storage type - Verify network policies allow egress to your storage provider
Diagnostic Commands
Using kubectl-cnpg plugin
The kubectl-cnpg plugin provides enhanced debugging capabilities. Make sure you have it installed and updated:
# Install or update kubectl-cnpg plugin
kubectl krew install cnpg
# Or download directly from: https://github.com/cloudnative-pg/cloudnative-pg/releases
# Check plugin status (requires CNPG 1.27.0+)
kubectl cnpg status <cluster-name> -n <namespace>
# View cluster status in detail
kubectl cnpg status <cluster-name> -n <namespace> --verbose
# Check backup status
kubectl cnpg backup list -n <namespace>
# View plugin capabilities
kubectl cnpg plugin list -n <namespace>
Getting Help
If you continue to experience issues:
-
Check the documentation
- Review the Installation Guide
- Check Object Store Configuration for provider-specific settings
- Review Usage Examples for correct configuration patterns
-
Gather diagnostic information
# Create a diagnostic bundle (⚠️sanitize these before sharing!) kubectl get objectstores.barmancloud.cnpg.io -A -o yaml > /tmp/objectstores.yaml kubectl get clusters.postgresql.cnpg.io -A -o yaml > /tmp/clusters.yaml kubectl logs -n cnpg-system deployment/barman-cloud --tail=1000 > /tmp/plugin.log -
Community support
- CloudNativePG Slack: #cloudnativepg-users
- GitHub Issues: plugin-barman-cloud
-
When reporting issues, include:
- CloudNativePG version
- Barman Cloud Plugin version
- Kubernetes version
- Cloud provider and region
- Relevant configuration (⚠️sanitize/redact sensitive information)
- Error messages and logs
- Steps to reproduce
Known Issues and Limitations
Current Known Issues
- WAL overwrite protection: Unlike the in-tree Barman archiver, the plugin doesn't prevent WAL overwrites when multiple clusters share the same name and object store path (#263)
- Migration compatibility: After migrating from in-tree backup to the plugin, the
kubectl cnpg backupcommand syntax has changed (#353):# Old command (in-tree, no longer works after migration) kubectl cnpg backup -n <namespace> <cluster-name> --method=barmanObjectStore # New command (plugin-based) kubectl cnpg backup -n <namespace> <cluster-name> --method=plugin --plugin-name=barman-cloud.cloudnative-pg.io
Plugin Limitations
- Installation method: Currently only supports manifest and Kustomize installation (#351 - Helm chart requested)
- Sidecar resource sharing: The plugin sidecar container shares pod resources with PostgreSQL
- Plugin restart behavior: Restarting the sidecar container requires restarting the entire PostgreSQL pod
Compatibility Matrix
| Plugin Version | CloudNativePG Version | Kubernetes Version | Notes |
|---|---|---|---|
| 0.6.x | 1.26.x, 1.27.x (recommended) | 1.28+ | CNPG 1.27.0+ provides enhanced plugin status reporting |
| 0.5.x | 1.25.x, 1.26.x | 1.27+ | Limited plugin diagnostics |
:::tip Always check the Release Notes for version-specific known issues and fixes. :::