mirror of
https://github.com/cloudnative-pg/plugin-barman-cloud.git
synced 2026-01-11 21:23:12 +01:00
docs(web): add troubleshooting guide and improve navigation
- Create troubleshooting page with debugging steps and known issues - Add hero subtitle and quick links to landing page - Document kubectl-cnpg backup command syntax changes - Include migration notes for users switching from in-tree backup Signed-off-by: Jeff Mealo <jmealo@protonmail.com>
This commit is contained in:
parent
eced5ea2c6
commit
8aabd2860b
@ -23,9 +23,14 @@ for detailed instructions.
|
|||||||
|
|
||||||
To use the Barman Cloud Plugin, you need:
|
To use the Barman Cloud Plugin, you need:
|
||||||
|
|
||||||
- [CloudNativePG](https://cloudnative-pg.io) version **1.26** <!-- ADD WHEN 1.27 IS OUT "or later" -->
|
- [CloudNativePG](https://cloudnative-pg.io) version **1.26** or later
|
||||||
|
- **Version 1.27.0 or later is strongly recommended** for enhanced plugin error and status reporting
|
||||||
|
- See the [upgrade guide](https://cloudnative-pg.io/documentation/current/installation_upgrade) if you need to upgrade
|
||||||
- [cert-manager](https://cert-manager.io/) to enable TLS communication between
|
- [cert-manager](https://cert-manager.io/) to enable TLS communication between
|
||||||
the plugin and the operator
|
the plugin and the operator
|
||||||
|
- [kubectl-cnpg plugin](https://cloudnative-pg.io/documentation/current/kubectl-plugin/) (recommended)
|
||||||
|
- Provides enhanced debugging and status checking capabilities
|
||||||
|
- Install via `kubectl krew install cnpg` or download from [releases](https://github.com/cloudnative-pg/cloudnative-pg/releases)
|
||||||
|
|
||||||
## Key Features
|
## Key Features
|
||||||
|
|
||||||
|
|||||||
491
web/docs/troubleshooting.md
Normal file
491
web/docs/troubleshooting.md
Normal file
@ -0,0 +1,491 @@
|
|||||||
|
---
|
||||||
|
sidebar_position: 50
|
||||||
|
---
|
||||||
|
|
||||||
|
# Troubleshooting
|
||||||
|
|
||||||
|
<!-- SPDX-License-Identifier: CC-BY-4.0 -->
|
||||||
|
|
||||||
|
This guide helps you diagnose and resolve common issues with the Barman Cloud Plugin.
|
||||||
|
|
||||||
|
## Before You Begin
|
||||||
|
|
||||||
|
### Recommended Upgrades
|
||||||
|
|
||||||
|
:::important
|
||||||
|
**CloudNativePG 1.27.0** offers significantly improved error and status reporting for plugins. If you're experiencing issues, we strongly recommend upgrading to version 1.27.0 or later for better diagnostics.
|
||||||
|
|
||||||
|
- **Upgrade CloudNativePG**: Follow the [official upgrade guide](https://cloudnative-pg.io/documentation/current/installation_upgrade)
|
||||||
|
- **Update kubectl-cnpg plugin**: Install or update the kubectl plugin for better debugging capabilities. See the [kubectl plugin documentation](https://cloudnative-pg.io/documentation/current/kubectl-plugin/)
|
||||||
|
:::
|
||||||
|
|
||||||
|
### Viewing Logs
|
||||||
|
|
||||||
|
To effectively troubleshoot issues, you need to check logs from multiple sources:
|
||||||
|
|
||||||
|
:::note
|
||||||
|
The following commands assume you've installed the CloudNativePG operator in the default `cnpg-system` namespace. If you've installed it in a different namespace, adjust the commands accordingly.
|
||||||
|
:::
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# View operator logs (contains plugin interaction logs)
|
||||||
|
# Assumes operator is installed in the default cnpg-system namespace
|
||||||
|
kubectl logs -n cnpg-system deployment/cnpg-controller-manager -f
|
||||||
|
|
||||||
|
# View sidecar container logs (barman-cloud operations)
|
||||||
|
kubectl logs -n <namespace> <cluster-pod-name> -c plugin-barman-cloud -f
|
||||||
|
|
||||||
|
# View plugin manager logs
|
||||||
|
kubectl logs -n cnpg-system deployment/barman-cloud -f
|
||||||
|
|
||||||
|
# View all containers in a pod
|
||||||
|
kubectl logs -n <namespace> <cluster-pod-name> --all-containers=true
|
||||||
|
|
||||||
|
# View previous container logs (if container restarted)
|
||||||
|
kubectl logs -n <namespace> <cluster-pod-name> -c plugin-barman-cloud --previous
|
||||||
|
```
|
||||||
|
|
||||||
|
## Common Issues
|
||||||
|
|
||||||
|
### Plugin Installation Issues
|
||||||
|
|
||||||
|
#### Plugin pods not starting
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- Plugin pods are in `CrashLoopBackOff` or `Error` state
|
||||||
|
- Plugin deployment is not ready
|
||||||
|
|
||||||
|
**Possible causes and solutions:**
|
||||||
|
|
||||||
|
1. **Certificate issues**
|
||||||
|
```bash
|
||||||
|
# Check if cert-manager is installed and running
|
||||||
|
kubectl get pods -n cert-manager
|
||||||
|
|
||||||
|
# Check if the plugin certificate is created
|
||||||
|
kubectl get certificates -n cnpg-system
|
||||||
|
```
|
||||||
|
|
||||||
|
If cert-manager is not installed, install it first:
|
||||||
|
```bash
|
||||||
|
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Image pull errors**
|
||||||
|
```bash
|
||||||
|
# Check pod events for image pull errors
|
||||||
|
kubectl describe pod -n cnpg-system -l app.kubernetes.io/name=barman-cloud
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify the image exists and you have proper credentials if using a private registry.
|
||||||
|
|
||||||
|
3. **Resource constraints**
|
||||||
|
```bash
|
||||||
|
# Check node resources
|
||||||
|
kubectl top nodes
|
||||||
|
kubectl describe nodes
|
||||||
|
```
|
||||||
|
|
||||||
|
Ensure your cluster has sufficient CPU and memory resources.
|
||||||
|
|
||||||
|
### Backup Failures
|
||||||
|
|
||||||
|
#### Quick Backup Troubleshooting Checklist
|
||||||
|
|
||||||
|
When a backup fails, follow these steps in order:
|
||||||
|
|
||||||
|
1. **Check backup status**: `kubectl get backups.postgresql.cnpg.io -n <namespace>`
|
||||||
|
2. **Get error details and target pod**:
|
||||||
|
```bash
|
||||||
|
kubectl describe backups.postgresql.cnpg.io -n <namespace> <backup-name>
|
||||||
|
# Or extract just the target pod name
|
||||||
|
kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='{.status.instanceID.podName}'
|
||||||
|
```
|
||||||
|
3. **Check the specific target pod's sidecar logs**:
|
||||||
|
```bash
|
||||||
|
TARGET_POD=$(kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='{.status.instanceID.podName}')
|
||||||
|
kubectl logs -n <namespace> $TARGET_POD -c plugin-barman-cloud --tail=100 | grep -E "ERROR|FATAL|panic"
|
||||||
|
```
|
||||||
|
4. **Check cluster events**: `kubectl get events -n <namespace> --field-selector involvedObject.name=<cluster-name> --sort-by='.lastTimestamp'`
|
||||||
|
5. **Verify plugin is running**: `kubectl get pods -n cnpg-system -l app.kubernetes.io/name=barman-cloud`
|
||||||
|
6. **Check operator logs**: `kubectl logs -n cnpg-system deployment/cnpg-controller-manager --tail=100 | grep -i "backup\|plugin"`
|
||||||
|
7. **Check plugin manager logs**: `kubectl logs -n cnpg-system deployment/barman-cloud --tail=100`
|
||||||
|
|
||||||
|
#### Backup job fails immediately
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- Backup pods terminate with error
|
||||||
|
- No backup files appear in object storage
|
||||||
|
- Backup shows `failed` phase with various error messages
|
||||||
|
|
||||||
|
**Common failure modes and solutions:**
|
||||||
|
|
||||||
|
1. **"requested plugin is not available" errors**
|
||||||
|
```
|
||||||
|
ERROR: requested plugin is not available: barman
|
||||||
|
ERROR: requested plugin is not available: barman-cloud
|
||||||
|
ERROR: requested plugin is not available: barman-cloud.cloudnative-pg.io
|
||||||
|
```
|
||||||
|
|
||||||
|
**Cause:** The plugin name in the Cluster configuration doesn't match the deployed plugin or the plugin isn't registered
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
|
||||||
|
a. **Check plugin registration status**:
|
||||||
|
```bash
|
||||||
|
# If you have kubectl-cnpg plugin installed (v1.27.0+)
|
||||||
|
kubectl cnpg status -n <namespace> <cluster-name>
|
||||||
|
```
|
||||||
|
|
||||||
|
Look for the "Plugins status" section:
|
||||||
|
```
|
||||||
|
Plugins status
|
||||||
|
Name Version Status Reported Operator Capabilities
|
||||||
|
---- ------- ------ ------------------------------
|
||||||
|
barman-cloud.cloudnative-pg.io 0.6.0 N/A Reconciler Hooks, Lifecycle Service
|
||||||
|
```
|
||||||
|
|
||||||
|
:::tip
|
||||||
|
If the Plugins status section is missing:
|
||||||
|
- Install or update kubectl-cnpg plugin to the latest version
|
||||||
|
- Ensure CloudNativePG operator is v1.27.0 or later
|
||||||
|
:::
|
||||||
|
|
||||||
|
b. **Verify correct plugin name in Cluster spec**:
|
||||||
|
```yaml
|
||||||
|
apiVersion: postgresql.cnpg.io/v1
|
||||||
|
kind: Cluster
|
||||||
|
spec:
|
||||||
|
plugins:
|
||||||
|
- name: barman-cloud.cloudnative-pg.io
|
||||||
|
parameters:
|
||||||
|
barmanObjectStore: <your-objectstore-name>
|
||||||
|
```
|
||||||
|
|
||||||
|
c. **Check plugin deployment is running**:
|
||||||
|
```bash
|
||||||
|
kubectl get deployment -n cnpg-system barman-cloud
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **"rpc error: code = Unknown desc = panic caught: assignment to entry in nil map" errors**
|
||||||
|
|
||||||
|
**Cause:** Configuration issue, often a typo or missing required field in the ObjectStore configuration
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
- Check the sidecar container logs for detailed error messages:
|
||||||
|
```bash
|
||||||
|
kubectl logs -n <namespace> <cluster-pod> -c plugin-barman-cloud
|
||||||
|
```
|
||||||
|
- Verify your ObjectStore configuration has all required fields
|
||||||
|
- Common issues include:
|
||||||
|
- Missing or incorrect secret references
|
||||||
|
- Typos in configuration parameters
|
||||||
|
- Missing required environment variables in secrets
|
||||||
|
|
||||||
|
**General debugging steps:**
|
||||||
|
|
||||||
|
1. **Check backup status and identify the target instance**
|
||||||
|
```bash
|
||||||
|
# List all backups and their status
|
||||||
|
kubectl get backups.postgresql.cnpg.io -n <namespace>
|
||||||
|
|
||||||
|
# Using kubectl-cnpg plugin (if installed)
|
||||||
|
kubectl cnpg backup list -n <namespace>
|
||||||
|
|
||||||
|
# Get detailed backup information including error messages and target instance
|
||||||
|
kubectl describe backups.postgresql.cnpg.io -n <namespace> <backup-name>
|
||||||
|
|
||||||
|
# Extract the target pod name from a failed backup
|
||||||
|
kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='{.status.instanceID.podName}'
|
||||||
|
|
||||||
|
# Or get more details including the target pod, method, phase and error
|
||||||
|
kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='Pod: {.status.instanceID.podName}{"\n"}Method: {.status.method}{"\n"}Phase: {.status.phase}{"\n"}Error: {.status.error}{"\n"}'
|
||||||
|
|
||||||
|
# Check the cluster status for backup-related information
|
||||||
|
kubectl cnpg status <cluster-name> -n <namespace> --verbose
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Check sidecar logs on the backup target pod**
|
||||||
|
```bash
|
||||||
|
# First, identify which pod was the backup target (from step 1)
|
||||||
|
TARGET_POD=$(kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='{.status.instanceID.podName}')
|
||||||
|
echo "Backup target pod: $TARGET_POD"
|
||||||
|
|
||||||
|
# Check the sidecar logs on the specific target pod
|
||||||
|
kubectl logs -n <namespace> $TARGET_POD -c plugin-barman-cloud --tail=100
|
||||||
|
|
||||||
|
# Follow the logs in real-time to see ongoing issues
|
||||||
|
kubectl logs -n <namespace> $TARGET_POD -c plugin-barman-cloud -f
|
||||||
|
|
||||||
|
# Check for specific errors in the target pod around the backup time
|
||||||
|
kubectl logs -n <namespace> $TARGET_POD -c plugin-barman-cloud --since=10m | grep -E "ERROR|FATAL|panic|failed"
|
||||||
|
|
||||||
|
# Alternative: List all cluster pods and their roles
|
||||||
|
kubectl get pods -n <namespace> -l cnpg.io/cluster=<cluster-name> \
|
||||||
|
-o custom-columns=NAME:.metadata.name,ROLE:.metadata.labels.cnpg\\.io/instanceRole,INSTANCE:.metadata.labels.cnpg\\.io/instanceName
|
||||||
|
|
||||||
|
# Check sidecar logs on ALL cluster pods for any errors (if target is unclear)
|
||||||
|
for pod in $(kubectl get pods -n <namespace> -l cnpg.io/cluster=<cluster-name> -o name); do
|
||||||
|
echo "=== Checking $pod ==="
|
||||||
|
kubectl logs -n <namespace> $pod -c plugin-barman-cloud --tail=20 | grep -i error || echo "No errors found"
|
||||||
|
done
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Check events for backup-related issues**
|
||||||
|
```bash
|
||||||
|
# Check events for the cluster
|
||||||
|
kubectl get events -n <namespace> --field-selector involvedObject.name=<cluster-name>
|
||||||
|
|
||||||
|
# Check events for failed backups
|
||||||
|
kubectl get events -n <namespace> --field-selector involvedObject.kind=Backup
|
||||||
|
|
||||||
|
# Get all recent events in the namespace
|
||||||
|
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Verify ObjectStore configuration**
|
||||||
|
```bash
|
||||||
|
# Check the ObjectStore resource
|
||||||
|
kubectl get objectstores.barmancloud.cnpg.io -n <namespace> <objectstore-name> -o yaml
|
||||||
|
|
||||||
|
# Verify the secret exists and has correct keys
|
||||||
|
kubectl get secret -n <namespace> <secret-name> -o yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Common error messages and solutions:**
|
||||||
|
|
||||||
|
- **"AccessDenied" or "403 Forbidden"**: Check cloud credentials and bucket permissions
|
||||||
|
- **"NoSuchBucket"**: Verify the bucket exists and the endpoint URL is correct
|
||||||
|
- **"Connection timeout"**: Check network connectivity and firewall rules
|
||||||
|
- **"SSL certificate problem"**: For self-signed certificates, check CA bundle configuration
|
||||||
|
|
||||||
|
#### Backup performance issues
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- Backups take extremely long
|
||||||
|
- Backups timeout
|
||||||
|
|
||||||
|
**Plugin-specific considerations:**
|
||||||
|
|
||||||
|
1. **Check ObjectStore parallelism settings**
|
||||||
|
- Adjust `maxParallel` in ObjectStore configuration
|
||||||
|
- Monitor sidecar container resource usage during backups
|
||||||
|
|
||||||
|
2. **Verify plugin resource allocation**
|
||||||
|
- Check if the sidecar container has sufficient CPU/memory
|
||||||
|
- Review plugin container logs for resource-related warnings
|
||||||
|
|
||||||
|
:::tip
|
||||||
|
For Barman-specific features like compression, encryption, and performance tuning, refer to the [Barman documentation](https://docs.pgbarman.org/latest/).
|
||||||
|
:::
|
||||||
|
|
||||||
|
### WAL Archiving Issues
|
||||||
|
|
||||||
|
#### WAL archiving through plugin stops working
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- WAL files accumulating on primary
|
||||||
|
- Cluster warnings about WAL archiving
|
||||||
|
- Plugin sidecar logs show WAL archive errors
|
||||||
|
|
||||||
|
**Plugin-specific debugging:**
|
||||||
|
|
||||||
|
1. **Check plugin sidecar logs for WAL archiving errors**
|
||||||
|
```bash
|
||||||
|
# Check recent WAL archive operations in sidecar
|
||||||
|
kubectl logs -n <namespace> <primary-pod> -c plugin-barman-cloud --tail=50 | grep -i wal
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Verify the plugin is handling archive_command**
|
||||||
|
```bash
|
||||||
|
# The archive_command should be routing through the plugin
|
||||||
|
kubectl exec -n <namespace> <primary-pod> -c postgres -- psql -U postgres -c "SHOW archive_command;"
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Check ObjectStore configuration for WAL settings**
|
||||||
|
- Ensure ObjectStore has proper WAL retention settings
|
||||||
|
- Verify credentials have permissions for WAL operations
|
||||||
|
|
||||||
|
### Restore Issues
|
||||||
|
|
||||||
|
#### Restore fails during recovery
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- New cluster stuck in recovery mode
|
||||||
|
- Plugin sidecar shows restore errors
|
||||||
|
- PostgreSQL won't start
|
||||||
|
|
||||||
|
**Plugin-specific debugging:**
|
||||||
|
|
||||||
|
1. **Check plugin sidecar logs during restore**
|
||||||
|
```bash
|
||||||
|
# Check the sidecar logs on the recovering cluster pods
|
||||||
|
kubectl logs -n <namespace> <cluster-pod-name> -c plugin-barman-cloud --tail=100
|
||||||
|
|
||||||
|
# Look for restore-related errors
|
||||||
|
kubectl logs -n <namespace> <cluster-pod-name> -c plugin-barman-cloud | grep -E "restore|recovery|ERROR"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Verify plugin can access backups**
|
||||||
|
```bash
|
||||||
|
# Check if ObjectStore is properly configured for restore
|
||||||
|
kubectl get objectstores.barmancloud.cnpg.io -n <namespace> <objectstore-name> -o yaml
|
||||||
|
|
||||||
|
# Check PostgreSQL recovery logs
|
||||||
|
kubectl logs -n <namespace> <cluster-pod> -c postgres | grep -i recovery
|
||||||
|
```
|
||||||
|
|
||||||
|
:::tip
|
||||||
|
For detailed Barman restore operations and troubleshooting, refer to the [Barman documentation](https://docs.pgbarman.org/latest/barman-cloud-restore.html).
|
||||||
|
:::
|
||||||
|
|
||||||
|
#### Point-in-time recovery (PITR) configuration issues
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- PITR target time not reached
|
||||||
|
- Plugin sidecar shows WAL access errors
|
||||||
|
- Recovery stops before target
|
||||||
|
|
||||||
|
**Plugin-specific configuration:**
|
||||||
|
|
||||||
|
1. **Verify plugin configuration for PITR**
|
||||||
|
```yaml
|
||||||
|
apiVersion: postgresql.cnpg.io/v1
|
||||||
|
kind: Cluster
|
||||||
|
spec:
|
||||||
|
plugins:
|
||||||
|
- name: barman-cloud.cloudnative-pg.io
|
||||||
|
parameters:
|
||||||
|
barmanObjectStore: <objectstore-name>
|
||||||
|
bootstrap:
|
||||||
|
recovery:
|
||||||
|
recoveryTarget:
|
||||||
|
targetTime: "2024-01-15 10:30:00"
|
||||||
|
targetTimezone: "UTC"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Check plugin sidecar for WAL access**
|
||||||
|
```bash
|
||||||
|
# Check sidecar logs during recovery for WAL-related errors
|
||||||
|
kubectl logs -n <namespace> <cluster-pod> -c plugin-barman-cloud | grep -i wal
|
||||||
|
```
|
||||||
|
|
||||||
|
:::note
|
||||||
|
For detailed PITR configuration and WAL management, see the [Barman PITR documentation](https://docs.pgbarman.org/latest/).
|
||||||
|
:::
|
||||||
|
|
||||||
|
### Plugin Configuration Issues
|
||||||
|
|
||||||
|
#### Plugin cannot connect to object storage
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- Plugin sidecar logs show connection errors
|
||||||
|
- Backups fail with authentication or network errors
|
||||||
|
- ObjectStore resource shows errors
|
||||||
|
|
||||||
|
**Plugin-specific solutions:**
|
||||||
|
|
||||||
|
1. **Verify ObjectStore CRD configuration**
|
||||||
|
```bash
|
||||||
|
# Check ObjectStore resource status
|
||||||
|
kubectl get objectstores.barmancloud.cnpg.io -n <namespace> <objectstore-name> -o yaml
|
||||||
|
|
||||||
|
# Verify the secret exists and has correct keys for your provider
|
||||||
|
kubectl get secret -n <namespace> <secret-name> -o jsonpath='{.data}' | jq 'keys'
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Check plugin sidecar connectivity**
|
||||||
|
```bash
|
||||||
|
# Check sidecar logs for connection errors
|
||||||
|
kubectl logs -n <namespace> <cluster-pod> -c plugin-barman-cloud | grep -E "connection|timeout|SSL|certificate"
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Provider-specific configuration**
|
||||||
|
- See [Object Store Configuration](object_stores.md) for provider-specific settings
|
||||||
|
- Ensure `endpointURL` and `s3UsePathStyle` match your storage type
|
||||||
|
- Verify network policies allow egress to your storage provider
|
||||||
|
|
||||||
|
## Diagnostic Commands
|
||||||
|
|
||||||
|
### Using kubectl-cnpg plugin
|
||||||
|
|
||||||
|
The kubectl-cnpg plugin provides enhanced debugging capabilities. Make sure you have it installed and updated:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install or update kubectl-cnpg plugin
|
||||||
|
kubectl krew install cnpg
|
||||||
|
# Or download directly from: https://github.com/cloudnative-pg/cloudnative-pg/releases
|
||||||
|
|
||||||
|
# Check plugin status (requires CNPG 1.27.0+)
|
||||||
|
kubectl cnpg status <cluster-name> -n <namespace>
|
||||||
|
|
||||||
|
# View cluster status in detail
|
||||||
|
kubectl cnpg status <cluster-name> -n <namespace> --verbose
|
||||||
|
|
||||||
|
# Check backup status
|
||||||
|
kubectl cnpg backup list -n <namespace>
|
||||||
|
|
||||||
|
# View plugin capabilities
|
||||||
|
kubectl cnpg plugin list -n <namespace>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Getting Help
|
||||||
|
|
||||||
|
If you continue to experience issues:
|
||||||
|
|
||||||
|
1. **Check the documentation**
|
||||||
|
- Review the [Installation Guide](installation.mdx)
|
||||||
|
- Check [Object Store Configuration](object_stores.md) for provider-specific settings
|
||||||
|
- Review [Usage Examples](usage.md) for correct configuration patterns
|
||||||
|
|
||||||
|
2. **Gather diagnostic information**
|
||||||
|
```bash
|
||||||
|
# Create a diagnostic bundle (⚠️sanitize these before sharing!)
|
||||||
|
kubectl get objectstores.barmancloud.cnpg.io -A -o yaml > /tmp/objectstores.yaml
|
||||||
|
kubectl get clusters.postgresql.cnpg.io -A -o yaml > /tmp/clusters.yaml
|
||||||
|
kubectl logs -n cnpg-system deployment/barman-cloud --tail=1000 > /tmp/plugin.log
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Community support**
|
||||||
|
- CloudNativePG Slack: [#cloudnativepg-users](https://cloud-native.slack.com/messages/cloudnativepg-users)
|
||||||
|
- GitHub Issues: [plugin-barman-cloud](https://github.com/cloudnative-pg/plugin-barman-cloud/issues)
|
||||||
|
|
||||||
|
4. **When reporting issues, include:**
|
||||||
|
- CloudNativePG version
|
||||||
|
- Barman Cloud Plugin version
|
||||||
|
- Kubernetes version
|
||||||
|
- Cloud provider and region
|
||||||
|
- Relevant configuration (⚠️sanitize/redact sensitive information)
|
||||||
|
- Error messages and logs
|
||||||
|
- Steps to reproduce
|
||||||
|
|
||||||
|
## Known Issues and Limitations
|
||||||
|
|
||||||
|
### Current Known Issues
|
||||||
|
|
||||||
|
1. **WAL overwrite protection**: Unlike the in-tree Barman archiver, the plugin doesn't prevent WAL overwrites when multiple clusters share the same name and object store path ([#263](https://github.com/cloudnative-pg/plugin-barman-cloud/issues/263))
|
||||||
|
2. **Migration compatibility**: After migrating from in-tree backup to the plugin, the `kubectl cnpg backup` command syntax has changed ([#353](https://github.com/cloudnative-pg/plugin-barman-cloud/issues/353)):
|
||||||
|
```bash
|
||||||
|
# Old command (in-tree, no longer works after migration)
|
||||||
|
kubectl cnpg backup -n <namespace> <cluster-name> --method=barmanObjectStore
|
||||||
|
|
||||||
|
# New command (plugin-based)
|
||||||
|
kubectl cnpg backup -n <namespace> <cluster-name> --method=plugin --plugin-name=barman-cloud.cloudnative-pg.io
|
||||||
|
```
|
||||||
|
|
||||||
|
### Plugin Limitations
|
||||||
|
|
||||||
|
1. **Installation method**: Currently only supports manifest and Kustomize installation ([#351](https://github.com/cloudnative-pg/plugin-barman-cloud/issues/351) - Helm chart requested)
|
||||||
|
2. **Sidecar resource sharing**: The plugin sidecar container shares pod resources with PostgreSQL
|
||||||
|
3. **Plugin restart behavior**: Restarting the sidecar container requires restarting the entire PostgreSQL pod
|
||||||
|
|
||||||
|
### Compatibility Matrix
|
||||||
|
|
||||||
|
| Plugin Version | CloudNativePG Version | Kubernetes Version | Notes |
|
||||||
|
|---------------|----------------------|-------------------|--------|
|
||||||
|
| 0.6.x | 1.26.x, **1.27.x** (recommended) | 1.28+ | CNPG 1.27.0+ provides enhanced plugin status reporting |
|
||||||
|
| 0.5.x | 1.25.x, 1.26.x | 1.27+ | Limited plugin diagnostics |
|
||||||
|
|
||||||
|
:::tip
|
||||||
|
Always check the [Release Notes](https://github.com/cloudnative-pg/plugin-barman-cloud/releases) for version-specific known issues and fixes.
|
||||||
|
:::
|
||||||
@ -81,8 +81,34 @@ This configuration enables both WAL archiving and data directory backups.
|
|||||||
|
|
||||||
## Performing a Base Backup
|
## Performing a Base Backup
|
||||||
|
|
||||||
Once WAL archiving is enabled, the cluster is ready for backups. To issue an
|
Once WAL archiving is enabled, the cluster is ready for backups.
|
||||||
on-demand backup, use the following configuration with the plugin method:
|
|
||||||
|
### Using kubectl-cnpg plugin
|
||||||
|
|
||||||
|
The quickest way to create an on-demand backup is using the kubectl-cnpg plugin:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl cnpg backup -n <namespace> <cluster-name> \
|
||||||
|
--method=plugin \
|
||||||
|
--plugin-name=barman-cloud.cloudnative-pg.io
|
||||||
|
```
|
||||||
|
|
||||||
|
:::note Migration from in-tree backup
|
||||||
|
If you're migrating from the in-tree backup system, note that the command has changed from:
|
||||||
|
```bash
|
||||||
|
# Old command (in-tree backup)
|
||||||
|
kubectl cnpg backup -n <namespace> <cluster-name> --method=barmanObjectStore
|
||||||
|
```
|
||||||
|
to:
|
||||||
|
```bash
|
||||||
|
# New command (plugin backup)
|
||||||
|
kubectl cnpg backup -n <namespace> <cluster-name> --method=plugin --plugin-name=barman-cloud.cloudnative-pg.io
|
||||||
|
```
|
||||||
|
:::
|
||||||
|
|
||||||
|
### Using YAML manifests
|
||||||
|
|
||||||
|
Alternatively, you can create a backup using a YAML manifest:
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
apiVersion: postgresql.cnpg.io/v1
|
apiVersion: postgresql.cnpg.io/v1
|
||||||
|
|||||||
@ -1,5 +1,7 @@
|
|||||||
import type {ComponentProps, ComponentType, ReactElement} from "react";
|
import type {ComponentProps, ComponentType, ReactElement} from "react";
|
||||||
import clsx from "clsx";
|
import clsx from "clsx";
|
||||||
|
|
||||||
|
import Link from "@docusaurus/Link";
|
||||||
import styles from "@site/src/components/HomepageFeatures/styles.module.css";
|
import styles from "@site/src/components/HomepageFeatures/styles.module.css";
|
||||||
import Heading from "@theme/Heading";
|
import Heading from "@theme/Heading";
|
||||||
|
|
||||||
@ -25,25 +27,52 @@ function Feature({title, Svg, description}: FeatureItem): ReactElement<FeatureIt
|
|||||||
|
|
||||||
export function FeatureList(): ReactElement<null> {
|
export function FeatureList(): ReactElement<null> {
|
||||||
return (
|
return (
|
||||||
<div className="row">
|
<>
|
||||||
<Feature
|
<div className="row">
|
||||||
title={'Backup your clusters'}
|
<Feature
|
||||||
description={"Securely backup your CloudNativePG clusters to object storage with configurable retention " +
|
title={'Backup your clusters'}
|
||||||
"policies and compression options"}
|
description={"Securely backup your CloudNativePG clusters to object storage with configurable retention " +
|
||||||
Svg={require('@site/static/img/undraw_going-up_g8av.svg').default}
|
"policies and compression options"}
|
||||||
|
Svg={require('@site/static/img/undraw_going-up_g8av.svg').default}
|
||||||
|
|
||||||
/>
|
/>
|
||||||
<Feature
|
<Feature
|
||||||
title={'Restore to any point in time'}
|
title={'Restore to any point in time'}
|
||||||
description={"Perform flexible restores to any point in time using a combination of " +
|
description={"Perform flexible restores to any point in time using a combination of " +
|
||||||
"base backups and WAL archives."}
|
"base backups and WAL archives."}
|
||||||
Svg={require('@site/static/img/undraw_season-change_ohe6.svg').default}
|
Svg={require('@site/static/img/undraw_season-change_ohe6.svg').default}
|
||||||
/>
|
/>
|
||||||
<Feature
|
<Feature
|
||||||
title={'Cloud-native architecture'}
|
title={'Cloud-native architecture'}
|
||||||
description={"Seamlessly integrate with all major cloud providers and on-premises object storage solutions."}
|
description={"Seamlessly integrate with all major cloud providers and on-premises object storage solutions."}
|
||||||
Svg={require('@site/static/img/undraw_maintenance_rjtm.svg').default}
|
Svg={require('@site/static/img/undraw_maintenance_rjtm.svg').default}
|
||||||
/>
|
/>
|
||||||
</div>
|
</div>
|
||||||
|
<div className={clsx('row', styles.quickLinksSection)}>
|
||||||
|
<div className="col col--12">
|
||||||
|
<Heading as="h2" className={styles.quickLinksTitle}>Quick Links</Heading>
|
||||||
|
<div className={styles.quickLinksContainer}>
|
||||||
|
<Link to="/docs/installation" className="button button--primary">
|
||||||
|
Installation Guide
|
||||||
|
</Link>
|
||||||
|
<Link to="/docs/concepts" className="button button--primary">
|
||||||
|
Core Concepts
|
||||||
|
</Link>
|
||||||
|
<Link to="/docs/usage" className="button button--primary">
|
||||||
|
Usage Examples
|
||||||
|
</Link>
|
||||||
|
<Link to="/docs/object_stores" className="button button--primary">
|
||||||
|
Object Store Setup
|
||||||
|
</Link>
|
||||||
|
<Link to="/docs/migration" className="button button--primary">
|
||||||
|
Migration Guide
|
||||||
|
</Link>
|
||||||
|
<Link to="/docs/troubleshooting" className="button button--primary">
|
||||||
|
Troubleshooting
|
||||||
|
</Link>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</>
|
||||||
)
|
)
|
||||||
}
|
}
|
||||||
|
|||||||
@ -9,3 +9,19 @@
|
|||||||
height: 200px;
|
height: 200px;
|
||||||
width: 200px;
|
width: 200px;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
.quickLinksSection {
|
||||||
|
margin-top: 3rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.quickLinksTitle {
|
||||||
|
text-align: center;
|
||||||
|
margin-bottom: 2rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.quickLinksContainer {
|
||||||
|
display: flex;
|
||||||
|
justify-content: center;
|
||||||
|
gap: 1rem;
|
||||||
|
flex-wrap: wrap;
|
||||||
|
}
|
||||||
|
|||||||
@ -16,6 +16,16 @@ function HomepageHeader(): ReactElement<null> {
|
|||||||
<Heading as="h1" className="hero__title">
|
<Heading as="h1" className="hero__title">
|
||||||
{siteConfig.title}
|
{siteConfig.title}
|
||||||
</Heading>
|
</Heading>
|
||||||
|
<p className="hero__subtitle">
|
||||||
|
Enable online continuous physical backups of PostgreSQL clusters to object storage
|
||||||
|
</p>
|
||||||
|
<div className={styles.buttons}>
|
||||||
|
<Link
|
||||||
|
className="button button--secondary button--lg"
|
||||||
|
to="/docs/intro">
|
||||||
|
Get Started with the Documentation →
|
||||||
|
</Link>
|
||||||
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</header>
|
</header>
|
||||||
);
|
);
|
||||||
|
|||||||
@ -23,9 +23,14 @@ for detailed instructions.
|
|||||||
|
|
||||||
To use the Barman Cloud Plugin, you need:
|
To use the Barman Cloud Plugin, you need:
|
||||||
|
|
||||||
- [CloudNativePG](https://cloudnative-pg.io) version **1.26** <!-- ADD WHEN 1.27 IS OUT "or later" -->
|
- [CloudNativePG](https://cloudnative-pg.io) version **1.26** or later
|
||||||
|
- **Version 1.27.0 or later is strongly recommended** for enhanced plugin error and status reporting
|
||||||
|
- See the [upgrade guide](https://cloudnative-pg.io/documentation/current/installation_upgrade) if you need to upgrade
|
||||||
- [cert-manager](https://cert-manager.io/) to enable TLS communication between
|
- [cert-manager](https://cert-manager.io/) to enable TLS communication between
|
||||||
the plugin and the operator
|
the plugin and the operator
|
||||||
|
- [kubectl-cnpg plugin](https://cloudnative-pg.io/documentation/current/kubectl-plugin/) (recommended)
|
||||||
|
- Provides enhanced debugging and status checking capabilities
|
||||||
|
- Install via `kubectl krew install cnpg` or download from [releases](https://github.com/cloudnative-pg/cloudnative-pg/releases)
|
||||||
|
|
||||||
## Key Features
|
## Key Features
|
||||||
|
|
||||||
|
|||||||
491
web/versioned_docs/version-0.6.0/troubleshooting.md
Normal file
491
web/versioned_docs/version-0.6.0/troubleshooting.md
Normal file
@ -0,0 +1,491 @@
|
|||||||
|
---
|
||||||
|
sidebar_position: 50
|
||||||
|
---
|
||||||
|
|
||||||
|
# Troubleshooting
|
||||||
|
|
||||||
|
<!-- SPDX-License-Identifier: CC-BY-4.0 -->
|
||||||
|
|
||||||
|
This guide helps you diagnose and resolve common issues with the Barman Cloud Plugin.
|
||||||
|
|
||||||
|
## Before You Begin
|
||||||
|
|
||||||
|
### Recommended Upgrades
|
||||||
|
|
||||||
|
:::important
|
||||||
|
**CloudNativePG 1.27.0** offers significantly improved error and status reporting for plugins. If you're experiencing issues, we strongly recommend upgrading to version 1.27.0 or later for better diagnostics.
|
||||||
|
|
||||||
|
- **Upgrade CloudNativePG**: Follow the [official upgrade guide](https://cloudnative-pg.io/documentation/current/installation_upgrade)
|
||||||
|
- **Update kubectl-cnpg plugin**: Install or update the kubectl plugin for better debugging capabilities. See the [kubectl plugin documentation](https://cloudnative-pg.io/documentation/current/kubectl-plugin/)
|
||||||
|
:::
|
||||||
|
|
||||||
|
### Viewing Logs
|
||||||
|
|
||||||
|
To effectively troubleshoot issues, you need to check logs from multiple sources:
|
||||||
|
|
||||||
|
:::note
|
||||||
|
The following commands assume you've installed the CloudNativePG operator in the default `cnpg-system` namespace. If you've installed it in a different namespace, adjust the commands accordingly.
|
||||||
|
:::
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# View operator logs (contains plugin interaction logs)
|
||||||
|
# Assumes operator is installed in the default cnpg-system namespace
|
||||||
|
kubectl logs -n cnpg-system deployment/cnpg-controller-manager -f
|
||||||
|
|
||||||
|
# View sidecar container logs (barman-cloud operations)
|
||||||
|
kubectl logs -n <namespace> <cluster-pod-name> -c plugin-barman-cloud -f
|
||||||
|
|
||||||
|
# View plugin manager logs
|
||||||
|
kubectl logs -n cnpg-system deployment/barman-cloud -f
|
||||||
|
|
||||||
|
# View all containers in a pod
|
||||||
|
kubectl logs -n <namespace> <cluster-pod-name> --all-containers=true
|
||||||
|
|
||||||
|
# View previous container logs (if container restarted)
|
||||||
|
kubectl logs -n <namespace> <cluster-pod-name> -c plugin-barman-cloud --previous
|
||||||
|
```
|
||||||
|
|
||||||
|
## Common Issues
|
||||||
|
|
||||||
|
### Plugin Installation Issues
|
||||||
|
|
||||||
|
#### Plugin pods not starting
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- Plugin pods are in `CrashLoopBackOff` or `Error` state
|
||||||
|
- Plugin deployment is not ready
|
||||||
|
|
||||||
|
**Possible causes and solutions:**
|
||||||
|
|
||||||
|
1. **Certificate issues**
|
||||||
|
```bash
|
||||||
|
# Check if cert-manager is installed and running
|
||||||
|
kubectl get pods -n cert-manager
|
||||||
|
|
||||||
|
# Check if the plugin certificate is created
|
||||||
|
kubectl get certificates -n cnpg-system
|
||||||
|
```
|
||||||
|
|
||||||
|
If cert-manager is not installed, install it first:
|
||||||
|
```bash
|
||||||
|
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Image pull errors**
|
||||||
|
```bash
|
||||||
|
# Check pod events for image pull errors
|
||||||
|
kubectl describe pod -n cnpg-system -l app.kubernetes.io/name=barman-cloud
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify the image exists and you have proper credentials if using a private registry.
|
||||||
|
|
||||||
|
3. **Resource constraints**
|
||||||
|
```bash
|
||||||
|
# Check node resources
|
||||||
|
kubectl top nodes
|
||||||
|
kubectl describe nodes
|
||||||
|
```
|
||||||
|
|
||||||
|
Ensure your cluster has sufficient CPU and memory resources.
|
||||||
|
|
||||||
|
### Backup Failures
|
||||||
|
|
||||||
|
#### Quick Backup Troubleshooting Checklist
|
||||||
|
|
||||||
|
When a backup fails, follow these steps in order:
|
||||||
|
|
||||||
|
1. **Check backup status**: `kubectl get backups.postgresql.cnpg.io -n <namespace>`
|
||||||
|
2. **Get error details and target pod**:
|
||||||
|
```bash
|
||||||
|
kubectl describe backups.postgresql.cnpg.io -n <namespace> <backup-name>
|
||||||
|
# Or extract just the target pod name
|
||||||
|
kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='{.status.instanceID.podName}'
|
||||||
|
```
|
||||||
|
3. **Check the specific target pod's sidecar logs**:
|
||||||
|
```bash
|
||||||
|
TARGET_POD=$(kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='{.status.instanceID.podName}')
|
||||||
|
kubectl logs -n <namespace> $TARGET_POD -c plugin-barman-cloud --tail=100 | grep -E "ERROR|FATAL|panic"
|
||||||
|
```
|
||||||
|
4. **Check cluster events**: `kubectl get events -n <namespace> --field-selector involvedObject.name=<cluster-name> --sort-by='.lastTimestamp'`
|
||||||
|
5. **Verify plugin is running**: `kubectl get pods -n cnpg-system -l app.kubernetes.io/name=barman-cloud`
|
||||||
|
6. **Check operator logs**: `kubectl logs -n cnpg-system deployment/cnpg-controller-manager --tail=100 | grep -i "backup\|plugin"`
|
||||||
|
7. **Check plugin manager logs**: `kubectl logs -n cnpg-system deployment/barman-cloud --tail=100`
|
||||||
|
|
||||||
|
#### Backup job fails immediately
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- Backup pods terminate with error
|
||||||
|
- No backup files appear in object storage
|
||||||
|
- Backup shows `failed` phase with various error messages
|
||||||
|
|
||||||
|
**Common failure modes and solutions:**
|
||||||
|
|
||||||
|
1. **"requested plugin is not available" errors**
|
||||||
|
```
|
||||||
|
ERROR: requested plugin is not available: barman
|
||||||
|
ERROR: requested plugin is not available: barman-cloud
|
||||||
|
ERROR: requested plugin is not available: barman-cloud.cloudnative-pg.io
|
||||||
|
```
|
||||||
|
|
||||||
|
**Cause:** The plugin name in the Cluster configuration doesn't match the deployed plugin or the plugin isn't registered
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
|
||||||
|
a. **Check plugin registration status**:
|
||||||
|
```bash
|
||||||
|
# If you have kubectl-cnpg plugin installed (v1.27.0+)
|
||||||
|
kubectl cnpg status -n <namespace> <cluster-name>
|
||||||
|
```
|
||||||
|
|
||||||
|
Look for the "Plugins status" section:
|
||||||
|
```
|
||||||
|
Plugins status
|
||||||
|
Name Version Status Reported Operator Capabilities
|
||||||
|
---- ------- ------ ------------------------------
|
||||||
|
barman-cloud.cloudnative-pg.io 0.6.0 N/A Reconciler Hooks, Lifecycle Service
|
||||||
|
```
|
||||||
|
|
||||||
|
:::tip
|
||||||
|
If the Plugins status section is missing:
|
||||||
|
- Install or update kubectl-cnpg plugin to the latest version
|
||||||
|
- Ensure CloudNativePG operator is v1.27.0 or later
|
||||||
|
:::
|
||||||
|
|
||||||
|
b. **Verify correct plugin name in Cluster spec**:
|
||||||
|
```yaml
|
||||||
|
apiVersion: postgresql.cnpg.io/v1
|
||||||
|
kind: Cluster
|
||||||
|
spec:
|
||||||
|
plugins:
|
||||||
|
- name: barman-cloud.cloudnative-pg.io
|
||||||
|
parameters:
|
||||||
|
barmanObjectStore: <your-objectstore-name>
|
||||||
|
```
|
||||||
|
|
||||||
|
c. **Check plugin deployment is running**:
|
||||||
|
```bash
|
||||||
|
kubectl get deployment -n cnpg-system barman-cloud
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **"rpc error: code = Unknown desc = panic caught: assignment to entry in nil map" errors**
|
||||||
|
|
||||||
|
**Cause:** Configuration issue, often a typo or missing required field in the ObjectStore configuration
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
- Check the sidecar container logs for detailed error messages:
|
||||||
|
```bash
|
||||||
|
kubectl logs -n <namespace> <cluster-pod> -c plugin-barman-cloud
|
||||||
|
```
|
||||||
|
- Verify your ObjectStore configuration has all required fields
|
||||||
|
- Common issues include:
|
||||||
|
- Missing or incorrect secret references
|
||||||
|
- Typos in configuration parameters
|
||||||
|
- Missing required environment variables in secrets
|
||||||
|
|
||||||
|
**General debugging steps:**
|
||||||
|
|
||||||
|
1. **Check backup status and identify the target instance**
|
||||||
|
```bash
|
||||||
|
# List all backups and their status
|
||||||
|
kubectl get backups.postgresql.cnpg.io -n <namespace>
|
||||||
|
|
||||||
|
# Using kubectl-cnpg plugin (if installed)
|
||||||
|
kubectl cnpg backup list -n <namespace>
|
||||||
|
|
||||||
|
# Get detailed backup information including error messages and target instance
|
||||||
|
kubectl describe backups.postgresql.cnpg.io -n <namespace> <backup-name>
|
||||||
|
|
||||||
|
# Extract the target pod name from a failed backup
|
||||||
|
kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='{.status.instanceID.podName}'
|
||||||
|
|
||||||
|
# Or get more details including the target pod, method, phase and error
|
||||||
|
kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='Pod: {.status.instanceID.podName}{"\n"}Method: {.status.method}{"\n"}Phase: {.status.phase}{"\n"}Error: {.status.error}{"\n"}'
|
||||||
|
|
||||||
|
# Check the cluster status for backup-related information
|
||||||
|
kubectl cnpg status <cluster-name> -n <namespace> --verbose
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Check sidecar logs on the backup target pod**
|
||||||
|
```bash
|
||||||
|
# First, identify which pod was the backup target (from step 1)
|
||||||
|
TARGET_POD=$(kubectl get backups.postgresql.cnpg.io -n <namespace> <backup-name> -o jsonpath='{.status.instanceID.podName}')
|
||||||
|
echo "Backup target pod: $TARGET_POD"
|
||||||
|
|
||||||
|
# Check the sidecar logs on the specific target pod
|
||||||
|
kubectl logs -n <namespace> $TARGET_POD -c plugin-barman-cloud --tail=100
|
||||||
|
|
||||||
|
# Follow the logs in real-time to see ongoing issues
|
||||||
|
kubectl logs -n <namespace> $TARGET_POD -c plugin-barman-cloud -f
|
||||||
|
|
||||||
|
# Check for specific errors in the target pod around the backup time
|
||||||
|
kubectl logs -n <namespace> $TARGET_POD -c plugin-barman-cloud --since=10m | grep -E "ERROR|FATAL|panic|failed"
|
||||||
|
|
||||||
|
# Alternative: List all cluster pods and their roles
|
||||||
|
kubectl get pods -n <namespace> -l cnpg.io/cluster=<cluster-name> \
|
||||||
|
-o custom-columns=NAME:.metadata.name,ROLE:.metadata.labels.cnpg\\.io/instanceRole,INSTANCE:.metadata.labels.cnpg\\.io/instanceName
|
||||||
|
|
||||||
|
# Check sidecar logs on ALL cluster pods for any errors (if target is unclear)
|
||||||
|
for pod in $(kubectl get pods -n <namespace> -l cnpg.io/cluster=<cluster-name> -o name); do
|
||||||
|
echo "=== Checking $pod ==="
|
||||||
|
kubectl logs -n <namespace> $pod -c plugin-barman-cloud --tail=20 | grep -i error || echo "No errors found"
|
||||||
|
done
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Check events for backup-related issues**
|
||||||
|
```bash
|
||||||
|
# Check events for the cluster
|
||||||
|
kubectl get events -n <namespace> --field-selector involvedObject.name=<cluster-name>
|
||||||
|
|
||||||
|
# Check events for failed backups
|
||||||
|
kubectl get events -n <namespace> --field-selector involvedObject.kind=Backup
|
||||||
|
|
||||||
|
# Get all recent events in the namespace
|
||||||
|
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Verify ObjectStore configuration**
|
||||||
|
```bash
|
||||||
|
# Check the ObjectStore resource
|
||||||
|
kubectl get objectstores.barmancloud.cnpg.io -n <namespace> <objectstore-name> -o yaml
|
||||||
|
|
||||||
|
# Verify the secret exists and has correct keys
|
||||||
|
kubectl get secret -n <namespace> <secret-name> -o yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Common error messages and solutions:**
|
||||||
|
|
||||||
|
- **"AccessDenied" or "403 Forbidden"**: Check cloud credentials and bucket permissions
|
||||||
|
- **"NoSuchBucket"**: Verify the bucket exists and the endpoint URL is correct
|
||||||
|
- **"Connection timeout"**: Check network connectivity and firewall rules
|
||||||
|
- **"SSL certificate problem"**: For self-signed certificates, check CA bundle configuration
|
||||||
|
|
||||||
|
#### Backup performance issues
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- Backups take extremely long
|
||||||
|
- Backups timeout
|
||||||
|
|
||||||
|
**Plugin-specific considerations:**
|
||||||
|
|
||||||
|
1. **Check ObjectStore parallelism settings**
|
||||||
|
- Adjust `maxParallel` in ObjectStore configuration
|
||||||
|
- Monitor sidecar container resource usage during backups
|
||||||
|
|
||||||
|
2. **Verify plugin resource allocation**
|
||||||
|
- Check if the sidecar container has sufficient CPU/memory
|
||||||
|
- Review plugin container logs for resource-related warnings
|
||||||
|
|
||||||
|
:::tip
|
||||||
|
For Barman-specific features like compression, encryption, and performance tuning, refer to the [Barman documentation](https://docs.pgbarman.org/latest/).
|
||||||
|
:::
|
||||||
|
|
||||||
|
### WAL Archiving Issues
|
||||||
|
|
||||||
|
#### WAL archiving through plugin stops working
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- WAL files accumulating on primary
|
||||||
|
- Cluster warnings about WAL archiving
|
||||||
|
- Plugin sidecar logs show WAL archive errors
|
||||||
|
|
||||||
|
**Plugin-specific debugging:**
|
||||||
|
|
||||||
|
1. **Check plugin sidecar logs for WAL archiving errors**
|
||||||
|
```bash
|
||||||
|
# Check recent WAL archive operations in sidecar
|
||||||
|
kubectl logs -n <namespace> <primary-pod> -c plugin-barman-cloud --tail=50 | grep -i wal
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Verify the plugin is handling archive_command**
|
||||||
|
```bash
|
||||||
|
# The archive_command should be routing through the plugin
|
||||||
|
kubectl exec -n <namespace> <primary-pod> -c postgres -- psql -U postgres -c "SHOW archive_command;"
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Check ObjectStore configuration for WAL settings**
|
||||||
|
- Ensure ObjectStore has proper WAL retention settings
|
||||||
|
- Verify credentials have permissions for WAL operations
|
||||||
|
|
||||||
|
### Restore Issues
|
||||||
|
|
||||||
|
#### Restore fails during recovery
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- New cluster stuck in recovery mode
|
||||||
|
- Plugin sidecar shows restore errors
|
||||||
|
- PostgreSQL won't start
|
||||||
|
|
||||||
|
**Plugin-specific debugging:**
|
||||||
|
|
||||||
|
1. **Check plugin sidecar logs during restore**
|
||||||
|
```bash
|
||||||
|
# Check the sidecar logs on the recovering cluster pods
|
||||||
|
kubectl logs -n <namespace> <cluster-pod-name> -c plugin-barman-cloud --tail=100
|
||||||
|
|
||||||
|
# Look for restore-related errors
|
||||||
|
kubectl logs -n <namespace> <cluster-pod-name> -c plugin-barman-cloud | grep -E "restore|recovery|ERROR"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Verify plugin can access backups**
|
||||||
|
```bash
|
||||||
|
# Check if ObjectStore is properly configured for restore
|
||||||
|
kubectl get objectstores.barmancloud.cnpg.io -n <namespace> <objectstore-name> -o yaml
|
||||||
|
|
||||||
|
# Check PostgreSQL recovery logs
|
||||||
|
kubectl logs -n <namespace> <cluster-pod> -c postgres | grep -i recovery
|
||||||
|
```
|
||||||
|
|
||||||
|
:::tip
|
||||||
|
For detailed Barman restore operations and troubleshooting, refer to the [Barman documentation](https://docs.pgbarman.org/latest/barman-cloud-restore.html).
|
||||||
|
:::
|
||||||
|
|
||||||
|
#### Point-in-time recovery (PITR) configuration issues
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- PITR target time not reached
|
||||||
|
- Plugin sidecar shows WAL access errors
|
||||||
|
- Recovery stops before target
|
||||||
|
|
||||||
|
**Plugin-specific configuration:**
|
||||||
|
|
||||||
|
1. **Verify plugin configuration for PITR**
|
||||||
|
```yaml
|
||||||
|
apiVersion: postgresql.cnpg.io/v1
|
||||||
|
kind: Cluster
|
||||||
|
spec:
|
||||||
|
plugins:
|
||||||
|
- name: barman-cloud.cloudnative-pg.io
|
||||||
|
parameters:
|
||||||
|
barmanObjectStore: <objectstore-name>
|
||||||
|
bootstrap:
|
||||||
|
recovery:
|
||||||
|
recoveryTarget:
|
||||||
|
targetTime: "2024-01-15 10:30:00"
|
||||||
|
targetTimezone: "UTC"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Check plugin sidecar for WAL access**
|
||||||
|
```bash
|
||||||
|
# Check sidecar logs during recovery for WAL-related errors
|
||||||
|
kubectl logs -n <namespace> <cluster-pod> -c plugin-barman-cloud | grep -i wal
|
||||||
|
```
|
||||||
|
|
||||||
|
:::note
|
||||||
|
For detailed PITR configuration and WAL management, see the [Barman PITR documentation](https://docs.pgbarman.org/latest/).
|
||||||
|
:::
|
||||||
|
|
||||||
|
### Plugin Configuration Issues
|
||||||
|
|
||||||
|
#### Plugin cannot connect to object storage
|
||||||
|
|
||||||
|
**Symptoms:**
|
||||||
|
- Plugin sidecar logs show connection errors
|
||||||
|
- Backups fail with authentication or network errors
|
||||||
|
- ObjectStore resource shows errors
|
||||||
|
|
||||||
|
**Plugin-specific solutions:**
|
||||||
|
|
||||||
|
1. **Verify ObjectStore CRD configuration**
|
||||||
|
```bash
|
||||||
|
# Check ObjectStore resource status
|
||||||
|
kubectl get objectstores.barmancloud.cnpg.io -n <namespace> <objectstore-name> -o yaml
|
||||||
|
|
||||||
|
# Verify the secret exists and has correct keys for your provider
|
||||||
|
kubectl get secret -n <namespace> <secret-name> -o jsonpath='{.data}' | jq 'keys'
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Check plugin sidecar connectivity**
|
||||||
|
```bash
|
||||||
|
# Check sidecar logs for connection errors
|
||||||
|
kubectl logs -n <namespace> <cluster-pod> -c plugin-barman-cloud | grep -E "connection|timeout|SSL|certificate"
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Provider-specific configuration**
|
||||||
|
- See [Object Store Configuration](object_stores.md) for provider-specific settings
|
||||||
|
- Ensure `endpointURL` and `s3UsePathStyle` match your storage type
|
||||||
|
- Verify network policies allow egress to your storage provider
|
||||||
|
|
||||||
|
## Diagnostic Commands
|
||||||
|
|
||||||
|
### Using kubectl-cnpg plugin
|
||||||
|
|
||||||
|
The kubectl-cnpg plugin provides enhanced debugging capabilities. Make sure you have it installed and updated:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install or update kubectl-cnpg plugin
|
||||||
|
kubectl krew install cnpg
|
||||||
|
# Or download directly from: https://github.com/cloudnative-pg/cloudnative-pg/releases
|
||||||
|
|
||||||
|
# Check plugin status (requires CNPG 1.27.0+)
|
||||||
|
kubectl cnpg status <cluster-name> -n <namespace>
|
||||||
|
|
||||||
|
# View cluster status in detail
|
||||||
|
kubectl cnpg status <cluster-name> -n <namespace> --verbose
|
||||||
|
|
||||||
|
# Check backup status
|
||||||
|
kubectl cnpg backup list -n <namespace>
|
||||||
|
|
||||||
|
# View plugin capabilities
|
||||||
|
kubectl cnpg plugin list -n <namespace>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Getting Help
|
||||||
|
|
||||||
|
If you continue to experience issues:
|
||||||
|
|
||||||
|
1. **Check the documentation**
|
||||||
|
- Review the [Installation Guide](installation.mdx)
|
||||||
|
- Check [Object Store Configuration](object_stores.md) for provider-specific settings
|
||||||
|
- Review [Usage Examples](usage.md) for correct configuration patterns
|
||||||
|
|
||||||
|
2. **Gather diagnostic information**
|
||||||
|
```bash
|
||||||
|
# Create a diagnostic bundle (⚠️sanitize these before sharing!)
|
||||||
|
kubectl get objectstores.barmancloud.cnpg.io -A -o yaml > /tmp/objectstores.yaml
|
||||||
|
kubectl get clusters.postgresql.cnpg.io -A -o yaml > /tmp/clusters.yaml
|
||||||
|
kubectl logs -n cnpg-system deployment/barman-cloud --tail=1000 > /tmp/plugin.log
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Community support**
|
||||||
|
- CloudNativePG Slack: [#cloudnativepg-users](https://cloud-native.slack.com/messages/cloudnativepg-users)
|
||||||
|
- GitHub Issues: [plugin-barman-cloud](https://github.com/cloudnative-pg/plugin-barman-cloud/issues)
|
||||||
|
|
||||||
|
4. **When reporting issues, include:**
|
||||||
|
- CloudNativePG version
|
||||||
|
- Barman Cloud Plugin version
|
||||||
|
- Kubernetes version
|
||||||
|
- Cloud provider and region
|
||||||
|
- Relevant configuration (⚠️sanitize/redact sensitive information)
|
||||||
|
- Error messages and logs
|
||||||
|
- Steps to reproduce
|
||||||
|
|
||||||
|
## Known Issues and Limitations
|
||||||
|
|
||||||
|
### Current Known Issues
|
||||||
|
|
||||||
|
1. **WAL overwrite protection**: Unlike the in-tree Barman archiver, the plugin doesn't prevent WAL overwrites when multiple clusters share the same name and object store path ([#263](https://github.com/cloudnative-pg/plugin-barman-cloud/issues/263))
|
||||||
|
2. **Migration compatibility**: After migrating from in-tree backup to the plugin, the `kubectl cnpg backup` command syntax has changed ([#353](https://github.com/cloudnative-pg/plugin-barman-cloud/issues/353)):
|
||||||
|
```bash
|
||||||
|
# Old command (in-tree, no longer works after migration)
|
||||||
|
kubectl cnpg backup -n <namespace> <cluster-name> --method=barmanObjectStore
|
||||||
|
|
||||||
|
# New command (plugin-based)
|
||||||
|
kubectl cnpg backup -n <namespace> <cluster-name> --method=plugin --plugin-name=barman-cloud.cloudnative-pg.io
|
||||||
|
```
|
||||||
|
|
||||||
|
### Plugin Limitations
|
||||||
|
|
||||||
|
1. **Installation method**: Currently only supports manifest and Kustomize installation ([#351](https://github.com/cloudnative-pg/plugin-barman-cloud/issues/351) - Helm chart requested)
|
||||||
|
2. **Sidecar resource sharing**: The plugin sidecar container shares pod resources with PostgreSQL
|
||||||
|
3. **Plugin restart behavior**: Restarting the sidecar container requires restarting the entire PostgreSQL pod
|
||||||
|
|
||||||
|
### Compatibility Matrix
|
||||||
|
|
||||||
|
| Plugin Version | CloudNativePG Version | Kubernetes Version | Notes |
|
||||||
|
|---------------|----------------------|-------------------|--------|
|
||||||
|
| 0.6.x | 1.26.x, **1.27.x** (recommended) | 1.28+ | CNPG 1.27.0+ provides enhanced plugin status reporting |
|
||||||
|
| 0.5.x | 1.25.x, 1.26.x | 1.27+ | Limited plugin diagnostics |
|
||||||
|
|
||||||
|
:::tip
|
||||||
|
Always check the [Release Notes](https://github.com/cloudnative-pg/plugin-barman-cloud/releases) for version-specific known issues and fixes.
|
||||||
|
:::
|
||||||
Loading…
Reference in New Issue
Block a user