Backup Linux to S3

Share this post on:

A Bash script that archives a specified directory and securely uploads it to an S3 bucket is described.

Overview

The script is designed to:

  • Resolve the supplied directory to its canonical absolute path.
  • Create a compressed archive using tar and generate a safe, filename-friendly archive name.
  • Construct an S3 object key based on the target directory.
  • Securely sign and transfer the archive to the specified S3 endpoint using curl.
  • Clean up local artifacts after a successful upload.

Our implementation leverages practices such as error propagation (set -euo pipefail), clear function abstractions, detailed debug logging (when enabled), and credential placeholders to emphasize security.

Best Practices Applied

  1. Robust Error Handling:
    The script is configured with set -euo pipefail to ensure that errors are quickly surfaced and that command pipelines are robust.
  2. Canonical Path Resolution:
    Instead of relying solely on the current working directory, we convert the input to a canonical absolute path using realpath (or an equivalent fallback). This guarantees consistent outcomes regardless of how the input path is specified.
  3. Secure Credential Management:
    While the script uses placeholders for the S3 credentials and endpoint details, the design encourages the usage of secure storage solutions (e.g., environment variables or Azure Key Vault) in a production setting.
  4. Modular Design:
    The script is broken into well-defined functions (for debugging, path resolution, archiving, uploading, and cleanup), which enhances readability and maintainability.
  5. Clean Output for Integration:
    Debug and progress information is sent to standard error (stderr) to avoid interfering with standard output. This makes the script suitable for integration with other automated processes.

Complete Script

The following script encapsulates these design principles. Replace the placeholder values <YOUR_S3_KEY>, <YOUR_S3_SECRET>, <YOUR_S3_HOST>, and <YOUR_S3_BUCKET> with your actual credentials and configuration.

#!/bin/bash
set -euo pipefail

# Set DEBUG=1 to enable detailed debug output; 0 disables debug messages.
DEBUG=1

# Timestamp for naming (e.g. 20250424-1804)
now="$(date +'%Y%m%d-%H%M')"

# AWS S3 Credentials – replace these with your actual credentials.
s3Key="<YOUR_S3_KEY>"
s3Secret="<YOUR_S3_SECRET>"

# S3 Configuration – replace placeholders with your S3 host and bucket name.
host="<YOUR_S3_HOST>"        # e.g., s3.amazonaws.com or your designated endpoint.
bucket="<YOUR_S3_BUCKET>"      # e.g., my-backup-bucket.
contentType="application/x-tar"

#-----------------------------------------------------
# Function: debug
#-----------------------------------------------------
# Outputs debugging information to stderr.
debug() {
    if [[ ${DEBUG:-0} -eq 1 ]]; then
        echo "[DEBUG]" "$@" >&2
    fi
}

#-----------------------------------------------------
# Function: resolve_path
#-----------------------------------------------------
# Resolves the supplied path to its canonical absolute path.
resolve_path() {
    local input="$1"
    if command -v realpath >/dev/null 2>&1; then
        realpath "$input"
    elif command -v readlink >/dev/null 2>&1 && readlink -f "$input" >/dev/null 2>&1; then
        readlink -f "$input"
    else
        # Fallback: use pwd with parameter expansion.
        echo "$(cd "$(dirname "$input")" && pwd)/$(basename "$input")"
    fi
}

#-----------------------------------------------------
# Function: create_archive
#-----------------------------------------------------
# Archives the provided directory into a tar.gz file and saves it in /tmp.
# The archive filename is derived from the canonical directory path,
# replacing '/' with '-' to produce a safe filename.
create_archive() {
    local dir="$1"
    # Convert the directory path to its absolute canonical form.
    local abs_dir
    abs_dir=$(resolve_path "$dir")
    debug "Absolute path for directory: $abs_dir"

    # Generate a safe directory name for the archive (replace '/' with '-').
    local safe_dir
    safe_dir=$(echo "$abs_dir" | sed 's#/#-#g')
    debug "Safe directory name: $safe_dir"

    local filename="/tmp/grrg-${safe_dir}-${now}.tar.gz"
    debug "Creating archive: $filename from directory: $abs_dir"

    # Use tar with the -C flag so only the directory’s basename is stored in the archive.
    tar -czf "$filename" -C "$(dirname "$abs_dir")" "$(basename "$abs_dir")"
    debug "Archive created at: $filename"

    if [[ -f "$filename" ]]; then
        debug "Archive file details:" 
        ls -l "$filename" >&2
    else
        echo "Error: Archive $filename not created." >&2
        exit 1
    fi
    echo "$filename"
}

#-----------------------------------------------------
# Function: upload_to_s3
#-----------------------------------------------------
# Uploads the created archive file to S3 using curl.
# Constructs the S3 object key from the target directory’s last folder name
# and the archive’s basename.
upload_to_s3() {
    local filename="$1"
    local target_dir="$2"

    # Validate that the archive file exists.
    if [[ ! -f "$filename" ]]; then
        echo "Error: Archive file '$filename' does not exist." >&2
        exit 1
    fi

    local archive_basename
    archive_basename=$(basename "$filename")
    local target_folder
    target_folder=$(basename "$target_dir")
    # S3 object key format: <target_folder>/<archive_basename>
    local s3_file="${target_folder}/${archive_basename}"

    local resource="/${bucket}/${s3_file}"
    local dateValue
    dateValue=$(date -R)
    local stringToSign="PUT\n\n${contentType}\n${dateValue}\n${resource}"
    local signature
    signature=$(echo -en "$stringToSign" | openssl sha1 -hmac "$s3Secret" -binary | base64)

    debug "S3 Upload Info:"
    debug "Archive filename: $filename"
    debug "S3 Resource: $resource"
    debug "Date: $dateValue"
    debug "StringToSign:" 
    debug "$stringToSign"
    debug "Signature: $signature"

    echo "Uploading $filename to S3..." >&2
    curl --progress-bar -# -k -L -X PUT -T "$filename" \
         -H "Host: ${bucket}.${host}" \
         -H "Date: ${dateValue}" \
         -H "Content-Type: ${contentType}" \
         -H "Authorization: AWS ${s3Key}:${signature}" \
         "https://${bucket}.${host}/${s3_file}" || {
             echo "Error during file upload." >&2
             exit 1
         }
    echo "Uploaded successfully!" >&2
}

#-----------------------------------------------------
# Function: cleanup
#-----------------------------------------------------
# Removes the local archive file after successful upload.
cleanup() {
    local filename="$1"
    echo "Removing local archive: $filename" >&2
    rm -f "$filename"
    debug "Local archive removed."
}

#-----------------------------------------------------
# Main Execution
#-----------------------------------------------------
if [[ $# -ne 1 ]]; then
    echo "Usage: $0 <directory>" >&2
    exit 1
fi

input_dir="$1"
abs_input_dir=$(resolve_path "$input_dir")
if [[ ! -d "$abs_input_dir" ]]; then
    echo "Error: '$abs_input_dir' is not a valid directory." >&2
    exit 1
fi

debug "Starting backup for directory: $abs_input_dir"

archive_file=$(create_archive "$abs_input_dir")
upload_to_s3 "$archive_file" "$abs_input_dir"
cleanup "$archive_file"

echo "Backup and upload process completed!" >&2

How to Deploy and Use

  1. Installation:
    Save the script as /usr/local/bin/backup_to_s3, and then make it executable with the following command:
   sudo chmod +x /usr/local/bin/backup_to_s3
  1. Configuration:
    Replace the placeholders in the script with your actual S3 credentials and configuration values. Ideally, store secrets securely and avoid hardcoding them in the source file.
  2. Execution:
    Run the script by specifying a single directory. The script accepts both relative and absolute paths (converting relative paths to canonical absolute paths using realpath or a fallback). For example:
   sudo backup_to_s3 /path/to/your/directory


or

   cd /path/to/your
   sudo backup_to_s3 directory

Conclusion

The script is suitable for production environments where reliability is critical.

If you have any questions or improvements, please leave a comment or get in touch. We hope this solution enhances your backup strategy and provides a foundation for further automation.

Share this post on:

Author: tayyebi

Tayyebi works in the role of Director at Gordarg where he is the founder. He is passionate about people, technology, and arts. Mohammad believes in communications, each of us has the power to empower their people with knowledge. He can be seen writing codes, playing music, biking, and reading.

View all posts by tayyebi >






www.Gordarg.com