2026SequoialaunchdcronrsyncSFTPopenrsyncremote Mac

2026 macOS Sequoia Unattended rsync/SFTP Hangs: launchd, cron, openrsync, and ssh-agent Decision Matrix

Engineers routinely prove that rsync over SSH to a remote Mac works from Terminal, then ship the same command inside launchd or cron and watch it wedge forever. macOS 15 Sequoia compounds the story because /usr/bin/rsync commonly executes openrsync, which community threads associate with stalls after hundreds or a little over a thousand files in high-churn trees. The stall is not a polite error code: CPU stays low, logs stay quiet, and operators restart by hand. In parallel, unattended jobs lack a TTY, so any passphrase-protected key without a preloaded ssh-agent session blocks exactly where an interactive user would have been prompted. This article separates those failure classes, gives a matrix of first actions, and links openrsync selection, checksum gates, concurrent SFTP, and atomic releases so you can treat unattended sync as a contract instead of folklore.

Operations teams often respond by buying bandwidth or blaming the remote host disk. Neither helps when the client never finishes listing files or when SSH never completes user authentication because BatchMode cannot prompt. A cheaper first step is to print the environment the daemon actually sees: PATH without Homebrew prefixes, missing SSH_AUTH_SOCK, and a working directory that differs from the engineer laptop. Once L0 SSH is deterministic, tune rsync implementation parity across ends. When the remote side still runs a different rsync feature set, --rsync-path pins the remote binary and removes silent protocol skew. If stalls persist, add --protocol=28 as a documented compatibility shim while you schedule installation of GNU rsync from Homebrew or MacPorts on both client and server roles.

Finally, connect unattended reliability to release hygiene. A script that restarts openrsync nightly but still writes into a live download directory simply publishes corruption faster. Stage uploads, verify checksums, then flip a symlink. The closing section contrasts self-built remote Mac maintenance with SFTPMAC hosted bare-metal Mac ingress where directory models and observability defaults are productized, which matters when your on-call rotation is already saturated.

openrsynclaunchdcronssh-agentremote Mac
2026 macOS Sequoia launchd cron unattended rsync SFTP remote Mac openrsync ssh-agent

Pain points: unattended is not copy-paste from Terminal

Interactive success, daemon failure. Terminal sessions inherit login shell PATH additions, agent sockets, and sometimes TouchID-backed key access. launchd plist jobs and cron entries inherit almost none of that unless you encode it explicitly. The failure signature is a hang after TCP connects or during incremental listing, not an immediate permission denied from the remote Mac.

openrsync stalls on large trees. Reports cluster around transfers that progress through many files then stop between files without error. That pattern differs from WAN congestion, which tends to throttle gradually, and differs from disk full, which eventually surfaces I/O errors. Mitigations include protocol flags, rsync binary alignment, and splitting very large trees into multiple jobs with independent logs.

SFTP batch waits forever. sftp -b in non-interactive mode cannot answer prompts. A permissions mismatch or a relative path typo can wedge the session silently unless verbose logging and stderr redirection are configured in the plist.

Decoupled from atomic publishing. Fixing rsync without fixing release mechanics means partial files become visible. Pair this article with staged directories and checksum gates so success means verified bytes, not merely exit code zero.

Layered symptoms: environment versus protocol stack

L0 proves SSH with BatchMode=yes under the same user context as the job. Failures here precede rsync tuning. L1 distinguishes list-building stalls from mid-file stalls. L2 addresses openrsync interoperability with the remote rsync implementation, including explicit --rsync-path and optional --protocol=28.

For launchd, validate UserName, WorkingDirectory, StandardOutPath, and StandardErrorPath. LaunchDaemons versus LaunchAgents also change keychain access patterns. Prefer dedicated CI keys without passphrases for unattended paths, and restrict them to upload-only accounts on the remote Mac.

Document how GitHub Actions self-hosted runners differ again: they inject another PATH layer and another home directory. The same wrapper script should print its identity at the top so triage stays factual instead of anecdotal.

When multiple teams share one Mac ingress, combine this runbook with concurrency caps so retries from one team do not stampede sshd while another team's job is mid-checksum.

Security reviewers sometimes ask whether exporting a broader PATH in plist XML weakens integrity guarantees. The pragmatic answer is that daemons already inherit an implicit policy whether you document it or not. Making PATH explicit is therefore a security improvement because it removes accidental reliance on whatever a human engineer typed last Friday. Pair explicit PATH with file permissions on the wrapper script itself so only the service account can execute it.

Capacity planning also benefits from honest accounting of retry storms. If openrsync occasionally needs three attempts to finish a snapshot, your concurrency model must reserve headroom for those attempts instead of assuming one-shot success. Otherwise a benign statistical tail becomes an outage when three pipelines align on the same bad minute.

Quantified baselines: measure stall seconds, not vibes

Before changing flags, capture a twenty-four hour baseline: start time, end time, file counts, bytes moved, exit codes, and seconds without byte progress. If progress idles longer than roughly three hundred seconds, wrap the command with timeout and alert. Correlate with RTT and loss on the WAN path so you do not misclassify L2 stalls as L0 network flaps.

For CI, define success as three explicit phases: transport complete, checksum verified, atomic pointer switched. Give each phase its own exit contract so flaky openrsync retries show up as measurable retry rates instead of binary pass-fail noise.

When executives ask whether the Mac or the network is broken, answer with distributions, not single anecdotes. Histograms of stall durations after protocol alignment show whether you are chasing a rare tail or a systemic regression tied to a macOS minor upgrade.

Revisit baselines after every Sequoia point release because Apple adjusts bundled tooling behavior across updates. A job that survived fifteen dot releases can still shift when openrsync defaults change.

When you present numbers to finance, translate stall minutes into developer wait minutes. A ten minute nightly stall across twenty engineers is not a networking footnote; it is hundreds of dollars of attention tax every week. Framing the problem in money often unlocks the modest infrastructure budget needed to pin GNU rsync consistently.

Also chart seasonality. End-of-quarter release windows concentrate uploads and amplify latent concurrency bugs. A matrix that works in February may still fail in December unless you rehearse peak shapes with synthetic load.

Finally, keep a one-page escalation ladder taped to the incident channel topic: first owner checks plist logs, second owner checks SSH BatchMode probes, third owner checks rsync versions, fourth owner engages network about NAT. Without that ladder, well-meaning responders parallelize conflicting experiments and extend downtime.

Decision matrix: first actions

SymptomLikely root causeFirst actionRisk or rollback
Only daemons hangPATH, missing agent socket, no TTYExport PATH explicitly; preload keys; fail fast with BatchModeOver-broad env exports can collide across plists; isolate per job
Stalls mid-treeopenrsync pairing skewAdd --protocol=28; set --rsync-path to GNU rsyncWrong remote path fails fast; test in staging
SFTP batch silentPrompt or permissionRun with -v; split batches; explicit byeVerbose logs may leak paths; scrub archives
Flaky after OS updateToolchain driftRe-pin rsync versions; rerun baseline suiteShort-term dual binaries increase support surface

How-to: seven ordered steps

#!/bin/bash
set -euo pipefail
export PATH="/usr/local/bin:/opt/homebrew/bin:/usr/bin:/bin:/usr/sbin:/sbin"
export RSYNC_RSH="ssh -o BatchMode=yes -o ServerAliveInterval=30"
/usr/bin/rsync -av --protocol=28 --rsync-path=/opt/homebrew/bin/rsync \
  ./artifacts/ "[email protected]:/data/inbox/staging/"

Step 1: Print id, pwd, and a sorted redacted env inside the plist context.

Step 2: Validate SSH with ssh -o BatchMode=yes -o ConnectTimeout=10 before rsync.

Step 3: Align rsync implementations and set --rsync-path; keep version numbers in internal docs.

Step 4: Wire keys: dedicated CI key or managed ssh-agent; never rely on interactive passphrase entry.

Step 5: Add timeout and bounded exponential backoff; log retry ordinal for openrsync recovery.

Step 6: Dry-run against staging; only then attach atomic switch steps from the release guide.

Step 7: Archive per-run summaries for monthly review of retry rates and tail latency.

After mechanical steps, rehearse one failure injection quarterly: kill rsync mid-transfer and verify gates mark the build bad without publishing partial artifacts. Exercises should include a runner reboot mid-job because that stresses both agent wiring and partial file cleanup.

Keep wrapper scripts in version control next to pipeline definitions so diffs ride normal code review. The goal is to stop secret per-host tweaks that never return to documentation.

If legal requires passphrase-protected keys everywhere, invest in an approved non-interactive unlock path rather than pretending cron can prompt. Split human keys from machine keys at the account boundary on the remote Mac.

When logs grow large, rotate them with the same discipline as application logs. Silent disks full of logs have caused more than one false attribution to openrsync.

Document ownership for each flag you add. A --protocol=28 workaround that lives only in one engineer's forked gist will disappear when that engineer rotates teams. Centralize in git with a short rationale comment so future readers know whether the flag is compatibility debt or a permanent requirement.

If you operate both Intel and Apple Silicon Macs in one pool, verify rsync paths separately per architecture. Homebrew prefixes differ; hardcoding only one path creates subtle arch-specific failures that confuse dashboards because half the fleet looks healthy.

Testing strategy should include downgrade drills: temporarily remove Homebrew rsync from staging and confirm alarms fire when the job falls back to openrsync without the mitigation bundle. Drills prove monitoring is wired, not merely configured.

Lastly, align naming: call the plist label, the log file prefix, and the Grafana series the same string so midnight triage does not require mental translation between three vocabularies.

Related reading

Parallel sessions and keepalive tuning live in concurrent SFTP. Handshake optimization for dense CI loops remains in ControlMaster matrix. Directory isolation belongs with chroot SFTP and multi-team collaboration. The homepage summarizes SFTPMAC plans for always-on Mac ingress.

Self-managed remote Mac operators carry hardware lifecycle, macOS minor upgrades, plist drift, and multi-tenant permission design simultaneously. When that total cost exceeds the value of bespoke control, leasing a dedicated ingress with documented defaults frequently shortens incident timelines.

FAQ and hosted remote Mac contrast

Should I ban system rsync entirely?

No blanket ban. Small local copies can stay on system tools. Remote, high-file-count, unattended workloads justify pinning GNU rsync and documenting versions.

Is cron still acceptable?

Apple prefers launchd. If cron remains, treat its minimal environment as default and encode assumptions explicitly.

Summary: Unattended success on Sequoia requires explicit PATH and agent context, deterministic non-interactive SSH, aligned rsync implementations, timeouts, and staged releases.

Limitation: This runbook does not replace network design for dual-stack or corporate proxies; those remain separate L0 programs.

Contrast: SFTPMAC hosted remote Mac offerings package stable ingress, permission boundaries, and operational defaults so teams spend fewer nights correlating plist logs with openrsync restarts. You keep pipeline ownership while outsourcing node hygiene and baseline drift, which is often the cheapest path when on-call time is scarce.