The Abstraction Applies Everywhere

In our last post, we established that HuskHoard treats local storage as a presentation layer. It uses fanotify to intercept file requests and pull data from offline tapes or cold disks on-demand. But the logic that powers a local tape robot scales perfectly to the internet.

To HuskHoard's storage engine, AWS S3, Google Cloud Storage, or Backblaze B2 are not special. They are simply treated as "Append-Only" storage mediums with very high latency—conceptually identical to an LTO-9 tape drive. Because HuskHoard natively integrates with Rclone via child-process pipes, any of the 70+ cloud providers supported by Rclone can be mounted as a Husk volume.

Physical Media

LTO Tape Drive (/dev/nst0)

A sequential, append-only medium. Requires physical hardware, robots, and physical space. Retrieving a file means rewinding the tape, seeking to a byte offset, and reading.

Cloud Object Storage

Rclone Remote (rclone:s3:bucket)

A sequential, append-only medium (from Husk's perspective). Requires network bandwidth. Retrieving a file means making an HTTP Range request to a specific byte offset, and reading.

Because the engine abstracts the storage backend, you can architect your deployment in two distinct ways: using the cloud purely as a bottomless dump for a local server, or running HuskHoard itself inside the cloud to act as a centralized, global gateway.

Architecture A: The Local Cloud Gateway

In this architecture, HuskHoard runs on a machine inside your office or homelab. Your "Hot Tier" is a fast local NVMe drive or a local ZFS array. Your "Cold Tier" is a cloud bucket.

You work off the local NVMe drive at LAN speeds. When files age past your policy threshold (e.g., 30 days), the Husk Janitor daemon wakes up, compresses the file via `zstd`, and pipes it directly through Rclone into your cloud bucket. The local file is then "stubbed"—reduced to zero bytes on the NVMe drive, but retaining its original size, timestamps, and permissions in the directory listing.

# husk_config.toml hot_tier = "/mnt/fast_nvme" db_path = "husk_catalog.db" # Push cold files straight to a Backblaze B2 bucket via Rclone primary_volumes = ["rclone:my_b2:husk-archive/"] max_age_days = 30

If a user tries to open a stubbed file 6 months later, the fanotify interceptor pauses the application. HuskHoard calculates exactly where that file lives inside the cloud object, spawns rclone cat --offset X --count Y, and streams the decrypted, decompressed bytes directly into the application's memory buffer.

3
Application Request
User double-clicks an old PDF. Application calls read().
user space
2
Husk StreamGate
Intercepts the read. Looks up the byte-offset in the local SQLite catalog.
local network
1
Rclone ↔ Cloud
Pulls only the requested byte range via an HTTP 206 Partial Content request.
internet

The Advantage: You get local NVMe performance for active projects, but your storage capacity is infinite. You aren't paying to keep terabytes of cold data on local spinning rust, and you don't have to manage physical tapes. It's a bottomless, self-managing local NAS.

Architecture B: The Cloud-Native Hub

What if your team is fully remote? You don't have a local office to put an NVMe drive in. In this architecture, HuskHoard runs on a cloud compute instance (like an AWS EC2 or DigitalOcean Droplet).

Here, the EC2 instance has a small, fast EBS (Elastic Block Store) volume mounted as the Hot Tier. The Cold Tier is an S3 bucket (or a cheaper cold-block volume like AWS st1). Users connect to the EC2 instance via VPN, SMB, or HuskHoard's built-in HTTP Streaming Gateway.

As remote users upload or edit files on the EC2 instance, the EBS volume fills up. HuskHoard's emergency spillover policy kicks in:

# Emergency spillover keeps the small EBS volume from crashing hot_tier_max_usage_percent = 80

When EBS hits 80% capacity, HuskHoard aggressively stubs the oldest files, silently migrating them to S3. To the remote users accessing the SMB share, it looks like a 500TB drive that never runs out of space, even though you are only paying for a 500GB EBS volume.

The Advantage: Centralization. The bandwidth required to migrate data from Hot (EBS) to Cold (S3) happens entirely across the cloud provider's internal backbone at gigabit speeds. Users only consume bandwidth when they specifically read or write a file.

Inside the Object: How Husk Writes to the Cloud

If you upload 10,000 small files to Amazon S3, you will be penalized. Cloud providers charge for PUT and GET API calls. If HuskHoard uploaded a 1-to-1 mirror of your filesystem, archiving a node_modules folder would cost a fortune in API calls alone.

To solve this, HuskHoard treats the cloud like a tape drive. It packs files into large, sequentially appended binary objects.

# Inside your S3 bucket, you won't see your filesystem. # You will see massive, compressed binary chunks. s3://husk-archive/ ├── husk_4096.bin (150 GB) ├── husk_1610612736.bin (150 GB) └── husk_3221225472.bin (42 GB)

When the Janitor daemon processes the queue, it streams files through Blake3 hashing and Zstd compression, piping the raw byte stream to rclone rcat. Dozens or hundreds of files are packed into a single .bin object. The local Husk Catalog remembers exactly which byte offset inside husk_4096.bin belongs to which file.

When a user opens a stubbed file, HuskHoard doesn't download the 150GB object. It utilizes its StreamGate logic to spawn a targeted rclone cat command:

[Gateway] Spawning Rclone -> cat husk_4096.bin --offset 1048576 --count 40960

Rclone translates this into an HTTP Range request. S3 serves only the exact 40KB of compressed data required. HuskHoard decompresses it in RAM and hands it to the application. Zero-Disk Delivery. The local hot tier isn't even touched during a read.

The Economics of the Stub

Let's look at why you would choose either of these architectures from a purely financial perspective.

01 — API Call Reduction
Packing saves money
By packing thousands of files into large, sequential objects, HuskHoard turns millions of potential S3 PUT requests into a steady, single stream. You pay for storage space, not transaction penalties.
02 — Egress Mitigation
Range requests cut bandwidth
Because HuskHoard uses byte-range offsets, reading a 5MB PDF stored inside a 150GB archive chunk only incurs 5MB of egress fees. You only pay to download exactly what your applications read.
03 — Local Hardware
Small NVMe, Infinite NAS
In the Local Gateway architecture, you can build a 100TB NAS using a single 2TB NVMe drive. The hardware investment is minimal, and the storage scales automatically on the provider's side as you write.
04 — Multi-Cloud
Vendor Agnostic
Because HuskHoard wraps Rclone, you are never locked into AWS. If Cloudflare R2 or Backblaze drops their prices, you update your `husk_config.toml` and HuskHoard instantly starts writing new chunks to the new provider.

Choosing Your Flight Path

Both architectures leverage the same core truth: the filesystem is an abstraction.

If your team is physically located in one building and works with heavy assets (video editing, 3D rendering), Architecture A (Local Gateway) is superior. You need the 10Gbps LAN speed of local NVMe for active projects, and pushing cold data to the cloud is simply a hands-free offsite backup that reclaims local space.

If your team is distributed across time zones, or your workload is highly transactional (documents, code, web assets), Architecture B (Cloud-Native Hub) makes more sense. You offload the hardware maintenance entirely to AWS or DigitalOcean, and let HuskHoard aggressively manage the expensive EBS block storage so you don't have to.

In either scenario, the end-user experience is exactly the same. They look at a folder. They see their files. They double-click. HuskHoard figures out the rest.