Resilience & Self-Healing
A decentralized network is only as durable as its ability to recover from partial failures. Servers go offline. New servers join and need to bootstrap their data. Network partitions cause divergence. Hashiverse addresses this through self-healing protocols that operate without any central coordinator — driven by clients and servers detecting inconsistency and repairing it.
Sharding and load distribution
Social media follows power laws. A small number of users and hashtags attract the vast majority of followers and generate the majority of posts. In a naive DHT, this means a small number of nodes would bear the majority of the load — becoming performance bottlenecks and high-value DDoS targets by virtue of the content they host. Hashiverse addresses this at the data model level through time-based and frequency-based sharding.
Epoch-based bucket keys
A user's or hashtag's content is not stored under a single fixed DHT key. The key is a function of the user's public ID (or hashtag text) and a time epoch. This means the DHT location of a user's posts shifts over time, spreading responsibility for hosting their content across different nodes in different epochs.
For high-volume users and hashtags, the key space is further subdivided: the busier an
identity is in terms of post volume, the more granular the bucket subdivision becomes.
A prolific poster's content is split across more, smaller buckets — each hosted by a
different region of the DHT ring. This is captured in the BucketLocation
type, which encodes the identity, epoch, and bucket granularity together.
The result is that no single node becomes permanently responsible for a power user's traffic, and no node becomes a stable target for a DDoS attack aimed at silencing a specific account or hashtag.
Source: buckets.rs —
BucketLocation
Demand-driven caching
Beyond healing (which repairs missing data after the fact), Hashiverse implements demand-driven caching to reduce load on the DHT-nearest servers for popular buckets. Intermediate servers cache post bundles and feedback bundles on behalf of the nodes closest to the content, serving clients directly and terminating the Kademlia walk early. The cache expands outward under sustained demand and contracts automatically when interest drops — entirely self-regulating, no coordination required.
Hit-threshold token mechanism
Each server maintains a bundle cache and a feedback cache
(one Moka weighted cache per type). On every incoming
GetPostBundleV1 or GetPostBundleFeedbackV1 request the server
increments a hit counter for that bucket location. Once the counter reaches the threshold
(currently 10 hits within the idle window), the server issues a
CacheRequestTokenV1 — a short-lived token signed by the server and scoped to
that bucket location.
The client collects any tokens issued during its Kademlia walk. After successfully
fetching and verifying the bundle from a responsible server, it asynchronously uploads
the bundle to each token-issuing server via CachePostBundleV1 /
CachePostBundleFeedbackV1. The server validates the token (signature, expiry,
location match), parses the bundle, and stores it in the in-memory cache. Subsequent
clients walking through that server receive the cached bundle directly, without continuing
the walk inward.
An in-flight deduplication cache (keyed by location ID, TTL matching the token lifetime) prevents the server from issuing duplicate tokens to multiple concurrent walkers for the same bucket.
Source: post_bundle_caching.rs,
post_bundle_feedback_caching.rs
Live vs. sealed buckets
The cache distinguishes between live and sealed data:
- Live buckets (current epoch, still accepting posts) — each cached bundle carries an absolute expiry of server time + 5 minutes. Bundles are served only while fresh; stale entries are skipped on read without evicting them, preserving the location's hit-count history so the threshold can be reached again quickly when new content arrives.
- Sealed buckets (epoch closed, no new posts) — cached bundles carry no individual expiry. They are evicted only by the location-level time-to-idle (60 seconds of no requests), or by the Moka byte-capacity limit. While a sealed bucket remains hot, its cached copy persists indefinitely.
Client-side discovery radius
To make use of the expanding cache, each client maintains a per-bucket
discovery radius: the XOR distance to the furthest server that
successfully returned a bundle on the previous walk. On the next walk the
PeerIterator skips servers closer than this radius and starts querying
from the frontier outward — going directly to wherever the cache was last found,
bypassing the responsible (innermost) servers entirely.
The radius is updated after each walk to the XOR distance of the furthest positive responder. If outer cache layers have since expired and no server at the frontier responds, the walk falls back to the next closest peer and the radius naturally contracts to reflect the current cache depth. The radius is held in a time-to-idle cache, so buckets that haven't been visited in a while start fresh from the responsible servers on the next access.
Cache eviction and capacity
Both caches are weighted by actual bundle byte size, backed by Moka's W-TinyLFU admission policy. Placeholder entries (locations that have been queried but not yet populated with a bundle) use a small fixed weight so they participate in the frequency sketch without consuming real capacity. When the byte high-watermark is reached, the least-frequently-used locations are evicted first. Each location also has a 60-second time-to-idle: a location that stops receiving requests is evicted quietly, resetting its hit counter and contracting the effective cache radius.
The bundle cache stores up to a configurable number of originators per location (currently 5). If a new originator arrives when the per-location cap is full, the bundle expiring soonest is replaced. Sealed bundles (no expiry) are treated as expiring last and are only displaced by other sealed bundles.
Source: post_bundle_caching_shared.rs,
config.rs — SERVER_POST_BUNDLE_CACHE_MAX_BYTES et al.
Post bundle healing
When a client fetches posts for a bucket, it queries multiple servers and collects their responses. It then compares: which server has posts that another server is missing? For each donor-target pair where the donor has posts the target lacks, the client arranges a two-phase heal:
- Claim phase: The client asks the target server which of the donor's
posts it needs, using a
HealPostBundleClaimV1request. The server declares which post IDs it is missing. - Commit phase: The client (or the donor, acting as a proxy) sends those
specific post bytes to the target via
HealPostBundleCommitV1.
This runs in background tasks spawned after the client's primary fetch completes. The user sees their content without waiting for healing; the network repairs itself in parallel.
Source: post_bundle_healing.rs,
test_healing_post_bundles.rs
Feedback healing
Feedback signals — likes, reports, dislikes — also diverge across servers as they propagate
through the network. A report might reach server A but not server B. Feedback healing
addresses this separately from post healing because feedback has a different data shape
(the 50-byte EncodedPostFeedbackV1 record) and a different merge rule.
The merge rule for feedback is: for each (post_id, feedback_type) pair, keep the signal
with the highest PoW across all servers. The client, after collecting feedback from multiple
servers, computes the global maximum and identifies servers that have weaker signals than
the global max. It then sends the stronger signals to those servers via
HealPostBundleFeedbackV1.
This is a single-phase heal (no claim step): the feedback record is small enough that sending it unconditionally is cheaper than the round-trip of asking first.
Source: post_bundle_feedback_healing.rs,
test_healing_post_bundle_feedbacks.rs
DDoS resistance
Hashiverse has four layers of DDoS protection that intercept attacks at progressively earlier points in the connection lifecycle, from kernel packet filtering down to application-level rate limiting.
Kernel layer: ipset blacklist
For sustained or severe attacks, the server escalates to kernel-level packet dropping via
Linux's ipset. Blacklisted IPs are added to a hash set with a five-minute
timeout; iptables (or nftables) drops matching packets before they reach the application
layer, eliminating even the cost of accepting and rejecting the TCP connection.
ipset create hashiverse_ddos_blacklist hash:ip timeout 300
The Rust server adds IPs to this set using non-blocking Command::new("ipset")
calls — the kernel operation does not block request handling. This requires
CAP_NET_ADMIN or root in production, managed via Linux Ambient Capabilities.
Pre-TLS: connection cap and per-IP slot limit
Each new TCP connection is checked before the TLS handshake begins. Two guards run at this point:
- Global connection cap — a semaphore limits the total number of simultaneous in-flight TCP connections. If the cap is reached the socket is closed immediately; no handshake resources are consumed.
- Per-IP connection cap — a separate in-memory counter limits how many concurrent connections a single IP may hold. This prevents one address from monopolising all available slots. IPs already in the application-layer ban list are also rejected here.
Both checks fire before TlsAcceptor::accept is called, so a flood of new
connections from a single IP consumes no CPU on TLS negotiation.
The server is deliberately IPv4-only for now; IPv6 support requires prefix-level
tracking (/64) to be effective, which is not yet implemented.
TLS handshake timeout
Once a connection passes the pre-TLS guards, the TLS handshake is wrapped in a hard timeout. A client that sends a ClientHello but never completes the handshake — a classic TLS-layer slow connection — is dropped after a fixed number of seconds and its bad-request score is incremented, moving it toward the application-layer ban.
Slow Loris defence: header and body read timeouts
After a successful TLS handshake, two further timeouts guard against connection-holding attacks at the HTTP layer:
- Header read timeout — if an HTTP/1.1 client has not finished sending its request headers within the allowed window, the connection is dropped. This is the primary defence against classic Slow Loris attacks, where an attacker opens many connections and trickles headers one byte at a time to hold slots open indefinitely.
- Body read timeout — a separate timeout covers the request body. This defends against body-level variants where headers arrive quickly but the body is streamed at a crawl. Combined with an explicit maximum body size, this bounds both time and memory consumed per request.
Application layer: IP reputation cache
Requests that pass all of the above are tracked per IP address using a Moka in-memory cache. IPs that exceed the request rate threshold within a sliding window receive a Hashiverse RPC error response and accumulate a bad-request score. Once the score crosses a threshold the IP is promoted to the ipset blacklist, dropping it to the kernel layer for future connections.
Source: hashiverse-server-lib/src/network/transport/,
config.rs
TimeProvider abstraction
The TimeProvider trait abstracts over real time, enabling
ScaledTimeProvider — a time-accelerated simulation clock used in tests.
Integration tests that verify healing behavior over time windows that would normally
take hours can run in seconds by scaling time. This is how the integration test suite
can verify the full healing lifecycle — including expiry of old content — without wall-clock
delays.