GO-DFS

A peer-to-peer distributed file system, built entirely from scratch in Go. No IPFS. No libp2p. Every byte, understood.

user@node-a:~

Built Different. Built From Scratch.

Every component — from the transport layer to the DHT — is hand-rolled. Zero external DFS dependencies.

Kademlia DHT

XOR-distance routing, k-bucket tables, and iterative FIND_NODE lookups. Real Kademlia, not a wrapper around someone else's library.

AES-256 Encryption

Every chunk is encrypted client-side with AES-256-CTR before it ever leaves your machine. Your key, your data. Even relay nodes can't read it.

Content-Addressed Chunks

Files are split into 256KB chunks, each keyed by SHA-256(encrypted_data). Dedup is automatic. CID = hash of the manifest.

NAT Traversal & Relay

Nodes behind NAT communicate through relay peers with TTL-limited message forwarding. Works across the real internet, not just localhost.

Automatic Replication

Periodic audits detect under-replicated or over-replicated chunks. The mesh self-heals — no manual intervention.

Tombstone Deletion

Deletes propagate through the network via cryptographic tombstones. Chunks are garbage-collected across all peers.

System Architecture

Four layers that make up the mesh, each built from raw TCP sockets up.

Transport

pkg/p2p

Raw TCP with TLS 1.3 handshake. Length-prefixed framing. Stream vs message multiplexing. Platform-specific socket tuning.

DHT

pkg/dht

Kademlia-style distributed hash table. SHA-256 node IDs, XOR distance metric, k-bucket routing table with K=20 replication factor.

Storage

internal/storage

Content-addressed on-disk store. File chunker with 256KB blocks. CID index for file→chunk mapping. Tombstone store for delete propagation.

Server

internal/server

The brain. Message routing, peer exchange, relay forwarding, chunk replication audits, heartbeat failure detection, and the HTTP control API.

Three Nodes. Three Steps. Real Networking.

Step 1

Start the Relay (Cloud)

Spin up a public node on EC2 or any VPS. It acts as the bootstrap peer and NAT bridge.

./dfs node start --port :7000 --relay --advertise <PUBLIC_IP>:7000
Step 2

Join the Mesh (Local)

Start your local node. It bootstraps through the relay, discovers peers, and builds its routing table.

./dfs.exe node start --port :7001 --bootstrap <PUBLIC_IP>:7000 --api-port :9001 -i
Step 3

Store & Retrieve

Store a file from any node — it gets encrypted, chunked, and replicated. Retrieve it from any other node using the CID.

dfs> store my_file.txt
# CID: b64730e9f1...

dfs> get b64730e9f1... -o downloaded.txt
✓ File retrieved, decrypted, and saved.
AES-256-CTRClient-side encryption
SHA-256Content addressing
K=20Replication factor
256KBChunk size
TLS 1.3Transport security
TTL=3Max relay hops

Open Source & Yours to Hack

GO-DFS is completely open source. Dive into the code, break it, improve it, or fork it.

Frequently Asked Questions

GO-DFS is a learning-first project built entirely from scratch. IPFS uses libp2p, Bitswap, and many abstraction layers. Here, every component — TCP transport, Kademlia DHT, chunk encoding — is written by hand to understand distributed systems from the ground up.

Yes. Files are encrypted with AES-256-CTR using your personal key BEFORE chunking and distribution. Even relay nodes that forward your data cannot read it. The chunk key itself is SHA-256(encrypted_data), so even the key is derived from ciphertext.

Absolutely. The relay system enables NAT traversal. Run a bootstrap node on a VPS, and local nodes behind NAT can communicate through it. This has been tested with real EC2 instances.

The replication audit system detects missing replicas via periodic heartbeat checks. Under-replicated chunks are automatically re-replicated to healthy peers. Over-replicated chunks are pruned.

Yes. Deletion uses cryptographic tombstones that propagate through the mesh. All peers holding chunks of that file will garbage-collect them within the GC window.