August 12, 2025

ShardFS

CLI

Distributed Systems

Backend

Storage

Dashboard

Yash Bansal

@yashbansal

Tech Stack

Node.js

Typescript

express.js

PostgreSQL

Docker

Redis

React

Socket.IO

Description

ShardFS is a distributed file system inspired by GFS/HDFS, designed for scalable and fault-tolerant storage. Large files are split into shards, distributed across nodes, and replicated to guarantee durability and availability.

It features a central metadata service for shard indexing, node health tracking, and orchestration of uploads/retrievals. Redis was used as a caching layer for hot metadata, reducing lookup latency.

Alongside the CLI, a real-time monitoring dashboard was built to visualize cluster health, shard placement, and I/O statistics, enabling better observability.

By supporting parallel uploads and downloads, ShardFS achieves higher throughput for large datasets, while its replication strategy ensures resilience against node failures.

Built a distributed file system with sharding + replication.
Designed automatic fault recovery for high availability.
Implemented metadata service for shard indexing and node coordination.
Developed realtime monitoring dashboard for system observability.
Dockerized cluster setup for reproducible testing and deployment.
Optimized performance with parallel shard fetches.

Page Info

System Overview

Architecture diagram showing how files are split into shards, replicated, and distributed across storage nodes with metadata service orchestration.

Upload CLI

Example of uploading a file via CLI. The system splits it into shards, replicates them, and updates metadata service.

Retrieval CLI

CLI command fetching shards in parallel from distributed nodes, reconstructing the file seamlessly for the client.

Fault Tolerance Demo

Terminal output showing successful file recovery when one node fails, leveraging shard replicas for high availability.

Realtime Dashboard

A web dashboard showing live system metrics such as shard distribution, node health, active connections, throughput, and replication status.

All Experience