All articles
Blockchain Cryptography Distributed Systems Beginner Backend

Blockchain from Scratch: A Technical Intro with a Toy App You Can Actually Run

Palakorn Voramongkol
April 15, 2026 16 min read

“A ground-up, technical introduction to blockchains — hashes, blocks, Merkle trees, proof-of-work, consensus — paired with a ~200-line TypeScript toy chain you can run, fork, and break. The version of the explanation that assumes you're a developer, not an investor.”

Most “what is blockchain” posts are written for people deciding whether to buy something. This one is written for people deciding whether to build something. By the end you’ll have a working blockchain — a real one, in the technical sense — that you can run, break, and extend to test ideas without spinning up Ethereum.

We’ll build up from first principles: hashes, blocks, chains, Merkle trees, proof-of-work, validation, and forks. Then we’ll wire it all into ~200 lines of TypeScript. No tokens, no ICOs, no jargon-for-the-sake-of-it. Just the actual distributed-systems primitive underneath.

TL;DR

  • A blockchain is a hash-linked, append-only log that many parties agree on without a central coordinator.
  • The “block” is just a batch of transactions + a pointer (hash) to the previous block. Change history and every subsequent hash breaks.
  • Consensus is what makes it useful across untrusting parties — proof-of-work, proof-of-stake, BFT variants. Each trades latency, energy, and safety guarantees differently.
  • Merkle trees let you prove a transaction is in a block without downloading the whole block.
  • Great fit for: coordination without a trusted intermediary, tamper-evident audit logs, cross-organisation state.
  • Bad fit for: things a Postgres table handles fine. Most “blockchain projects” are in this bucket.
  • We’ll build a working chain you can node in under 10 minutes.

What a Blockchain Actually Is

Strip away the marketing and a blockchain is four ideas stacked on top of each other:

  1. A hash chain. Each block contains the hash of the previous block. Change anything in block N, and the hash of block N changes, which invalidates block N+1’s pointer, which invalidates N+2’s, and so on — all the way to the tip. That’s the “tamper-evident” property. It isn’t tamper-proof — it’s tamper-evident.
  2. Content-addressed batches. A block groups transactions into one unit so you’re not hashing every transaction against every other one. The block header commits to all its transactions via a Merkle root — one hash that covers the whole set.
  3. A consensus rule for picking which chain is “the” chain when nodes see different versions. Without this, every node has their own append-only log and there’s no shared truth.
  4. Peer-to-peer gossip so nodes exchange blocks and transactions without a central server.

Everything else — smart contracts, tokens, gas, zero-knowledge proofs — is built on top of these four. If you understand hash chains + Merkle trees + consensus + gossip, you understand the primitive. The rest is application layer.

The Five Building Blocks

1. Cryptographic Hash

A one-way function: input → fixed-size fingerprint. Blockchains use SHA-256 (Bitcoin) or Keccak-256 (Ethereum). Two key properties:

  • Determinism. Same input → same output, always.
  • Avalanche. One bit flipped in the input → completely different output.
  • Collision resistance. In practice, you can’t engineer two inputs with the same hash.
import { createHash } from "node:crypto";
const sha = (s: string) => createHash("sha256").update(s).digest("hex");

sha("hello");  // 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
sha("hellp");  // completely different

That’s it. That’s the cryptographic primitive every blockchain rests on.

2. Block

A block is a header + a list of transactions. The header is what actually gets hashed and referenced:

interface BlockHeader {
  index: number;             // height in the chain, 0 = genesis
  timestamp: number;         // Unix ms
  previousHash: string;      // hash of the previous block's header
  merkleRoot: string;        // root of the transactions Merkle tree
  nonce: number;             // puzzle solution for proof-of-work
  difficulty: number;        // current difficulty target
}

The block’s own hash is sha256(JSON.stringify(header)). Since merkleRoot commits to the transaction set and previousHash commits to the prior block’s state, one hash fingerprints everything.

3. Merkle Tree

Given N transactions, a Merkle tree produces one root hash from which you can prove any single transaction is included, using only log₂(N) hashes. That’s why light clients can verify a transaction without downloading the whole chain.

function merkleRoot(items: string[]): string {
  if (items.length === 0) return sha("");
  let level = items.map(sha);
  while (level.length > 1) {
    if (level.length % 2 === 1) level.push(level[level.length - 1]); // duplicate last
    const next: string[] = [];
    for (let i = 0; i < level.length; i += 2) {
      next.push(sha(level[i] + level[i + 1]));
    }
    level = next;
  }
  return level[0];
}

Understanding this tiny function unblocks a lot of later reading — inclusion proofs, light clients, rollups, all rest on it.

4. Consensus

This is the knob that matters most in practice, and the one most casual explanations handwave. Consensus is the rule every node uses to pick the “real” chain when they see disagreement. Common approaches:

  • Proof-of-Work (PoW). Bitcoin’s approach. Miners compete to find a nonce such that the block hash has N leading zero bits. Winner publishes the block. Longest chain wins. Expensive to attack (you’d need more compute than the rest of the network), expensive to run (electricity).
  • Proof-of-Stake (PoS). Ethereum’s current approach. Validators bond a stake; a leader is pseudo-randomly chosen in proportion to stake; bad behaviour gets slashed. Cheap to run; relies on economic penalties rather than energy.
  • BFT (Byzantine Fault Tolerant) variants (Tendermint, HotStuff, etc.). Leader-based, round-based, with quorum voting. Fast finality, known validator set — common in enterprise/consortium chains.
  • Proof-of-Authority (PoA). A fixed set of authorised signers rotate block production. Trivial consensus but centralised — fine for private testnets and consortium chains.

The choice dictates latency, throughput, and who can run a node.

5. Gossip / P2P

Nodes maintain peer connections, relay new transactions as they arrive, and relay new blocks once they’re valid. A real P2P layer is a lot — NAT traversal, block propagation, DoS resistance — but the core idea is “tell your neighbours what you just heard.” For learning purposes, a JSON-over-HTTP endpoint between a handful of nodes is enough.

Properties (and Non-Properties) of Blockchains

What they do give you:

  • Tamper-evidence. Rewriting history requires redoing all subsequent proofs.
  • Censorship resistance. On a permissionless chain with enough participants, no single party can block a transaction.
  • Auditability. Every state change is public and attributable to a signing key.
  • No trusted intermediary. The ledger’s integrity comes from the protocol, not a company.

What they do not give you, despite frequent claims:

  • Privacy. Most chains are pseudonymous, not private. Your address history is public forever.
  • Unbreakable data storage. Losing your key loses your assets. Smart contract bugs are forever.
  • Immutability of application truth. Garbage-in is permanent garbage. If the oracle you feed your contract lies, the chain faithfully records the lie.
  • Speed. Even fast chains are orders of magnitude slower than a centralised database. The property you’re paying for is decentralisation, not performance.

A Mental Model: Git, But Global and Adversarial

Git is a useful starting analogy — a hash-linked DAG of commits — but blockchain has three differences worth internalising:

  1. Global. Git repos fork freely; blockchains need a single canonical chain that all participants agree on.
  2. Adversarial. Some participants will actively try to rewrite history, double-spend, censor, or stall.
  3. Byzantine fault tolerant. The system must keep working when a minority of participants lie, crash, or collude.

The consensus rule is how you make git-like semantics survive that environment.

flowchart LR
  T1[Transaction] --> Pool[Mempool]
  T2[Transaction] --> Pool
  Pool --> Miner[Miner / Validator]
  Miner -->|find nonce| Block[New Block]
  Block --> Chain[(Blockchain)]
  Chain --> P1[Peer 1]
  Chain --> P2[Peer 2]
  Chain --> P3[Peer 3]
  P1 <-.gossip.-> P2
  P2 <-.gossip.-> P3

New transactions pile up in a mempool; a miner or validator picks some, builds a block, proves it (PoW / PoS), and gossips it to peers — who each verify before appending.

Building a Toy Chain in TypeScript

The tree below is the minimum worth writing. It’s not production — no networking, no signing, no orphan-block handling — but it is a real blockchain in the technical sense: blocks, hashes, Merkle roots, proof-of-work, and validation.

toychain/
├── package.json
└── chain.ts
// package.json
{
  "name": "toychain",
  "type": "module",
  "scripts": { "start": "tsx chain.ts" },
  "devDependencies": { "tsx": "^4.19.2", "typescript": "^5" }
}
// chain.ts
import { createHash } from "node:crypto";

// ---------- primitives ----------
const sha = (s: string) =>
  createHash("sha256").update(s).digest("hex");

function merkleRoot(items: string[]): string {
  if (items.length === 0) return sha("");
  let level = items.map(sha);
  while (level.length > 1) {
    if (level.length % 2 === 1) level.push(level[level.length - 1]);
    const next: string[] = [];
    for (let i = 0; i < level.length; i += 2) {
      next.push(sha(level[i] + level[i + 1]));
    }
    level = next;
  }
  return level[0];
}

// ---------- domain ----------
interface Transaction {
  from: string;
  to: string;
  amount: number;
  nonce: number; // per-sender replay protection
}

interface Block {
  index: number;
  timestamp: number;
  previousHash: string;
  merkleRoot: string;
  nonce: number;
  difficulty: number;
  transactions: Transaction[];
}

const hashBlock = (b: Block): string =>
  sha(
    JSON.stringify({
      index: b.index,
      timestamp: b.timestamp,
      previousHash: b.previousHash,
      merkleRoot: b.merkleRoot,
      nonce: b.nonce,
      difficulty: b.difficulty,
    }),
  );

const meetsDifficulty = (hash: string, difficulty: number) =>
  hash.startsWith("0".repeat(difficulty));

// ---------- mining ----------
function mine(block: Block): Block {
  let nonce = 0;
  while (true) {
    const candidate = { ...block, nonce };
    if (meetsDifficulty(hashBlock(candidate), block.difficulty)) {
      return candidate;
    }
    nonce += 1;
  }
}

// ---------- chain ----------
class Blockchain {
  chain: Block[] = [];
  mempool: Transaction[] = [];
  difficulty = 4; // ~1s on a laptop for SHA-256

  constructor() {
    // Genesis block — hardcoded, no previous hash.
    const genesis: Block = {
      index: 0,
      timestamp: 0,
      previousHash: "0".repeat(64),
      merkleRoot: merkleRoot([]),
      nonce: 0,
      difficulty: 0,
      transactions: [],
    };
    this.chain.push(genesis);
  }

  tip(): Block {
    return this.chain[this.chain.length - 1];
  }

  submit(tx: Transaction): void {
    // A real chain would verify signatures + balances here.
    this.mempool.push(tx);
  }

  mineNext(): Block {
    const txs = this.mempool.splice(0, 10); // batch size
    const next: Block = {
      index: this.tip().index + 1,
      timestamp: Date.now(),
      previousHash: hashBlock(this.tip()),
      merkleRoot: merkleRoot(txs.map((t) => JSON.stringify(t))),
      nonce: 0,
      difficulty: this.difficulty,
      transactions: txs,
    };
    const mined = mine(next);
    this.chain.push(mined);
    return mined;
  }

  isValid(): boolean {
    for (let i = 1; i < this.chain.length; i++) {
      const prev = this.chain[i - 1];
      const cur = this.chain[i];
      if (cur.previousHash !== hashBlock(prev)) return false;
      if (cur.merkleRoot !== merkleRoot(cur.transactions.map((t) => JSON.stringify(t)))) return false;
      if (!meetsDifficulty(hashBlock(cur), cur.difficulty)) return false;
    }
    return true;
  }
}

// ---------- demo ----------
const chain = new Blockchain();

chain.submit({ from: "alice", to: "bob",   amount: 10, nonce: 1 });
chain.submit({ from: "bob",   to: "carol", amount: 3,  nonce: 1 });

const b1 = chain.mineNext();
console.log(`mined #${b1.index} nonce=${b1.nonce} hash=${hashBlock(b1).slice(0, 12)}...`);

chain.submit({ from: "alice", to: "dave", amount: 5, nonce: 2 });
const b2 = chain.mineNext();
console.log(`mined #${b2.index} nonce=${b2.nonce} hash=${hashBlock(b2).slice(0, 12)}...`);

console.log("chain valid:", chain.isValid());

// Tamper check
chain.chain[1].transactions[0].amount = 999;
console.log("chain valid after tamper:", chain.isValid());

Run it:

npm install
npm start

You should see something like:

mined #1 nonce=19427 hash=0000abc1f23d...
mined #2 nonce=34812 hash=0000bf9e7c5a...
chain valid: true
chain valid after tamper: false

Two blocks mined, each solving a “find a nonce so the hash starts with four zeros” puzzle. A tiny tampering demo at the end flips an amount in block 1 — validation notices because the Merkle root (and thus the chain of hashes) no longer lines up.

That is, in the most reduced form possible, a blockchain.

Things to Try With It

This is where a toy chain earns its keep. Each extension teaches you a real concept:

  • Bump the difficulty. Change difficulty = 4 to 5, 6, 7. Observe how mining time grows roughly 16× per step. Welcome to exponential difficulty.
  • Add ECDSA signatures. Give each participant a secp256k1 keypair. Require every transaction to carry a signature over {from, to, amount, nonce}. Verify on submission. You’ve now added cryptographic authorisation.
  • Add balances. Maintain an account state map. Reject transactions whose sender doesn’t have enough balance or whose nonce doesn’t line up. You’ve now added validation-beyond-cryptography.
  • Double-spend attack. Without nonce checks, submit the same transaction twice, mine both into separate blocks. Add the nonce check to see how it’s prevented.
  • Fork handling. Run two instances, mine in parallel, then exchange blocks. Implement the “longest valid chain wins” rule. Congratulations — you’ve implemented consensus.
  • Light-client proof. Given a transaction and a block, compute and verify a Merkle inclusion proof (the log₂(N) sibling hashes). That’s how SPV wallets work.
  • P2P gossip. Expose /blocks and /submit HTTP endpoints; have nodes poll each other. Not production-grade, but enough to see gossip in action.
  • Swap consensus. Replace proof-of-work with a round-robin signer set (a toy PoA). Notice that everything else stays the same — consensus is a module.
  • Smart contracts. The heavy lift. Add a tiny VM that interprets a minimal bytecode over account state, and include the post-execution state root in each block. This is the leap from Bitcoin-style chains to Ethereum-style chains.

Even doing two or three of these teaches more than most online courses.

When to Use a Real Blockchain

Most problems described as “we need a blockchain” are actually solved by:

  • Postgres + an audit log table, or
  • An append-only S3 bucket, or
  • A signed-notary service, or
  • An existing database with better access controls.

The legitimate blockchain use cases share a pattern: multiple distrusting parties needing to agree on shared state without granting one of them authority.

Real fits:

  • Public permissionless cryptocurrencies. Bitcoin, Ethereum, and their descendants. The canonical case.
  • Cross-organisation coordination (consortium chains): trade finance, supply-chain provenance, inter-bank settlement — where no party will accept another as the source of truth.
  • Tamper-evident audit trails across administrative boundaries: medical records across hospitals, legal document notarisation, license registries.
  • Token economies / NFTs, where the protocol’s native asset layer is the point.

Non-fits (no matter how hard the proposal tries):

  • Internal audit logs. A signed append-only log inside your system is faster, simpler, and as auditable.
  • Database replacements. Blockchains are optimised for trust minimisation, not queries.
  • “Private data on the blockchain” (privacy-preserving smart contracts are an active research area; assume it won’t work until proven otherwise for your case).
  • High-throughput transactional systems. Even modern chains do thousands of TPS; your RDBMS does hundreds of thousands to millions.

The right question isn’t “could this use a blockchain?” It’s “who are the distrusting parties, and what happens if they keep using a database?” If you can’t answer the first half cleanly, your project is probably a database problem in disguise.

  1. The Bitcoin whitepaper — 9 pages. Read it. It’s still the clearest description of the core idea.
  2. The Ethereum yellow paper — heavier. Skim to understand the state-machine framing; the formalism is optional the first time.
  3. Mastering Bitcoin (Antonopoulos) and Mastering Ethereum (Antonopoulos & Wood) — the canonical book-length treatments.
  4. Vitalik’s blog — long posts on scaling, rollups, cryptography, governance. Best single archive for the “why” of modern Ethereum design.
  5. The libp2p specs — if you want to understand real blockchain P2P networking.
  6. The go-ethereum source — when you want to see a production node’s anatomy.
  7. And the earlier deep-dive on Kafka on this site for a comparison point — Kafka is also an append-only log, and the comparison sharpens what’s actually unique to blockchain.

The same rule applies as with the Kafka and Redis intros: you’ll understand blockchains ten times faster by running a tiny one, breaking it, and patching it than by reading another overview. The ~200 lines above exist so you can do exactly that.

Comments powered by Giscus are not yet configured. Set PUBLIC_GISCUS_REPO_ID and PUBLIC_GISCUS_CATEGORY_ID in apps/web/.env to enable.

PV

Written by Palakorn Voramongkol

Software Engineer Specialist with 20+ years of experience. Writing about architecture, performance, and building production systems.

More about me

Continue Reading