All articles
Database Architecture PostgreSQL Data Engineering

Database Diagrams: A Practical Guide to Schema Design and Documentation

Palakorn Voramongkol
April 27, 2025 15 min read

“A comprehensive guide to database diagrams — covering ERD notations (Chen, Crow's Foot, UML), drawing with Mermaid and PlantUML, real-world schema examples, and best practices for documenting database designs.”

Database Diagrams: A Practical Guide to Schema Design and Documentation

A database diagram is worth a thousand words of prose documentation. When a new engineer joins the team, a well-drawn schema diagram communicates the data model faster than any wiki page. When you are designing a new feature, a quick ERD on a whiteboard surfaces inconsistencies before a single line of SQL is written. When an incident strikes at 3 AM, a clear diagram of foreign-key relationships is the difference between a fast fix and an hour of guessing.

This guide covers everything from the foundational notation systems to practical tooling — with real diagrams using Mermaid and PlantUML that you can drop straight into your documentation.

Why Database Diagrams Matter

Database diagrams serve three critical functions in a software team’s workflow.

Communication: A diagram makes implicit assumptions explicit. Does an order always have a user, or can guest checkouts exist? Can a product belong to more than one category? These questions are immediately visible in a diagram and frequently missed in prose documentation.

Onboarding: New team members spend days or weeks building a mental model of the data layer. A single comprehensive diagram collapses that time dramatically. Engineers who understand the data model write better queries, design better APIs, and introduce fewer data-consistency bugs.

Design Reviews: Reviewing a schema diagram during planning catches problems — missing foreign keys, ambiguous relationships, normalization issues — before they are baked into migrations that are painful to undo.

Types of Database Diagrams

Entity-Relationship Diagrams (ERDs)

An ERD is the most common type of database diagram. It models entities (tables), their attributes (columns), and the relationships between them. ERDs exist at two levels:

  • Conceptual ERD: Entities and relationships only — no column types, no keys. Used for high-level discussions with non-technical stakeholders.
  • Logical ERD: Adds attribute names and data types, primary keys, and foreign keys — but remains database-agnostic.

Physical Database Diagrams

A physical diagram is the logical ERD translated to a specific database engine. It shows exact column types (VARCHAR(255), BIGINT, TIMESTAMPTZ), indexes, constraints (NOT NULL, UNIQUE, CHECK), and engine-specific features like partitioning or tablespaces. This is what you generate from a tool like DBeaver or pgAdmin by reverse-engineering a live database.

Logical vs Physical Models

AspectLogical ModelPhysical Model
AudienceArchitects, analystsDBAs, backend engineers
Data typesGeneric (String, Integer)Engine-specific (TEXT, INT8)
KeysConceptual PK/FKActual index definitions
PortabilityDatabase-agnosticEngine-specific
Use caseDesign, communicationImplementation, migration

Standard Notations

Chen Notation

Peter Chen introduced the original ERD notation in 1976. In Chen notation:

  • Rectangles represent entities (e.g., USER, ORDER)
  • Diamonds represent relationships (e.g., PLACES)
  • Ovals represent attributes (e.g., email, created_at)
  • Double rectangles represent weak entities (entities that cannot exist without a parent)
  • Double diamonds represent identifying relationships

Chen notation is great for academia and high-level conceptual modeling. In practice, it becomes unwieldy for schemas with more than 10–15 entities because the diamond shapes and connecting lines create visual clutter.

Crow’s Foot / IE Notation

Crow’s Foot (also called Information Engineering notation) is the dominant standard in professional database tools. It places relationship cardinality symbols directly on the connecting lines:

  • || — exactly one
  • |< — one or many (the “crow’s foot”)
  • o| — zero or one
  • o< — zero or many

Crow’s Foot diagrams look cleaner than Chen notation for large schemas. Every tool you will encounter in practice — DBeaver, pgAdmin, DataGrip, dbdiagram.io — uses Crow’s Foot by default.

UML Class Diagrams for Databases

UML class diagrams are borrowed from object-oriented design but work well for database modeling. Tables become classes, columns become attributes, and relationships are expressed with UML multiplicity notation (1, *, 0..1, 1..*). If your team already uses UML for architecture diagrams, using it for the data layer keeps everything consistent.

IDEF1X Notation

IDEF1X (Integration Definition for Information Modeling) is a formal notation used in government and aerospace systems. It distinguishes between identifying relationships (where a child entity’s primary key includes the parent’s primary key) and non-identifying relationships. You will rarely encounter IDEF1X outside of enterprise or regulated-industry contexts.

Drawing ERDs with Mermaid

Mermaid’s erDiagram syntax renders directly in GitHub Markdown, GitLab, Notion, and most modern documentation platforms. It uses Crow’s Foot notation automatically.

Basic Relationships: One-to-Many

erDiagram
    USER {
        bigint id PK
        varchar email UK
        varchar name
        timestamp created_at
    }
    ORDER {
        bigint id PK
        bigint user_id FK
        varchar status
        decimal total_amount
        timestamp created_at
    }
    ADDRESS {
        bigint id PK
        bigint user_id FK
        varchar street
        varchar city
        varchar country
    }

    USER ||--o{ ORDER : "places"
    USER ||--o{ ADDRESS : "has"

The ||--o{ syntax reads: the left entity has exactly one instance that relates to zero or many instances on the right — a classic one-to-many.

Many-to-Many with a Junction Table

Many-to-many relationships always require an intermediate (junction) table in a relational database. The ERD should make this explicit:

erDiagram
    PRODUCT {
        bigint id PK
        varchar name
        text description
        decimal price
        bigint category_id FK
    }
    TAG {
        bigint id PK
        varchar name UK
        varchar slug UK
    }
    PRODUCT_TAG {
        bigint product_id FK
        bigint tag_id FK
    }
    CATEGORY {
        bigint id PK
        varchar name UK
        bigint parent_id FK
    }

    PRODUCT }o--o{ TAG : "tagged with"
    PRODUCT_TAG }|--|| PRODUCT : "belongs to"
    PRODUCT_TAG }|--|| TAG : "references"
    PRODUCT }o--|| CATEGORY : "belongs to"
    CATEGORY ||--o{ CATEGORY : "parent of"

Complete E-Commerce Schema

This diagram shows a production-grade e-commerce data model with users, orders, products, and the order line items that join them:

erDiagram
    USER {
        bigint id PK
        varchar email UK
        varchar name
        varchar password_hash
        boolean is_active
        timestamp created_at
        timestamp updated_at
    }
    ADDRESS {
        bigint id PK
        bigint user_id FK
        varchar street
        varchar city
        varchar state
        varchar country
        varchar postal_code
        boolean is_default
    }
    PRODUCT {
        bigint id PK
        varchar sku UK
        varchar name
        text description
        decimal price
        int stock_quantity
        bigint category_id FK
        timestamp created_at
    }
    CATEGORY {
        bigint id PK
        varchar name UK
        bigint parent_id FK
    }
    ORDER {
        bigint id PK
        bigint user_id FK
        bigint shipping_address_id FK
        varchar status
        decimal subtotal
        decimal tax
        decimal total
        timestamp created_at
        timestamp updated_at
    }
    ORDER_ITEM {
        bigint id PK
        bigint order_id FK
        bigint product_id FK
        int quantity
        decimal unit_price
        decimal line_total
    }
    PAYMENT {
        bigint id PK
        bigint order_id FK
        varchar provider
        varchar status
        decimal amount
        varchar transaction_ref
        timestamp paid_at
    }

    USER ||--o{ ADDRESS : "has"
    USER ||--o{ ORDER : "places"
    CATEGORY ||--o{ PRODUCT : "contains"
    CATEGORY ||--o{ CATEGORY : "parent of"
    ORDER }o--|| ADDRESS : "ships to"
    ORDER ||--o{ ORDER_ITEM : "contains"
    ORDER ||--o{ PAYMENT : "paid via"
    PRODUCT ||--o{ ORDER_ITEM : "included in"

Drawing with PlantUML

PlantUML’s !define syntax produces publication-quality diagrams and integrates with Confluence, IntelliJ IDEA, and VS Code. For database schemas, use the entity keyword.

Blog Platform Schema

@startuml blog-schema
!define TABLE(name,desc) class name as "desc" << (T,#FFAAAA) >>
!define PK(x) <b><color:#b8861b>x</color></b>
!define FK(x) <color:#aaaaaa>x</color>
hide methods
hide stereotypes

TABLE(users, "users") {
  PK(id) BIGSERIAL
  --
  email VARCHAR(255) UNIQUE NOT NULL
  username VARCHAR(100) UNIQUE NOT NULL
  password_hash VARCHAR(255) NOT NULL
  display_name VARCHAR(200)
  bio TEXT
  avatar_url TEXT
  is_active BOOLEAN DEFAULT true
  created_at TIMESTAMPTZ NOT NULL
  updated_at TIMESTAMPTZ NOT NULL
}

TABLE(posts, "posts") {
  PK(id) BIGSERIAL
  --
  FK(author_id) BIGINT NOT NULL
  title VARCHAR(500) NOT NULL
  slug VARCHAR(500) UNIQUE NOT NULL
  excerpt TEXT
  content TEXT NOT NULL
  status VARCHAR(20) DEFAULT 'draft'
  published_at TIMESTAMPTZ
  view_count INT DEFAULT 0
  created_at TIMESTAMPTZ NOT NULL
  updated_at TIMESTAMPTZ NOT NULL
}

TABLE(comments, "comments") {
  PK(id) BIGSERIAL
  --
  FK(post_id) BIGINT NOT NULL
  FK(author_id) BIGINT NOT NULL
  FK(parent_id) BIGINT
  content TEXT NOT NULL
  is_approved BOOLEAN DEFAULT false
  created_at TIMESTAMPTZ NOT NULL
}

TABLE(tags, "tags") {
  PK(id) BIGSERIAL
  --
  name VARCHAR(100) UNIQUE NOT NULL
  slug VARCHAR(100) UNIQUE NOT NULL
  description TEXT
}

TABLE(post_tags, "post_tags") {
  FK(post_id) BIGINT NOT NULL
  FK(tag_id) BIGINT NOT NULL
  --
  created_at TIMESTAMPTZ NOT NULL
}

users "1" --> "0..*" posts : author_id
users "1" --> "0..*" comments : author_id
posts "1" --> "0..*" comments : post_id
comments "0..1" --> "0..*" comments : parent_id
posts "1" --> "0..*" post_tags : post_id
tags "1" --> "0..*" post_tags : tag_id

@enduml

E-Commerce Physical Schema with Indexes

@startuml ecommerce-physical
!define TABLE(name,desc) class name as "desc" << (T,#AADDFF) >>
!define PK(x) <b><color:#b8861b>x</color></b>
!define FK(x) <color:#888888>x</color>
!define IDX(x) <color:#336699><i>x</i></color>
hide methods
hide stereotypes

TABLE(users, "users") {
  PK(id) BIGSERIAL PRIMARY KEY
  --
  email VARCHAR(255) UNIQUE NOT NULL
  password_hash VARCHAR(255) NOT NULL
  created_at TIMESTAMPTZ DEFAULT now()
  ==indexes==
  IDX(idx_users_email) ON (email)
}

TABLE(orders, "orders") {
  PK(id) BIGSERIAL PRIMARY KEY
  --
  FK(user_id) BIGINT NOT NULL
  status VARCHAR(30) NOT NULL DEFAULT 'pending'
  total NUMERIC(12,2) NOT NULL
  created_at TIMESTAMPTZ DEFAULT now()
  ==indexes==
  IDX(idx_orders_user_id) ON (user_id)
  IDX(idx_orders_status) ON (status)
  IDX(idx_orders_created_at) ON (created_at DESC)
}

TABLE(products, "products") {
  PK(id) BIGSERIAL PRIMARY KEY
  --
  FK(category_id) BIGINT
  sku VARCHAR(100) UNIQUE NOT NULL
  name VARCHAR(500) NOT NULL
  price NUMERIC(10,2) NOT NULL
  stock_qty INT NOT NULL DEFAULT 0
  ==indexes==
  IDX(idx_products_sku) ON (sku)
  IDX(idx_products_category) ON (category_id)
}

TABLE(order_items, "order_items") {
  PK(id) BIGSERIAL PRIMARY KEY
  --
  FK(order_id) BIGINT NOT NULL
  FK(product_id) BIGINT NOT NULL
  quantity INT NOT NULL
  unit_price NUMERIC(10,2) NOT NULL
  line_total NUMERIC(12,2) GENERATED ALWAYS AS (quantity * unit_price) STORED
  ==indexes==
  IDX(idx_oi_order_id) ON (order_id)
  IDX(idx_oi_product_id) ON (product_id)
}

users "1" --> "0..*" orders : user_id
orders "1" --> "1..*" order_items : order_id
products "1" --> "0..*" order_items : product_id

@enduml

Real-World Example: Blog Platform Schema

Let us design a complete blog platform from scratch. The requirements are straightforward:

  • Users write posts; posts have slugs for SEO-friendly URLs
  • Posts can be commented on; comments are threaded (parent/child)
  • Posts are tagged; tags are reusable across posts
  • All timestamps use timezone-aware types

ERD in Mermaid

erDiagram
    USERS {
        bigint id PK
        varchar email UK
        varchar username UK
        varchar password_hash
        varchar display_name
        text bio
        text avatar_url
        boolean is_active
        timestamptz created_at
        timestamptz updated_at
    }
    POSTS {
        bigint id PK
        bigint author_id FK
        varchar title
        varchar slug UK
        text excerpt
        text content
        varchar status
        timestamptz published_at
        int view_count
        timestamptz created_at
        timestamptz updated_at
    }
    COMMENTS {
        bigint id PK
        bigint post_id FK
        bigint author_id FK
        bigint parent_id FK
        text content
        boolean is_approved
        timestamptz created_at
    }
    TAGS {
        bigint id PK
        varchar name UK
        varchar slug UK
        text description
    }
    POST_TAGS {
        bigint post_id FK
        bigint tag_id FK
        timestamptz created_at
    }

    USERS ||--o{ POSTS : "writes"
    USERS ||--o{ COMMENTS : "authors"
    POSTS ||--o{ COMMENTS : "receives"
    COMMENTS ||--o{ COMMENTS : "parent of"
    POSTS ||--o{ POST_TAGS : "has"
    TAGS ||--o{ POST_TAGS : "used in"

SQL DDL in Three Dialects

-- PostgreSQL
CREATE TABLE users (
  id          BIGSERIAL PRIMARY KEY,
  email       VARCHAR(255) NOT NULL,
  username    VARCHAR(100) NOT NULL,
  password_hash VARCHAR(255) NOT NULL,
  display_name  VARCHAR(200),
  bio         TEXT,
  avatar_url  TEXT,
  is_active   BOOLEAN NOT NULL DEFAULT true,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
  CONSTRAINT uq_users_email    UNIQUE (email),
  CONSTRAINT uq_users_username UNIQUE (username)
);

CREATE TABLE posts (
  id           BIGSERIAL PRIMARY KEY,
  author_id    BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
  title        VARCHAR(500) NOT NULL,
  slug         VARCHAR(500) NOT NULL,
  excerpt      TEXT,
  content      TEXT NOT NULL,
  status       VARCHAR(20) NOT NULL DEFAULT 'draft'
                 CHECK (status IN ('draft','published','archived')),
  published_at TIMESTAMPTZ,
  view_count   INT NOT NULL DEFAULT 0,
  created_at   TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at   TIMESTAMPTZ NOT NULL DEFAULT now(),
  CONSTRAINT uq_posts_slug UNIQUE (slug)
);

CREATE TABLE comments (
  id          BIGSERIAL PRIMARY KEY,
  post_id     BIGINT NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
  author_id   BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
  parent_id   BIGINT REFERENCES comments(id) ON DELETE CASCADE,
  content     TEXT NOT NULL,
  is_approved BOOLEAN NOT NULL DEFAULT false,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE TABLE tags (
  id          BIGSERIAL PRIMARY KEY,
  name        VARCHAR(100) NOT NULL,
  slug        VARCHAR(100) NOT NULL,
  description TEXT,
  CONSTRAINT uq_tags_name UNIQUE (name),
  CONSTRAINT uq_tags_slug UNIQUE (slug)
);

CREATE TABLE post_tags (
  post_id    BIGINT NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
  tag_id     BIGINT NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  PRIMARY KEY (post_id, tag_id)
);

-- Indexes
CREATE INDEX idx_posts_author_id    ON posts(author_id);
CREATE INDEX idx_posts_status       ON posts(status);
CREATE INDEX idx_posts_published_at ON posts(published_at DESC)
  WHERE status = 'published';
CREATE INDEX idx_comments_post_id   ON comments(post_id);
CREATE INDEX idx_comments_parent_id ON comments(parent_id);
-- MySQL
CREATE TABLE users (
  id            BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
  email         VARCHAR(255) NOT NULL,
  username      VARCHAR(100) NOT NULL,
  password_hash VARCHAR(255) NOT NULL,
  display_name  VARCHAR(200),
  bio           TEXT,
  avatar_url    TEXT,
  is_active     TINYINT(1) NOT NULL DEFAULT 1,
  created_at    DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
  updated_at    DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6)
                  ON UPDATE CURRENT_TIMESTAMP(6),
  UNIQUE KEY uq_users_email    (email),
  UNIQUE KEY uq_users_username (username)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

CREATE TABLE posts (
  id           BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
  author_id    BIGINT UNSIGNED NOT NULL,
  title        VARCHAR(500) NOT NULL,
  slug         VARCHAR(500) NOT NULL,
  excerpt      TEXT,
  content      LONGTEXT NOT NULL,
  status       ENUM('draft','published','archived') NOT NULL DEFAULT 'draft',
  published_at DATETIME(6),
  view_count   INT UNSIGNED NOT NULL DEFAULT 0,
  created_at   DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
  updated_at   DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6)
                 ON UPDATE CURRENT_TIMESTAMP(6),
  UNIQUE KEY uq_posts_slug (slug),
  KEY idx_posts_author_id    (author_id),
  KEY idx_posts_status       (status),
  KEY idx_posts_published_at (published_at),
  CONSTRAINT fk_posts_author FOREIGN KEY (author_id)
    REFERENCES users(id) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

CREATE TABLE comments (
  id          BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
  post_id     BIGINT UNSIGNED NOT NULL,
  author_id   BIGINT UNSIGNED NOT NULL,
  parent_id   BIGINT UNSIGNED,
  content     TEXT NOT NULL,
  is_approved TINYINT(1) NOT NULL DEFAULT 0,
  created_at  DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
  KEY idx_comments_post_id   (post_id),
  KEY idx_comments_parent_id (parent_id),
  CONSTRAINT fk_comments_post   FOREIGN KEY (post_id)   REFERENCES posts(id)    ON DELETE CASCADE,
  CONSTRAINT fk_comments_author FOREIGN KEY (author_id) REFERENCES users(id)    ON DELETE CASCADE,
  CONSTRAINT fk_comments_parent FOREIGN KEY (parent_id) REFERENCES comments(id) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

CREATE TABLE tags (
  id          BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
  name        VARCHAR(100) NOT NULL,
  slug        VARCHAR(100) NOT NULL,
  description TEXT,
  UNIQUE KEY uq_tags_name (name),
  UNIQUE KEY uq_tags_slug (slug)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

CREATE TABLE post_tags (
  post_id    BIGINT UNSIGNED NOT NULL,
  tag_id     BIGINT UNSIGNED NOT NULL,
  created_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
  PRIMARY KEY (post_id, tag_id),
  CONSTRAINT fk_pt_post FOREIGN KEY (post_id) REFERENCES posts(id) ON DELETE CASCADE,
  CONSTRAINT fk_pt_tag  FOREIGN KEY (tag_id)  REFERENCES tags(id)  ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
-- SQLite
CREATE TABLE users (
  id            INTEGER PRIMARY KEY AUTOINCREMENT,
  email         TEXT    NOT NULL UNIQUE,
  username      TEXT    NOT NULL UNIQUE,
  password_hash TEXT    NOT NULL,
  display_name  TEXT,
  bio           TEXT,
  avatar_url    TEXT,
  is_active     INTEGER NOT NULL DEFAULT 1,
  created_at    TEXT    NOT NULL DEFAULT (datetime('now','utc')),
  updated_at    TEXT    NOT NULL DEFAULT (datetime('now','utc'))
);

CREATE TABLE posts (
  id           INTEGER PRIMARY KEY AUTOINCREMENT,
  author_id    INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
  title        TEXT    NOT NULL,
  slug         TEXT    NOT NULL UNIQUE,
  excerpt      TEXT,
  content      TEXT    NOT NULL,
  status       TEXT    NOT NULL DEFAULT 'draft'
                 CHECK (status IN ('draft','published','archived')),
  published_at TEXT,
  view_count   INTEGER NOT NULL DEFAULT 0,
  created_at   TEXT    NOT NULL DEFAULT (datetime('now','utc')),
  updated_at   TEXT    NOT NULL DEFAULT (datetime('now','utc'))
);

CREATE TABLE comments (
  id          INTEGER PRIMARY KEY AUTOINCREMENT,
  post_id     INTEGER NOT NULL REFERENCES posts(id)    ON DELETE CASCADE,
  author_id   INTEGER NOT NULL REFERENCES users(id)    ON DELETE CASCADE,
  parent_id   INTEGER          REFERENCES comments(id) ON DELETE CASCADE,
  content     TEXT    NOT NULL,
  is_approved INTEGER NOT NULL DEFAULT 0,
  created_at  TEXT    NOT NULL DEFAULT (datetime('now','utc'))
);

CREATE TABLE tags (
  id          INTEGER PRIMARY KEY AUTOINCREMENT,
  name        TEXT NOT NULL UNIQUE,
  slug        TEXT NOT NULL UNIQUE,
  description TEXT
);

CREATE TABLE post_tags (
  post_id    INTEGER NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
  tag_id     INTEGER NOT NULL REFERENCES tags(id)  ON DELETE CASCADE,
  created_at TEXT    NOT NULL DEFAULT (datetime('now','utc')),
  PRIMARY KEY (post_id, tag_id)
);

CREATE INDEX idx_posts_author_id ON posts(author_id);
CREATE INDEX idx_posts_status    ON posts(status);
CREATE INDEX idx_comments_post   ON comments(post_id);

ORM Model Definitions

The same blog schema expressed as ORM models across the four most common stacks:

// Prisma (TypeScript)
// schema.prisma

model User {
  id           BigInt    @id @default(autoincrement())
  email        String    @unique @db.VarChar(255)
  username     String    @unique @db.VarChar(100)
  passwordHash String    @map("password_hash") @db.VarChar(255)
  displayName  String?   @map("display_name") @db.VarChar(200)
  bio          String?
  avatarUrl    String?   @map("avatar_url")
  isActive     Boolean   @default(true) @map("is_active")
  createdAt    DateTime  @default(now()) @map("created_at") @db.Timestamptz
  updatedAt    DateTime  @updatedAt @map("updated_at") @db.Timestamptz

  posts    Post[]
  comments Comment[]

  @@map("users")
}

model Post {
  id          BigInt    @id @default(autoincrement())
  authorId    BigInt    @map("author_id")
  title       String    @db.VarChar(500)
  slug        String    @unique @db.VarChar(500)
  excerpt     String?
  content     String
  status      String    @default("draft") @db.VarChar(20)
  publishedAt DateTime? @map("published_at") @db.Timestamptz
  viewCount   Int       @default(0) @map("view_count")
  createdAt   DateTime  @default(now()) @map("created_at") @db.Timestamptz
  updatedAt   DateTime  @updatedAt @map("updated_at") @db.Timestamptz

  author   User      @relation(fields: [authorId], references: [id], onDelete: Cascade)
  comments Comment[]
  tags     PostTag[]

  @@index([authorId])
  @@index([status])
  @@map("posts")
}

model Comment {
  id         BigInt   @id @default(autoincrement())
  postId     BigInt   @map("post_id")
  authorId   BigInt   @map("author_id")
  parentId   BigInt?  @map("parent_id")
  content    String
  isApproved Boolean  @default(false) @map("is_approved")
  createdAt  DateTime @default(now()) @map("created_at") @db.Timestamptz

  post     Post      @relation(fields: [postId], references: [id], onDelete: Cascade)
  author   User      @relation(fields: [authorId], references: [id], onDelete: Cascade)
  parent   Comment?  @relation("CommentThread", fields: [parentId], references: [id])
  children Comment[] @relation("CommentThread")

  @@index([postId])
  @@map("comments")
}

model Tag {
  id          BigInt    @id @default(autoincrement())
  name        String    @unique @db.VarChar(100)
  slug        String    @unique @db.VarChar(100)
  description String?

  posts PostTag[]

  @@map("tags")
}

model PostTag {
  postId    BigInt   @map("post_id")
  tagId     BigInt   @map("tag_id")
  createdAt DateTime @default(now()) @map("created_at") @db.Timestamptz

  post Post @relation(fields: [postId], references: [id], onDelete: Cascade)
  tag  Tag  @relation(fields: [tagId], references: [id], onDelete: Cascade)

  @@id([postId, tagId])
  @@map("post_tags")
}
// JPA / Hibernate (Java/Spring Boot)

@Entity
@Table(name = "users")
public class User {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @Column(nullable = false, unique = true, length = 255)
    private String email;

    @Column(nullable = false, unique = true, length = 100)
    private String username;

    @Column(name = "password_hash", nullable = false, length = 255)
    private String passwordHash;

    @Column(name = "display_name", length = 200)
    private String displayName;

    @Column(columnDefinition = "TEXT")
    private String bio;

    @Column(name = "avatar_url", columnDefinition = "TEXT")
    private String avatarUrl;

    @Column(name = "is_active", nullable = false)
    private boolean isActive = true;

    @Column(name = "created_at", nullable = false,
            columnDefinition = "TIMESTAMPTZ DEFAULT now()")
    private OffsetDateTime createdAt;

    @Column(name = "updated_at", nullable = false,
            columnDefinition = "TIMESTAMPTZ DEFAULT now()")
    private OffsetDateTime updatedAt;

    @OneToMany(mappedBy = "author", cascade = CascadeType.ALL, orphanRemoval = true)
    private List<Post> posts = new ArrayList<>();
}

@Entity
@Table(name = "posts", indexes = {
    @Index(name = "idx_posts_author_id", columnList = "author_id"),
    @Index(name = "idx_posts_status",    columnList = "status")
})
public class Post {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @ManyToOne(fetch = FetchType.LAZY, optional = false)
    @JoinColumn(name = "author_id", nullable = false)
    private User author;

    @Column(nullable = false, length = 500)
    private String title;

    @Column(nullable = false, unique = true, length = 500)
    private String slug;

    @Column(columnDefinition = "TEXT")
    private String excerpt;

    @Column(nullable = false, columnDefinition = "TEXT")
    private String content;

    @Enumerated(EnumType.STRING)
    @Column(nullable = false, length = 20)
    private PostStatus status = PostStatus.DRAFT;

    @Column(name = "published_at",
            columnDefinition = "TIMESTAMPTZ")
    private OffsetDateTime publishedAt;

    @Column(name = "view_count", nullable = false)
    private int viewCount = 0;

    @OneToMany(mappedBy = "post", cascade = CascadeType.ALL, orphanRemoval = true)
    private List<Comment> comments = new ArrayList<>();

    @OneToMany(mappedBy = "post", cascade = CascadeType.ALL, orphanRemoval = true)
    private List<PostTag> tags = new ArrayList<>();
}
# SQLAlchemy 2.x (Python)

from __future__ import annotations
from datetime import datetime
from typing import Optional
from sqlalchemy import (
    BigInteger, Boolean, Column, DateTime, ForeignKey,
    Index, Integer, String, Text, UniqueConstraint, func
)
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship


class Base(DeclarativeBase):
    pass


class User(Base):
    __tablename__ = "users"

    id:            Mapped[int]           = mapped_column(BigInteger, primary_key=True, autoincrement=True)
    email:         Mapped[str]           = mapped_column(String(255), nullable=False, unique=True)
    username:      Mapped[str]           = mapped_column(String(100), nullable=False, unique=True)
    password_hash: Mapped[str]           = mapped_column(String(255), nullable=False)
    display_name:  Mapped[Optional[str]] = mapped_column(String(200))
    bio:           Mapped[Optional[str]] = mapped_column(Text)
    avatar_url:    Mapped[Optional[str]] = mapped_column(Text)
    is_active:     Mapped[bool]          = mapped_column(Boolean, nullable=False, default=True)
    created_at:    Mapped[datetime]      = mapped_column(DateTime(timezone=True), server_default=func.now())
    updated_at:    Mapped[datetime]      = mapped_column(DateTime(timezone=True), server_default=func.now(), onupdate=func.now())

    posts:    Mapped[list["Post"]]    = relationship(back_populates="author", cascade="all, delete-orphan")
    comments: Mapped[list["Comment"]] = relationship(back_populates="author", cascade="all, delete-orphan")


class Post(Base):
    __tablename__ = "posts"
    __table_args__ = (
        UniqueConstraint("slug", name="uq_posts_slug"),
        Index("idx_posts_author_id", "author_id"),
        Index("idx_posts_status",    "status"),
    )

    id:           Mapped[int]            = mapped_column(BigInteger, primary_key=True, autoincrement=True)
    author_id:    Mapped[int]            = mapped_column(BigInteger, ForeignKey("users.id", ondelete="CASCADE"), nullable=False)
    title:        Mapped[str]            = mapped_column(String(500), nullable=False)
    slug:         Mapped[str]            = mapped_column(String(500), nullable=False)
    excerpt:      Mapped[Optional[str]]  = mapped_column(Text)
    content:      Mapped[str]            = mapped_column(Text, nullable=False)
    status:       Mapped[str]            = mapped_column(String(20), nullable=False, default="draft")
    published_at: Mapped[Optional[datetime]] = mapped_column(DateTime(timezone=True))
    view_count:   Mapped[int]            = mapped_column(Integer, nullable=False, default=0)
    created_at:   Mapped[datetime]       = mapped_column(DateTime(timezone=True), server_default=func.now())
    updated_at:   Mapped[datetime]       = mapped_column(DateTime(timezone=True), server_default=func.now(), onupdate=func.now())

    author:   Mapped["User"]       = relationship(back_populates="posts")
    comments: Mapped[list["Comment"]] = relationship(back_populates="post", cascade="all, delete-orphan")
    post_tags: Mapped[list["PostTag"]] = relationship(back_populates="post", cascade="all, delete-orphan")


class Comment(Base):
    __tablename__ = "comments"
    __table_args__ = (
        Index("idx_comments_post_id",   "post_id"),
        Index("idx_comments_parent_id", "parent_id"),
    )

    id:          Mapped[int]           = mapped_column(BigInteger, primary_key=True, autoincrement=True)
    post_id:     Mapped[int]           = mapped_column(BigInteger, ForeignKey("posts.id",    ondelete="CASCADE"), nullable=False)
    author_id:   Mapped[int]           = mapped_column(BigInteger, ForeignKey("users.id",    ondelete="CASCADE"), nullable=False)
    parent_id:   Mapped[Optional[int]] = mapped_column(BigInteger, ForeignKey("comments.id", ondelete="CASCADE"))
    content:     Mapped[str]           = mapped_column(Text, nullable=False)
    is_approved: Mapped[bool]          = mapped_column(Boolean, nullable=False, default=False)
    created_at:  Mapped[datetime]      = mapped_column(DateTime(timezone=True), server_default=func.now())

    post:     Mapped["Post"]          = relationship(back_populates="comments")
    author:   Mapped["User"]          = relationship(back_populates="comments")
    parent:   Mapped[Optional["Comment"]]  = relationship(remote_side="Comment.id", back_populates="children")
    children: Mapped[list["Comment"]] = relationship(back_populates="parent")


class Tag(Base):
    __tablename__ = "tags"

    id:          Mapped[int]           = mapped_column(BigInteger, primary_key=True, autoincrement=True)
    name:        Mapped[str]           = mapped_column(String(100), nullable=False, unique=True)
    slug:        Mapped[str]           = mapped_column(String(100), nullable=False, unique=True)
    description: Mapped[Optional[str]] = mapped_column(Text)

    post_tags: Mapped[list["PostTag"]] = relationship(back_populates="tag", cascade="all, delete-orphan")


class PostTag(Base):
    __tablename__ = "post_tags"

    post_id:    Mapped[int]      = mapped_column(BigInteger, ForeignKey("posts.id", ondelete="CASCADE"), primary_key=True)
    tag_id:     Mapped[int]      = mapped_column(BigInteger, ForeignKey("tags.id",  ondelete="CASCADE"), primary_key=True)
    created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), server_default=func.now())

    post: Mapped["Post"] = relationship(back_populates="post_tags")
    tag:  Mapped["Tag"]  = relationship(back_populates="post_tags")
// Entity Framework Core (C# / .NET)

public class User
{
    public long Id { get; set; }
    public string Email { get; set; } = null!;
    public string Username { get; set; } = null!;
    public string PasswordHash { get; set; } = null!;
    public string? DisplayName { get; set; }
    public string? Bio { get; set; }
    public string? AvatarUrl { get; set; }
    public bool IsActive { get; set; } = true;
    public DateTimeOffset CreatedAt { get; set; } = DateTimeOffset.UtcNow;
    public DateTimeOffset UpdatedAt { get; set; } = DateTimeOffset.UtcNow;

    public ICollection<Post> Posts { get; set; } = [];
    public ICollection<Comment> Comments { get; set; } = [];
}

public class Post
{
    public long Id { get; set; }
    public long AuthorId { get; set; }
    public string Title { get; set; } = null!;
    public string Slug { get; set; } = null!;
    public string? Excerpt { get; set; }
    public string Content { get; set; } = null!;
    public PostStatus Status { get; set; } = PostStatus.Draft;
    public DateTimeOffset? PublishedAt { get; set; }
    public int ViewCount { get; set; }
    public DateTimeOffset CreatedAt { get; set; } = DateTimeOffset.UtcNow;
    public DateTimeOffset UpdatedAt { get; set; } = DateTimeOffset.UtcNow;

    public User Author { get; set; } = null!;
    public ICollection<Comment> Comments { get; set; } = [];
    public ICollection<PostTag> PostTags { get; set; } = [];
}

public class Comment
{
    public long Id { get; set; }
    public long PostId { get; set; }
    public long AuthorId { get; set; }
    public long? ParentId { get; set; }
    public string Content { get; set; } = null!;
    public bool IsApproved { get; set; }
    public DateTimeOffset CreatedAt { get; set; } = DateTimeOffset.UtcNow;

    public Post Post { get; set; } = null!;
    public User Author { get; set; } = null!;
    public Comment? Parent { get; set; }
    public ICollection<Comment> Children { get; set; } = [];
}

// EF Core configuration (IEntityTypeConfiguration<T> pattern)
public class PostConfiguration : IEntityTypeConfiguration<Post>
{
    public void Configure(EntityTypeBuilder<Post> builder)
    {
        builder.ToTable("posts");
        builder.HasKey(p => p.Id);
        builder.Property(p => p.Title).HasMaxLength(500).IsRequired();
        builder.Property(p => p.Slug).HasMaxLength(500).IsRequired();
        builder.HasIndex(p => p.Slug).IsUnique();
        builder.HasIndex(p => p.AuthorId);
        builder.HasIndex(p => p.Status);
        builder.Property(p => p.Status).HasConversion<string>();

        builder.HasOne(p => p.Author)
               .WithMany(u => u.Posts)
               .HasForeignKey(p => p.AuthorId)
               .OnDelete(DeleteBehavior.Cascade);
    }
}

Best Practices

Naming Conventions

Consistency is more important than the specific convention you choose. Pick one and enforce it across every table:

  • Table names: lowercase snake_case, plural (users, order_items, not User or orderItem)
  • Column names: lowercase snake_case (created_at, user_id, not createdAt or UserId)
  • Primary keys: always id — not user_id in the users table, not pk_user
  • Foreign keys: {referenced_table_singular}_iduser_id, post_id, category_id
  • Boolean columns: prefix with is_, has_, or can_is_active, has_verified_email
  • Timestamps: created_at and updated_at on every table, always timezone-aware
  • Junction tables: combine both table names alphabetically — post_tags (not tag_posts or posts_to_tags)
  • Indexes: idx_{table}_{columns}idx_posts_author_id, idx_orders_status_created_at
  • Constraints: uq_{table}_{column}, fk_{table}_{referenced_table}, chk_{table}_{column}

When to Normalize vs Denormalize

Third Normal Form (3NF) is the right starting point for most OLTP schemas. Normalize until it hurts, then denormalize strategically.

Normalize when:

  • Data is written frequently (normalization reduces write amplification)
  • Data integrity is critical — you cannot afford inconsistencies
  • The data model is still evolving — normalized schemas are easier to change

Denormalize when:

  • Read performance is critical and the data is read far more than it is written
  • You need to avoid expensive joins across large tables (e.g., reporting dashboards)
  • You are using a JSONB column to store semi-structured data that varies per row
  • You have proven a normalization bottleneck with actual profiling data — not assumptions

A common pragmatic pattern: keep your write path fully normalized, and use materialized views or read replicas with denormalized structures for your read path.

Documenting Constraints

Constraints are not just database enforcement mechanisms — they are documentation. Every constraint tells a reader something about the business rules embedded in the schema:

  • NOT NULL says the field is required by the business domain
  • UNIQUE says no two rows can represent the same logical entity
  • CHECK says the value must satisfy a business rule
  • FOREIGN KEY says this entity cannot exist without its parent
  • DEFAULT says what value the system supplies when the application does not

Name your constraints explicitly rather than relying on auto-generated names. CONSTRAINT uq_users_email UNIQUE (email) is far easier to reference in error messages and migrations than users_email_key.

Version-Controlling Schema Changes

Treat migrations as first-class code artifacts:

  1. Use a migration tool — Flyway, Liquibase, Alembic, Prisma Migrate, or Django migrations. Never apply schema changes by hand to production.
  2. One change per migration — A migration that adds a column and creates an index is two logical changes. Keep them separate so rollbacks are precise.
  3. Write both up and down migrations — Even if you never roll back, the down migration documents the inverse operation and forces you to think it through.
  4. Never edit a committed migration — Committed migrations are immutable history. Fix forward with a new migration.
  5. Review migrations like code — Include schema changes in pull requests. A migration that drops a column with data deserves the same scrutiny as a security-sensitive code change.
  6. Store the diagram alongside the migrations — Keep a docs/schema.mmd or docs/schema.puml in your repository and update it as part of the same PR as every migration.

Tools Comparison

ToolBest ForStrengthsLimitations
dbdiagram.ioQuick documentation, sharingFast DBML syntax, beautiful output, shareable linksOnline-only, limited free tier
DBeaverMulti-database reverse engineeringSupports 80+ databases, ER diagram from live DB, free Community editionDiagram export can be finicky
pgAdminPostgreSQL-specific workBuilt-in ERD tool, query plan visualizer, tight PG integrationPostgreSQL only
DataGripPower users, IDE integrationExcellent schema visualization, smart migrations, multi-DBPaid (JetBrains subscription)
draw.io / diagrams.netCustom diagrams, presentationsFully free, offline, highly customizable, integrates with ConfluenceManual — no database reverse engineering
MermaidDocs-as-code, Git-nativeRenders in GitHub/GitLab/Notion, version-controlled, plain textLimited visual customization
PlantUMLEnterprise/Confluence docsRich layout control, CI/CD friendly, IntelliJ integrationRequires Java or server rendering
LucidchartTeam collaborationReal-time collaboration, database import, Jira/Confluence pluginsPaid

For most teams the pragmatic combination is: Mermaid for diagrams embedded in Git repositories (your README.md, ADRs, PR descriptions) and DBeaver or DataGrip for reverse-engineering and exploring live databases.

Final Thoughts

A database diagram that lives in your repository alongside the migration files is infinitely more valuable than one buried in a Confluence page that was last updated two years ago. The best diagram tooling for engineering teams is the one that treats the diagram as code — version-controlled, diffable, reviewable, and automatically rendered in the platforms your team already uses.

Start with a Mermaid erDiagram in your project’s docs/ folder today. Add a CI step that validates the Mermaid syntax. Make updating the diagram a required part of your schema change process. In six months, every engineer on your team will thank you — including your future self at 3 AM during an incident.

Comments powered by Giscus are not yet configured. Set PUBLIC_GISCUS_REPO_ID and PUBLIC_GISCUS_CATEGORY_ID in apps/web/.env to enable.

PV

Written by Palakorn Voramongkol

Software Engineer Specialist with 20+ years of experience. Writing about architecture, performance, and building production systems.

More about me

Continue Reading