Database Diagrams: A Practical Guide to Schema Design and Documentation
A database diagram is worth a thousand words of prose documentation. When a new engineer joins the team, a well-drawn schema diagram communicates the data model faster than any wiki page. When you are designing a new feature, a quick ERD on a whiteboard surfaces inconsistencies before a single line of SQL is written. When an incident strikes at 3 AM, a clear diagram of foreign-key relationships is the difference between a fast fix and an hour of guessing.
This guide covers everything from the foundational notation systems to practical tooling — with real diagrams using Mermaid and PlantUML that you can drop straight into your documentation.
Why Database Diagrams Matter
Database diagrams serve three critical functions in a software team’s workflow.
Communication: A diagram makes implicit assumptions explicit. Does an order always have a user, or can guest checkouts exist? Can a product belong to more than one category? These questions are immediately visible in a diagram and frequently missed in prose documentation.
Onboarding: New team members spend days or weeks building a mental model of the data layer. A single comprehensive diagram collapses that time dramatically. Engineers who understand the data model write better queries, design better APIs, and introduce fewer data-consistency bugs.
Design Reviews: Reviewing a schema diagram during planning catches problems — missing foreign keys, ambiguous relationships, normalization issues — before they are baked into migrations that are painful to undo.
Types of Database Diagrams
Entity-Relationship Diagrams (ERDs)
An ERD is the most common type of database diagram. It models entities (tables), their attributes (columns), and the relationships between them. ERDs exist at two levels:
- Conceptual ERD: Entities and relationships only — no column types, no keys. Used for high-level discussions with non-technical stakeholders.
- Logical ERD: Adds attribute names and data types, primary keys, and foreign keys — but remains database-agnostic.
Physical Database Diagrams
A physical diagram is the logical ERD translated to a specific database engine. It shows exact column types (VARCHAR(255), BIGINT, TIMESTAMPTZ), indexes, constraints (NOT NULL, UNIQUE, CHECK), and engine-specific features like partitioning or tablespaces. This is what you generate from a tool like DBeaver or pgAdmin by reverse-engineering a live database.
Logical vs Physical Models
| Aspect | Logical Model | Physical Model |
|---|---|---|
| Audience | Architects, analysts | DBAs, backend engineers |
| Data types | Generic (String, Integer) | Engine-specific (TEXT, INT8) |
| Keys | Conceptual PK/FK | Actual index definitions |
| Portability | Database-agnostic | Engine-specific |
| Use case | Design, communication | Implementation, migration |
Standard Notations
Chen Notation
Peter Chen introduced the original ERD notation in 1976. In Chen notation:
- Rectangles represent entities (e.g.,
USER,ORDER) - Diamonds represent relationships (e.g.,
PLACES) - Ovals represent attributes (e.g.,
email,created_at) - Double rectangles represent weak entities (entities that cannot exist without a parent)
- Double diamonds represent identifying relationships
Chen notation is great for academia and high-level conceptual modeling. In practice, it becomes unwieldy for schemas with more than 10–15 entities because the diamond shapes and connecting lines create visual clutter.
Crow’s Foot / IE Notation
Crow’s Foot (also called Information Engineering notation) is the dominant standard in professional database tools. It places relationship cardinality symbols directly on the connecting lines:
||— exactly one|<— one or many (the “crow’s foot”)o|— zero or oneo<— zero or many
Crow’s Foot diagrams look cleaner than Chen notation for large schemas. Every tool you will encounter in practice — DBeaver, pgAdmin, DataGrip, dbdiagram.io — uses Crow’s Foot by default.
UML Class Diagrams for Databases
UML class diagrams are borrowed from object-oriented design but work well for database modeling. Tables become classes, columns become attributes, and relationships are expressed with UML multiplicity notation (1, *, 0..1, 1..*). If your team already uses UML for architecture diagrams, using it for the data layer keeps everything consistent.
IDEF1X Notation
IDEF1X (Integration Definition for Information Modeling) is a formal notation used in government and aerospace systems. It distinguishes between identifying relationships (where a child entity’s primary key includes the parent’s primary key) and non-identifying relationships. You will rarely encounter IDEF1X outside of enterprise or regulated-industry contexts.
Drawing ERDs with Mermaid
Mermaid’s erDiagram syntax renders directly in GitHub Markdown, GitLab, Notion, and most modern documentation platforms. It uses Crow’s Foot notation automatically.
Basic Relationships: One-to-Many
erDiagram
USER {
bigint id PK
varchar email UK
varchar name
timestamp created_at
}
ORDER {
bigint id PK
bigint user_id FK
varchar status
decimal total_amount
timestamp created_at
}
ADDRESS {
bigint id PK
bigint user_id FK
varchar street
varchar city
varchar country
}
USER ||--o{ ORDER : "places"
USER ||--o{ ADDRESS : "has"
The ||--o{ syntax reads: the left entity has exactly one instance that relates to zero or many instances on the right — a classic one-to-many.
Many-to-Many with a Junction Table
Many-to-many relationships always require an intermediate (junction) table in a relational database. The ERD should make this explicit:
erDiagram
PRODUCT {
bigint id PK
varchar name
text description
decimal price
bigint category_id FK
}
TAG {
bigint id PK
varchar name UK
varchar slug UK
}
PRODUCT_TAG {
bigint product_id FK
bigint tag_id FK
}
CATEGORY {
bigint id PK
varchar name UK
bigint parent_id FK
}
PRODUCT }o--o{ TAG : "tagged with"
PRODUCT_TAG }|--|| PRODUCT : "belongs to"
PRODUCT_TAG }|--|| TAG : "references"
PRODUCT }o--|| CATEGORY : "belongs to"
CATEGORY ||--o{ CATEGORY : "parent of"
Complete E-Commerce Schema
This diagram shows a production-grade e-commerce data model with users, orders, products, and the order line items that join them:
erDiagram
USER {
bigint id PK
varchar email UK
varchar name
varchar password_hash
boolean is_active
timestamp created_at
timestamp updated_at
}
ADDRESS {
bigint id PK
bigint user_id FK
varchar street
varchar city
varchar state
varchar country
varchar postal_code
boolean is_default
}
PRODUCT {
bigint id PK
varchar sku UK
varchar name
text description
decimal price
int stock_quantity
bigint category_id FK
timestamp created_at
}
CATEGORY {
bigint id PK
varchar name UK
bigint parent_id FK
}
ORDER {
bigint id PK
bigint user_id FK
bigint shipping_address_id FK
varchar status
decimal subtotal
decimal tax
decimal total
timestamp created_at
timestamp updated_at
}
ORDER_ITEM {
bigint id PK
bigint order_id FK
bigint product_id FK
int quantity
decimal unit_price
decimal line_total
}
PAYMENT {
bigint id PK
bigint order_id FK
varchar provider
varchar status
decimal amount
varchar transaction_ref
timestamp paid_at
}
USER ||--o{ ADDRESS : "has"
USER ||--o{ ORDER : "places"
CATEGORY ||--o{ PRODUCT : "contains"
CATEGORY ||--o{ CATEGORY : "parent of"
ORDER }o--|| ADDRESS : "ships to"
ORDER ||--o{ ORDER_ITEM : "contains"
ORDER ||--o{ PAYMENT : "paid via"
PRODUCT ||--o{ ORDER_ITEM : "included in"
Drawing with PlantUML
PlantUML’s !define syntax produces publication-quality diagrams and integrates with Confluence, IntelliJ IDEA, and VS Code. For database schemas, use the entity keyword.
Blog Platform Schema
@startuml blog-schema
!define TABLE(name,desc) class name as "desc" << (T,#FFAAAA) >>
!define PK(x) <b><color:#b8861b>x</color></b>
!define FK(x) <color:#aaaaaa>x</color>
hide methods
hide stereotypes
TABLE(users, "users") {
PK(id) BIGSERIAL
--
email VARCHAR(255) UNIQUE NOT NULL
username VARCHAR(100) UNIQUE NOT NULL
password_hash VARCHAR(255) NOT NULL
display_name VARCHAR(200)
bio TEXT
avatar_url TEXT
is_active BOOLEAN DEFAULT true
created_at TIMESTAMPTZ NOT NULL
updated_at TIMESTAMPTZ NOT NULL
}
TABLE(posts, "posts") {
PK(id) BIGSERIAL
--
FK(author_id) BIGINT NOT NULL
title VARCHAR(500) NOT NULL
slug VARCHAR(500) UNIQUE NOT NULL
excerpt TEXT
content TEXT NOT NULL
status VARCHAR(20) DEFAULT 'draft'
published_at TIMESTAMPTZ
view_count INT DEFAULT 0
created_at TIMESTAMPTZ NOT NULL
updated_at TIMESTAMPTZ NOT NULL
}
TABLE(comments, "comments") {
PK(id) BIGSERIAL
--
FK(post_id) BIGINT NOT NULL
FK(author_id) BIGINT NOT NULL
FK(parent_id) BIGINT
content TEXT NOT NULL
is_approved BOOLEAN DEFAULT false
created_at TIMESTAMPTZ NOT NULL
}
TABLE(tags, "tags") {
PK(id) BIGSERIAL
--
name VARCHAR(100) UNIQUE NOT NULL
slug VARCHAR(100) UNIQUE NOT NULL
description TEXT
}
TABLE(post_tags, "post_tags") {
FK(post_id) BIGINT NOT NULL
FK(tag_id) BIGINT NOT NULL
--
created_at TIMESTAMPTZ NOT NULL
}
users "1" --> "0..*" posts : author_id
users "1" --> "0..*" comments : author_id
posts "1" --> "0..*" comments : post_id
comments "0..1" --> "0..*" comments : parent_id
posts "1" --> "0..*" post_tags : post_id
tags "1" --> "0..*" post_tags : tag_id
@enduml
E-Commerce Physical Schema with Indexes
@startuml ecommerce-physical
!define TABLE(name,desc) class name as "desc" << (T,#AADDFF) >>
!define PK(x) <b><color:#b8861b>x</color></b>
!define FK(x) <color:#888888>x</color>
!define IDX(x) <color:#336699><i>x</i></color>
hide methods
hide stereotypes
TABLE(users, "users") {
PK(id) BIGSERIAL PRIMARY KEY
--
email VARCHAR(255) UNIQUE NOT NULL
password_hash VARCHAR(255) NOT NULL
created_at TIMESTAMPTZ DEFAULT now()
==indexes==
IDX(idx_users_email) ON (email)
}
TABLE(orders, "orders") {
PK(id) BIGSERIAL PRIMARY KEY
--
FK(user_id) BIGINT NOT NULL
status VARCHAR(30) NOT NULL DEFAULT 'pending'
total NUMERIC(12,2) NOT NULL
created_at TIMESTAMPTZ DEFAULT now()
==indexes==
IDX(idx_orders_user_id) ON (user_id)
IDX(idx_orders_status) ON (status)
IDX(idx_orders_created_at) ON (created_at DESC)
}
TABLE(products, "products") {
PK(id) BIGSERIAL PRIMARY KEY
--
FK(category_id) BIGINT
sku VARCHAR(100) UNIQUE NOT NULL
name VARCHAR(500) NOT NULL
price NUMERIC(10,2) NOT NULL
stock_qty INT NOT NULL DEFAULT 0
==indexes==
IDX(idx_products_sku) ON (sku)
IDX(idx_products_category) ON (category_id)
}
TABLE(order_items, "order_items") {
PK(id) BIGSERIAL PRIMARY KEY
--
FK(order_id) BIGINT NOT NULL
FK(product_id) BIGINT NOT NULL
quantity INT NOT NULL
unit_price NUMERIC(10,2) NOT NULL
line_total NUMERIC(12,2) GENERATED ALWAYS AS (quantity * unit_price) STORED
==indexes==
IDX(idx_oi_order_id) ON (order_id)
IDX(idx_oi_product_id) ON (product_id)
}
users "1" --> "0..*" orders : user_id
orders "1" --> "1..*" order_items : order_id
products "1" --> "0..*" order_items : product_id
@enduml
Real-World Example: Blog Platform Schema
Let us design a complete blog platform from scratch. The requirements are straightforward:
- Users write posts; posts have slugs for SEO-friendly URLs
- Posts can be commented on; comments are threaded (parent/child)
- Posts are tagged; tags are reusable across posts
- All timestamps use timezone-aware types
ERD in Mermaid
erDiagram
USERS {
bigint id PK
varchar email UK
varchar username UK
varchar password_hash
varchar display_name
text bio
text avatar_url
boolean is_active
timestamptz created_at
timestamptz updated_at
}
POSTS {
bigint id PK
bigint author_id FK
varchar title
varchar slug UK
text excerpt
text content
varchar status
timestamptz published_at
int view_count
timestamptz created_at
timestamptz updated_at
}
COMMENTS {
bigint id PK
bigint post_id FK
bigint author_id FK
bigint parent_id FK
text content
boolean is_approved
timestamptz created_at
}
TAGS {
bigint id PK
varchar name UK
varchar slug UK
text description
}
POST_TAGS {
bigint post_id FK
bigint tag_id FK
timestamptz created_at
}
USERS ||--o{ POSTS : "writes"
USERS ||--o{ COMMENTS : "authors"
POSTS ||--o{ COMMENTS : "receives"
COMMENTS ||--o{ COMMENTS : "parent of"
POSTS ||--o{ POST_TAGS : "has"
TAGS ||--o{ POST_TAGS : "used in"
SQL DDL in Three Dialects
-- PostgreSQL
CREATE TABLE users (
id BIGSERIAL PRIMARY KEY,
email VARCHAR(255) NOT NULL,
username VARCHAR(100) NOT NULL,
password_hash VARCHAR(255) NOT NULL,
display_name VARCHAR(200),
bio TEXT,
avatar_url TEXT,
is_active BOOLEAN NOT NULL DEFAULT true,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
CONSTRAINT uq_users_email UNIQUE (email),
CONSTRAINT uq_users_username UNIQUE (username)
);
CREATE TABLE posts (
id BIGSERIAL PRIMARY KEY,
author_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
title VARCHAR(500) NOT NULL,
slug VARCHAR(500) NOT NULL,
excerpt TEXT,
content TEXT NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'draft'
CHECK (status IN ('draft','published','archived')),
published_at TIMESTAMPTZ,
view_count INT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
CONSTRAINT uq_posts_slug UNIQUE (slug)
);
CREATE TABLE comments (
id BIGSERIAL PRIMARY KEY,
post_id BIGINT NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
author_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
parent_id BIGINT REFERENCES comments(id) ON DELETE CASCADE,
content TEXT NOT NULL,
is_approved BOOLEAN NOT NULL DEFAULT false,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE tags (
id BIGSERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
slug VARCHAR(100) NOT NULL,
description TEXT,
CONSTRAINT uq_tags_name UNIQUE (name),
CONSTRAINT uq_tags_slug UNIQUE (slug)
);
CREATE TABLE post_tags (
post_id BIGINT NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
tag_id BIGINT NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY (post_id, tag_id)
);
-- Indexes
CREATE INDEX idx_posts_author_id ON posts(author_id);
CREATE INDEX idx_posts_status ON posts(status);
CREATE INDEX idx_posts_published_at ON posts(published_at DESC)
WHERE status = 'published';
CREATE INDEX idx_comments_post_id ON comments(post_id);
CREATE INDEX idx_comments_parent_id ON comments(parent_id);-- MySQL
CREATE TABLE users (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
email VARCHAR(255) NOT NULL,
username VARCHAR(100) NOT NULL,
password_hash VARCHAR(255) NOT NULL,
display_name VARCHAR(200),
bio TEXT,
avatar_url TEXT,
is_active TINYINT(1) NOT NULL DEFAULT 1,
created_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
updated_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6)
ON UPDATE CURRENT_TIMESTAMP(6),
UNIQUE KEY uq_users_email (email),
UNIQUE KEY uq_users_username (username)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE posts (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
author_id BIGINT UNSIGNED NOT NULL,
title VARCHAR(500) NOT NULL,
slug VARCHAR(500) NOT NULL,
excerpt TEXT,
content LONGTEXT NOT NULL,
status ENUM('draft','published','archived') NOT NULL DEFAULT 'draft',
published_at DATETIME(6),
view_count INT UNSIGNED NOT NULL DEFAULT 0,
created_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
updated_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6)
ON UPDATE CURRENT_TIMESTAMP(6),
UNIQUE KEY uq_posts_slug (slug),
KEY idx_posts_author_id (author_id),
KEY idx_posts_status (status),
KEY idx_posts_published_at (published_at),
CONSTRAINT fk_posts_author FOREIGN KEY (author_id)
REFERENCES users(id) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE comments (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
post_id BIGINT UNSIGNED NOT NULL,
author_id BIGINT UNSIGNED NOT NULL,
parent_id BIGINT UNSIGNED,
content TEXT NOT NULL,
is_approved TINYINT(1) NOT NULL DEFAULT 0,
created_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
KEY idx_comments_post_id (post_id),
KEY idx_comments_parent_id (parent_id),
CONSTRAINT fk_comments_post FOREIGN KEY (post_id) REFERENCES posts(id) ON DELETE CASCADE,
CONSTRAINT fk_comments_author FOREIGN KEY (author_id) REFERENCES users(id) ON DELETE CASCADE,
CONSTRAINT fk_comments_parent FOREIGN KEY (parent_id) REFERENCES comments(id) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE tags (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
slug VARCHAR(100) NOT NULL,
description TEXT,
UNIQUE KEY uq_tags_name (name),
UNIQUE KEY uq_tags_slug (slug)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE post_tags (
post_id BIGINT UNSIGNED NOT NULL,
tag_id BIGINT UNSIGNED NOT NULL,
created_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
PRIMARY KEY (post_id, tag_id),
CONSTRAINT fk_pt_post FOREIGN KEY (post_id) REFERENCES posts(id) ON DELETE CASCADE,
CONSTRAINT fk_pt_tag FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;-- SQLite
CREATE TABLE users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
email TEXT NOT NULL UNIQUE,
username TEXT NOT NULL UNIQUE,
password_hash TEXT NOT NULL,
display_name TEXT,
bio TEXT,
avatar_url TEXT,
is_active INTEGER NOT NULL DEFAULT 1,
created_at TEXT NOT NULL DEFAULT (datetime('now','utc')),
updated_at TEXT NOT NULL DEFAULT (datetime('now','utc'))
);
CREATE TABLE posts (
id INTEGER PRIMARY KEY AUTOINCREMENT,
author_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
title TEXT NOT NULL,
slug TEXT NOT NULL UNIQUE,
excerpt TEXT,
content TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'draft'
CHECK (status IN ('draft','published','archived')),
published_at TEXT,
view_count INTEGER NOT NULL DEFAULT 0,
created_at TEXT NOT NULL DEFAULT (datetime('now','utc')),
updated_at TEXT NOT NULL DEFAULT (datetime('now','utc'))
);
CREATE TABLE comments (
id INTEGER PRIMARY KEY AUTOINCREMENT,
post_id INTEGER NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
author_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
parent_id INTEGER REFERENCES comments(id) ON DELETE CASCADE,
content TEXT NOT NULL,
is_approved INTEGER NOT NULL DEFAULT 0,
created_at TEXT NOT NULL DEFAULT (datetime('now','utc'))
);
CREATE TABLE tags (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL UNIQUE,
slug TEXT NOT NULL UNIQUE,
description TEXT
);
CREATE TABLE post_tags (
post_id INTEGER NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
tag_id INTEGER NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
created_at TEXT NOT NULL DEFAULT (datetime('now','utc')),
PRIMARY KEY (post_id, tag_id)
);
CREATE INDEX idx_posts_author_id ON posts(author_id);
CREATE INDEX idx_posts_status ON posts(status);
CREATE INDEX idx_comments_post ON comments(post_id);ORM Model Definitions
The same blog schema expressed as ORM models across the four most common stacks:
// Prisma (TypeScript)
// schema.prisma
model User {
id BigInt @id @default(autoincrement())
email String @unique @db.VarChar(255)
username String @unique @db.VarChar(100)
passwordHash String @map("password_hash") @db.VarChar(255)
displayName String? @map("display_name") @db.VarChar(200)
bio String?
avatarUrl String? @map("avatar_url")
isActive Boolean @default(true) @map("is_active")
createdAt DateTime @default(now()) @map("created_at") @db.Timestamptz
updatedAt DateTime @updatedAt @map("updated_at") @db.Timestamptz
posts Post[]
comments Comment[]
@@map("users")
}
model Post {
id BigInt @id @default(autoincrement())
authorId BigInt @map("author_id")
title String @db.VarChar(500)
slug String @unique @db.VarChar(500)
excerpt String?
content String
status String @default("draft") @db.VarChar(20)
publishedAt DateTime? @map("published_at") @db.Timestamptz
viewCount Int @default(0) @map("view_count")
createdAt DateTime @default(now()) @map("created_at") @db.Timestamptz
updatedAt DateTime @updatedAt @map("updated_at") @db.Timestamptz
author User @relation(fields: [authorId], references: [id], onDelete: Cascade)
comments Comment[]
tags PostTag[]
@@index([authorId])
@@index([status])
@@map("posts")
}
model Comment {
id BigInt @id @default(autoincrement())
postId BigInt @map("post_id")
authorId BigInt @map("author_id")
parentId BigInt? @map("parent_id")
content String
isApproved Boolean @default(false) @map("is_approved")
createdAt DateTime @default(now()) @map("created_at") @db.Timestamptz
post Post @relation(fields: [postId], references: [id], onDelete: Cascade)
author User @relation(fields: [authorId], references: [id], onDelete: Cascade)
parent Comment? @relation("CommentThread", fields: [parentId], references: [id])
children Comment[] @relation("CommentThread")
@@index([postId])
@@map("comments")
}
model Tag {
id BigInt @id @default(autoincrement())
name String @unique @db.VarChar(100)
slug String @unique @db.VarChar(100)
description String?
posts PostTag[]
@@map("tags")
}
model PostTag {
postId BigInt @map("post_id")
tagId BigInt @map("tag_id")
createdAt DateTime @default(now()) @map("created_at") @db.Timestamptz
post Post @relation(fields: [postId], references: [id], onDelete: Cascade)
tag Tag @relation(fields: [tagId], references: [id], onDelete: Cascade)
@@id([postId, tagId])
@@map("post_tags")
}// JPA / Hibernate (Java/Spring Boot)
@Entity
@Table(name = "users")
public class User {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@Column(nullable = false, unique = true, length = 255)
private String email;
@Column(nullable = false, unique = true, length = 100)
private String username;
@Column(name = "password_hash", nullable = false, length = 255)
private String passwordHash;
@Column(name = "display_name", length = 200)
private String displayName;
@Column(columnDefinition = "TEXT")
private String bio;
@Column(name = "avatar_url", columnDefinition = "TEXT")
private String avatarUrl;
@Column(name = "is_active", nullable = false)
private boolean isActive = true;
@Column(name = "created_at", nullable = false,
columnDefinition = "TIMESTAMPTZ DEFAULT now()")
private OffsetDateTime createdAt;
@Column(name = "updated_at", nullable = false,
columnDefinition = "TIMESTAMPTZ DEFAULT now()")
private OffsetDateTime updatedAt;
@OneToMany(mappedBy = "author", cascade = CascadeType.ALL, orphanRemoval = true)
private List<Post> posts = new ArrayList<>();
}
@Entity
@Table(name = "posts", indexes = {
@Index(name = "idx_posts_author_id", columnList = "author_id"),
@Index(name = "idx_posts_status", columnList = "status")
})
public class Post {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@ManyToOne(fetch = FetchType.LAZY, optional = false)
@JoinColumn(name = "author_id", nullable = false)
private User author;
@Column(nullable = false, length = 500)
private String title;
@Column(nullable = false, unique = true, length = 500)
private String slug;
@Column(columnDefinition = "TEXT")
private String excerpt;
@Column(nullable = false, columnDefinition = "TEXT")
private String content;
@Enumerated(EnumType.STRING)
@Column(nullable = false, length = 20)
private PostStatus status = PostStatus.DRAFT;
@Column(name = "published_at",
columnDefinition = "TIMESTAMPTZ")
private OffsetDateTime publishedAt;
@Column(name = "view_count", nullable = false)
private int viewCount = 0;
@OneToMany(mappedBy = "post", cascade = CascadeType.ALL, orphanRemoval = true)
private List<Comment> comments = new ArrayList<>();
@OneToMany(mappedBy = "post", cascade = CascadeType.ALL, orphanRemoval = true)
private List<PostTag> tags = new ArrayList<>();
}# SQLAlchemy 2.x (Python)
from __future__ import annotations
from datetime import datetime
from typing import Optional
from sqlalchemy import (
BigInteger, Boolean, Column, DateTime, ForeignKey,
Index, Integer, String, Text, UniqueConstraint, func
)
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(BigInteger, primary_key=True, autoincrement=True)
email: Mapped[str] = mapped_column(String(255), nullable=False, unique=True)
username: Mapped[str] = mapped_column(String(100), nullable=False, unique=True)
password_hash: Mapped[str] = mapped_column(String(255), nullable=False)
display_name: Mapped[Optional[str]] = mapped_column(String(200))
bio: Mapped[Optional[str]] = mapped_column(Text)
avatar_url: Mapped[Optional[str]] = mapped_column(Text)
is_active: Mapped[bool] = mapped_column(Boolean, nullable=False, default=True)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), server_default=func.now())
updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), server_default=func.now(), onupdate=func.now())
posts: Mapped[list["Post"]] = relationship(back_populates="author", cascade="all, delete-orphan")
comments: Mapped[list["Comment"]] = relationship(back_populates="author", cascade="all, delete-orphan")
class Post(Base):
__tablename__ = "posts"
__table_args__ = (
UniqueConstraint("slug", name="uq_posts_slug"),
Index("idx_posts_author_id", "author_id"),
Index("idx_posts_status", "status"),
)
id: Mapped[int] = mapped_column(BigInteger, primary_key=True, autoincrement=True)
author_id: Mapped[int] = mapped_column(BigInteger, ForeignKey("users.id", ondelete="CASCADE"), nullable=False)
title: Mapped[str] = mapped_column(String(500), nullable=False)
slug: Mapped[str] = mapped_column(String(500), nullable=False)
excerpt: Mapped[Optional[str]] = mapped_column(Text)
content: Mapped[str] = mapped_column(Text, nullable=False)
status: Mapped[str] = mapped_column(String(20), nullable=False, default="draft")
published_at: Mapped[Optional[datetime]] = mapped_column(DateTime(timezone=True))
view_count: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), server_default=func.now())
updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), server_default=func.now(), onupdate=func.now())
author: Mapped["User"] = relationship(back_populates="posts")
comments: Mapped[list["Comment"]] = relationship(back_populates="post", cascade="all, delete-orphan")
post_tags: Mapped[list["PostTag"]] = relationship(back_populates="post", cascade="all, delete-orphan")
class Comment(Base):
__tablename__ = "comments"
__table_args__ = (
Index("idx_comments_post_id", "post_id"),
Index("idx_comments_parent_id", "parent_id"),
)
id: Mapped[int] = mapped_column(BigInteger, primary_key=True, autoincrement=True)
post_id: Mapped[int] = mapped_column(BigInteger, ForeignKey("posts.id", ondelete="CASCADE"), nullable=False)
author_id: Mapped[int] = mapped_column(BigInteger, ForeignKey("users.id", ondelete="CASCADE"), nullable=False)
parent_id: Mapped[Optional[int]] = mapped_column(BigInteger, ForeignKey("comments.id", ondelete="CASCADE"))
content: Mapped[str] = mapped_column(Text, nullable=False)
is_approved: Mapped[bool] = mapped_column(Boolean, nullable=False, default=False)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), server_default=func.now())
post: Mapped["Post"] = relationship(back_populates="comments")
author: Mapped["User"] = relationship(back_populates="comments")
parent: Mapped[Optional["Comment"]] = relationship(remote_side="Comment.id", back_populates="children")
children: Mapped[list["Comment"]] = relationship(back_populates="parent")
class Tag(Base):
__tablename__ = "tags"
id: Mapped[int] = mapped_column(BigInteger, primary_key=True, autoincrement=True)
name: Mapped[str] = mapped_column(String(100), nullable=False, unique=True)
slug: Mapped[str] = mapped_column(String(100), nullable=False, unique=True)
description: Mapped[Optional[str]] = mapped_column(Text)
post_tags: Mapped[list["PostTag"]] = relationship(back_populates="tag", cascade="all, delete-orphan")
class PostTag(Base):
__tablename__ = "post_tags"
post_id: Mapped[int] = mapped_column(BigInteger, ForeignKey("posts.id", ondelete="CASCADE"), primary_key=True)
tag_id: Mapped[int] = mapped_column(BigInteger, ForeignKey("tags.id", ondelete="CASCADE"), primary_key=True)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), server_default=func.now())
post: Mapped["Post"] = relationship(back_populates="post_tags")
tag: Mapped["Tag"] = relationship(back_populates="post_tags")// Entity Framework Core (C# / .NET)
public class User
{
public long Id { get; set; }
public string Email { get; set; } = null!;
public string Username { get; set; } = null!;
public string PasswordHash { get; set; } = null!;
public string? DisplayName { get; set; }
public string? Bio { get; set; }
public string? AvatarUrl { get; set; }
public bool IsActive { get; set; } = true;
public DateTimeOffset CreatedAt { get; set; } = DateTimeOffset.UtcNow;
public DateTimeOffset UpdatedAt { get; set; } = DateTimeOffset.UtcNow;
public ICollection<Post> Posts { get; set; } = [];
public ICollection<Comment> Comments { get; set; } = [];
}
public class Post
{
public long Id { get; set; }
public long AuthorId { get; set; }
public string Title { get; set; } = null!;
public string Slug { get; set; } = null!;
public string? Excerpt { get; set; }
public string Content { get; set; } = null!;
public PostStatus Status { get; set; } = PostStatus.Draft;
public DateTimeOffset? PublishedAt { get; set; }
public int ViewCount { get; set; }
public DateTimeOffset CreatedAt { get; set; } = DateTimeOffset.UtcNow;
public DateTimeOffset UpdatedAt { get; set; } = DateTimeOffset.UtcNow;
public User Author { get; set; } = null!;
public ICollection<Comment> Comments { get; set; } = [];
public ICollection<PostTag> PostTags { get; set; } = [];
}
public class Comment
{
public long Id { get; set; }
public long PostId { get; set; }
public long AuthorId { get; set; }
public long? ParentId { get; set; }
public string Content { get; set; } = null!;
public bool IsApproved { get; set; }
public DateTimeOffset CreatedAt { get; set; } = DateTimeOffset.UtcNow;
public Post Post { get; set; } = null!;
public User Author { get; set; } = null!;
public Comment? Parent { get; set; }
public ICollection<Comment> Children { get; set; } = [];
}
// EF Core configuration (IEntityTypeConfiguration<T> pattern)
public class PostConfiguration : IEntityTypeConfiguration<Post>
{
public void Configure(EntityTypeBuilder<Post> builder)
{
builder.ToTable("posts");
builder.HasKey(p => p.Id);
builder.Property(p => p.Title).HasMaxLength(500).IsRequired();
builder.Property(p => p.Slug).HasMaxLength(500).IsRequired();
builder.HasIndex(p => p.Slug).IsUnique();
builder.HasIndex(p => p.AuthorId);
builder.HasIndex(p => p.Status);
builder.Property(p => p.Status).HasConversion<string>();
builder.HasOne(p => p.Author)
.WithMany(u => u.Posts)
.HasForeignKey(p => p.AuthorId)
.OnDelete(DeleteBehavior.Cascade);
}
}Best Practices
Naming Conventions
Consistency is more important than the specific convention you choose. Pick one and enforce it across every table:
- Table names: lowercase snake_case, plural (
users,order_items, notUserororderItem) - Column names: lowercase snake_case (
created_at,user_id, notcreatedAtorUserId) - Primary keys: always
id— notuser_idin theuserstable, notpk_user - Foreign keys:
{referenced_table_singular}_id—user_id,post_id,category_id - Boolean columns: prefix with
is_,has_, orcan_—is_active,has_verified_email - Timestamps:
created_atandupdated_aton every table, always timezone-aware - Junction tables: combine both table names alphabetically —
post_tags(nottag_postsorposts_to_tags) - Indexes:
idx_{table}_{columns}—idx_posts_author_id,idx_orders_status_created_at - Constraints:
uq_{table}_{column},fk_{table}_{referenced_table},chk_{table}_{column}
When to Normalize vs Denormalize
Third Normal Form (3NF) is the right starting point for most OLTP schemas. Normalize until it hurts, then denormalize strategically.
Normalize when:
- Data is written frequently (normalization reduces write amplification)
- Data integrity is critical — you cannot afford inconsistencies
- The data model is still evolving — normalized schemas are easier to change
Denormalize when:
- Read performance is critical and the data is read far more than it is written
- You need to avoid expensive joins across large tables (e.g., reporting dashboards)
- You are using a JSONB column to store semi-structured data that varies per row
- You have proven a normalization bottleneck with actual profiling data — not assumptions
A common pragmatic pattern: keep your write path fully normalized, and use materialized views or read replicas with denormalized structures for your read path.
Documenting Constraints
Constraints are not just database enforcement mechanisms — they are documentation. Every constraint tells a reader something about the business rules embedded in the schema:
NOT NULLsays the field is required by the business domainUNIQUEsays no two rows can represent the same logical entityCHECKsays the value must satisfy a business ruleFOREIGN KEYsays this entity cannot exist without its parentDEFAULTsays what value the system supplies when the application does not
Name your constraints explicitly rather than relying on auto-generated names. CONSTRAINT uq_users_email UNIQUE (email) is far easier to reference in error messages and migrations than users_email_key.
Version-Controlling Schema Changes
Treat migrations as first-class code artifacts:
- Use a migration tool — Flyway, Liquibase, Alembic, Prisma Migrate, or Django migrations. Never apply schema changes by hand to production.
- One change per migration — A migration that adds a column and creates an index is two logical changes. Keep them separate so rollbacks are precise.
- Write both up and down migrations — Even if you never roll back, the down migration documents the inverse operation and forces you to think it through.
- Never edit a committed migration — Committed migrations are immutable history. Fix forward with a new migration.
- Review migrations like code — Include schema changes in pull requests. A migration that drops a column with data deserves the same scrutiny as a security-sensitive code change.
- Store the diagram alongside the migrations — Keep a
docs/schema.mmdordocs/schema.pumlin your repository and update it as part of the same PR as every migration.
Tools Comparison
| Tool | Best For | Strengths | Limitations |
|---|---|---|---|
| dbdiagram.io | Quick documentation, sharing | Fast DBML syntax, beautiful output, shareable links | Online-only, limited free tier |
| DBeaver | Multi-database reverse engineering | Supports 80+ databases, ER diagram from live DB, free Community edition | Diagram export can be finicky |
| pgAdmin | PostgreSQL-specific work | Built-in ERD tool, query plan visualizer, tight PG integration | PostgreSQL only |
| DataGrip | Power users, IDE integration | Excellent schema visualization, smart migrations, multi-DB | Paid (JetBrains subscription) |
| draw.io / diagrams.net | Custom diagrams, presentations | Fully free, offline, highly customizable, integrates with Confluence | Manual — no database reverse engineering |
| Mermaid | Docs-as-code, Git-native | Renders in GitHub/GitLab/Notion, version-controlled, plain text | Limited visual customization |
| PlantUML | Enterprise/Confluence docs | Rich layout control, CI/CD friendly, IntelliJ integration | Requires Java or server rendering |
| Lucidchart | Team collaboration | Real-time collaboration, database import, Jira/Confluence plugins | Paid |
For most teams the pragmatic combination is: Mermaid for diagrams embedded in Git repositories (your README.md, ADRs, PR descriptions) and DBeaver or DataGrip for reverse-engineering and exploring live databases.
Final Thoughts
A database diagram that lives in your repository alongside the migration files is infinitely more valuable than one buried in a Confluence page that was last updated two years ago. The best diagram tooling for engineering teams is the one that treats the diagram as code — version-controlled, diffable, reviewable, and automatically rendered in the platforms your team already uses.
Start with a Mermaid erDiagram in your project’s docs/ folder today. Add a CI step that validates the Mermaid syntax. Make updating the diagram a required part of your schema change process. In six months, every engineer on your team will thank you — including your future self at 3 AM during an incident.