Databases — Types, Differences & Selection Guide

Database Categories at a Glance

Category	Data Model	Query Language	Examples
Relational (SQL)	Tables with rows and columns	SQL	PostgreSQL, MySQL, SQLite
Document	JSON/BSON documents	Query API / SQL-like	MongoDB, CouchDB
Key-Value	Key → value pairs	GET/SET API	Redis, DynamoDB, etcd
Column-Family	Wide columns, sparse rows	CQL / SQL-like	Cassandra, ScyllaDB, HBase
Graph	Nodes + edges (relationships)	Cypher, Gremlin	Neo4j, Amazon Neptune
Time-Series	Timestamped data points	SQL / InfluxQL	TimescaleDB, InfluxDB, QuestDB
Vector	High-dimensional embeddings	ANN similarity search	Pinecone, Weaviate, Qdrant, pgvector
Search Engine	Inverted index on text	Query DSL / SQL	Elasticsearch, Meilisearch

Relational Databases (SQL)

Data is stored in tables with a defined schema. Relations are managed via foreign keys. Transactions keep changes consistent.

Strength	Weakness
Data integrity (ACID)	Vertical scaling limits
Complex joins and aggregations	Schema changes can be costly
Mature tooling, broad SQL knowledge	Weak with deeply nested / unstructured data

When to use: Structured data, financial systems, CRM/ERP, any domain where data consistency and complex queries are critical.

Popular choices: PostgreSQL (feature-rich, extensible), MySQL (web apps, read-heavy), SQLite (embedded, mobile, testing).

Document Databases

Data is stored as flexible JSON-like documents. Schema is flexible, so documents can have different fields.

Strength	Weakness
Schema flexibility	No multi-document ACID (varies by engine)
Fast reads of nested objects	Cross-collection joins are slow or absent
Horizontal scaling (sharding)	Data duplication / denormalization needed

When to use: Content management, catalogs, user profiles, prototypes with rapidly evolving schemas.

Popular choice: MongoDB (most widely used document DB).

Key-Value Stores

Simplest model: store and read values by unique key. Optimized for speed.

Strength	Weakness
Sub-millisecond latency (in-memory)	No ad-hoc queries or joins
Simple horizontal scaling	Value is opaque — no filtering by fields
Predictable O(1) performance	Limited data modeling

When to use: Caching, sessions, feature flags, rate limiting, shopping carts, queues.

Popular choice: Redis (in-memory, data structures, pub/sub).

Column-Family (Wide-Column)

Data is organized by column families. Best for heavy writes and large analytical reads.

Strength	Weakness
High write throughput	Limited ad-hoc query flexibility
Excellent compression	No joins
Linear horizontal scaling	Data modeling requires query-first design

When to use: IoT ingestion, event logging, analytics on billions of rows, time-series at scale.

Popular choices: Cassandra, ScyllaDB, HBase.

BigQuery, Redshift, and Snowflake are columnar analytical warehouses (OLAP), not wide-column operational NoSQL stores.

Graph Databases

Core entities are nodes (things) and edges (relations). Great for traversing connected data.

Strength	Weakness
Relationship queries are fast	Poor fit for tabular / aggregation workloads
Natural for connected data	Smaller ecosystem, fewer engineers
Flexible schema on nodes/edges	Scaling graph traversals is hard

When to use: Social networks, recommendation engines, fraud detection, knowledge graphs, dependency analysis.

Popular choice: Neo4j (Cypher query language).

Time-Series Databases

Optimized for time-stamped data, range queries, and downsampling.

Strength	Weakness
Optimized time-range queries	Not for general-purpose OLTP
Automatic data retention / rollups	Limited join / transaction support
High ingestion throughput	Query flexibility varies

When to use: Monitoring metrics (Prometheus), IoT sensors, financial ticks, log analytics.

Popular choices: TimescaleDB (PostgreSQL extension), InfluxDB, QuestDB.

Vector Databases

Store and search high-dimensional vectors (ML embeddings), usually with ANN search.

Strength	Weakness
Semantic / similarity search	Not for structured CRUD
Powers RAG, recommendations	Index rebuild can be expensive
Sub-second search on millions of vectors	Young ecosystem, rapidly evolving

When to use: AI/LLM RAG pipelines, semantic search, image similarity, anomaly detection.

Popular choices: Pinecone, Weaviate, Qdrant, Milvus. Hybrid: PostgreSQL + pgvector.

Decision Matrix

Requirement	Best Fit
ACID transactions, complex queries	Relational (PostgreSQL)
Flexible schema, nested objects	Document (MongoDB)
Ultra-low latency cache / sessions	Key-Value (Redis)
Billions of rows, write-heavy analytics	Column-Family (Cassandra)
Relationship traversal, recommendations	Graph (Neo4j)
Time-stamped metrics, IoT	Time-Series (TimescaleDB)
AI embeddings, semantic search	Vector (pgvector, Pinecone)
Full-text search, log search	Search Engine (Elasticsearch)

Polyglot Persistence

Modern systems often combine multiple database types:

User request
    │
    ├── PostgreSQL ── core business data, transactions
    ├── Redis ── session cache, rate limits
    ├── Elasticsearch ── full-text search
    └── pgvector ── AI similarity search

Rule of thumb: Start with PostgreSQL. Add specialized databases only when a specific workload outgrows what PostgreSQL can handle.

Section Map

File	Topics
PostgreSQL Overview	Features, ecosystem, extensions, cheat sheet, when to choose
Commands & psql	Connection, navigation, psql shortcuts, import/export
Schema & Data Types	Tables, types, constraints, indexes, views
Queries & Performance	EXPLAIN ANALYZE, indexing strategy, pagination, anti-patterns
Admin & Operations	Users, backup, Docker setup, tuning, monitoring
Basic Query Commands	SQL cheat sheet: CRUD, filters, joins, CTEs, window functions, transactions