Skip to content

Databases — Types, Differences & Selection Guide

Database Categories at a Glance

Category Data Model Query Language Examples
Relational (SQL) Tables with rows and columns SQL PostgreSQL, MySQL, SQLite
Document JSON/BSON documents Query API / SQL-like MongoDB, CouchDB
Key-Value Key → value pairs GET/SET API Redis, DynamoDB, etcd
Column-Family Wide columns, sparse rows CQL / SQL-like Cassandra, ScyllaDB, HBase
Graph Nodes + edges (relationships) Cypher, Gremlin Neo4j, Amazon Neptune
Time-Series Timestamped data points SQL / InfluxQL TimescaleDB, InfluxDB, QuestDB
Vector High-dimensional embeddings ANN similarity search Pinecone, Weaviate, Qdrant, pgvector
Search Engine Inverted index on text Query DSL / SQL Elasticsearch, Meilisearch

Relational Databases (SQL)

Data is stored in tables with a defined schema. Relations are managed via foreign keys. Transactions keep changes consistent.

Strength Weakness
Data integrity (ACID) Vertical scaling limits
Complex joins and aggregations Schema changes can be costly
Mature tooling, broad SQL knowledge Weak with deeply nested / unstructured data

When to use: Structured data, financial systems, CRM/ERP, any domain where data consistency and complex queries are critical.

Popular choices: PostgreSQL (feature-rich, extensible), MySQL (web apps, read-heavy), SQLite (embedded, mobile, testing).

Document Databases

Data is stored as flexible JSON-like documents. Schema is flexible, so documents can have different fields.

Strength Weakness
Schema flexibility No multi-document ACID (varies by engine)
Fast reads of nested objects Cross-collection joins are slow or absent
Horizontal scaling (sharding) Data duplication / denormalization needed

When to use: Content management, catalogs, user profiles, prototypes with rapidly evolving schemas.

Popular choice: MongoDB (most widely used document DB).

Key-Value Stores

Simplest model: store and read values by unique key. Optimized for speed.

Strength Weakness
Sub-millisecond latency (in-memory) No ad-hoc queries or joins
Simple horizontal scaling Value is opaque — no filtering by fields
Predictable O(1) performance Limited data modeling

When to use: Caching, sessions, feature flags, rate limiting, shopping carts, queues.

Popular choice: Redis (in-memory, data structures, pub/sub).

Column-Family (Wide-Column)

Data is organized by column families. Best for heavy writes and large analytical reads.

Strength Weakness
High write throughput Limited ad-hoc query flexibility
Excellent compression No joins
Linear horizontal scaling Data modeling requires query-first design

When to use: IoT ingestion, event logging, analytics on billions of rows, time-series at scale.

Popular choices: Cassandra, ScyllaDB, HBase.

BigQuery, Redshift, and Snowflake are columnar analytical warehouses (OLAP), not wide-column operational NoSQL stores.

Graph Databases

Core entities are nodes (things) and edges (relations). Great for traversing connected data.

Strength Weakness
Relationship queries are fast Poor fit for tabular / aggregation workloads
Natural for connected data Smaller ecosystem, fewer engineers
Flexible schema on nodes/edges Scaling graph traversals is hard

When to use: Social networks, recommendation engines, fraud detection, knowledge graphs, dependency analysis.

Popular choice: Neo4j (Cypher query language).

Time-Series Databases

Optimized for time-stamped data, range queries, and downsampling.

Strength Weakness
Optimized time-range queries Not for general-purpose OLTP
Automatic data retention / rollups Limited join / transaction support
High ingestion throughput Query flexibility varies

When to use: Monitoring metrics (Prometheus), IoT sensors, financial ticks, log analytics.

Popular choices: TimescaleDB (PostgreSQL extension), InfluxDB, QuestDB.

Vector Databases

Store and search high-dimensional vectors (ML embeddings), usually with ANN search.

Strength Weakness
Semantic / similarity search Not for structured CRUD
Powers RAG, recommendations Index rebuild can be expensive
Sub-second search on millions of vectors Young ecosystem, rapidly evolving

When to use: AI/LLM RAG pipelines, semantic search, image similarity, anomaly detection.

Popular choices: Pinecone, Weaviate, Qdrant, Milvus. Hybrid: PostgreSQL + pgvector.

Decision Matrix

Requirement Best Fit
ACID transactions, complex queries Relational (PostgreSQL)
Flexible schema, nested objects Document (MongoDB)
Ultra-low latency cache / sessions Key-Value (Redis)
Billions of rows, write-heavy analytics Column-Family (Cassandra)
Relationship traversal, recommendations Graph (Neo4j)
Time-stamped metrics, IoT Time-Series (TimescaleDB)
AI embeddings, semantic search Vector (pgvector, Pinecone)
Full-text search, log search Search Engine (Elasticsearch)

Polyglot Persistence

Modern systems often combine multiple database types:

User request
    │
    ├── PostgreSQL ── core business data, transactions
    ├── Redis ── session cache, rate limits
    ├── Elasticsearch ── full-text search
    └── pgvector ── AI similarity search

Rule of thumb: Start with PostgreSQL. Add specialized databases only when a specific workload outgrows what PostgreSQL can handle.

Section Map

File Topics
PostgreSQL Overview Features, ecosystem, extensions, cheat sheet, when to choose
Commands & psql Connection, navigation, psql shortcuts, import/export
Schema & Data Types Tables, types, constraints, indexes, views
Queries & Performance EXPLAIN ANALYZE, indexing strategy, pagination, anti-patterns
Admin & Operations Users, backup, Docker setup, tuning, monitoring
Basic Query Commands SQL cheat sheet: CRUD, filters, joins, CTEs, window functions, transactions

See also