LangChain — Models, Prompts & Parsers
Chat Models
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
openai_llm = ChatOpenAI(
model="gpt-4o-mini",
temperature=0,
max_tokens=1024,
timeout=30,
max_retries=2,
)
anthropic_llm = ChatAnthropic(
model="claude-sonnet-4-20250514",
temperature=0,
max_tokens=1024,
)
| Parameter | Purpose |
|---|---|
model |
Model identifier (gpt-4o, claude-sonnet-4-20250514, etc.) |
temperature |
Randomness: 0 = deterministic, 1 = creative |
max_tokens |
Maximum response length |
timeout |
Request timeout in seconds |
max_retries |
Auto-retry on transient failures |
Direct Invocation
from langchain_core.messages import HumanMessage, SystemMessage
messages = [
SystemMessage(content="You are a senior Python developer."),
HumanMessage(content="Explain decorators in 3 sentences."),
]
response = openai_llm.invoke(messages)
print(response.content)
Message Types
from langchain_core.messages import (
AIMessage,
HumanMessage,
SystemMessage,
ToolMessage,
)
| Type | Role | Purpose |
|---|---|---|
SystemMessage |
system |
Sets persona, rules, constraints |
HumanMessage |
user |
User input |
AIMessage |
assistant |
Model response |
ToolMessage |
tool |
Tool execution result (for function calling) |
Prompt Templates
ChatPromptTemplate
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
("system", "You are an expert in {domain}."),
("human", "{question}"),
])
formatted = prompt.invoke({"domain": "databases", "question": "What is ACID?"})
MessagesPlaceholder
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
MessagesPlaceholder(variable_name="history"),
("human", "{input}"),
])
MessagesPlaceholder inserts a list of messages — used for conversation history, few-shot examples, or agent scratchpad.
Few-Shot Prompting
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
("system", "Translate English to French."),
("human", "Hello"),
("ai", "Bonjour"),
("human", "Goodbye"),
("ai", "Au revoir"),
("human", "{input}"),
])
result = (prompt | model).invoke({"input": "Thank you"})
Zero-Shot Prompt Template (Practical)
Use this when you do not have examples and need strict instructions.
from langchain_core.prompts import ChatPromptTemplate
zero_shot_prompt = ChatPromptTemplate.from_messages([
("system", "You are a technical assistant. Answer briefly and accurately."),
("human", "{question}"),
])
Chain-of-Thought Style Template (Practical)
Use this when task needs multi-step reasoning. In production, prefer concise reasoning output.
cot_prompt = ChatPromptTemplate.from_messages([
("system", "Solve step-by-step internally. Return only concise final answer."),
("human", "{problem}"),
])
| Template style | Best for | Main risk |
|---|---|---|
| Zero-shot | Fast/simple tasks | Under-specified behavior |
| Few-shot | Stable output style | Prompt length grows quickly |
| CoT-style | Reasoning-heavy tasks | Verbose output/cost increase |
Output Parsers
StrOutputParser
from langchain_core.output_parsers import StrOutputParser
chain = prompt | model | StrOutputParser()
result = chain.invoke({"question": "What is REST?"})
# result is a plain string
JsonOutputParser
from langchain_core.output_parsers import JsonOutputParser
from pydantic import BaseModel, Field
class Review(BaseModel):
score: int = Field(description="Rating 1-5")
summary: str = Field(description="One-line summary")
parser = JsonOutputParser(pydantic_object=Review)
prompt = ChatPromptTemplate.from_messages([
("system", "Analyze the review.\n{format_instructions}"),
("human", "{review_text}"),
])
prompt = prompt.partial(format_instructions=parser.get_format_instructions())
chain = prompt | model | parser
result = chain.invoke({"review_text": "Great product, fast delivery!"})
Structured Output (Preferred)
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
class CityInfo(BaseModel):
name: str = Field(description="City name")
country: str = Field(description="Country")
population: int = Field(description="Approximate population")
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
structured_llm = llm.with_structured_output(CityInfo)
result = structured_llm.invoke("Tell me about Paris")
print(result.name) # "Paris"
print(result.population) # 2161000
with_structured_output() uses native function calling — more reliable than parser-based approaches.
Custom Parser Pattern
Use custom parser when you need domain-specific validation or normalization beyond generic JSON parsing.
from langchain_core.output_parsers import BaseOutputParser
class LabelParser(BaseOutputParser[dict]):
def parse(self, text: str) -> dict:
raw = text.strip().lower()
if raw not in {"approve", "reject"}:
raise ValueError(f"Unexpected label: {raw}")
return {"label": raw}
custom_chain = prompt | model | LabelParser()
result = custom_chain.invoke({"question": "Decision?"})
Custom parser is useful for strict contracts in automation flows.
Streaming
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
chain = (
ChatPromptTemplate.from_messages([("human", "{question}")])
| ChatOpenAI(model="gpt-4o-mini")
| StrOutputParser()
)
for chunk in chain.stream({"question": "Explain microservices"}):
print(chunk, end="", flush=True)
| Method | Use case |
|---|---|
.invoke() |
Single request → single response |
.stream() |
Token-by-token streaming |
.batch() |
Multiple inputs in parallel |
.ainvoke() |
Async single request |
.astream() |
Async streaming |
Model Configuration Patterns
model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
bound = model.bind(stop=["\n\n"])
with_retry = model.with_retry(stop_after_attempt=3)
with_fallback = ChatOpenAI(model="gpt-4o").with_fallbacks(
[ChatAnthropic(model="claude-sonnet-4-20250514")]
)
| Method | Purpose |
|---|---|
.bind() |
Fix kwargs for every call (stop tokens, tools) |
.with_retry() |
Auto-retry on transient errors |
.with_fallbacks() |
Fall back to alternate model on failure |
.configurable_fields() |
Make params switchable at runtime |
Parameter Tuning Defaults (Practical)
| Parameter | Safe default | When to increase | Risk when too high |
|---|---|---|---|
temperature |
0.0-0.2 |
Brainstorming/creative writing | Hallucinations |
top_p |
1.0 |
Alternative sampling strategy | Unstable output |
max_tokens |
Explicit per task | Long reasoning or long-form output | Cost spikes |
timeout |
20-60s |
Slow provider/tools | Higher user-visible latency |
max_retries |
2-3 |
Flaky network/provider | Longer worst-case runtime |
stop |
Explicit for strict formats | Template-driven termination | Truncated useful output |
Use this tuning sequence: (1) set deterministic defaults, (2) fix prompt/retrieval quality, (3) then tune randomness.
Streaming Events (Debugging and UI Telemetry)
Use astream_events() when you need step-level runtime events (not only token chunks).
import asyncio
async def debug_events() -> None:
async for event in chain.astream_events({"question": "Explain CQRS"}, version="v2"):
print(event["event"], event.get("name"))
asyncio.run(debug_events())
This is useful for advanced UIs, live progress indicators, and debugging nested runnables.
Plain-Language Summary
- Model = the brain that generates text.
- Prompt = instructions + input format for the brain.
- Parser = converts model output into the format your app needs.
- Chain = pipeline that glues all steps together.
If output quality is poor, first check prompt clarity and output format constraints before changing the model.
Common Beginner Mistakes
- Sending huge unstructured prompts with no explicit output format.
- Using free-form text where structured output is required.
- Skipping retries/timeouts and getting random production failures.
- Parsing raw strings manually instead of using
with_structured_output().
Minimum Working Baseline (Recommended)
For first production-safe version:
- set
temperature=0; - set
timeoutandmax_retries; - use
with_structured_output()for machine-readable results; - enable tracing (
LANGCHAIN_TRACING_V2=true); - add one fallback model.