How Many Types of Agent Engineering Exist Right Now?

The AI industry has started producing a new engineering label almost every month. Prompt Engineering. Context Engineering. Harness Engineering. Eval Engineering. Memory Engineering. Skills Engineering. Guardrail Engineering. Inference Engineering. And now, Agentic Engineering. The pattern is already visible: the labels are proliferating because the systems are getting more real. At first glance, it looks like a fragmented field. My view is the opposite. These are not competing end-states. They are subsystems of one production discipline: Agent Engineering.

The Short Answer

If you go by the labels, there are at least nine named subdisciplines in circulation today, plus the open-ended category I would call *.Engineering. If you go by the underlying work, there is one bigger answer: all roads lead to Agent Engineering.

Prompt Engineering
Context Engineering
Harness Engineering
Eval Engineering
Memory Engineering
Skills Engineering
Guardrail Engineering
Inference Engineering
Agentic Engineering
*.Engineering

The Current Discipline Map

Each label tends to appear when one layer of the stack becomes painful enough to deserve its own abstraction. The labels are useful. The mistake is thinking the abstraction is the whole discipline.

The public trail is uneven but still strong enough to map. OpenAI’s prompt engineering guide helped formalize prompt work. LangChain’s context engineering essay gave context engineering a clearer systems definition. OpenAI’s Responses API essay made harness concerns legible to a broader audience. Hamel Husain’s evals writing, Letta’s memory work, Vercel AI SDK 6, NVIDIA NeMo Guardrails, and IBM’s agentic engineering explainer all show how separate pieces of the stack started getting named as engineering layers.

Prompt Engineering

Instruction design, examples, prompt templates, and interaction shaping. Attribution: Community-emergent term mainstreamed through the ChatGPT era, OpenAI docs, and later survey work such as The Prompt Report.

Prompts shape behavior, but they do not define runtime control, memory, evaluation, or system reliability.

Context Engineering

Selecting the right tools, memory, retrieval, and state to feed the model at the right moment.

Attribution: Popularized in 2025 by practitioner discourse around Tobi Lütke and others, then shaped into a clearer systems framing by Harrison Chase.

Context is critical, but production agents need more than context assembly.

Harness Engineering

The execution loop around the model: retries, branching, tool orchestration, decomposition, and environment control.

Attribution: One of the clearest recent labels, commonly attributed publicly to Mitchell Hashimoto and crystallized further by OpenAI in 2026.

The harness runs the system, but it still depends on prompts, context, evals, memory, safety, and inference.

Eval Engineering

Designing datasets, graders, rubrics, metrics, and feedback loops that tell you if the system is improving.

Attribution: Community-emergent label advanced by OpenAI Evals, Hamel Husain’s practitioner work, Anthropic-style evaluation culture, and later vendor framing.

Evals tell you whether the system works; they do not replace the design of the system.

Memory Engineering

Persistent state, retrieval strategy, compression, summarization, and durable learning across sessions.

Attribution: Still an emerging label, but Letta and MemGPT made memory a visible systems problem rather than a side feature.

Memory matters for long-running agents, but it is one subsystem inside a larger production discipline.

Skills Engineering

Packaging reusable capabilities, procedures, prompt files, tool bundles, and agent know-how into portable units.

Attribution: An emergent label shaped by reusable skills systems in ecosystems such as Claude Code and Vercel AI SDK 6.

Skills improve reuse and portability, but they still require orchestration, state, and runtime control.

Guardrail Engineering

Safety, policy enforcement, moderation, validation, approvals, bounded autonomy, and compliance constraints.

Attribution: The practice predates the label, but Guardrails AI and NVIDIA NeMo Guardrails helped institutionalize it as a formal systems layer.

Guardrails constrain behavior; they do not by themselves create a capable or useful agent.

Inference Engineering

Serving, routing, batching, latency, throughput, model hosting, and cost optimization.

Attribution: Infrastructure-first label strongly associated with serving and inference-platform builders such as Baseten and the broader runtime ecosystem.

Inference engineering solves speed and cost, not correctness, orchestration, or reliability of the full agent system.

Agentic Engineering

Human-supervised delegation of meaningful software and workflow execution to one or more agents.

Attribution: Clearly popularized in early 2026 by Andrej Karpathy as a more serious frame than vibe coding.

Agentic engineering describes how people work with agents, not the full production discipline for building those systems.

Why These Labels Appeared

The ordering is not random. Each label names a new class of failure in production systems.

Prompt Engineering named instruction failure.

Context Engineering named information failure.

Harness Engineering named execution failure.

Eval Engineering named measurement failure.

Memory Engineering named persistence failure.

Guardrail Engineering named control failure.

Inference Engineering named performance failure.

Why None of Them Is Enough Alone

If you optimize prompts but ignore context, your system becomes brittle. If you optimize context but ignore evals, you cannot prove improvement. If you optimize evals but ignore memory, long-running tasks degrade. If you optimize memory but ignore guardrails, the system becomes unsafe. If you optimize inference but ignore harness design, the system becomes fast but unreliable.

The Core Thesis

These labels are real, but each one is structurally incomplete. Production agents force them to converge. Prompts, context, harnesses, evals, memory, skills, guardrails, inference, and agentic workflows only become durable when treated as one systems discipline.

The Working Definition

My working definition is this:

Agent Engineering is the production discipline of specifying, assembling, measuring, operating, and improving non-deterministic AI systems that reason, use tools, maintain state, and act over time.

Under that definition, prompt engineering, context engineering, harness engineering, eval engineering, memory engineering, skills engineering, guardrail engineering, inference engineering, and agentic engineering are all subsets.

This is also why the umbrella framing matters. LangChain’s Agent Engineering essay points toward the same conclusion: the field is not just accumulating labels, it is converging on a production discipline.

Final Answer

How many types of Agent Engineering exist right now? At least nine named ones, plus an endless stream of new *.Engineering labels.

Context, evals, memory, harnesses, skills, guardrails, inference, and agentic orchestration converge in production systems engineering.

The labels will keep multiplying. The umbrella discipline will still be Agent Engineering.

We are hosting the Agent Engineering conference to cover all of these engineering types in one place.

SuperQode 0.2.0: A Harness Engineering Framework for Coding Agents

Superagentic AI Blog

Full Stack Agentic AI, Agent Optimization, Agent Engineering and Agent Experience.