Self Model

To effectively adapt to the environment and interact with it, an embodied agent needs to understand not only the external environment but also its internal self. Inspired by the human cognition of self, we propose the self model for embodied AI. The self model serves as a core component of embodied AI systems by integrating perception, prediction, memory, and decision modules, thereby enabling agents with diverse embodiments to perform various tasks such as manipulation, navigation, and question answering.

The Concept of Self: From Human to Agent

For humans, the concept of self is not only proof of one’s own existence but also the foundation of perception, thought, and emotion. Biologically, the core neural mechanisms of self include: body schema for structural representation of self, forward model for dynamical/causal prediction, inverse model that maps desired goals to control commands, mechanisms for agency that distinguish self from environment, and a perceptual‑memory model that tracks temporally extended self‑states and interaction histories. This unified, subjective framework is equally applicable and necessary in embodied AI. However, existing research mostly focuses on isolated components, lacking an integrated framework that fuses these functions, which prevents agents from achieving true autonomy, adaptation, and continuous learning.

Therefore, to replicate human-like self-awareness in robots, we start from four fundamental pillars: self-perception (awareness of its own body and environment), self-prediction (anticipating action outcomes), self-memory (maintaining internal state continuity), and self-decision (selecting feasible and goal-directed actions). These pillars fill the gaps in current embodied capabilities, enabling robots not only to understand the external world but also to form a coherent internal representation of “themselves”.

Internal representation of the self model in the human brain. The illustrated mechanisms offer biological inspiration for the design and construction of self model in embodied AI.

Self model in embodied AI

The Self Model is a unified internal representation that enables an embodied agent to understand its own body, capabilities, memories, and decision processes. It bridges perception, memory, prediction, and decision into a closed‑loop cognitive architecture, allowing the agent to reason not only about the external world but also about itself — its actions, limitations, and consequences.

Four core modules:

Perception – Real‑time awareness of body state (joints, collisions, morphology).
Memory – A 3D semantic self‑map that records spatial and experiential history.
Prediction – Forecasting action outcomes (e.g., grasp success) and diagnosing failures.
Decision – Goal‑to‑action mapping guided by self‑identity and predicted results.

These modules form a perception–memory–prediction–decision loop, enabling continuous self‑calibration and adaptive behavior in complex environments.

Self model establishes a functional mapping between the five core mechanisms of the human self model (left: body schema, forward model, inverse model, agency, perceptual-memory model) and the four modular components of the embodied AI self model (right: perception, prediction, decision, memory).

The L0–L5 Hierarchy

To systematically evaluate self‑modeling capabilities, we propose a six‑level hierarchy (L0–L5) that characterizes developmental stages from non‑self‑representation to full self‑awareness.

L0 – Non‑Self Model: Purely reactive, no self/non‑self distinction.
L1 – Basic Self‑Awareness: Static physical self, short‑term memory, basic collision judgments.
L2 – Basic Self‑Adaptation: Dynamic self‑environment coupling, generalized forward prediction.
L3 – Socialized Self: Role‑aware behavior, social memory, recognition of others’ intentions.
L4 – Sustained Self‑Evolution: Autobiographical memory, metacognitive monitoring, value‑oriented iteration.
L5 – Full Self‑Awareness: Worldview and ethical reasoning, hierarchically organized decision‑making.

This hierarchy provides an operational taxonomy for evaluating progress toward autonomous, self‑aware embodied systems.

Overview of the six-level hierarchy of self model. The hierarchy spans from a non-self model to full self awareness, encompassing basic self cognition, basic self adaptation, socialized self, and self evolution.

The paper “Self Model for Embodied Artificial Intelligence” presents a systematic formulation of this framework, including a detailed L1 implementation and experimental validation. For the full technical details, please see the paper below.

📄 Series of Papers

Self Model for Embodied Artificial Intelligence

Shuqiang Jiang, Sixian Zhang, Shida Tao, Xihong Zhu, Tianliang Qi, Xinhang Song

Journal of Computer Science and Technology, 2026

Abstract: This paper presents a systematic formulation of the Self Model for embodied AI... (read full abstract)