Recycling Robot

Sutton & Barto's recycling robot — search, wait or recharge under a battery MDP.

Async Tick Observations FSM

Try it live→ Download Database

SC-015 / SCHEMATICRL / MDP

Clock 120 Hz tick

Update Asynchronous

States High / Low battery

Policies Bold / Timid

Pattern Action → outcome

OVERVIEW

The recycling-robot Markov decision process. From a high or low battery state the robot picks an action under a Bold or Timid policy: search (best reward, may drain the battery), wait (safe, smaller reward) or recharge. Each action transitions to a short outcome state whose probabilities encode the environment's response — finding cans, or being rescued at a steep penalty — so the stochastic transition function is expressed directly in the model.

A minimal, exact MDP for studying policy under risk. The two-stage action→outcome encoding is a reusable pattern for any stochastic transition, and the Bold-versus-Timid reward gap is visible directly in the exported reward and rescue counters.

LIVE DATA

A live sample of the dataset this scenario generates.

Policy replay — state machine and reward trace FULLSCREEN ⤢

TRAITS

Async

Independent, event-driven timelines

Tick

Discrete fixed-step time

Observations

Emits structured agent observations

FSM

Entities are finite-state machines

SCHEMA

Linked tables with guaranteed referential integrity.

TABLECOLUMNSDESCRIPTION

robot ID, policy, total_reward, rescues, current_state One row per robot: its policy (Bold/Timid), cumulative reward, rescue count and battery/outcome state.

LIVE API

Generated REST endpoints. Also exposed as MCP tools.

POST /scenarios/recycling-robot/experiments Seed a new robot population

POST /scenarios/recycling-robot/experiments/{eid}/run Advance N turns, or request an action

GET /scenarios/recycling-robot/experiments/{eid}/entities/robot Read robot states and rewards

GET /scenarios/recycling-robot/experiments/{eid}/events Append-only event log

GET /scenarios/recycling-robot/experiments/{eid}/dataset Download the exported dataset

SEMANTIC LAYER

OSI-compatible definition, emitted with the dataset.

# recycling-robot.osi.yaml — emitted automatically
semantic_model:
  name: "recycling-robot"
  source: "duckdb://recycling-robot.db"
  entities:
    - name: robot
      primary_key: id
  dimensions:
    - name: state
      type: categorical
    - name: t
      type: time
  measures:
    - name: row_count
      agg: count
    - name: active
      agg: sum
      filter: "state = 'ACTIVE'"