◂ ALL REPLAYS
SCENARIOS / SC-015 / RECYCLING ROBOT
SC-015 RL / MDP

Recycling Robot

Sutton & Barto's recycling robot — search, wait or recharge under a battery MDP.

Async Tick Observations FSM
Policy replay — state machine and reward trace FULLSCREEN ⤢
Clock 120 Hz tick
Update Asynchronous
States High / Low battery
Policies Bold / Timid
Pattern Action → outcome
OVERVIEW

The recycling-robot Markov decision process. From a high or low battery state the robot picks an action under a Bold or Timid policy: search (best reward, may drain the battery), wait (safe, smaller reward) or recharge. Each action transitions to a short outcome state whose probabilities encode the environment's response — finding cans, or being rescued at a steep penalty — so the stochastic transition function is expressed directly in the model.

A minimal, exact MDP for studying policy under risk. The two-stage action→outcome encoding is a reusable pattern for any stochastic transition, and the Bold-versus-Timid reward gap is visible directly in the exported reward and rescue counters.

TRAITS
Async
Independent, event-driven timelines
Tick
Discrete fixed-step time
Observations
Emits structured agent observations
FSM
Entities are finite-state machines
SCHEMA

Linked tables with guaranteed referential integrity.

TABLECOLUMNSDESCRIPTION
robot ID, policy, total_reward, rescues, current_state One row per robot: its policy (Bold/Timid), cumulative reward, rescue count and battery/outcome state.
LIVE API

Generated REST endpoints. Also exposed as MCP tools.

POST /scenarios/recycling-robot/experiments Seed a new robot population
POST /scenarios/recycling-robot/experiments/{eid}/run Advance N turns, or request an action
GET /scenarios/recycling-robot/experiments/{eid}/entities/robot Read robot states and rewards
GET /scenarios/recycling-robot/experiments/{eid}/events Append-only event log
GET /scenarios/recycling-robot/experiments/{eid}/dataset Download the exported dataset
SEMANTIC LAYER

OSI-compatible definition, emitted with the dataset.

# recycling-robot.osi.yaml — emitted automatically
semantic_model:
  name: "recycling-robot"
  source: "duckdb://recycling-robot.db"
  entities:
    - name: robot
      primary_key: id
  dimensions:
    - name: state
      type: categorical
    - name: t
      type: time
  measures:
    - name: row_count
      agg: count
    - name: active
      agg: sum
      filter: "state = 'ACTIVE'"