Describe what the agent should do on the GUI. The loop captures a screenshot, reasons over it, executes an action, then observes again.