GRAPH-BASED MEMORY ARCHITECTURE FOR EFFICIENT LLM-BASED COMPUTER USE AGENTS
DOI:
https://doi.org/10.31891/2219-9365-2026-85-20Keywords:
LLMs, Computer Use Agents, Graph Memory, GUI Automation, Knowledge Reuse, Task Efficiency, OSWorldAbstract
Large language model (LLM)-driven agents that use computers (Computer Use Agents, CUAs) often waste computation by repeatedly reasoning through tasks they have solved before. We address this inefficiency by introducing a graph-based memory architecture for GUI automation. In our approach, the agent stores its interaction traces in a dynamic graph, where nodes represent application screens and edges encode action sequences (e.g. GUI scripts) leading to state transitions. Reusing this graph of past experiences allows the agent to recall low-level actions and high-level workflows instead of re-computing them from scratch. We implement our memory-augmented framework by extending a state-of-the-art CUA (Agent S3) with a persistent memory layer. Experiments on the OSWorld benchmark show that our method reduces LLM token consumption and execution time by roughly 50–60% compared to the memoryless baseline, without degrading success rates. This graph memory enables the agent to efficiently recall exact UI manipulations and to reason over abstract tasks (like "login" or "export report") as reusable subroutines. Our findings suggest that structured memory significantly improves the practicality of LLM-based UI agents for real-world repetitive tasks.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Андрій МУСІЄНКО, Данило ВОРВУЛЬ

This work is licensed under a Creative Commons Attribution 4.0 International License.

