Orby ActIO Large Action Model

The Industry's First Agentic-AI Foundation Model Purpose-Built for the Enterprise

ActIO is the most capable and agile agentic AI foundation model on the market. ActIO uniquely provides four key building blocks for enterprise generative AI: Visual Grounding, Content Understanding, Planning, and Task Modeling.

ActIO_Diagram
Co-Founder and CTO Will Lu on Orby's ActIO Large Action Model

VISUAL GROUNDING

Locate the most relevant object or region in an image, based on a natural language query. The query can be a phrase, a sentence, or an instruction.

Grounding
Understanding

CONTENT UNDERSTANDING

Comprehend and interpret the meaning, context, and nuances of various forms of content, such as text, images, GUI, and documents. It allows AI agents to perform tasks that require layout and GUI understanding, beyond mere semantic comprehension.

PLANNING

Process of formulating a sequence of actions or decisions to achieve a specific goal or set of goals. Planning involves reasoning about the future, taking into account the current state of the environment, possible actions, and the outcomes of those actions.

Planning
Modeling

TASK MODELING

Understanding sequences of actions taken by users, predicting user intent, and constructing workflows are critical capabilities for enabling AI agents to learn from demonstrations. This involves observing and analyzing the actions users take to achieve specific outcomes, inferring their intentions, then using this intelligence to create automated workflows.

“Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents.”

PEERLESS PERFORMANCE AND ACCURACY

ActIO has shown state-of-the-art performance across top GUI agent benchmarks, better than existing multimodal models.  These benchmarks cover multiple scenarios, including web, desktop and mobile in both online and offline settings.

VisualGroundingChart
266PD-grounding-evaluation-in-multiple-gui-environments

In VisualWebBench test, ActIO-7b outperforms top models like, GPT-4o, Gemini 1.5 pro and Llava 1.6-34B.

ActIO also demonstrates state-of-the-art effectiveness and proficiency in supporting GUI agents.

Detailed Large Action Model evaluation results are available on LAMB.