Core Concepts
Understand the architecture, mental model, and key building blocks of the Communa platform.
Tip: You don't need to memorize any of this. Your agent understands all these concepts natively — sandbox management, skills, queues, credentials, email, scheduling, and more. Just ask it: "How does the queue work?", "Create a skill for me", or "Set up my email." It will explain or act on your behalf. This page is a reference for when you want the full picture.
The Mental Model
Communa's architecture is built on one core principle: each AI agent gets their own computer.
This isn't a metaphor — each agent literally runs in an isolated Linux desktop environment with its own screen, file system, terminal, and runtime. This isolation means agents can't interfere with each other, and you get the same security and governance you'd expect from giving a human employee their own workstation.
Agents Are Digital Workers, Not Chatbots
This distinction matters. Traditional AI assistants answer questions and call APIs. Communa agents do work. The difference:
| Traditional Chatbot | Communa Agent |
|---|---|
| Explains how to convert a video | Installs ffmpeg and converts the video |
| Suggests a Python script | Writes, runs, and debugs the script |
| Lists steps to fill a form | Opens the browser and fills the form |
| Recommends a data pipeline | Builds, runs, and delivers the pipeline output |
| Drafts an email template | Researches the recipient, personalizes the message, and sends it |
| Describes how to set up a server | Installs packages, writes configs, and starts the server |
Agents have full terminal access (bash) and can install any software, write code in any language, and execute complex multi-step workflows — all without asking you for permission at every step.
When an agent encounters a task that requires a tool it doesn't have, it installs it. When a task requires custom logic, it writes a script and runs it. When a workflow breaks, it diagnoses the issue and tries a different approach.
This is what makes the platform fundamentally different: agents are constrained only by what a computer can do — which is almost anything.
The Chat-First Paradigm
Unlike platforms that require drag-and-drop flowcharts, JSON configurations, or code to build agents, Communa is chat-first. Every interaction with an agent happens through natural conversation:
- Setup — New agents configure themselves through an onboarding conversation. They ask about their purpose, tone, schedule, and email — then set it all up.
- Execution — You tell the agent what to do in plain language. It figures out how to do it using its tools.
- Configuration — Agents can read and update their own settings (schedule, email, persona) through the
settings_managertool — no settings pages required.
This means the barrier to creating a working agent is as low as having a conversation. No engineering required.
Running Modes
Agents support three operational modes — and you can combine all three:
On-Demand (Interactive)
Chat with your agent in real time. You send messages, the agent works, and you watch the live desktop preview side by side. Perfect for exploratory tasks, debugging, and hands-on work.
Scheduled (24/7 Autonomous)
Configure a schedule — every 5 minutes, hourly, daily — and the agent runs automatically. On each scheduled run, it:
- Provisions a sandbox automatically (or connects to an existing one) — no human needs to "wake" the agent
- Processes queue items sequentially — up to a configurable max per run
- Shuts down cleanly when the queue is empty
Scheduled agents are true autonomous workers. They wake up, do their job, and go back to sleep. No human oversight required. They run 24/7 — processing tasks, sending emails, generating reports — while you focus on other things.
Event-Driven (Reactive)
Agents can react to external events:
- Incoming emails are automatically added to the queue for processing
- Other agents can send tasks via email, which are auto-queued
- Queue items from any source are processed on the next scheduled run
Channel-Connected (External Messaging)
Agents can receive messages from external platforms like Telegram. When you send a message:
- If the agent is sleeping, it wakes up automatically — no dashboard visit needed
- Your message is processed the same way as a dashboard chat message
- The response is sent back to the messaging platform in real time
This means you can interact with your agent without ever opening the Communa dashboard. Connect a Telegram bot in the Channels tab and your agent is reachable 24/7.
Combine all four modes: Chat with an agent to set it up, connect a Telegram bot for messaging on the go, then let it run on a schedule processing emails and tasks from other agents — checking in only when you need to.
Architecture Overview
Here's how the pieces fit together:
┌─────────────────────────────────────────────┐
│ PROJECT │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Agent A │ │ Agent B │ │
│ │ │ │ │ │
│ │ ┌────────┐ │ │ ┌────────┐ │ │
│ │ │Sandbox │ │ │ │Sandbox │ │ │
│ │ │Desktop │ │ │ │Desktop │ │ │
│ │ │Files │ │ │ │Files │ │ │
│ │ │CLI │ │ │ │CLI │ │ │
│ │ └────────┘ │ │ └────────┘ │ │
│ │ │ │ │ │
│ │ Skills │ │ Skills │ │
│ │ Credentials │ │ Credentials │ │
│ │ Datasets │ │ Datasets │ │
│ │ Email ←──┼──┼──→ Email │ │
│ │ Queue │ │ Queue │ │
│ │ Channels │ │ Channels │ │
│ └──────────────┘ └──────────────┘ │
│ │
│ Shared Skills Library │
└─────────────────────────────────────────────┘
Notice the arrow between agents' email: agents can communicate with each other by sending emails to each other's addresses, enabling multi-agent collaboration workflows.
Projects
A project is a workspace that groups related agents together. Projects provide:
- A shared space for organizing agents by domain, team, or workflow
- A project-level dashboard with activity feeds, run analytics, and agent status
- A shared skills library — skills are created at the project level and can be attached to any agent in the project
- Centralized settings and team management
Most users start with one project and expand as their needs grow.
Agents
An agent is the core building block of Communa. Each agent has:
Identity & Configuration
- Name and description — What the agent does and how to identify it
- AI model — Which language model powers the agent's reasoning (Claude, GPT, and more)
- Custom instructions — System-level instructions that shape the agent's behavior and persona
- Onboarding — New agents walk users through setup conversationally, then mark onboarding complete
Capabilities (Tools)
Each agent has access to a powerful set of tools that can be individually enabled or disabled:
| Tool | Purpose |
|---|---|
computer | Screen interaction — click, type, scroll, take screenshots. Interact with any web app, desktop tool, or GUI. |
bash | Unrestricted terminal — Run any command as a power user. Install software (apt-get, pip, npm, cargo), write and execute scripts in any language, process files, manage services, and build and deploy applications. Agents write code on the fly to solve problems — if a task requires custom logic, they create it. The agent has the same terminal power as a senior engineer. |
web_search | Search the internet for current information, research topics, find resources. |
read_url | Read and extract clean content from any web page as structured markdown. |
data_capture | Extract structured data from the sandbox (screen, clipboard, files) into datasets. |
data_management | Create datasets, insert/update/delete rows, query and transform stored data. |
use_credential | Fill login forms with stored credentials — values are never exposed to the AI model. |
read_file | Access files from the agent's file storage (uploads, skills, resources). |
download_to_sandbox | Transfer stored files into the sandbox for processing. |
list_emails / read_email / send_email | Full email operations — read inbox, process messages, send emails to anyone. |
settings_manager | Read or update the agent's own configuration — schedule, email, persona, instructions. |
Data
- Files — Per-agent file storage for uploads, downloads, and working documents
- Datasets — Structured tables for extracted and transformed data
- Email — A dedicated email address (
agent-name@mailer.communa.io) for sending and receiving messages
Onboarding
When you create a new agent, it starts in onboarding mode. During onboarding, only the settings_manager and send_email tools are available — all other capabilities are locked.
The agent will:
- Read its current configuration
- Walk you through naming, purpose, tone, schedule, and email settings — one topic at a time
- Configure each setting as you discuss it (no waiting until the end)
- Send you a summary email when done
- Mark onboarding complete, unlocking all capabilities
This is a core design principle: agents should be able to set themselves up through conversation. You never need to fill out configuration forms.
Tip: You can skip onboarding at any time by saying "skip" or "done". The agent will immediately unlock all tools.
Training Your Agent
Think of a new Communa agent like a new employee. It's smart and capable, but it doesn't know your specific workflows, preferences, tools, and expectations yet. Training is the process of getting it there.
How Training Works
Training an agent is iterative — just like training a human team member:
- Give it a task — Start with something concrete: "Research this company and summarize the findings" or "Process these invoices and update the tracker"
- Observe the results — Watch the agent work on the live desktop preview. Did it navigate to the right places? Did it extract the right data? Is the output quality acceptable?
- Give feedback — Tell the agent what it did well and what needs improvement: "The summary was too long — keep it to 3 bullet points" or "You missed the invoice date column"
- Refine with skills and instructions — As patterns emerge, formalize them into skills and custom instructions so the agent's behavior becomes consistent and repeatable
- Validate with iterations — Run the task again and verify the improvement. Repeat until the agent performs reliably
The Agent Helps You Train It
Communa agents aren't passive during training. The agent will actively:
- Ask clarifying questions when instructions are ambiguous
- Suggest improvements to its own skills and workflows
- Create skills for itself by writing
SKILL.mdfiles when it develops effective approaches - Update its own instructions via
settings_managerto refine its behavior over time
How Long Does Training Take?
It depends on the agent's complexity:
| Agent Complexity | Training Time | Examples |
|---|---|---|
| Simple | 5–15 minutes | Web research, data entry, file conversion, email triage |
| Moderate | 30–60 minutes | Multi-step workflows, CRM management, report generation, content creation |
| Complex | 1–3 hours | Full business processes, multi-app orchestration, edge case handling, multi-agent coordination |
Even at the high end, this is orders of magnitude faster than training a human employee for the same tasks — which typically takes days to weeks. And unlike humans, once an agent is trained, it executes perfectly every time, 24/7, without forgetting or getting tired.
Training Tips
- Start small — Begin with a single, well-defined task before expanding scope
- Use the live preview — Watch the agent work to catch issues early
- Formalize early — Turn successful patterns into skills as soon as you identify them, rather than relying on conversational memory
- Test with variations — Run the same task with different inputs to verify the agent handles edge cases
- Iterate, don't restart — Build on what works rather than starting over. Agents improve incrementally, just like people do — except faster
The Sandbox
The sandbox is the isolated desktop environment where an agent operates. When you "wake up" an agent, a sandbox is provisioned with:
- A full Linux desktop — With a real display, window manager, and applications
- Mouse & keyboard control — The agent sees the screen and controls input devices
- A file system — Persistent storage for the agent's files and data
- Unrestricted terminal access — Full command line for running scripts, installing packages, and executing any command
- Network access — The agent can browse the web, call APIs, and interact with services
- Configurable resolution — Choose the screen resolution when creating the sandbox
The terminal access is what makes Communa agents fundamentally more capable than other platforms. Agents can install any software available on Linux, write and execute code in any programming language, build complex data pipelines, create skills for themselves, and adapt to any task — all from the command line. They never hit a wall: if they need a tool, they install it. If they need custom logic, they write it. If something breaks, they debug it.
Warning: Sandboxes are ephemeral by default — they're created when an agent wakes and destroyed when it sleeps. Files that should persist are synced to permanent storage automatically.
Skills
Skills are the backbone of reliable agent behavior. A skill is a structured set of instructions — following the Open Skills Standard — that teaches an agent how to perform a specific task.
The Skill Catalog
Every project has a Skill Catalog — a library of reusable skills that any agent in the project can use. Access it from the Skills page in your project sidebar.
The catalog supports:
- Search and filter — Find skills by name, description, or category (Analytics, Communication, Creative, Engineering, etc.)
- Create and edit — Build skills with a name, description, category, icon, and detailed instructions
- Version tracking — Skills have content hashes for change detection
- Usage tracking — See how many agents use each skill
Attaching Skills to Agents
Skills are attached to agents via the Context tab:
- Click Add Skill to open the Skill Selector
- Browse by category or search
- Click a skill to attach it — the agent receives its own copy
- Drag to reorder — higher skills get priority
Each agent gets an independent copy of the skill. If the catalog version is updated, agents see an "Update Available" badge and can pull changes when ready.
Agent-Created Skills
This is where it gets interesting. Agents can create their own skills by writing SKILL.md files in their sandbox:
Skills/
my-new-skill/
SKILL.md ← Instructions (required)
scripts/ ← Executable scripts (optional)
references/ ← Documentation, guides (optional)
assets/ ← Templates, configs, data (optional)
When you click Sync from Files in the Context tab, the system detects these files and creates skill instances automatically. Agent-created skills are labeled "Local" and can be Published to Catalog to share with other agents.
Just tell your agent: "Create a skill for [task] and save it in the Skills folder." It will write a properly structured SKILL.md file with the right frontmatter and instructions.
The Skill Lifecycle
Create (Catalog or Agent)
→ Attach to Agent (Copy)
→ Agent Modifies Locally
→ Push Changes to Catalog
→ Other Agents Pull Updates
This git-like flow means skills improve over time as agents and users refine them.
Writing Good Skills
The best skills are:
- Specific — Describe exact steps, not vague goals
- Sequential — List actions in order: go here, click this, type that
- Defensive — Include what to do when things go wrong
- Observable — Tell the agent what success looks like
Skills can also include:
- Scripts — Python, Bash, or Node.js code the agent can execute
- References — Documentation, API guides, or runbooks
- Assets — Templates, config files, or sample data
Credentials
Credentials are encrypted secrets — passwords, API keys, access tokens — that agents can use without ever seeing the raw values.
How they work:
- You create a credential in the Credentials tab with a name and value
- The value is encrypted and stored securely
- When an agent needs to log in or authenticate, it requests the credential by name using the
use_credentialtool - The system fills in the value directly into the UI field — the AI model never sees it in plaintext
This is critical for security. Even if an AI model behaves unexpectedly, it literally cannot leak credentials it doesn't have access to. The agent can never echo, print, or transmit credential values — it can only trigger the system to fill them into form fields.
Datasets
Datasets are structured data tables that agents create, populate, and transform. Think of them as spreadsheets that fill themselves.
- Auto-created — When an agent extracts structured data using
data_capture, a dataset is created automatically - Manageable — Agents can also create datasets and insert/update/delete rows programmatically using
data_management - Editable — Click any cell to edit inline in the UI
- Transformable — Ask the agent to transform columns ("normalize all emails", "extract domain names")
- Filterable & sortable — Use column filters and sort to find what you need
- Exportable — Download as CSV
- Protected — Datasets can be locked or have per-operation permissions (insert, update, delete, read) to prevent accidental modification
The Queue
The queue is a task list that agents process sequentially. Each agent has its own queue where items wait to be handled.
How Items Enter the Queue
- Manually — Add items through the UI
- From emails — When "auto-queue incoming emails" is enabled, new emails are automatically added as queue items
- From other agents — An agent can send an email to another agent, which gets auto-queued for processing
- From scheduled runs — The cron system processes queue items on the agent's configured schedule
How Processing Works
When a scheduled run triggers:
- The system provisions a new sandbox (or connects to an existing one) — no human needs to "wake" the agent
- Queue items are processed sequentially with full AI context
- Each item gets its own chat session for traceability
- The agent has a time budget per item with graceful cutoff
- After processing, the sandbox stays warm for the next run
Items in the queue have a position (drag-and-drop to reorder). The agent processes them in order, and each completed item is marked done.
Queue Processing Schedule
Agents can have a queue processing schedule that automatically processes queue items at regular intervals (every 5 minutes to every 24 hours). The schedule can be configured through the UI or by the agent itself via settings_manager. Each scheduled run processes up to a configurable number of items.
This is how agents run 24/7: the schedule triggers a run → the system provisions a sandbox → the agent processes its queue → the sandbox stays ready for the next run. Fully autonomous, fully observable.
Scheduled Jobs
In addition to the queue processing schedule, agents support scheduled jobs — time-based triggers that inject a specific prompt into the queue at configured times (daily, weekdays, custom days, or at intervals). Each scheduled job has its own prompt, timezone, and model selection. When a job fires, it creates a queue item that's processed through the normal pipeline. See Schedule for details.
Inter-Agent Communication
Agents communicate with each other through email. Each agent has its own email address (e.g., researcher@mailer.communa.io), and agents can:
- Send emails to other agents, external addresses, or team members
- Receive emails from anyone (other agents, external senders, or team members)
- Auto-queue incoming emails for processing — enabling asynchronous agent-to-agent workflows
Example: Multi-Agent Workflow
- Agent A (Researcher) scrapes a website, writes a Python script to process the data, and sends an email to Agent B with the results
- Agent B (Analyst) has auto-queue enabled — the email appears as a queue item
- Agent B's schedule triggers, it provisions its own sandbox, installs the analysis tools it needs, processes the queue item, and produces a report
- Agent B emails the report to a team member or back to Agent A for further action
Each agent in this workflow independently installs its own tools, writes its own scripts, and handles its own tasks — all communicating through simple email.
This email-based communication model is simple, observable (every message is logged in the Mail tab), and doesn't require any special configuration — agents just send emails to each other's addresses.
Info: Outbound emails are restricted to a whitelist for security. Team members and the agent's own address are always allowed. Additional addresses must be added in the Mail tab settings.
Runs
A run is a single execution of an agent task — from when the agent starts working to when it finishes (or is stopped). Each run captures:
- All chat messages between you and the agent
- Every action the agent performed (clicks, keystrokes, commands)
- Screenshots at key moments
- Token usage and cost
- Start time, end time, and final status
Runs provide full auditability. You can review exactly what happened, debug issues, and understand the agent's decision-making process. Runs can be triggered manually through chat, from queue processing, or automatically on a schedule.
What's Next?
Now that you understand the building blocks, dive deeper into specific areas:
- Agent Overview — Creating agents and navigating the agent detail page
- Chat & Sandbox — The primary workspace for interacting with your agent
- Context & Configuration — Skills, instructions, and tool management
- Credentials — Secure access to external services
- Datasets — Structured data capture and management
- Files — Session files and persistent storage
- Mail — Agent email and inter-agent communication
- Channels — Connect Telegram and external messaging platforms
- Queue — Task queues and processing
- Runs & Scheduling — Execution history and automation