OTHER

smolagents

OTHER

smolagents

コードを実行して思考するエージェント構築用のミニマルPythonライブラリ。言語モデルが直接コードを生成・実行する軽量なアーキテクチャを提供し、シンプルで拡張性の高いエージェント実装を可能にします。

原文: 🤗 smolagents: a barebones library for agents that think in code.

GitHub を開く公式サイト Issues

#エージェント#Python#コード実行

EDITOR'S TAKE

編集部メモ

LLM がコード生成と実行を一貫して行う、1,000行のミニマルエージェントフレームワーク

HuggingFace が 2024年末に公開したエージェント構築ライブラリです。言語モデルが直接 Python コードを生成・実行する「Code-first」というシンプルな設計思想が特徴です。実装が軽量で学習教材としても優れており、複数サンドボックス環境による実行セキュリティ、MCP サーバーや Hub Spaces との連携など拡張性も備えています。ただしシンプルさゆえに複雑な状態管理やメモリ機構は最小限。本番運用での堅牢性はまだ検証途上です。

USE CASES

こんな場面で使う

LLM が環境に直接アクセスしてコード実行するワークフロー（Web検索、API 呼び出し、ローカル計算の統合）
既存ツール・API を HuggingFace Hub 経由で LLM エージェント化し、複数ユーザーで共有する
エージェント実装の基礎を学ぶ教材として、また小規模プロトタイプを迅速に構築する

DIFFERENTIATOR

類似ツールとの違い

LangChain は大規模エコシステムを重視、AutoGen は複数エージェント間の協調を想定。一方 smolagents はコード行数を徹底的に抑え、「何をするか」の思考ロジックを読みやすく保つ設計。HuggingFace Hub との統合により、訓練済みモデルとエージェント定義の一体配布も容易です。

CAVEAT

注意点・向かない用途

⚠️ ライブラリ自体が 2024年12月の新規公開で、本番運用での安定性・エラーハンドリング戦略はまだ検証段階です。メモリ管理や複雑な状態保持が必要なアプリケーションには向きません。Python のみ対応。

BEST FOR

向いている読者

シンプルなエージェント実装を求める開発者LLM のコード生成能力を直接活用したい AI 研究者・学生軽量フレームワークで迅速にプロトタイピングしたいスタートアップエージェント実装の仕組みを学びたい初心者

— OSS Agents JP 編集部による独自評価（smolagents に関する観察）

REPO STATS

リポジトリ統計

⭐ Stars

27.3k

🍴 Forks

2.6k

⚠️ Open Issues

537

🌿 Language

Python

📄 License

Apache-2.0

🕒 最終更新

2026.05.14 (2日前)

📅 公開日

2024.12.05

🌿 Branch

main

REFERENCE

公式ドキュメント（README）

本ハブの独自評価は上記「編集部メモ」が一次情報です。以下は GitHub README の参考転載（折りたたみ）。

📖 GitHub README の日本語訳を読む（AI 自動翻訳 / 参考情報）

— AI による自動翻訳 (2026.05.15 更新)。正確な情報は GitHub の原文をご確認ください。

コードで考えるエージェント!

smolagents は数行のコードで強力なエージェントを実行できるライブラリです。以下のような機能を提供します:

✨ シンプルさ: エージェントのロジックは約 1,000 行のコードに収まります (agents.py を参照)。抽象化を最小限に保ち、生のコードを優先しました!

🧑‍💻 Code Agents への第一級サポート。当社の CodeAgent はアクションをコードで記述します (「コードを書くために使われるエージェント」とは異なります)。セキュリティを確保するために、Blaxel、E2B、Modal、Docker、または Pyodide+Deno WebAssembly サンドボックスを介したサンドボックス環境での実行をサポートしています。

🤗 Hub インテグレーション: ツールまたはエージェントを Hub との間で共有/取得でき、最も効率的なエージェントを即座に共有できます!

🌐 モデル非依存: smolagents はあらゆる LLM をサポートします。ローカルの transformers または ollama モデル、Hub の多くのプロバイダーの 1 つ、または LiteLLM インテグレーション経由での OpenAI、Anthropic などからの任意のモデルを使用できます。

👁️ モダリティ非依存: エージェントはテキスト、ビジョン、ビデオ、さらにはオーディオ入力をサポートします! ビジョンについてはこのチュートリアルを参照してください。

🛠️ ツール非依存: 任意の MCP サーバーからのツール、LangChain からのツール、さらには Hub Space をツールとして使用することもできます。

完全なドキュメントはこちらから確認できます。

注

当社のローンチブログ記事を確認して、smolagents についてさらに詳しく学んでください!

クイックデモ

まず、デフォルトのツールセットでパッケージをインストールします:

pip install "smolagents[toolkit]"

その後、エージェントを定義し、必要なツールを与えて実行してください。

from smolagents import CodeAgent, WebSearchTool, InferenceClientModel

model = InferenceClientModel()
agent = CodeAgent(tools=[WebSearchTool()], model=model, stream_outputs=True)

agent.run("How many seconds would it take for a leopard at full speed to run through Pont des Arts?")

smolagents_readme_leopard.mp4

さらに、エージェントをハブにスペースリポジトリとして共有することもできます。

agent.push_to_hub("m-ric/my_agent")

# agent.from_hub("m-ric/my_agent") to load an agent from Hub

当社のライブラリは LLM に依存しません。上記の例を任意の推論プロバイダーに切り替えることができます。

InferenceClientModel: HF でサポートされているすべての推論プロバイダーへのゲートウェイ

from smolagents import InferenceClientModel

model = InferenceClientModel(
    model_id="deepseek-ai/DeepSeek-R1",
    provider="together",
)

LiteLLM で 100 以上の LLM にアクセス

from smolagents import LiteLLMModel

model = LiteLLMModel(
    model_id="anthropic/claude-4-sonnet-latest",
    temperature=0.2,
    api_key=os.environ["ANTHROPIC_API_KEY"]
)

OpenAI 互換サーバー: Together AI

import os
from smolagents import OpenAIModel

model = OpenAIModel(
    model_id="deepseek-ai/DeepSeek-R1",
    api_base="https://api.together.xyz/v1/", # Leave this blank to query OpenAI servers.
    api_key=os.environ["TOGETHER_API_KEY"], # Switch to the API key for the server you're targeting.
)

OpenAI 互換サーバー: OpenRouter

import os
from smolagents import OpenAIModel

model = OpenAIModel(
    model_id="openai/gpt-4o",
    api_base="https://openrouter.ai/api/v1", # Leave this blank to query OpenAI servers.
    api_key=os.environ["OPENROUTER_API_KEY"], # Switch to the API key for the server you're targeting.
)

ローカル transformers モデル

from smolagents import TransformersModel

model = TransformersModel(
    model_id="Qwen/Qwen3-Next-80B-A3B-Thinking",
    max_new_tokens=4096,
    device_map="auto"
)

Azure モデル

import os
from smolagents import AzureOpenAIModel

model = AzureOpenAIModel(
    model_id = os.environ.get("AZURE_OPENAI_MODEL"),
    azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
    api_version=os.environ.get("OPENAI_API_VERSION")    
)

Amazon Bedrock モデル

import os
from smolagents import AmazonBedrockModel

model = AmazonBedrockModel(
    model_id = os.environ.get("AMAZON_BEDROCK_MODEL_ID") 
)

CLI

CLIから2つのコマンド（smolagentとwebagent）を使用してエージェントを実行できます。

smolagentは、様々なツールで装備できるマルチステップのCodeAgentを実行するための汎用コマンドです。

# Run with direct prompt and options
smolagent "Plan a trip to Tokyo, Kyoto and Osaka between Mar 28 and Apr 7."  --model-type "InferenceClientModel" --model-id "Qwen/Qwen3-Next-80B-A3B-Thinking" --imports pandas numpy --tools web_search

# Run in interactive mode (launches setup wizard when no prompt provided)
smolagent

インタラクティブモードでは、以下をガイドします：

エージェントタイプの選択（CodeAgent vs ToolCallingAgent）
利用可能なツールボックスからのツール選択
モデルの設定（タイプ、ID、API設定）
追加インポートなどの詳細オプション
タスクプロンプト入力

一方、webagentはheliumを使用した特定のウェブブラウジングエージェントです（詳細はこちらを参照してください）。

たとえば：

webagent "go to xyz.com/men, get to sale section, click the first clothing item you see. Get the product details, and the price, return them. note that I'm shopping from France" --model-type "LiteLLMModel" --model-id "gpt-5"

Code agentsはどのように動作するのですか？

当社のCodeAgentは基本的に古典的なReActエージェントと同様に動作しますが、LLMエンジンがアクションをPythonコードスニペットとして書き込む点が異なります。

flowchart TB
    Task[User Task]
    Memory[agent.memory]
    Generate[Generate from agent.model]
    Execute[Execute Code action - Tool calls are written as functions]
    Answer[Return the argument given to 'final_answer']

    Task -->|Add task to agent.memory| Memory

    subgraph ReAct[ReAct loop]
        Memory -->|Memory as chat messages| Generate
        Generate -->|Parse output to extract code action| Execute
        Execute -->|No call to 'final_answer' tool => Store execution logs in memory and keep running| Memory
    end
    
    Execute -->|Call to 'final_answer' tool| Answer

    %% Styling
    classDef default fill:#d4b702,stroke:#8b7701,color:#ffffff
    classDef io fill:#4a5568,stroke:#2d3748,color:#ffffff
    
    class Task,Answer io

アクションはPythonコードスニペットになりました。したがって、ツール呼び出しはPython関数呼び出しとして実行されます。たとえば、エージェントが単一のアクションで複数のウェブサイトにわたってウェブ検索を実行する方法は以下の通りです：

requests_to_search = ["gulf of mexico america", "greenland denmark", "tariffs"]
for request in requests_to_search:
    print(f"Here are the search results for {request}:", web_search(request))

アクションをコードスニペットとして記述することは、LLMが呼び出したいツールのディクショナリーを出力させるという現在の業界慣行よりも優れていることが実証されています：ステップが30%削減され（したがってLLM呼び出しが30%削減）、困難なベンチマークでより高いパフォーマンスを達成しています。詳細についてはエージェントに関する高レベルの紹介をご覧ください。

コード実行は深刻なセキュリティ上の問題となる可能性があるため（任意のコード実行！）、エージェントコードはサンドボックスで実行する必要があります。複数のオプションをサポートしています：

E2B、Blaxel、Modal — 管理されたクラウドサンドボックス、セットアップが最も簡単です
Docker — セルフホストされたコンテナー分離
Pyodide+Deno WebAssembly — ブラウザーまたはエッジ環境用の軽量サンドボックス

組み込みのLocalPythonExecutorはセキュリティサンドボックスではありません。いくつかの制限を適用していますが、回避される可能性があり、セキュリティ境界として使用してはいけません。

CodeAgentとともに、アクションをJSON/テキストブロブとして記述する標準のToolCallingAgentも提供しています。ユースケースに最も適したスタイルを選択できます。

このライブラリはどのくらいコンパクトですか？

抽象化を最小限に抑えることに力を注ぎました。メインコードの agents.py は 1,000 行未満です。それでも、複数のタイプのエージェントを実装しています：CodeAgent はアクションを Python コードスニペットとして記述し、より古典的な ToolCallingAgent は組み込みのツール呼び出しメソッドを活用します。マルチエージェント階層、ツールコレクションからのインポート、リモートコード実行、ビジョンモデルなども備えています。

ところで、なぜフレームワークを使うのでしょうか？それは、この多くの部分が非自明だからです。たとえば、コードエージェントはシステムプロンプト全体、パーサー、実行を通じてコードの一貫した形式を維持する必要があります。したがって、私たちのフレームワークがこの複雑さを処理します。もちろん、ソースコードにハッキングして必要な部分だけを使用することをお勧めします。

オープンモデルはエージェントワークフローにどの程度対応できますか？

いくつかの主要なモデルで CodeAgent インスタンスを作成し、このベンチマークで比較しました。このベンチマークは複数の異なるベンチマークから質問を集めて、様々な課題をミックスしています。

詳細については、ここでベンチマーク化コードを確認してください。使用されたエージェントセットアップの詳細、および LLM コードエージェントと素の状態の比較を参照してください（ネタバレ：コードエージェントの方がより効果的です）。

この比較は、オープンソースモデルが現在、最高のクローズドソースモデルに対抗できることを示しています。

セキュリティ

セキュリティは、エージェントワークフローで作業する際の重要な考慮事項です。

📖 GitHub README の原文を読む（English / 参考情報）

— GitHub から取得した原文。完全版は GitHub へ。

Agents that think in code!

smolagents is a library that enables you to run powerful agents in a few lines of code. It offers:

✨ Simplicity: the logic for agents fits in ~1,000 lines of code (see agents.py). We kept abstractions to their minimal shape above raw code!

🧑‍💻 First-class support for Code Agents. Our CodeAgent writes its actions in code (as opposed to "agents being used to write code"). To make it secure, we support executing in sandboxed environments via Blaxel, E2B, Modal, Docker, or Pyodide+Deno WebAssembly sandbox.

🤗 Hub integrations: you can share/pull tools or agents to/from the Hub for instant sharing of the most efficient agents!

🌐 Model-agnostic: smolagents supports any LLM. It can be a local transformers or ollama model, one of many providers on the Hub, or any model from OpenAI, Anthropic and many others via our LiteLLM integration.

👁️ Modality-agnostic: Agents support text, vision, video, even audio inputs! Cf this tutorial for vision.

🛠️ Tool-agnostic: you can use tools from any MCP server, from LangChain, you can even use a Hub Space as a tool.

Full documentation can be found here.

Note

Check the our launch blog post to learn more about smolagents!

Quick demo

First install the package with a default set of tools:

pip install "smolagents[toolkit]"

Then define your agent, give it the tools it needs and run it!

from smolagents import CodeAgent, WebSearchTool, InferenceClientModel

model = InferenceClientModel()
agent = CodeAgent(tools=[WebSearchTool()], model=model, stream_outputs=True)

agent.run("How many seconds would it take for a leopard at full speed to run through Pont des Arts?")

smolagents_readme_leopard.mp4

You can even share your agent to the Hub, as a Space repository:

agent.push_to_hub("m-ric/my_agent")

# agent.from_hub("m-ric/my_agent") to load an agent from Hub

Our library is LLM-agnostic: you could switch the example above to any inference provider.

InferenceClientModel, gateway for all inference providers supported on HF

from smolagents import InferenceClientModel

model = InferenceClientModel(
    model_id="deepseek-ai/DeepSeek-R1",
    provider="together",
)

LiteLLM to access 100+ LLMs

from smolagents import LiteLLMModel

model = LiteLLMModel(
    model_id="anthropic/claude-4-sonnet-latest",
    temperature=0.2,
    api_key=os.environ["ANTHROPIC_API_KEY"]
)

OpenAI-compatible servers: Together AI

import os
from smolagents import OpenAIModel

model = OpenAIModel(
    model_id="deepseek-ai/DeepSeek-R1",
    api_base="https://api.together.xyz/v1/", # Leave this blank to query OpenAI servers.
    api_key=os.environ["TOGETHER_API_KEY"], # Switch to the API key for the server you're targeting.
)

OpenAI-compatible servers: OpenRouter

import os
from smolagents import OpenAIModel

model = OpenAIModel(
    model_id="openai/gpt-4o",
    api_base="https://openrouter.ai/api/v1", # Leave this blank to query OpenAI servers.
    api_key=os.environ["OPENROUTER_API_KEY"], # Switch to the API key for the server you're targeting.
)

Local `transformers` model

from smolagents import TransformersModel

model = TransformersModel(
    model_id="Qwen/Qwen3-Next-80B-A3B-Thinking",
    max_new_tokens=4096,
    device_map="auto"
)

Azure models

import os
from smolagents import AzureOpenAIModel

model = AzureOpenAIModel(
    model_id = os.environ.get("AZURE_OPENAI_MODEL"),
    azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
    api_version=os.environ.get("OPENAI_API_VERSION")    
)

Amazon Bedrock models

import os
from smolagents import AmazonBedrockModel

model = AmazonBedrockModel(
    model_id = os.environ.get("AMAZON_BEDROCK_MODEL_ID") 
)

CLI

You can run agents from CLI using two commands: smolagent and webagent.

smolagent is a generalist command to run a multi-step CodeAgent that can be equipped with various tools.

# Run with direct prompt and options
smolagent "Plan a trip to Tokyo, Kyoto and Osaka between Mar 28 and Apr 7."  --model-type "InferenceClientModel" --model-id "Qwen/Qwen3-Next-80B-A3B-Thinking" --imports pandas numpy --tools web_search

# Run in interactive mode (launches setup wizard when no prompt provided)
smolagent

Interactive mode guides you through:

Agent type selection (CodeAgent vs ToolCallingAgent)
Tool selection from available toolbox
Model configuration (type, ID, API settings)
Advanced options like additional imports
Task prompt input

Meanwhile webagent is a specific web-browsing agent using helium (read more here).

For instance:

webagent "go to xyz.com/men, get to sale section, click the first clothing item you see. Get the product details, and the price, return them. note that I'm shopping from France" --model-type "LiteLLMModel" --model-id "gpt-5"

How do Code agents work?

Our CodeAgent works mostly like classical ReAct agents - the exception being that the LLM engine writes its actions as Python code snippets.

flowchart TB
    Task[User Task]
    Memory[agent.memory]
    Generate[Generate from agent.model]
    Execute[Execute Code action - Tool calls are written as functions]
    Answer[Return the argument given to 'final_answer']

    Task -->|Add task to agent.memory| Memory

    subgraph ReAct[ReAct loop]
        Memory -->|Memory as chat messages| Generate
        Generate -->|Parse output to extract code action| Execute
        Execute -->|No call to 'final_answer' tool => Store execution logs in memory and keep running| Memory
    end
    
    Execute -->|Call to 'final_answer' tool| Answer

    %% Styling
    classDef default fill:#d4b702,stroke:#8b7701,color:#ffffff
    classDef io fill:#4a5568,stroke:#2d3748,color:#ffffff
    
    class Task,Answer io

Actions are now Python code snippets. Hence, tool calls will be performed as Python function calls. For instance, here is how the agent can perform web search over several websites in one single action:

requests_to_search = ["gulf of mexico america", "greenland denmark", "tariffs"]
for request in requests_to_search:
    print(f"Here are the search results for {request}:", web_search(request))

Writing actions as code snippets is demonstrated to work better than the current industry practice of letting the LLM output a dictionary of the tools it wants to call: uses 30% fewer steps (thus 30% fewer LLM calls) and reaches higher performance on difficult benchmarks. Head to our high-level intro to agents to learn more on that.

Since code execution can be a serious security concern (arbitrary code execution!), you should run agent code in a sandbox. We support several options:

E2B, Blaxel, Modal — managed cloud sandboxes, simplest to set up
Docker — self-hosted container isolation
Pyodide+Deno WebAssembly — lightweight sandbox for browser or edge environments

The built-in LocalPythonExecutor is not a security sandbox. It applies some restrictions but can be bypassed and must not be used as a security boundary.

Alongside CodeAgent, we also provide the standard ToolCallingAgent which writes actions as JSON/text blobs. You can pick whichever style best suits your use case.

How smol is this library?

We strived to keep abstractions to a strict minimum: the main code in agents.py has <1,000 lines of code. Still, we implement several types of agents: CodeAgent writes its actions as Python code snippets, and the more classic ToolCallingAgent leverages built-in tool calling methods. We also have multi-agent hierarchies, import from tool collections, remote code execution, vision models...

By the way, why use a framework at all? Well, because a big part of this stuff is non-trivial. For instance, the code agent has to keep a consistent format for code throughout its system prompt, its parser, the execution. So our framework handles this complexity for you. But of course we still encourage you to hack into the source code and use only the bits that you need, to the exclusion of everything else!

How strong are open models for agentic workflows?

We've created CodeAgent instances with some leading models, and compared them on this benchmark that gathers questions from a few different benchmarks to propose a varied blend of challenges.

Find the benchmarking code here for more detail on the agentic setup used, and see a comparison of using LLMs code agents compared to vanilla (spoilers: code agents works better).

This comparison shows that open-source models can now take on the best closed models!

Security

Security is a critical consideration when working wi

smolagents

smolagents

編集部メモ

こんな場面で使う

類似ツールとの違い

注意点・向かない用途

向いている読者

リポジトリ統計

公式ドキュメント（README）

コードで考えるエージェント!

クイックデモ

CLI

Code agentsはどのように動作するのですか？

このライブラリはどのくらいコンパクトですか？

オープンモデルはエージェントワークフローにどの程度対応できますか？

セキュリティ

Agents that think in code!

Quick demo

CLI

How do Code agents work?

How smol is this library?

How strong are open models for agentic workflows?

Security

同じカテゴリの他のツール

OpenClaw

AutoGPT

opencode

Langflow

Dify

LangChain