Use this file to discover all available pages before exploring further.
phoenix/js/packages/phoenix-cli at main · Arize-ai/phoenix
GitHub
Phoenix CLI is a command-line interface for your Phoenix projects. Fetch traces, list datasets, export experiment results, and access prompts directly from your terminal—or pipe them into AI coding agents like Claude Code, Cursor, Codex, and Gemini CLI.You can use Phoenix CLI for:
Immediate Debugging: Fetch the most recent trace of a failed or unexpected run with a single command
Bulk Export: Export large numbers of traces or experiment results to JSON files for offline analysis
Dataset & Experiment Access: List datasets and retrieve full experiment data including runs, evaluations, and trace IDs
Prompt Introspection: View and export prompt templates for analysis, optimization, or use with other tools
Terminal Workflows: Integrate trace and experiment data into your existing tools, piping output to Unix utilities like jq
AI Coding Assistants: Use with Claude Code, Cursor, Windsurf, or other AI-powered tools to analyze traces, experiments, and optimize prompts
Don’t see a use-case covered? @arizeai/phoenix-cli is open-source! Issues and PRs welcome.
# Configure your Phoenix instanceexport PHOENIX_HOST=http://localhost:6006export PHOENIX_PROJECT=my-projectexport PHOENIX_API_KEY=your-api-key # if authentication is enabled# Fetch the most recent tracepx trace list --limit 1# Fetch a specific trace by IDpx trace get abc123def456# Fetch recent LLM spanspx span list --span-kind LLM --limit 10# Export traces to a directorypx trace list ./my-traces --limit 50
px trace list --limit 10 # Output to stdoutpx trace list ./my-traces --limit 10 # Save to directorypx trace list --last-n-minutes 60 --limit 20 # Filter by timepx trace list --since 2026-01-13T10:00:00Z # Since timestamppx trace list --format raw --no-progress | jq # Pipe to jq
Fetch individual spans from the configured project with comprehensive filtering.
px span list --limit 20 # Recent spans (table view)px span list --last-n-minutes 60 --limit 50 # Spans from last hourpx span list --span-kind LLM --limit 10 # Only LLM spanspx span list --status-code ERROR --limit 20 # Only errored spanspx span list --span-kind LLM TOOL --status-code OK # Combine filterspx span list --name chat_completion --limit 10 # Filter by span namepx span list --trace-id abc123 --format raw | jq # All spans for a tracepx span list --attribute llm.model_name:gpt-4 # Filter by attributepx span list --attribute llm.model_name:gpt-4 --attribute user.role:admin # AND filterspx span list --include-annotations --limit 10 # Include annotation scorespx span list output.json --limit 100 # Save to JSON filepx span list --format raw --no-progress | jq # Pipe to jq
Filter by parent span ID (use "null" for root spans)
—
--attribute <filters...>
Filter by attribute key-value pairs. Format: key:value. Repeat to AND multiple filters. Values containing colons are supported (split on first : only). To match a string attribute that looks like a number or boolean, JSON-quote the value (e.g., 'user.id:"12345"'). Requires Phoenix server ≥ 14.9.0.
List sessions (multi-turn conversations) for a project.
px session list # List recent sessionspx session list --limit 20 # More sessionspx session list --order asc # Oldest firstpx session list --format raw --no-progress | jq # Pipe to jq
View a session’s conversation flow, including all traces (turns) in the session.
px session get my-chat-session-001 # By session_idpx session get UHJvamVjdFNlc3Npb24... # By GlobalIDpx session get my-chat-session-001 --include-annotations # With annotationspx session get my-chat-session-001 --file session.json # Save to filepx session get my-chat-session-001 --format raw | jq # Pipe to jq
px dataset get query_response # Fetch all examplespx dataset get query_response --split train # Filter by splitpx dataset get query_response --split train --split test # Multiple splitspx dataset get query_response --version <version-id> # Specific versionpx dataset get query_response --file dataset.json # Save to filepx dataset get query_response --format raw | jq '.examples[].input'
Fetch a single experiment with all run data, including inputs, outputs, evaluations, and trace IDs.
px experiment get RXhwZXJpbWVudDoxpx experiment get RXhwZXJpbWVudDox --file exp.json # Save to filepx experiment get RXhwZXJpbWVudDox --format json # JSON output
Show a Phoenix prompt.
Supports multiple output formats including a text format optimized for piping to AI coding assistants.
px prompt get my-assistant-prompt # Latest version (pretty)px prompt get my-assistant-prompt --tag production # Get by tagpx prompt get my-assistant-prompt --version abc123 # Specific versionpx prompt get my-assistant-prompt --format json # JSON outputpx prompt get my-assistant-prompt --format text # Plain text for piping
Option
Description
Default
--tag <name>
Get prompt version by tag name
—
--version <id>
Get specific prompt version by ID
latest
--format <format>
pretty, json, raw, or text
pretty
--endpoint <url>
Phoenix API endpoint
From env
--api-key <key>
Phoenix API key
From env
--no-progress
Disable progress indicators
—
The text format outputs prompt content with XML-style role tags, ideal for piping to AI assistants:
<system>You are a helpful assistant specialized in...</system><user>{{user_input}}</user>
Make authenticated GraphQL queries against the Phoenix API. Output is {"data": {...}} JSON — pipe with jq '.data.<field>' to extract values. Only queries are permitted; mutations and subscriptions are rejected before hitting the server.
px api graphql '<query>' [--endpoint <url>] [--api-key <key>]
# List all datasetspx dataset list --format raw --no-progress | jq '.[].name'# Output: "query_response"# List experiments for a datasetpx experiment list --dataset query_response --format raw --no-progress | \ jq '.[] | {id, successful_run_count, failed_run_count}'# Output: {"id":"RXhwZXJpbWVudDox","successful_run_count":249,"failed_run_count":1}# Export all experiment data for a dataset to a directorypx experiment list --dataset query_response ./experiments/
# List all promptspx prompt list --format raw --no-progress | jq '.[].name'# Get prompt template contentpx prompt get my-evaluator --format text --no-progress# View prompt with all metadatapx prompt get my-evaluator --format json --no-progress | jq '.template'# Get a specific tagged versionpx prompt get my-evaluator --tag production --format text --no-progress
Pipe your Phoenix prompts directly to Claude Code for analysis and optimization suggestions:
# Get prompt optimization ideaspx prompt get my-evaluator --format text --no-progress | claude -p "Review this prompt and suggest improvements for clarity and effectiveness"# Analyze prompt for edge casespx prompt get my-assistant --format text --no-progress | claude -p "What edge cases might this prompt fail to handle?"# Generate test cases for a promptpx prompt get my-classifier --format text --no-progress | claude -p "Generate 5 diverse test inputs to evaluate this prompt"
You can also ask Claude Code to work with your prompts interactively:
Fetch my "correctness-evaluator" prompt from Phoenix and suggest how to make the rubric more specific