MetaVision Lite - API Documentation

Introduction

MetaVision is an advanced, high-performance Multi-Modal AI analysis engine.

Unlike traditional computer vision APIs that return static labels, MetaVision allows you to define flexible Analysis Categories or use Custom Prompts to extract structured intelligence from media (Images, Videos, and Audio).

Try the Interactive Playground →

Authentication

All API requests require authentication using a Bearer token. Include your API key in the Authorization header of every request:

Authorization: Bearer YOUR_API_KEY

Core Analysis

The primary endpoint handles image, video, and audio analysis.

POST /api/v1/analyze

Request Structure

Send a JSON body with the media content and configuration.

Required Parameters

Parameter	Type	Description
`media`	string	The content to analyze. Must be text, a valid HTTP URL, or a Base64 encoded string (text, image, video, or audio/mp3).

Optional Parameters

Parameter	Type	Default	Description
`categories`	array[string]	(All)	List of Category IDs to execute (e.g., `["title", "objects"]`).
`custom_prompt`	string	null	Provide a custom analysis instruction (max 2000 chars). The result will appear under the key `custom_analysis`.
`model`	string	null	Force the use of a specific Agent. Must use the Model UUID found in the Models section.
`agent_strategy`	string	"default"	If no model is specified: `default` (uses the recommended default model), `random` (load balancing), or `best` (highest reliability score). If the default model does not support the requested media type, the engine automatically falls back to the highest-scoring capable agent.
`detail`	string	"high"	Analysis depth: `low` (faster, less token usage) or `high` (more detailed analysis).

Response Structure

The API returns a JSON object divided into metadata and data results.

{
"meta": {
  "timestamp": "2024-03-20T10:00:00.000Z",
  "model_id": "8f32e9...",
  "model_name": "general-v2",
  "execution_time": 2.45,
  "successful_categories": 2,
  "failed_categories": 0,
  "total_tokens_in": 1500,
  "total_tokens_out": 300,
  "estimated_cost": 0.0045
},
"data": {
  "title": {
    "result": "Sunset over a mountain range"
  },
  "custom_analysis": {
    "result": "The mood is serene and peaceful."
  }
}
}

Meta Object

Field	Type	Description
`model_id`	string	The UUID of the model that processed the request.
`model_name`	string	The display name of the model.
`execution_time`	float	Total processing time in seconds.
`total_tokens_in`	integer	Total input tokens consumed by the agents.
`total_tokens_out`	integer	Total output tokens generated.
`estimated_cost`	float	Estimated cost of the request in USD.

Analysis Categories

Categories define specific tasks for the engine. You can select specific categories in your request to tailor the analysis output.

Loading categories...

Available Models

Use the UUID below in the model parameter to force a specific agent. If no model is forced, requests use the default strategy which selects the recommended default model automatically.

Loading models...

Advanced Options

Agent Selection

By default (agent_strategy: "default"), MetaVision routes requests to a recommended, well-balanced default model. You can override this behavior with random for load balancing across all capable agents, best to always select the highest reliability score, or by forcing a specific model UUID.

Agent Consistency

MetaVision ensures consistency by using a single Agent for all categories within a single request. This prevents conflicting interpretations of the media across different analysis tasks.

Audio Analysis

MetaVision supports audio files (MP3, WAV, OGG). The engine listens to speech, tone, and background audio to perform the analysis. Max file size: 10MB.

Video Analysis

MetaVision supports video files (MP4, MOV, WebM). The engine samples frames from the video to perform the analysis.

Performance Note: Video and Audio analysis are significantly more computationally expensive than images. Execution times may range from 15s to 45s.

Error Handling

Errors can occur at the request level (4xx/5xx status codes) or at the individual category level.

API Errors

Status	Code	Description
400	`invalid_request`	Malformed JSON, missing 'media', or invalid 'model' UUID.
400	`invalid_categories`	Requested categories do not exist or are disabled.
401	`invalid_api_key`	Missing or incorrect Authorization header.
403	`account_disabled`	The user account has been disabled by an admin.
413	`payload_too_large`	Media exceeds the maximum allowed size (10MB).
500	`internal_error`	Unexpected server or upstream provider error.

Code Examples

import requests

API_KEY = "your_api_key"
API_URL = "https://metavision.api.efficientstack.com/api/v1/analyze"

payload = {
  "media": "https://example.com/image.jpg",
  "categories": ["title", "description"],
  "custom_prompt": "Describe the emotion.",
  # "model": "8f32e9...", # Optional UUID override
  "agent_strategy": "default"
}

headers = {
  "Authorization": f"Bearer {API_KEY}",
  "Content-Type": "application/json"
}

response = requests.post(API_URL, json=payload, headers=headers)
print(response.json())

const API_KEY = 'your_api_key';
const API_URL = 'https://metavision.api.efficientstack.com/api/v1/analyze';

async function analyzeMedia() {
const response = await fetch(API_URL, {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    media: "https://example.com/audio.mp3", // Audio supported
    categories: ["title"],
    custom_prompt: "Transcribe the speech and describe tone.",
    agent_strategy: "default" // Uses the recommended default model
  })
});

const result = await response.json();
console.log(result);
}

analyzeMedia();

curl -X POST "https://metavision.api.efficientstack.com/api/v1/analyze" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "media": "https://example.com/image.jpg",
  "custom_prompt": "List all colors present.",
  "agent_strategy": "default"
}'