inference:estimate
Estimates the token count and cost for a prompt without running the model
Estimates the token count and cost for a prompt without running the model
Options
| Name | Type | Description |
|---|---|---|
prompt | string | The prompt string or template file path |
prompt_vars | map[string]any | Variables injected into the prompt template |
expect | string | The expected output format (e.g. "text") |
expect_type | string | The expected output Go type (e.g. "string") |
provider | string | The inference provider (e.g. "ollama", "openai") |
model | string | The model identifier (e.g. "mistral:latest", "gpt-4") |
temperature | float32 | The model temperature setting |
top_p | float32 | The model top_p setting |
max_tokens | int | The maximum number of output tokens to request |
currency | string | The currency code for cost conversion (e.g. "EUR", "USD") |
Outputs
| Name | Type | Description |
|---|---|---|
prompt | string | The fully rendered prompt string |
model | string | The model used for the estimate |
provider | string | The inference provider used |
tokens_input | int | Estimated number of input tokens |
cost_input_usd | string | Estimated input cost in USD |
cost_input_converted | string | Estimated input cost in the requested currency |
How is this guide?
Inference
Interact with AI language models.
inference
Makes LLM inference requests using the configured provider and its model. Supports tools, LLM parameter settings and cost calculation with user currency conversion. Pass a template path to the `prompt` field, or a string prompt to render dynamic prompts with variables.