Operations
This document describes runtime logs, traffic logs, metrics, observability, and admin/debug endpoints.
System endpoints
Always mounted:
GET /healthGET | POST /ping
DEV only:
GET /logs/{last_n_lines}GET /logs/streamGET /logs/html
The /logs* endpoints are disabled in PROD.
Runtime logs
Runtime logs are process logs. They are written to stdout and the configured log file.
Common settings:
GPT2GIGA_LOG_LEVEL=INFO
GPT2GIGA_LOG_FILENAME=gpt2giga.log
GPT2GIGA_LOG_MAX_SIZE=10485760
Do not use DEBUG in production. Debug logs may contain sensitive operational context even with redaction enabled.
Metrics
Prometheus-compatible metrics are disabled by default:
GPT2GIGA_METRICS_ENABLED=False
GPT2GIGA_METRICS_PATH=/metrics
Local enablement:
GPT2GIGA_METRICS_ENABLED=True
GPT2GIGA_METRICS_PATH=/metrics
If API-key authentication is enabled, pass Authorization: Bearer <GPT2GIGA_API_KEY> or x-api-key.
Metrics do not contain prompt/response content, API keys, request ids, trace ids, or raw payloads. Labels are limited to a bounded set of operational fields: protocol, route, method, status, lifecycle, provider, model.
Baseline series:
gpt2giga_requests_totalgpt2giga_request_duration_secondsgpt2giga_upstream_duration_secondsgpt2giga_upstream_errors_totalgpt2giga_tokens_input_totalgpt2giga_tokens_output_totalgpt2giga_stream_disconnects_totalgpt2giga_traffic_log_dropped_total
Traffic logs
Traffic logs are structured records of request/response traffic. They are disabled by default:
GPT2GIGA_TRAFFIC_LOG_ENABLED=False
GPT2GIGA_TRAFFIC_LOG_SINK=noop
Local JSONL check:
GPT2GIGA_TRAFFIC_LOG_ENABLED=True
GPT2GIGA_TRAFFIC_LOG_SINK=jsonl
GPT2GIGA_TRAFFIC_LOG_JSONL_PATH=traffic_logs.jsonl
Durable backend in Postgres:
GPT2GIGA_TRAFFIC_LOG_ENABLED=True
GPT2GIGA_TRAFFIC_LOG_SINK=postgres
GPT2GIGA_TRAFFIC_LOG_POSTGRES_DSN=postgresql://user:password@localhost:5432/gpt2giga
Postgres plus an OpenSearch mirror:
GPT2GIGA_TRAFFIC_LOG_ENABLED=True
GPT2GIGA_TRAFFIC_LOG_SINKS=postgres,opensearch
GPT2GIGA_TRAFFIC_LOG_POSTGRES_DSN=postgresql://user:password@localhost:5432/gpt2giga
GPT2GIGA_OPENSEARCH_URL=http://localhost:9200
Content capture is opt-in and goes through redaction:
GPT2GIGA_TRAFFIC_LOG_CAPTURE_CONTENT=False
GPT2GIGA_TRAFFIC_LOG_REDACT_SENSITIVE=True
Keep content capture disabled until storage, retention, redaction, and access policies are approved.
Admin Traffic Logs API
Admin endpoints are disabled by default:
GPT2GIGA_ADMIN_API_ENABLED=False
Enablement with a separate admin key:
GPT2GIGA_ADMIN_API_ENABLED=True
GPT2GIGA_ADMIN_API_KEY="<strong-admin-secret>"
GPT2GIGA_TRAFFIC_LOG_SINK=postgres
GPT2GIGA_TRAFFIC_LOG_POSTGRES_DSN=postgresql://user:password@localhost:5432/gpt2giga
Authorization header options:
x-admin-api-key: <secret>Authorization: Bearer <secret>
Endpoints:
GET /_admin/logsGET /_admin/logs/{id}GET /_admin/logs/{id}/requestGET /_admin/logs/{id}/responseGET /_admin/logs/tailGET /_admin/logs/export.ndjsonGET /_admin/logs/export.csvPOST /_admin/logs/retention/purgePOST /_admin/logs/{id}/replayPOST /_admin/logs/{id}/redact
Replay requires:
GPT2GIGA_REPLAY_ENABLED=True
Debug Translate API
The debug translation endpoints are intended for local debugging and protected admin workflows:
GPT2GIGA_DEBUG_TRANSLATE_ENABLED=True
GPT2GIGA_ADMIN_API_KEY="<strong-admin-secret>"
Main endpoint:
POST /_debug/translate
Short endpoints:
POST /_debug/translate/openai-to-normalizedPOST /_debug/translate/anthropic-to-normalizedPOST /_debug/translate/normalized-to-gigachatPOST /_debug/translate/gigachat-to-openai
Supported payload families: openai, anthropic, normalized, gigachat, depending on the direction.
How these formats pass through the internal normalized contract is described in Normalized messages architecture.
Phoenix / OpenTelemetry
Phoenix observability is disabled by default:
GPT2GIGA_OBSERVABILITY_ENABLED=False
GPT2GIGA_OBSERVABILITY_BACKEND=phoenix
PHOENIX_COLLECTOR_ENDPOINT=http://localhost:4317
PHOENIX_PROJECT_NAME=gpt2giga
Enable it after installing the optional phoenix extra or via the Phoenix Compose profile.
LLM payload attributes require a double opt-in:
GPT2GIGA_OBSERVABILITY_CAPTURE_CONTENT=True
GPT2GIGA_OBSERVABILITY_CAPTURE_MESSAGES=True
GPT2GIGA_OBSERVABILITY_CAPTURE_TOOL_ARGS=False
GPT2GIGA_OBSERVABILITY_CAPTURE_RESPONSES=True
GPT2GIGA_OBSERVABILITY_REDACTION_ENABLED=True
Traffic logs and Phoenix spans are linked through gateway identifiers:
request_id, trace_id, protocol, route, model metadata. For LLM routes,
Phoenix gets one root span per format: OpenAI-Completions for Chat
Completions, OpenAI-Responses for the Responses API, Anthropic-Messages for
Anthropic Messages, Gemini-Content for Gemini GenerateContent, and Embeddings
for OpenAI Embeddings. Streaming milestones are attached to the corresponding
root span as span events. For non-LLM routes, a single lifecycle span
gpt2giga.request is used.
For filtering and grouping by the compatible API format, model spans get the
attribute gpt2giga.api_format: chat_completions, responses, messages,
generate_content, or embeddings. Stateful Responses additionally get
session.id and conversation.id from the GigaChat thread_id; if the upstream
thread id is not yet available, previous_response_id is used without the
resp_ prefix.
The OpenTelemetry span start time is taken from the gateway
RequestContext.started_at, so Phoenix Latency reflects the full request/stream
time, not just the time of the final observability span emission. The same value
is additionally written to the latency_ms attribute.
LLM spans set an explicit OpenTelemetry status (OK or ERROR) and duplicate
it in the safe status / llm.response.status attributes. Token usage is
written as the OpenInference fields llm.token_count.* and as the gateway
aliases input_tokens, output_tokens, total_tokens.
Tool visibility is safe by default: Phoenix gets llm.tools.count,
llm.tools.names, llm.tool_calls.count, llm.tool_calls.names, plus
llm.tool_call events without arguments. Tool call arguments and full tool
schemas are written only with a double opt-in:
GPT2GIGA_OBSERVABILITY_CAPTURE_CONTENT=True and
GPT2GIGA_OBSERVABILITY_CAPTURE_TOOL_ARGS=True; they go through redaction before
being sent.
Phoenix spans also get a safe caller classification from the incoming headers:
caller.name: for exampleswagger-ui,redoc-ui,openai-python,anthropic-compatible,claude-code,codex,qwen-code,browser;caller.category:ui,sdk,agent,browser,http_client, orunknown;caller.client_family:openaioranthropic, when it can be inferred from the SDK headers orUser-Agent;caller.sdk,caller.agent,caller.ui: more precise subtypes, when they are known.
The detailed object is duplicated in annotations.caller, so that in Phoenix you
can open the structured context without enabling payload capture. For Swagger UI
the source is determined by Referer: .../docs, for ReDoc by .../redoc; raw
prompt/response content is not added to annotations.
Terms and design constraints are described in Logging and observability. A checklist for adding new providers/protocols and the related observability changes: Adding a provider or protocol.