metrics module

Prometheus metrics primitives and helpers.

metrics.emit_event(event, severity='info', **fields)[מקור]
פרמטרים:
metrics.generate_latest()[מקור]
metrics.track_performance(operation, labels=None)[מקור]
פרמטרים:
metrics.metrics_endpoint_bytes()[מקור]
Return type:

bytes

metrics.metrics_content_type()[מקור]
Return type:

str

metrics.note_active_user(user_id)[מקור]

Record that a specific user was active recently, and update the gauge.

This uses a simple in-memory set per-process. It is a best-effort indicator and does not attempt cross-process aggregation. Good enough for basic dashboards/tests.

Return type:

None

פרמטרים:

user_id (int)

metrics.note_request_started()[מקור]

Increment the in-flight requests gauge (best-effort).

Return type:

None

metrics.note_request_finished()[מקור]

Decrement the in-flight requests gauge (never negative).

Return type:

None

metrics.get_active_requests_count()[מקור]

Return the current in-flight request count (best-effort).

Return type:

int

metrics.get_current_memory_usage()[מקור]

Return current process RSS in MB (best-effort).

Return type:

float

metrics.get_recent_errors_count(minutes=5)[מקור]

Return the number of 5xx errors recorded in the last X minutes.

Return type:

int

פרמטרים:

minutes (int)

metrics.get_top_slow_endpoints(limit=5, window_seconds=None)[מקור]

Return the slowest endpoints observed recently (best-effort).

Return type:

List[Dict[str, Any]]

פרמטרים:
  • limit (int)

  • window_seconds (int | None)

metrics.get_slowest_endpoint()[מקור]

Return a formatted string describing the slowest endpoint recently seen.

Return type:

str

metrics.note_deployment_started(summary='Service starting up')[מקור]

Mark the start of a deployment and emit an informational alert.

Return type:

None

פרמטרים:

summary (str)

metrics.note_deployment_shutdown(summary='Service shutting down')[מקור]

Emit a shutdown deployment event (does not reset latency grace period).

Return type:

None

פרמטרים:

summary (str)

metrics.get_avg_response_time_seconds()[מקור]

Return the smoothed average HTTP response time (seconds).

Return type:

float

metrics.record_request_outcome(status_code, duration_seconds, *, source=None, handler=None, command=None, cache_hit=None, status_label=None, method=None, path=None)[מקור]

Record a single HTTP request outcome across services.

  • Increments total requests

  • Increments failed requests (status >= 500)

  • Updates EWMA average response time gauge

  • Performs lightweight anomaly detection and emits internal alerts when thresholds are exceeded

Return type:

None

פרמטרים:
  • status_code (int)

  • duration_seconds (float)

  • source (str | None)

  • handler (str | None)

  • command (str | None)

  • cache_hit (bool | str | None)

  • status_label (str | None)

  • method (str | None)

  • path (str | None)

metrics.update_health_gauges(*, mongo_connected=None, ping_ms=None, indexes_total=None, latency_ewma_ms=None)[מקור]

Best-effort bridge from /healthz payload into Prometheus gauges.

Return type:

None

פרמטרים:
  • mongo_connected (bool | None)

  • ping_ms (float | None)

  • indexes_total (float | None)

  • latency_ewma_ms (float | None)

metrics.record_startup_stage_metric(stage, duration_ms)[מקור]

Expose per-stage startup duration (milliseconds) via Prometheus.

Return type:

None

פרמטרים:
  • stage (str)

  • duration_ms (float | None)

metrics.record_startup_total_metric(duration_ms)[מקור]

Expose total startup duration (milliseconds) via Prometheus.

Return type:

None

פרמטרים:

duration_ms (float | None)

metrics.record_http_request(method, endpoint, status_code, duration_seconds, *, path=None)[מקור]

Record HTTP request metrics for SLO calculations.

  • Increments http_requests_total{method,endpoint,status}

  • Observes http_request_duration_seconds{method,endpoint}

This function is best-effort and never raises.

Return type:

None

פרמטרים:
  • method (str)

  • endpoint (str | None)

  • status_code (int)

  • duration_seconds (float)

  • path (str | None)

metrics.record_request_queue_delay(method, endpoint, delay_seconds)[מקור]

Record request queue delay (best-effort, never raises).

Return type:

None

פרמטרים:
  • method (str)

  • endpoint (str | None)

  • delay_seconds (float)

metrics.record_outbound_request_duration(service, endpoint, status, duration_seconds)[מקור]
Return type:

None

פרמטרים:
  • service (str | None)

  • endpoint (str | None)

  • status (str | None)

  • duration_seconds (float)

metrics.increment_outbound_retry(service, endpoint)[מקור]
Return type:

None

פרמטרים:
  • service (str | None)

  • endpoint (str | None)

metrics.set_circuit_state(service, endpoint, state_value)[מקור]
Return type:

None

פרמטרים:
  • service (str | None)

  • endpoint (str | None)

  • state_value (float)

metrics.set_circuit_success_rate(service, endpoint, value)[מקור]
Return type:

None

פרמטרים:
  • service (str | None)

  • endpoint (str | None)

  • value (float)

metrics.get_boot_monotonic()[מקור]

Return the process boot monotonic timestamp captured by metrics import.

Return type:

float

metrics.mark_startup_complete()[מקור]

Mark startup as complete and set app_startup_seconds/startup_completed gauges.

Safe no-op if metrics are unavailable.

Return type:

None

metrics.note_first_request_latency(duration_seconds=None)[מקור]

Record the latency from process boot to first completed HTTP request.

If duration_seconds is None, compute against get_boot_monotonic().

Return type:

None

פרמטרים:

duration_seconds (float | None)

metrics.get_process_uptime_seconds()[מקור]

Return approximate process uptime in seconds using perf_counter baseline.

This is computed as perf_counter() - get_boot_monotonic() to yield elapsed time since the baseline captured at import. It is best-effort and monotonic.

Return type:

float

metrics.record_dependency_init(dependency, duration_seconds)[מקור]

Observe initialization time for a named dependency (Histogram).

Return type:

None

פרמטרים:
  • dependency (str)

  • duration_seconds (float)

metrics.record_db_operation(operation, duration_seconds, *, status='ok')[מקור]

Record latency + count for database hot path operations.

Return type:

None

פרמטרים:
  • operation (str)

  • duration_seconds (float)

  • status (str | None)

metrics.get_uptime_percentage()[מקור]

Compute uptime percentage based on request counters.

Uptime ≈ 1 - (failed / total). If counters are unavailable or total==0, return 100.0.

Return type:

float

metrics.set_adaptive_observability_gauges(*, error_rate_threshold_percent=None, latency_threshold_seconds=None, current_error_rate_percent=None, current_latency_avg_seconds=None)[מקור]

Update adaptive observability gauges. No-ops if gauges unavailable.

Return type:

None

פרמטרים:
  • error_rate_threshold_percent (float | None)

  • latency_threshold_seconds (float | None)

  • current_error_rate_percent (float | None)

  • current_latency_avg_seconds (float | None)

metrics.set_external_error_rate_percent(value)[מקור]

Update the external error rate gauge (best-effort).

Return type:

None

פרמטרים:

value (float | None)

metrics.track_file_saved(user_id, language, size_bytes)[מקור]

Record a file_saved business event.

Uses structured log for rich context and a lightweight Prometheus counter for volume.

Return type:

None

פרמטרים:
  • user_id (int)

  • language (str)

  • size_bytes (int)

metrics.track_search_performed(user_id, query, results_count)[מקור]

Record a search event without logging raw query (privacy by default).

Return type:

None

פרמטרים:
  • user_id (int)

  • query (str)

  • results_count (int)

metrics.track_github_sync(user_id, files_count, success)[מקור]

Record a github_sync event (aggregate outcome only).

Return type:

None

פרמטרים: