feat: Add per-feature-view metrics for online read path by vanitabhagwat · Pull Request #365 · ExpediaGroup/feast

vanitabhagwat · 2026-06-03T05:39:19Z

Emit per-feature-view latency, request count, error count, and total lookup requests on every online read in the Go feature server (HTTP + gRPC). This enables per-FV hit-rate computation in Datadog and allows filtering latency/error distributions by feature view.

Key changes:

Add Distribution() to StatsdClient interface
New FeatureViewReadMetrics emitter (fv_read_latency_ms, fv_read_requests, fv_read_errors)
Extend LookupMetricsAggregator with totalByFV for feature_lookup_requests
Extract FV names from request (works with fullFeatureNames=false)
New unified flag ENABLE_FV_LEVEL_METRICS (backward compatible with ENABLE_MISSING_KEY_METRICS)
Instrument GetOnlineFeatures and GetOnlineFeaturesRange in both HTTP and gRPC handlers

What this PR does / why we need it:

Which issue(s) this PR fixes:

Misc

…uests, errors, hit rate) Emit per-feature-view latency, request count, error count, and total lookup requests on every online read in the Go feature server (HTTP + gRPC). This enables per-FV hit-rate computation in Datadog and allows filtering latency/error distributions by feature view. Key changes: - Add Distribution() to StatsdClient interface - New FeatureViewReadMetrics emitter (fv_read_latency_ms, fv_read_requests, fv_read_errors) - Extend LookupMetricsAggregator with totalByFV for feature_lookup_requests - Extract FV names from request (works with fullFeatureNames=false) - New unified flag ENABLE_FV_LEVEL_METRICS (backward compat with ENABLE_MISSING_KEY_METRICS) - Instrument GetOnlineFeatures and GetOnlineFeaturesRange in both HTTP and gRPC handlers Co-Authored-By: Claude Opus 4.6 <[email protected]>

… emission - Define constants for all metric names (FVReadLatencyMetric, FVReadRequestsMetric, FVReadErrorsMetric, LookupNotFoundMetric, LookupNullOrExpiredMetric, LookupRequestsMetric) - Extract emitFVReadMetrics helper into server_commons.go to eliminate 4 identical nil-check + construct + emit patterns across HTTP and gRPC handlers - Update tests to reference constants instead of string literals Co-Authored-By: Claude Opus 4.6 <[email protected]>

…te at startup Extract sample-rate parsing into a single ParseSampleRate() helper in config.go and introduce MetricsContext that builds FeatureViewReadMetrics and baseTags once at server startup. This eliminates repeated env reads and per-request tag allocations on the hot path in both gRPC and HTTP servers. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

zabarn · 2026-06-11T21:48:45Z

+	if m.sampleRate < 1.0 && rand.Float64() > m.sampleRate {
+		return
+	}


Isn't the last argument of m.client.Count() the sampling rate?

Couldn't we drop this guard and then do m.client.Count(FVReadRequestsMetric, 1, tags, m.sampleRate) (same with the other m.client.*() calls?

Worth aligning lookup_metrics.go to the same approach so the two stay consistent.

zabarn · 2026-06-11T21:52:52Z

 		request.GetRequestContext(),
 		request.GetFullFeatureNames())

+	latencyMs := float64(time.Since(t0).Milliseconds())


time.Duration.Milliseconds() returns a truncated integer, so any fractional latency is floored (e.g. 1.4ms and 1.9ms both record as 1, and anything under 1ms records as 0). That collapses the latency distribution into coarse integer buckets and skews the percentiles. Could we use float64(time.Since(t0).Microseconds()) / 1000.0 to keep sub-ms precision?

zabarn · 2026-06-11T21:53:18Z

 		request.GetFullFeatureNames(),
 	)

+	latencyMs := float64(time.Since(t0).Milliseconds())


https://github.com/ExpediaGroup/feast/pull/365/changes#r3399405338

zabarn · 2026-06-11T21:53:24Z

 		requestContextProto,
 		request.FullFeatureNames)

+	latencyMs := float64(time.Since(t0).Milliseconds())


https://github.com/ExpediaGroup/feast/pull/365/changes#r3399405338

zabarn · 2026-06-11T21:53:34Z

 		requestContextProto,
 		request.FullFeatureNames)

+	latencyMs := float64(time.Since(t0).Milliseconds())


https://github.com/ExpediaGroup/feast/pull/365/changes#r3399405338

zabarn · 2026-06-11T22:02:24Z

 	"github.com/feast-dev/feast/go/internal/feast/registry"
 )

+const DefaultSampleRate = 0.01


someone who previously ran with ENABLE_MISSING_KEY_METRICS=true and no FEAST_METRICS_SAMPLE_RATE set used to get exact counts (old default 1.0), and after this change they silently get 1% sampling with ×100 extrapolation on the same eature_lookup_* metrics. So existing dashboards for that feature go from exact to coarse/noisy without any config change on their end.

I agree we shouldn't change the existing default sample rate. Since this new metric is very different, requiring a new default I think it's best to create a separate sapling rate config and default for it.

zabarn · 2026-06-11T22:04:51Z

+func IsFVMetricsEnabled() bool {
+	return strings.ToLower(os.Getenv("ENABLE_FV_LEVEL_METRICS")) == "true" ||
+		IsMissingKeyMetricsEnabled()
+}


This couples two separate features. Enabling ENABLE_MISSING_KEY_METRICS now also turns on the new FV-level metrics. Could we keep IsFVMetricsEnabled() checking only its own flag, and introduce a separate IsMetricsClientEnabled() (the OR of both) for the main.go gate that decides whether to build the statsd client? That keeps each flag controlling its own feature and the function name matching its behavior.

Agreed, these metrics should be enabled independently.

piket · 2026-06-15T17:53:11Z

+	}
+}
+
+func extractFVNamesFromRequest(features []string, featureService *model.FeatureService) []string {


Move this to server_commons since it is used in both http and grpc.
Also would be good to include unit test coverage for this.

vanitabhagwat and others added 2 commits June 2, 2026 22:36

vanitabhagwat changed the title ~~feat: Add per-feature-view metrics for online read path (latency, req…~~ feat: Add per-feature-view metrics for online read path Jun 3, 2026

vanitabhagwat force-pushed the feat-per-fv-read-metrics branch from 104b30c to 0bfdf07 Compare June 11, 2026 17:47

zabarn reviewed Jun 11, 2026

View reviewed changes

piket reviewed Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add per-feature-view metrics for online read path#365

feat: Add per-feature-view metrics for online read path#365
vanitabhagwat wants to merge 3 commits into
masterfrom
feat-per-fv-read-metrics

vanitabhagwat commented Jun 3, 2026 •

edited

Loading

Uh oh!

zabarn Jun 11, 2026

Uh oh!

zabarn Jun 11, 2026

Uh oh!

zabarn Jun 11, 2026

Uh oh!

zabarn Jun 11, 2026

Uh oh!

zabarn Jun 11, 2026

Uh oh!

zabarn Jun 11, 2026

Uh oh!

piket Jun 15, 2026

Uh oh!

zabarn Jun 11, 2026

Uh oh!

piket Jun 15, 2026

Uh oh!

piket Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

vanitabhagwat commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it:

Which issue(s) this PR fixes:

Misc

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vanitabhagwat commented Jun 3, 2026 •

edited

Loading