Conversation
📝 WalkthroughWalkthroughExpands provider normalization and display-name mappings, strips provider prefixes from fetched model names, adds post-normalization deduplication and MinSourceAgreement clamping in the sync flow, updates tests, and adds SQL migrations to normalize existing llm cost rows. Changes
Sequence DiagramsequenceDiagram
participant Fetcher as Model Fetcher
participant Norm as Normalizer
participant Dedup as Deduplicator
participant Filter as Price Filter
participant Sync as DB Sync
Fetcher->>Norm: Fetch raw prices (provider aliases, prefixed model names)
Norm->>Norm: NormalizeProvider()/NormalizeModelID() -> canonical provider & model_id
Norm->>Dedup: Emit normalized prices
Dedup->>Dedup: deduplicateSourcePrices(prices) by (source, provider, model_id)\nprefer ModelName without provider prefix
Dedup->>Filter: Return deduplicated set
Filter->>Filter: Apply min_agreement and other filters
Filter->>Sync: Upsert filtered prices into DB
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
6f73a7c to
7d739cf
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
api/v3/handlers/llmcost/convert.go (1)
38-57: Normalize first, then fall back.This formatter now only knows canonical IDs. If an alias slips through —
apiCreateOverrideToDomainlater in this file still forwardsbody.Provideras-is — the API falls back toAws/Azure Ai/X Aiinstead of the canonical names. Doing the map lookup onllmcost.NormalizeProvider(id)and only using the originalidfor the unknown-provider fallback would make this path a lot more robust.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/v3/handlers/llmcost/convert.go` around lines 38 - 57, The formatProviderName function currently looks up providerDisplayNames using the raw id; change it to first call llmcost.NormalizeProvider(id) and use that normalized key when checking providerDisplayNames (still keep the original id variable/value for the fallback split logic), so canonical aliases map correctly; update references to providerDisplayNames and the lookup in formatProviderName to use the normalized id but leave the hyphen/underscore split/fallback using the original input when normalization yields no match.openmeter/llmcost/sync/normalizer_test.go (1)
17-42: Add a few mixed-case + padded variants for the new aliases.Quick improvement: include a couple of inputs like uppercase + surrounding spaces for newly added aliases (not just
openai) to protect the trim/lowercase path and alias lookup together.Suggested test additions
{"x-ai", "xai"}, + {" X-AI ", "xai"}, {"minimax", "minimax"}, {"nano-gpt", "nanogpt"}, {"nano_gpt", "nanogpt"}, {"nanogpt", "nanogpt"}, + {" BEDROCK_CONVERSE ", "bedrock"}, {"vertex_ai-language-models", "vertex_ai"}, + {"VERTEX_AI-LANGUAGE-MODELS", "vertex_ai"}, {"vertex_ai-text-models", "vertex_ai"}, {"vertex_ai_something", "vertex_ai"},As per coding guidelines,
**/*_test.go: "Make sure the tests are comprehensive and cover the changes. Keep a strong focus on unit tests and in-code integration tests."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openmeter/llmcost/sync/normalizer_test.go` around lines 17 - 42, The test data in normalizer_test.go lacks mixed-case and whitespace-padded variants for the newly added aliases; update the test cases to include a few inputs per new alias that combine surrounding spaces and mixed casing (e.g., " Azure_AI ", "GEMINI", " MistralAI ", " X-AI ", " Nano-GPT ", " Vertex_AI-Language-Models ") so the trim+lowercase normalization path is exercised; target the table/fixture used in the test (the alias->expected mapping) and add these padded/mixed-case entries alongside the existing ones to ensure alias lookup and normalization (trim/lowercase) are validated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@openmeter/llmcost/sync/sync.go`:
- Around line 96-120: deduplicateSourcePrices is currently order-dependent when
two entries share the same sourceModelKey and neither wins the "no provider
prefix" test; update it to apply a deterministic tie-breaker instead of keeping
the first seen entry. In the collision branch inside deduplicateSourcePrices
(referencing seen, result, sourceModelKey and ModelName), compare the existing
entry result[idx] and the incoming p using a stable rule (for example: prefer
the entry with no provider prefix; if both same prefix-status, pick
deterministically by comparing ModelName, then Provider, then a numeric
tie-breaker like lower PricePerUnit or StoragePrice; if values differ log the
conflict) and replace result[idx] only when the tie-breaker chooses p (also add
a concise log when conflicting entries differ in key fields). This makes
canonical selection stable across runs.
- Around line 69-83: The cap logic currently reduces the default quorum when
config.MinSourceAgreement <= 0; change it so the cap to len(fetchers) only
applies when the user explicitly set MinSourceAgreement > 0. In practice: keep
the existing assignment minAgreement := config.MinSourceAgreement; if
minAgreement <= 0 set minAgreement = DefaultMinSourceAgreement; then only run
the capping branch "if numFetchers := len(fetchers); numFetchers > 0 &&
config.MinSourceAgreement > 0 && minAgreement > numFetchers { minAgreement =
numFetchers }" (i.e., check config.MinSourceAgreement > 0 instead of the
computed minAgreement) so defaults aren't silently lowered before creating the
Reconciler (NewReconciler).
In `@tools/migrate/migrations/20260318170735_llmcost_normalize_providers.up.sql`:
- Around line 5-111: Step 1's duplicate soft-delete only checks (provider,
model_id, namespace) and ignores effective_from, which can cause valid
historical rows to be removed and leave alias-vs-alias pairs to collide in Step
2; change the duplicate detection and dedupe logic on llm_cost_prices to include
effective_from (i.e., treat canonical key as provider + model_id + namespace +
effective_from), update every EXISTS and matching condition in the Step 1
UPDATEs to also compare effective_from (and/or use a dedupe CTE that partitions
by canonical provider, model_id, namespace, effective_from and keeps the
canonical row while soft-deleting aliases), and ensure the same key is used when
deciding which alias rows to rename in Step 2 so alias-vs-alias collisions are
eliminated before renaming.
---
Nitpick comments:
In `@api/v3/handlers/llmcost/convert.go`:
- Around line 38-57: The formatProviderName function currently looks up
providerDisplayNames using the raw id; change it to first call
llmcost.NormalizeProvider(id) and use that normalized key when checking
providerDisplayNames (still keep the original id variable/value for the fallback
split logic), so canonical aliases map correctly; update references to
providerDisplayNames and the lookup in formatProviderName to use the normalized
id but leave the hyphen/underscore split/fallback using the original input when
normalization yields no match.
In `@openmeter/llmcost/sync/normalizer_test.go`:
- Around line 17-42: The test data in normalizer_test.go lacks mixed-case and
whitespace-padded variants for the newly added aliases; update the test cases to
include a few inputs per new alias that combine surrounding spaces and mixed
casing (e.g., " Azure_AI ", "GEMINI", " MistralAI ", " X-AI ", " Nano-GPT ",
" Vertex_AI-Language-Models ") so the trim+lowercase normalization path is
exercised; target the table/fixture used in the test (the alias->expected
mapping) and add these padded/mixed-case entries alongside the existing ones to
ensure alias lookup and normalization (trim/lowercase) are validated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 269cc2c6-8711-441c-a964-e31fd1bece49
⛔ Files ignored due to path filters (1)
tools/migrate/migrations/atlas.sumis excluded by!**/*.sum,!**/*.sum
📒 Files selected for processing (9)
api/v3/handlers/llmcost/convert.goopenmeter/llmcost/normalize.goopenmeter/llmcost/normalize_test.goopenmeter/llmcost/sync/fetcher_modelsdev.goopenmeter/llmcost/sync/normalizer_test.goopenmeter/llmcost/sync/sync.goopenmeter/llmcost/sync/sync_test.gotools/migrate/migrations/20260318170735_llmcost_normalize_providers.down.sqltools/migrate/migrations/20260318170735_llmcost_normalize_providers.up.sql
tools/migrate/migrations/20260318170735_llmcost_normalize_providers.up.sql
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
🧹 Nitpick comments (3)
openmeter/llmcost/sync/normalizer_test.go (1)
16-42: Tests look solid! One small observation.The test cases align well with the expanded normalize.go mappings. However, I noticed the
gemini → googlemapping exists innormalize.gobut isn't explicitly tested here (thoughgoogle → googleis). Might be worth adding for completeness:{"gemini", "google"},That said, if you're confident it's covered elsewhere (like in
normalize_test.go), feel free to skip.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openmeter/llmcost/sync/normalizer_test.go` around lines 16 - 42, The test matrix in normalizer_test.go is missing an explicit entry for the "gemini" → "google" mapping (which exists in normalize.go); add a test case {"gemini", "google"} into the table of provider mappings used by the test (the slice of test cases in normalizer_test.go) so the normalize provider mapping for "gemini" is asserted directly by the test.api/v3/handlers/llmcost/convert.go (1)
47-57: Fallback formatting looks good!The split-on-hyphens-and-underscores approach handles unknown providers gracefully. One tiny edge case: if someone passes a provider ID like
"--foo", you'd get["", "", "foo"]which after capitalization becomes" Foo"with leading spaces. Probably not a real-world concern, but worth mentioning.🛡️ Optional: Defensive trim
return strings.Join(words, " ") + // Or consider: return strings.TrimSpace(strings.Join(words, " "))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/v3/handlers/llmcost/convert.go` around lines 47 - 57, The fallback can produce empty entries (e.g., id="--foo") and yield leading spaces; after splitting with strings.FieldsFunc into words, filter out empty strings before capitalizing: iterate over words, skip if w=="" (or use strings.TrimSpace(w) to be safe), capitalize the non-empty ones and append to a new slice, then return strings.Join of that filtered slice; reference the id input, words slice, strings.FieldsFunc call, the capitalization loop, and the final strings.Join.openmeter/llmcost/sync/sync.go (1)
96-120: Order-dependency is intentional and already tested—no action needed, but tie-breaker remains optional.Good catch on the order-dependency edge case. The test suite already covers this scenario explicitly (see "keeps first entry when both have clean names" and "keeps first entry when both have prefixed names" tests), showing that when two duplicates are equivalent, the function keeps whichever comes first, even if pricing differs. The tests confirm this is the intended behavior.
Given that the tests document this tradeoff and accept it (they even test with different pricing values: 0.01/0.03 vs 0.0101/0.0301), the current implementation is working as designed. The deduplication runs after normalization to prevent false multi-source agreement in the reconciler, so keeping the first entry is a reasonable choice.
If you want full determinism via a tie-breaker (comparing pricing or alphabetizing
ModelName), that'd be a nice enhancement but isn't critical. The current approach relies on stable input ordering and consistent pricing within a source, which seems fine for the use case.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openmeter/llmcost/sync/sync.go` around lines 96 - 120, The deduplication order-dependency is intentional and already covered by tests, so leave deduplicateSourcePrices as-is (function deduplicateSourcePrices, map seen[sourceModelKey], and the replacement logic that checks result[idx].ModelName vs p.ModelName should not be changed); if you later want full determinism, implement an explicit tie-breaker inside deduplicateSourcePrices (for example compare pricing fields or alphabetize ModelName before deciding to replace the existing entry) but no code change is required now.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@api/v3/handlers/llmcost/convert.go`:
- Around line 47-57: The fallback can produce empty entries (e.g., id="--foo")
and yield leading spaces; after splitting with strings.FieldsFunc into words,
filter out empty strings before capitalizing: iterate over words, skip if w==""
(or use strings.TrimSpace(w) to be safe), capitalize the non-empty ones and
append to a new slice, then return strings.Join of that filtered slice;
reference the id input, words slice, strings.FieldsFunc call, the capitalization
loop, and the final strings.Join.
In `@openmeter/llmcost/sync/normalizer_test.go`:
- Around line 16-42: The test matrix in normalizer_test.go is missing an
explicit entry for the "gemini" → "google" mapping (which exists in
normalize.go); add a test case {"gemini", "google"} into the table of provider
mappings used by the test (the slice of test cases in normalizer_test.go) so the
normalize provider mapping for "gemini" is asserted directly by the test.
In `@openmeter/llmcost/sync/sync.go`:
- Around line 96-120: The deduplication order-dependency is intentional and
already covered by tests, so leave deduplicateSourcePrices as-is (function
deduplicateSourcePrices, map seen[sourceModelKey], and the replacement logic
that checks result[idx].ModelName vs p.ModelName should not be changed); if you
later want full determinism, implement an explicit tie-breaker inside
deduplicateSourcePrices (for example compare pricing fields or alphabetize
ModelName before deciding to replace the existing entry) but no code change is
required now.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1ffc0acb-21af-4564-bd35-e46db995e3af
⛔ Files ignored due to path filters (1)
tools/migrate/migrations/atlas.sumis excluded by!**/*.sum,!**/*.sum
📒 Files selected for processing (9)
api/v3/handlers/llmcost/convert.goopenmeter/llmcost/normalize.goopenmeter/llmcost/normalize_test.goopenmeter/llmcost/sync/fetcher_modelsdev.goopenmeter/llmcost/sync/normalizer_test.goopenmeter/llmcost/sync/sync.goopenmeter/llmcost/sync/sync_test.gotools/migrate/migrations/20260318170735_llmcost_normalize_providers.down.sqltools/migrate/migrations/20260318170735_llmcost_normalize_providers.up.sql
🚧 Files skipped from review as they are similar to previous changes (5)
- openmeter/llmcost/normalize_test.go
- openmeter/llmcost/sync/fetcher_modelsdev.go
- openmeter/llmcost/sync/sync_test.go
- tools/migrate/migrations/20260318170735_llmcost_normalize_providers.down.sql
- tools/migrate/migrations/20260318170735_llmcost_normalize_providers.up.sql
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tools/migrate/migrations/20260318170735_llmcost_normalize_providers.up.sql`:
- Around line 19-20: The migration's use of IS NOT DISTINCT FROM and PARTITION
BY namespace conflicts with the unique index behavior because the index
llmcostprice_provider_model_id_namespace_effective_from does not mark NULL as
not distinct; either update the unique index to treat NULLs as equal (add NULLS
NOT DISTINCT to the index definition for the namespace partition) so the index
semantics match the migration, or modify the migration SQL (the
dedup/soft-delete queries that compare canon.namespace and dup.namespace) to
explicitly skip or handle rows where namespace IS NULL (e.g., only dedupe when
namespace IS NOT NULL or use logic that matches the current index behavior);
locate references to llmcostprice_provider_model_id_namespace_effective_from and
the queries using IS NOT DISTINCT FROM in this migration and apply one of these
two changes so index semantics and migration logic align.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c8c4c607-411c-43dc-9cef-b8da1e30dc99
⛔ Files ignored due to path filters (1)
tools/migrate/migrations/atlas.sumis excluded by!**/*.sum,!**/*.sum
📒 Files selected for processing (1)
tools/migrate/migrations/20260318170735_llmcost_normalize_providers.up.sql
| AND canon.namespace IS NOT DISTINCT FROM dup.namespace | ||
| AND canon.effective_from = dup.effective_from |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Locate llm_cost_prices unique index definition in SQL migrations =="
rg -n -C4 --type=sql 'CREATE UNIQUE INDEX|UNIQUE INDEX|llm_cost_prices|namespace|effective_from|deleted_at|NULLS NOT DISTINCT|COALESCE\(namespace' tools/migrate/migrations
echo
echo "== Optional: locate schema/index builders outside migrations =="
rg -n -C4 --type=go 'llm_cost_prices|Index\(|Unique\(|namespace|effective_from|deleted_at|NULLS NOT DISTINCT|COALESCE' .Repository: openmeterio/openmeter
Length of output: 50379
🏁 Script executed:
# Read the specific migration file
cat -n tools/migrate/migrations/20260318170735_llmcost_normalize_providers.up.sqlRepository: openmeterio/openmeter
Length of output: 6672
🏁 Script executed:
# Search specifically for llm_cost_prices unique index across all migrations
rg -n 'llm_cost_prices.*unique|CREATE.*INDEX.*llm_cost_prices' -i tools/migrate/migrations/Repository: openmeterio/openmeter
Length of output: 2043
🏁 Script executed:
# Verify the exact index definition one more time and check for any NULLS NOT DISTINCT usage
grep -A2 "llmcostprice_provider_model_id_namespace_effective_from" tools/migrate/migrations/20260303233513_llmcost_add.up.sqlRepository: openmeterio/openmeter
Length of output: 349
Namespace NULL semantics don't match the unique index.
Good catch! The unique index llmcostprice_provider_model_id_namespace_effective_from doesn't have NULLS NOT DISTINCT, so PostgreSQL treats multiple NULL namespaces as distinct—they won't collide on the index.
But the migration uses IS NOT DISTINCT FROM (lines 19, 33, 46, 59, 72, 85, 106) and PARTITION BY namespace (line 106), which treats NULLs as equal. This means the migration will soft-delete rows that wouldn't actually violate the unique constraint.
You'll need to either add NULLS NOT DISTINCT to the index to align with the migration logic, or adjust the migration to skip/handle NULL namespaces separately to match how the index actually behaves.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/migrate/migrations/20260318170735_llmcost_normalize_providers.up.sql`
around lines 19 - 20, The migration's use of IS NOT DISTINCT FROM and PARTITION
BY namespace conflicts with the unique index behavior because the index
llmcostprice_provider_model_id_namespace_effective_from does not mark NULL as
not distinct; either update the unique index to treat NULLs as equal (add NULLS
NOT DISTINCT to the index definition for the namespace partition) so the index
semantics match the migration, or modify the migration SQL (the
dedup/soft-delete queries that compare canon.namespace and dup.namespace) to
explicitly skip or handle rows where namespace IS NULL (e.g., only dedupe when
namespace IS NOT NULL or use logic that matches the current index behavior);
locate references to llmcostprice_provider_model_id_namespace_effective_from and
the queries using IS NOT DISTINCT FROM in this migration and apply one of these
two changes so index semantics and migration logic align.
Overview
Fixes #(issue)
Notes for reviewer
Summary by CodeRabbit
New Features
Bug Fixes
Chores
Tests