Skip to content

(GH-538) Define newtypes for tags field#1382

Draft
michaeltlombardi wants to merge 3 commits intoPowerShell:mainfrom
michaeltlombardi:gh-538/main/tag-newtype
Draft

(GH-538) Define newtypes for tags field#1382
michaeltlombardi wants to merge 3 commits intoPowerShell:mainfrom
michaeltlombardi:gh-538/main/tag-newtype

Conversation

@michaeltlombardi
Copy link
Collaborator

@michaeltlombardi michaeltlombardi commented Feb 9, 2026

PR Summary

This change addresses limitation in representing the tags field by defining two newtypes:

  • Tag as a wrapper around String that follows the "parse, don't validate" pattern to move pattern validation into the constructor. It also treats comparisons with tags case-insensitively. The defined JSON Schema for this type is marked as inline because it is always and only used to validate items in the tags field for various structs.
  • TagList as a transparent wrapper around HashSet<Tag>. It provides no additional functionality, it just enables us to define a JSON Schema for this reusable type. The representation here uses HashSet rather than Vec because the items in a tag list must be unique (JSON Schema keyword uniqueItems defined as true).

This change modifies the code for the ResourceManifest and ExtensionManifest structs to use TagList instead of Option<Vec<String>> and updates other code as necessary to address the change in type.

PR Context

Prior to this change, the representation for both resource and extension tags was defined as Option<Vec<String>>. With this construction, we can't define a reusable schema for tag lists (because we don't own the JsonSchema trait or the Option<T> type).

The existing implementation also doesn't adhere to the validation from the canonical JSON Schema for tags, which defines the regex pattern ^\w+$ for tags.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces two newtypes (Tag and TagList) to replace the previous Option<Vec<String>> representation for tags in resource and extension manifests. The change implements the "parse, don't validate" pattern by moving tag validation into the constructor and enables case-insensitive tag comparisons while providing canonical JSON schemas for these types.

Changes:

  • Defines Tag newtype wrapping String with regex validation (^\w+$) and case-insensitive comparison/hashing
  • Defines TagList newtype wrapping HashSet<Tag> to ensure uniqueness and provide reusable schema
  • Updates ResourceManifest and ExtensionManifest to use TagList instead of Option<Vec<String>>
  • Adds comprehensive integration tests following the repository's testing patterns
  • Updates tag filtering logic in list_resources to leverage case-insensitive comparison

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
lib/dsc-lib/src/types/tag.rs Defines Tag newtype with validation pattern ^\w+$, case-insensitive comparison/hashing, and extensive trait implementations
lib/dsc-lib/src/types/tag_list.rs Defines TagList newtype wrapping HashSet<Tag> with ergonomic trait implementations
lib/dsc-lib/src/types/mod.rs Exports new Tag and TagList types
lib/dsc-lib/src/extensions/extension_manifest.rs Changes tags field from Option<Vec<String>> to TagList with #[serde(default)]
lib/dsc-lib/src/dscresources/resource_manifest.rs Changes tags field from Option<Vec<String>> to TagList with #[serde(default)]
lib/dsc-lib/src/dscerror.rs Adds InvalidTag error variant for tag validation failures
lib/dsc-lib/locales/en-us.toml Adds localized error message keys for invalid tag errors
lib/dsc-lib/locales/schemas.definitions.yaml Adds comprehensive documentation for tag and tags schema definitions
dsc/src/subcommand.rs Updates tag filtering to use is_empty() and case-insensitive comparison via Tag type
lib/dsc-lib/tests/integration/types/tag.rs Comprehensive integration tests for Tag covering methods, schema, serde, and traits
lib/dsc-lib/tests/integration/types/tag_list.rs Comprehensive integration tests for TagList covering schema, serde, and traits
lib/dsc-lib/tests/integration/types/mod.rs Registers new test modules for tag and tag_list

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Prior to this change, the representation for both resource and
extension tags was defined as `Option<Vec<String>>`. With this
construction, we can't define a reusable schema for tag lists (because
we don't own the `JsonSchema` trait or the `Option<T>` type).

The existing implementation also doesn't adhere to the validation from
the canonical JSON Schema for tags, which defines the regex pattern
`^\w+$` for tags.

This change addresses these limitations by defining two newtypes:

- `Tag` as a wrapper around `String` that follows the "parse, don't
  validate" pattern to move pattern validation into the constructor. It
  also treats comparisons with tags case-insensitively. The defined
  JSON Schema for this type is marked as `inline` because it is always
  and only used to validate items in the `tags` field for various
  structs.
- `TagList` as a transparent wrapper around `HashSet<Tag>`. It provides
  no additional functionality, it just enables us to define a JSON
  Schema for this reusable type. The representation here uses `HashSet`
  rather than `Vec` because the items in a tag list must be unique
  (JSON Schema keyword `uniqueItems` defined as `true`).

  We can still skip serializing when the `HashSet` is empty, like we can
  skip serializing when an `Option` is `None`.
Prior to this change, the resource manifest struct defined the `tags`
field as `Option<Vec<String>>`. This change updates the field to use
the `TagList` newtype, which wraps `HashSet<Tag>` to provide better
semantics and JSON Schema for the type.

This change ensures that references to the `tags` field are updated
through the code base.
Prior to this change, the extension manifest struct defined the `tags`
field as `Option<Vec<String>>`. This change updates the field to use
the `TagList` newtype, which wraps `HashSet<Tag>` to provide better
semantics and JSON Schema for the type.

The `tags` field isn't directly accessed in any existing code.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// that the serialized data is deterministic.
impl From<TagList> for Vec<Tag> {
fn from(value: TagList) -> Self {
let mut list: Vec<Tag> = value.iter().cloned().collect();
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From<TagList> for Vec<Tag> consumes the TagList but currently builds the vector via value.iter().cloned(), which clones every Tag (and its inner String). Since value is owned here, you can move tags out of the underlying HashSet (e.g., via into_iter() on the inner set) before sorting to avoid the per-item clones during serialization.

Suggested change
let mut list: Vec<Tag> = value.iter().cloned().collect();
let mut list: Vec<Tag> = value.0.into_iter().collect();

Copilot uses AI. Check for mistakes.
strings as list items, like `["macOS", "macOS"]`. However, due to limitations in the
expressiveness of JSON Schema, a list that defines the same tag with different casing
will pass JSON Schema validation, like `["MACOS", "macos", "macOS"]`. When DSC
deserializes that JSON array value, only records the _first_ tag for any set of tags that
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar: the sentence "When DSC deserializes that JSON array value, only records the first tag..." is missing a subject. Consider inserting "it" ("..., it only records...") to make the documentation read correctly.

Suggested change
deserializes that JSON array value, only records the _first_ tag for any set of tags that
deserializes that JSON array value, it only records the _first_ tag for any set of tags that

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant