Add a kw-based function to create a builder#315
Conversation
|
Tokenizer information is still missing. The current approach will be to reuse serde information via https://docs.rs/serde_plain/latest/serde_plain/index.html maybe. Generating the python type stub information at compile time would be really helpful. |
| # not yet supported | ||
| # tokenizer: Optional[str], | ||
| validate_transform: bool = True, | ||
| target: Optional[str] = None, |
There was a problem hiding this comment.
We should support a StrEnum for this, and then have the signature be Optional[Union[EncoderfileTarget, str]]
There was a problem hiding this comment.
Hmmmm, sorry, I forgot that one 🙈 I'd just use the strenum in the type annotation, why give the choice 😄
There was a problem hiding this comment.
Hmmm, now I see the issue. I'll accept the target as a string, but possibly implement getters in a targetspec class that doesn't have a new method (unlike other members there). Open to discussion.
There was a problem hiding this comment.
I'd love to have a format attr like in OpenAPI, frankly. For the moment I'm accepting any string and validating in the rust side (see the test_encoderfilebuilder_wrong_arch test). Anything more detailed feels overkill, but some detailed types for the target components would be welcome, true.
| if s.eq_ignore_ascii_case("batch_longest") { | ||
| Ok(TokenizerPadStrategy::BatchLongest) | ||
| } else { | ||
| s.parse::<usize>() |
There was a problem hiding this comment.
@besaleli This would be the other, more manual approach.
Codecov Report❌ Patch coverage is
🚀 New features to boost your workflow:
|
besaleli
left a comment
There was a problem hiding this comment.
This is looking good so far!!
|
The encoderfile-py crate is not test-covered because we have the tests in python anyway... Would you like to some more coverage maybe in the de/ser parts for the new data types???? Again I think this will come up in the pytests, but if you insist I can add some checks to parse things back and forth. I'll also add py visibility for the target spec. |
A new function using kwargs has been added, called
from_config. The old function has been renamedfrom_configpathinstead. This approach has been favored over creating yet another data structure in python. The validation stays in Rust, as checked in the pytests.Closes #314