Skip to content

Update the base64 encoder padding to match the spec#86

Open
chuff wants to merge 1 commit intoIABTechLab:masterfrom
chuff:base64-to-spec
Open

Update the base64 encoder padding to match the spec#86
chuff wants to merge 1 commit intoIABTechLab:masterfrom
chuff:base64-to-spec

Conversation

@chuff
Copy link
Copy Markdown
Contributor

@chuff chuff commented Apr 25, 2025

No description provided.

Comment on lines -16 to -18
while (bitString.length % 8 > 0) {
bitString += "0";
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed for spec compatibility because one base64 character represents 8-bits of data.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you say is to not do this change (PR) at all. This is the only change to code. All other changes update the test cases.

Copy link
Copy Markdown

@pgoforth pgoforth Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please tell my why this change is necessary.

If you start encoding using only % 6 for padding, you will create a misaligned byte string upon decoding when you use this for TCF strings (which GPP is also responsible for).

This is fine if you are reading/writing using ONLY this library, but you are creating a GPP string that is going to be read and decoded by multiple third party decoders that expect a valid byte[]. Using only % 6 creates a misaligned byte array that would be considered invalid by many other decoders.

GPP requires alignment on 6 bits:
https://github.com/InteractiveAdvertisingBureau/Global-Privacy-Platform/blob/main/Core/Consent%20String%20Specification.md#creating-a-gpp-string

TCF requires alignment on 8 bits. This requirement has never been altered since v1.1 and there has been no reference to a change in any TCF updates:
https://github.com/InteractiveAdvertisingBureau/GDPR-Transparency-and-Consent-Framework/blob/10b489eae0d5328e07241001520d984e15afb59a/Consent%20string%20and%20vendor%20list%20formats%20v1.1%20Final.md?plain=1#L316

Therefore, padding to the LCM of 8 and 6 is necessary (LCM=24 because 6*4=24 and 8*3=24).

Since this is a common encoder/decoder, it is used for both TCF and other GPP represented strings. Simply forwarding the TCF portion of the string to a TCF decoder would produce an error if the encoder only padded using 6 bits.

Comment on lines -16 to -18
while (bitString.length % 8 > 0) {
bitString += "0";
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you say is to not do this change (PR) at all. This is the only change to code. All other changes update the test cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants