Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ML-DSA test vectors. #112

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Add ML-DSA test vectors. #112

wants to merge 2 commits into from

Conversation

gendx
Copy link

@gendx gendx commented Apr 5, 2024

This PR adds test vectors for ML-DSA, for two versions: the round 3 proposal (CRYSTALS-DILITHIUM) and the current draft of the FIPS 204 standard.

These tests aim to cover the following cases:

  • Baseline (signing a "Hello world" message).
  • Keys and signatures of the wrong length.
  • Signature with bit flips.
  • Signature hints with a non-canonical encoding:
    • hints that aren't sorted,
    • hints are not strictly sorted (i.e. the same hint is repeated),
    • too many hints (potentially causing a buffer overflow),
    • non-zero padding after the hints.
  • Secret keys with the s1 or s2 vectors out of the [-eta, eta] range.
  • Public key with the t1 component set to zero (allowing trivial forgeries, but the verification algorithm should still accept signatures for this key).
  • Boundary conditions in the signature rejection loop (aiming to detect incorrect comparisons):
    • z_max equals gamma1 - beta - 1 and gamma1 - beta,
    • r0_max equals gamma2 - beta - 1 and gamma2 - beta,
    • h_ones equals omega and omega + 1,
    • in the case of ML-DSA-44 (a.k.a. Dilithium 2), |ct0|_max equals gamma2 - 1 and gamma2.
  • A "large" number of SHAKE bytes and of SHAKE blocks generated in the expand_a, expand_s, rej_ntt_poly and rej_bounded_poly functions.
  • Boundary conditions in arithmetic functions:
    • power_2_round function: when the remainder (found in t0) is equal to 4096 or -4095,
    • decompose (via high_bits or low_bits): when the condition r_plus - r_0 = q - 1 happens.

Since the FIPS 204 standard is still a draft, I'm leaving this pull request as a draft for now.

@Aurum-Vale
Copy link

Hi,

Thanks for sharing your test vectors. I have been able to verify and pass all signature tests with my own private implementation of the ML-DSA-65 draft (I have yet to test signature verification and other parameter sets), and it has been of a great help so far.

I have a few remarks for some of the test vectors.

The FIPS 204 only states that "skDecode should only be run on input that comes from trusted sources". It does not require ML-DSA to check if the secret key is formatted correctly.
(However, it does require a length check on the public key when verifying a signature. I am not familiar with wycheproof, but you may want, if possible, to indicate the length check on the public key as optional, for the implementations that do not support a variable-length input.)

Therefore, the testing secret key length check should not be the concern of the ML-DSA implementation, but to its user instead (such as a higher-level provider) if they believe the secret key does not come from a trusted source.

The tests about secret keys with invalid s1 or s2 fall into a similar questioning. The specification does not state whether a check is required, or whether it is fine to generate a signature from such a key, but the remark on skDecode implies neither of those happen. Here as well, the check should come from a higher-level user/provider if it is really wanted.

Finally, the tests on the ball sampling never require more than one SHAKE256 block, but I doubt it is feasible in practice to find such a case. The tests "1 block" and "76 bytes" seem redundant (a SHAKE256 block is 136 bytes), is there a particular reason to include both?

@gendx
Copy link
Author

gendx commented May 17, 2024

Thank you a lot for the feedback, that's really valuable!

There are several things here.

  1. Byte arrays (representing keys or signatures) of an invalid length.

    These aren't strictly speaking part of the FIPS 204 specification draft, where algorithms 16-21 take as inputs/outputs byte strings of a specific length. I still think it's useful to include them to tests implementations that take arbitrary-length byte strings rather than specific-length arrays, either because the type system doesn't support constraining the length (depending on the implementation language), or because the serialized objects are transmitted in variable-length formats (e.g. in some tag-length-value binary format, or in some text format like JSON).

    If one wants to use these Wycheproof tests on an ML-DSA implementation that uses fixed-length arrays, there anyway needs to be a conversion layer between the JSON strings (which have unrestricted length) and the fixed-length arrays. So if we consider this conversion layer to be part of the tested implementation, then it also makes sense to have these tests.

    Note that these tests are labeled (e.g. "IncorrectPrivateKeyLength"), so they can be ignored accordingly.

  2. Tests where the private key isn't valid (s1 or s2 out of range).

    Here I would say that the specification is perhaps not clear enough. On the one hand, the mathematical definition (algorithm 19) accepts any byte string as input ("B^n" where n is the private key size), so out-of-range values are acceptable inputs. On the other hand, it says "Note that there exist malformed inputs [...] skDecode should only be run on input that comes from trusted sources", which kind of implies that the correct input domain isn't "B^n", but only the subset of well-formed private keys.

    Unfortunately, skimming through the official comments I didn't find this issue being raised (but maybe I've read to fast), so it's perhaps worth raising the issue on the pqc-forum (there was a thread about signature hint unpacking, which is a different question and definitely needs to be checked).

    I definitely agree that the check isn't needed if a user can indeed confirm that the key came from a trusted source or was already validated. That said, it isn't an expensive check performance-wise.

    Here again, the "InvalidPrivateKey" label allows to filter these tests.

  3. Regarding the SHAKE blocks vs. bytes, they can sometimes be different, so my test generation script just output all of them even when they are the same. My take being that it's more cautious to have a few extra redundant tests than too few tests that miss pitfalls.

    As you pointed out, it's anyway hard to test that the implementation is resilient to having to generate more blocks, as finding inputs that need many blocks requires a huge amount of brute-forcing.

@Aurum-Vale
Copy link

Thanks for your reply.
I may look into raising the issue in 2) on the pqc-forum. The current draft states that the signature process never fails, so it could imply that any byte string is valid despite the confusing claim about malformed private keys. Still, the draft sends mixed signals on this issue.

Since my initial post here, I have successfully tested your vectors for all parameter sets, and for both signature and verification.

@gendx
Copy link
Author

gendx commented May 21, 2024

I've created a thread on the pqc-forum: https://groups.google.com/a/list.nist.gov/g/pqc-forum/c/9nVfHKtid-k

@eay
Copy link

eay commented Jun 28, 2024

  1. Regarding the SHAKE blocks vs. bytes, they can sometimes be different, so my test generation script just output all of them even when they are the same. My take being that it's more cautious to have a few extra redundant tests than too few tests that miss pitfalls.
    As you pointed out, it's anyway hard to test that the implementation is resilient to having to generate more blocks, as finding inputs that need many blocks requires a huge amount of brute-forcing.

Regarding issue 3, have a look at usnistgov/ACVP-Server#321
This was for the seed value '9d20abd2a2d3b9313ee70aabb7c8bd80d262cedd6c8b5bfba0cbcae1cbd31d70'.
It was triggering an error for getting the next block of output, and I suspect it is the reason for the wanting to test this case 😊

These are based on the changes advertised in the Appendix D of
https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.204.pdf. I didn't
re-check the rest of the document.

I've added new tests for the new "context" parameter passed in
signing/verification functions (assumed empty when not provided). There
are no tests for the pre-hashing variants for now.
@gendx
Copy link
Author

gendx commented Aug 14, 2024

I've added the "standard" series of tests based on https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.204.pdf. I've only incorporated changes advertised in Appendix D, and NIST hasn't yet uploaded official tests (discussion on the PQC forum), so take these with a grain of salt.

I've notably added support for the new "context" parameter in signing/verification (notably considering the edge case of an overlong context, of >= 256 bytes). My test vectors are based on the "external" APIs (no test vectors provided for the _internal APIs for now), using the non-hedged deterministic variant, and assume an empty context when not specified. There are no test cases for the pre-hashed variants for now.

@eay
Copy link

eay commented Aug 15, 2024

Could I request you add some version information in the json? I'm personally use 3.1 (last dilithium) and 4.0 for the mldsa draft, 4.1 for the current ACVP-Server test vectors (no context) and now 4.2 for the final (your vectors - which I pass). I have scripts to load the json and use ffi to my C code. I feel filtering on filenames is a bit ugly.....
Perhaps a version based on document release date.
thanks

@franziskuskiefer
Copy link

ML-KEM (#110) and ML-DSA have pretty large keys and signatures. People start optimising the test vectors by converting them into other formats or hash the output values. I wonder if it would be beneficial to have a version that's optimised for size here in the repo as well. When everyone starts minimising the test files themselves, this gets hard to audit and error prone, and creates extra work for everyone.

@reneme reneme mentioned this pull request Aug 21, 2024
18 tasks
@obronchain
Copy link

Hi,
Is there some plan to merge this pull request and support the internal API ?

@Aurum-Vale
Copy link

Hi,

Unless I missed it somewhere, I could not find anywhere stating that those vectors were aimed at the deterministic signing variant for ML-DSA.
This can be a source of confusion, as the default variant in FIPS-204 is now the hedged variant.

A potential disadvantage for supporting the hedged variant is that you would need to refactor the vectors so they test the internal API (otherwise, consumers of the vectors need to mock the DRBG).

@gendx
Copy link
Author

gendx commented Oct 8, 2024

Unless I missed it somewhere, I could not find anywhere stating that those vectors were aimed at the deterministic signing variant for ML-DSA.

It's indeed true that these tests cover the external API (ML-DSA.Sign(), etc.), and use the deterministic variant (otherwise the tests wouldn't be reproducible, other than adding an additional "rnd" parameter that deviates from the standard API). Adding tests for the _internal functions would be a nice and perhaps more pragmatic addition.

},
{
"tcId": 20,
"comment": "signature with omega+1 hints (causing a buffer overflow)",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I get the point. The last byte of the signature is 0x37 (=55) which is exactly omega for ML-DSA-65. Why are you saying there are omega + 1 hints?

Copy link

@rben-dev rben-dev Dec 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

Agreed with the analysis of @wangweij : there is no overflow during bit unpacking here and the HintBitUnpack procedure (algorithm 21) should succeed. However, the result of the verification is still invalid (for another reason to be determined) , hence the fact that this vector still "passes" the various checks of people using it.

Regards,

@matbok
Copy link

matbok commented Nov 27, 2024

I'd be nice if the ML-DSA json files could contain the security level (44, 65, 87) in some json field.
I also agree with @Aurum-Vale in the need of a specific field stating the "deterministic" version of the signature.
And regarding the default randomized version, why not to add a "rnd" json field with the random provided?
Thanks

@rben-dev
Copy link

rben-dev commented Dec 29, 2024

Hi,

First of all, thanks for this valuable source of edge cases test vectors!

I agree with the previous comments regarding the inclusion of the security level (44, 65, 87), the variant (deterministic here, hopefully more to come), and a version (dilithium, FIPS ipd, FIPS final standard, etc) in the json file. For non-deterministic variants, similar to what @matbok suggests, maybe adding a dedicated field with the generated random would be the way to go.

Another subject of interest would be the Pre-Hashed variant of the standard (HashML-DSA) for which various hash functions with different OIDs should be tested. For such variants, a dedicated field in the json file for the prehashed value PH_M as well as the underlying hash function would be valuable.

Anyways, converging to a stable set of vectors for the finalized standard and merging this PR would be great, now that more and ore people will start to implement ML-DSA :-)

Regards,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants