Add ML-DSA test vectors. #112

gendx · 2024-04-05T15:00:35Z

This PR adds test vectors for ML-DSA, for two versions: the round 3 proposal (CRYSTALS-DILITHIUM) and the current draft of the FIPS 204 standard.

These tests aim to cover the following cases:

Baseline (signing a "Hello world" message).
Keys and signatures of the wrong length.
Signature with bit flips.
Signature hints with a non-canonical encoding:
- hints that aren't sorted,
- hints are not strictly sorted (i.e. the same hint is repeated),
- too many hints (potentially causing a buffer overflow),
- non-zero padding after the hints.
Secret keys with the s1 or s2 vectors out of the [-eta, eta] range.
Public key with the t1 component set to zero (allowing trivial forgeries, but the verification algorithm should still accept signatures for this key).
Boundary conditions in the signature rejection loop (aiming to detect incorrect comparisons):
- z_max equals gamma1 - beta - 1 and gamma1 - beta,
- r0_max equals gamma2 - beta - 1 and gamma2 - beta,
- h_ones equals omega and omega + 1,
- in the case of ML-DSA-44 (a.k.a. Dilithium 2), |ct0|_max equals gamma2 - 1 and gamma2.
A "large" number of SHAKE bytes and of SHAKE blocks generated in the expand_a, expand_s, rej_ntt_poly and rej_bounded_poly functions.
Boundary conditions in arithmetic functions:
- power_2_round function: when the remainder (found in t0) is equal to 4096 or -4095,
- decompose (via high_bits or low_bits): when the condition r_plus - r_0 = q - 1 happens.

Since the FIPS 204 standard is still a draft, I'm leaving this pull request as a draft for now.

Aurum-Vale · 2024-04-26T13:55:45Z

Hi,

Thanks for sharing your test vectors. I have been able to verify and pass all signature tests with my own private implementation of the ML-DSA-65 draft (I have yet to test signature verification and other parameter sets), and it has been of a great help so far.

I have a few remarks for some of the test vectors.

The FIPS 204 only states that "skDecode should only be run on input that comes from trusted sources". It does not require ML-DSA to check if the secret key is formatted correctly.
(However, it does require a length check on the public key when verifying a signature. I am not familiar with wycheproof, but you may want, if possible, to indicate the length check on the public key as optional, for the implementations that do not support a variable-length input.)

Therefore, the testing secret key length check should not be the concern of the ML-DSA implementation, but to its user instead (such as a higher-level provider) if they believe the secret key does not come from a trusted source.

The tests about secret keys with invalid s1 or s2 fall into a similar questioning. The specification does not state whether a check is required, or whether it is fine to generate a signature from such a key, but the remark on skDecode implies neither of those happen. Here as well, the check should come from a higher-level user/provider if it is really wanted.

Finally, the tests on the ball sampling never require more than one SHAKE256 block, but I doubt it is feasible in practice to find such a case. The tests "1 block" and "76 bytes" seem redundant (a SHAKE256 block is 136 bytes), is there a particular reason to include both?

gendx · 2024-05-17T09:56:16Z

Thank you a lot for the feedback, that's really valuable!

There are several things here.

Byte arrays (representing keys or signatures) of an invalid length.

These aren't strictly speaking part of the FIPS 204 specification draft, where algorithms 16-21 take as inputs/outputs byte strings of a specific length. I still think it's useful to include them to tests implementations that take arbitrary-length byte strings rather than specific-length arrays, either because the type system doesn't support constraining the length (depending on the implementation language), or because the serialized objects are transmitted in variable-length formats (e.g. in some tag-length-value binary format, or in some text format like JSON).

If one wants to use these Wycheproof tests on an ML-DSA implementation that uses fixed-length arrays, there anyway needs to be a conversion layer between the JSON strings (which have unrestricted length) and the fixed-length arrays. So if we consider this conversion layer to be part of the tested implementation, then it also makes sense to have these tests.

Note that these tests are labeled (e.g. "IncorrectPrivateKeyLength"), so they can be ignored accordingly.
Tests where the private key isn't valid (s1 or s2 out of range).

Here I would say that the specification is perhaps not clear enough. On the one hand, the mathematical definition (algorithm 19) accepts any byte string as input ("B^n" where n is the private key size), so out-of-range values are acceptable inputs. On the other hand, it says "Note that there exist malformed inputs [...] skDecode should only be run on input that comes from trusted sources", which kind of implies that the correct input domain isn't "B^n", but only the subset of well-formed private keys.

Unfortunately, skimming through the official comments I didn't find this issue being raised (but maybe I've read to fast), so it's perhaps worth raising the issue on the pqc-forum (there was a thread about signature hint unpacking, which is a different question and definitely needs to be checked).

I definitely agree that the check isn't needed if a user can indeed confirm that the key came from a trusted source or was already validated. That said, it isn't an expensive check performance-wise.

Here again, the "InvalidPrivateKey" label allows to filter these tests.
Regarding the SHAKE blocks vs. bytes, they can sometimes be different, so my test generation script just output all of them even when they are the same. My take being that it's more cautious to have a few extra redundant tests than too few tests that miss pitfalls.

As you pointed out, it's anyway hard to test that the implementation is resilient to having to generate more blocks, as finding inputs that need many blocks requires a huge amount of brute-forcing.

Aurum-Vale · 2024-05-17T10:11:56Z

Thanks for your reply.
I may look into raising the issue in 2) on the pqc-forum. The current draft states that the signature process never fails, so it could imply that any byte string is valid despite the confusing claim about malformed private keys. Still, the draft sends mixed signals on this issue.

Since my initial post here, I have successfully tested your vectors for all parameter sets, and for both signature and verification.

gendx · 2024-05-21T11:48:49Z

I've created a thread on the pqc-forum: https://groups.google.com/a/list.nist.gov/g/pqc-forum/c/9nVfHKtid-k

eay · 2024-06-28T02:25:40Z

Regarding the SHAKE blocks vs. bytes, they can sometimes be different, so my test generation script just output all of them even when they are the same. My take being that it's more cautious to have a few extra redundant tests than too few tests that miss pitfalls.
As you pointed out, it's anyway hard to test that the implementation is resilient to having to generate more blocks, as finding inputs that need many blocks requires a huge amount of brute-forcing.

Regarding issue 3, have a look at usnistgov/ACVP-Server#321
This was for the seed value '9d20abd2a2d3b9313ee70aabb7c8bd80d262cedd6c8b5bfba0cbcae1cbd31d70'.
It was triggering an error for getting the next block of output, and I suspect it is the reason for the wanting to test this case 😊

These are based on the changes advertised in the Appendix D of https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.204.pdf. I didn't re-check the rest of the document. I've added new tests for the new "context" parameter passed in signing/verification functions (assumed empty when not provided). There are no tests for the pre-hashing variants for now.

gendx · 2024-08-14T14:47:21Z

I've added the "standard" series of tests based on https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.204.pdf. I've only incorporated changes advertised in Appendix D, and NIST hasn't yet uploaded official tests (discussion on the PQC forum), so take these with a grain of salt.

I've notably added support for the new "context" parameter in signing/verification (notably considering the edge case of an overlong context, of >= 256 bytes). My test vectors are based on the "external" APIs (no test vectors provided for the _internal APIs for now), using the non-hedged deterministic variant, and assume an empty context when not specified. There are no test cases for the pre-hashed variants for now.

eay · 2024-08-15T06:40:55Z

Could I request you add some version information in the json? I'm personally use 3.1 (last dilithium) and 4.0 for the mldsa draft, 4.1 for the current ACVP-Server test vectors (no context) and now 4.2 for the final (your vectors - which I pass). I have scripts to load the json and use ffi to my C code. I feel filtering on filenames is a bit ugly.....
Perhaps a version based on document release date.
thanks

franziskuskiefer · 2024-08-20T12:30:01Z

ML-KEM (#110) and ML-DSA have pretty large keys and signatures. People start optimising the test vectors by converting them into other formats or hash the output values. I wonder if it would be beneficial to have a version that's optimised for size here in the repo as well. When everyone starts minimising the test files themselves, this gets hard to audit and error prone, and creates extra work for everyone.

obronchain · 2024-09-24T18:22:20Z

Hi,
Is there some plan to merge this pull request and support the internal API ?

Aurum-Vale · 2024-09-27T11:51:00Z

Hi,

Unless I missed it somewhere, I could not find anywhere stating that those vectors were aimed at the deterministic signing variant for ML-DSA.
This can be a source of confusion, as the default variant in FIPS-204 is now the hedged variant.

A potential disadvantage for supporting the hedged variant is that you would need to refactor the vectors so they test the internal API (otherwise, consumers of the vectors need to mock the DRBG).

gendx · 2024-10-08T08:46:13Z

Unless I missed it somewhere, I could not find anywhere stating that those vectors were aimed at the deterministic signing variant for ML-DSA.

It's indeed true that these tests cover the external API (ML-DSA.Sign(), etc.), and use the deterministic variant (otherwise the tests wouldn't be reproducible, other than adding an additional "rnd" parameter that deviates from the standard API). Adding tests for the _internal functions would be a nice and perhaps more pragmatic addition.

wangweij · 2024-11-06T19:24:50Z

testvectors_v1/mldsa_65_standard_verify_test.json

+        },
+        {
+          "tcId": 20,
+          "comment": "signature with omega+1 hints (causing a buffer overflow)",


Not sure I get the point. The last byte of the signature is 0x37 (=55) which is exactly omega for ML-DSA-65. Why are you saying there are omega + 1 hints?

Hi,

Agreed with the analysis of @wangweij : there is no overflow during bit unpacking here and the HintBitUnpack procedure (algorithm 21) should succeed. However, the result of the verification is still invalid (for another reason to be determined) , hence the fact that this vector still "passes" the various checks of people using it.

Regards,

matbok · 2024-11-27T14:18:03Z

I'd be nice if the ML-DSA json files could contain the security level (44, 65, 87) in some json field.
I also agree with @Aurum-Vale in the need of a specific field stating the "deterministic" version of the signature.
And regarding the default randomized version, why not to add a "rnd" json field with the random provided?
Thanks

rben-dev · 2024-12-29T09:59:54Z

Hi,

First of all, thanks for this valuable source of edge cases test vectors!

I agree with the previous comments regarding the inclusion of the security level (44, 65, 87), the variant (deterministic here, hopefully more to come), and a version (dilithium, FIPS ipd, FIPS final standard, etc) in the json file. For non-deterministic variants, similar to what @matbok suggests, maybe adding a dedicated field with the generated random would be the way to go.

Another subject of interest would be the Pre-Hashed variant of the standard (HashML-DSA) for which various hash functions with different OIDs should be tested. For such variants, a dedicated field in the json file for the prehashed value PH_M as well as the underlying hash function would be valuable.

Anyways, converging to a stable set of vectors for the finalized standard and merging this PR would be great, now that more and ore people will start to implement ML-DSA :-)

Regards,

Add ML-DSA test vectors.

b5eb7eb

xvzcf mentioned this pull request Jul 1, 2024

ML-DSA: Include verification key in the Wycheproof signature JSON KAT files. cryspen/libcrux#340

Open

OR13 mentioned this pull request Aug 14, 2024

FIPS Update cose-wg/draft-ietf-cose-dilithium#4

Closed

franziskuskiefer mentioned this pull request Aug 20, 2024

Add FIPS 203, 204 CAVP test vectors cryspen/libcrux#520

Closed

reneme mentioned this pull request Aug 21, 2024

PQC: ML-DSA randombit/botan#4270

Merged

18 tasks

paulmillr mentioned this pull request Sep 24, 2024

ML-DSA: missing context encoding step (only internal functions implemented?) paulmillr/noble-post-quantum#9

Closed

jschneider-bensch mentioned this pull request Sep 26, 2024

[ML-DSA] Use all of commitment hash to sample verifiers challenge cryspen/libcrux#600

Merged

wangweij reviewed Nov 6, 2024

View reviewed changes

SWilson4 mentioned this pull request Dec 19, 2024

Add Wycheproof test vectors open-quantum-safe/liboqs#2024

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ML-DSA test vectors. #112

Add ML-DSA test vectors. #112

gendx commented Apr 5, 2024

Aurum-Vale commented Apr 26, 2024

gendx commented May 17, 2024

Aurum-Vale commented May 17, 2024

gendx commented May 21, 2024

eay commented Jun 28, 2024

gendx commented Aug 14, 2024

eay commented Aug 15, 2024

franziskuskiefer commented Aug 20, 2024

obronchain commented Sep 24, 2024

Aurum-Vale commented Sep 27, 2024

gendx commented Oct 8, 2024

wangweij Nov 6, 2024

rben-dev Dec 29, 2024 •

edited

Loading

matbok commented Nov 27, 2024

rben-dev commented Dec 29, 2024 •

edited

Loading

Add ML-DSA test vectors. #112

Are you sure you want to change the base?

Add ML-DSA test vectors. #112

Conversation

gendx commented Apr 5, 2024

Aurum-Vale commented Apr 26, 2024

gendx commented May 17, 2024

Aurum-Vale commented May 17, 2024

gendx commented May 21, 2024

eay commented Jun 28, 2024

gendx commented Aug 14, 2024

eay commented Aug 15, 2024

franziskuskiefer commented Aug 20, 2024

obronchain commented Sep 24, 2024

Aurum-Vale commented Sep 27, 2024

gendx commented Oct 8, 2024

wangweij Nov 6, 2024

Choose a reason for hiding this comment

rben-dev Dec 29, 2024 • edited Loading

Choose a reason for hiding this comment

matbok commented Nov 27, 2024

rben-dev commented Dec 29, 2024 • edited Loading

rben-dev Dec 29, 2024 •

edited

Loading

rben-dev commented Dec 29, 2024 •

edited

Loading