Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for license expressions #707

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

cdce8p
Copy link

@cdce8p cdce8p commented Nov 24, 2024

https://peps.python.org/pep-0639/#add-string-value-to-license-key
https://peps.python.org/pep-0639/#add-license-expression-field

Partially vendor packaging 24.2 to include the license validation added in pypa/packaging#828.

Until Metadata version 2.4 is supported, the license expression will be backfilled to the License field. Hatch does the same atm.

Work on #692

@mgorny
Copy link
Contributor

mgorny commented Jan 4, 2025

Do we have actually need to have the validation/canonicalization done unconditionally? Creating a cyclic dependency between flit_core and packaging would really suck for Gentoo, and unless the non-canonicalized metadata is completely invalid and breaks everything, we'd rather see fewer moving parts for an end user who just need to install the package locally, rather than publish it on PyPI.

@cdce8p cdce8p force-pushed the license-expression branch from cbcbb87 to 95a8099 Compare January 4, 2025 11:44
@cdce8p
Copy link
Author

cdce8p commented Jan 4, 2025

Do we have actually need to have the validation/canonicalization done unconditionally?

From the PEP:

Build and publishing tools SHOULD check that the License-Expression field contains a valid SPDX expression, including the validity of the particular license identifiers (as defined above). Tools MAY halt execution and raise an error when an invalid expression is found. If tools choose to validate the SPDX expression, they also SHOULD store a case-normalized version of the License-Expression field using the reference case for each SPDX license identifier and uppercase for the AND, OR and WITH keywords.

Hatchling uses packaging for the validation as well and setuptools will likely do the same. So I don't think it would make sense to skip it here.

@mgorny
Copy link
Contributor

mgorny commented Jan 4, 2025

The difference is that hatchling and setuptools already depend on packaging, while flit_core does not, while packaging uses flit_core as a build system to provide a clean bootstrap path. So adding packaging use to flit_core introduces a cyclic dependency here and makes unvendored bootstrap impossible.

@mgorny
Copy link
Contributor

mgorny commented Jan 4, 2025

Also, I dare say that "build and publishing tools" are frontends, not backends. Though admittedly, it's not the first time where people are adding frontend work to backends, or pushing maintainer's work onto end users.

@cdce8p
Copy link
Author

cdce8p commented Jan 4, 2025

So adding packaging use to flit_core introduces a cyclic dependency here and makes unvendored bootstrap impossible.

tomli also uses flit_core and is vendored here. How is that handled in Gentoo?

@mgorny
Copy link
Contributor

mgorny commented Jan 4, 2025

tomli is special-cased, and it is needed only for Python 3.10.

@cdce8p
Copy link
Author

cdce8p commented Jan 4, 2025

tomli is special-cased, and it is needed only for Python 3.10.

Yeah, but in that case you could probably special case packaging as well. Tbh it's not my decision to make, I've made my point that I believe validation would be useful. Someone else should decide how to proceed here.

@takluyver
Copy link
Member

Thanks for looking into this!

Flit does have a role in bootstrapping the packaging ecosystem, and as such it's useful to avoid dependencies, even vendored ones (since Linux distros in particular prefer to unvendor such things). We made an exception for tomli, since using TOML was standardised before it was in the stdlib, but I'd rather not make a pattern of this.

At the same time, I do think it makes sense to validate this before producing a package. It's frustrating in some ways that we use the same pathways to make packages for publication as for immediate installation, but that's where we are, and there are upsides to this as well.

I'd like to consider doing an independent implementation of SPDX validation & case normalisation. Besides preserving the zero-dependency status, this also means the ecosystem has two versions used in earnest to compare, rather than everyone doing the same thing. In a similar vein, Flit implements its own parsing of the PEP 621 [project] table, construction of the metadata file and wheel file structure, even though these are libraries that could provide these. I also like that when there's a bug, it's just a bug to fix, not an incompatibility with dependency X version Y.

It looks like this is tractable for license expressionss - at a glance, packaging has done it in about 150 lines, not counting the data of license names. @mgorny I take it you'd be happy with that? @cdce8p are you interested in working on that, or shall I have a go?

@cdce8p cdce8p force-pushed the license-expression branch from 1da4db3 to aa16d4d Compare January 5, 2025 16:41
@cdce8p
Copy link
Author

cdce8p commented Jan 5, 2025

I'd like to consider doing an independent implementation of SPDX validation & case normalisation. Besides preserving the zero-dependency status, this also means the ecosystem has two versions used in earnest to compare, rather than everyone doing the same thing. [...]

I understand your argument. Just skeptical a custom implementation wouldn't look quite similar to the existing one in packaging. The main advantage IMO of using the vendored version would be that they likely thought of any special case and if there is a bug, it's fixed upstream for everyone.

It looks like this is tractable for license expressionss - at a glance, packaging has done it in about 150 lines, not counting the data of license names. @mgorny I take it you'd be happy with that? @cdce8p are you interested in working on that, or shall I have a go?

I'm certainly interested in moving forward on the PEP 639 support. (Not sure I would be the best to write the custom validation though.) Maybe an option would be to defer the validation for a moment and circle back to it when all other things are in place, i.e. before moving to metadata version 2.4?

I removed the vendored packaging dependency here and replaced it with TODO comments. So this PR stands on its own now. Similarly #705 for license-files. Would it be an option for you if you look at these two first?

If not, what path would you prefer?

@mgorny
Copy link
Contributor

mgorny commented Jan 5, 2025

@mgorny I take it you'd be happy with that?

Yeah, that would work for me. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants