Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: future.default_arg for method-specific behavior changes #60593

Open
rhshadrach opened this issue Dec 20, 2024 · 11 comments
Open

API: future.default_arg for method-specific behavior changes #60593

rhshadrach opened this issue Dec 20, 2024 · 11 comments
Labels
API Design Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action

Comments

@rhshadrach
Copy link
Member

rhshadrach commented Dec 20, 2024

Ref: #60551 (comment)

Whenever possible, a deprecation should be performed without the introduction of a new argument for performing said deprecation. However there are times where we are changing the behavior of an existing function or method where the output will change on the same set of input arguments. For this, I think pandas should have a standard way of modifying the existing behavior.

I think there are two common cases when using pandas:

  • The code I write is short-lived and when I get something working on one version of pandas, that's the only version it needs to work on.
  • The code I write is maintained, and I will need to upgrade pandas and have it still perform correctly.

I think we can support both use cases by:

  • Any time a particular method behavior changes, introduce a keyword-only argument future=[True | False | no_default] defaulting to no_default.
    • future=True gives the future behavior after the deprecation will be enforced.
    • future=False gives the current behavior, with no warning message.
    • future=no_default gives a warning message
  • Introduce a global underride future.default_arg = [True | False | no_default] defaulting to no_default.

When e.g. df.method(...) is called without specifying the value of future, only then will the value future.default_arg will be used.

Users can keep the current behavior and globally disable all warnings by specifying future.default_arg = False. If they were to do so and upgrade across major versions of pandas, they will see breaking changes without ever getting any warnings. As such, I think the documentation on this should read something like:

future.default_arg : bool | no_default
Global underride of any pandas function that has a future argument. When the future argument is specified in the function call itself, this value will not be used. When future.default_arg is not specified (so has value no_default), calling functions will warn on the upcoming change. When future.default_arg is set to True, the future behavior of functions will be used. When future.default_arg is set to False, the current behavior of functions will be used without warning. In particular, if you specify future.default_arg = False and upgrade across major versions of pandas, you will experience breaking changes without warning!

@rhshadrach rhshadrach added API Design Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action labels Dec 20, 2024
@rhshadrach rhshadrach changed the title API: future.default for method-specific behavior changes API: future.default_arg for method-specific behavior changes Dec 20, 2024
@rhshadrach
Copy link
Member Author

cc @pandas-dev/pandas-core

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Dec 20, 2024

I like this idea, and there is a bit of precedent (just discovered this yesterday) with Dataframe.stack() and the future_stack parameter.

Having said that, if we want to do this each time we make behavior changes, then maybe this proposal should be a PDEP?

@rhshadrach
Copy link
Member Author

I personally do not think this rises to the level of a PDEP.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Dec 20, 2024

I personally do not think this rises to the level of a PDEP.

My reason for suggesting that is because it then becomes a "policy" that we follow going forward

@WillAyd
Copy link
Member

WillAyd commented Dec 20, 2024

How would someone manage future=True meaning one thing when 4.x gets released and then having it mean something else when 5.x gets released?

@bashtage
Copy link
Contributor

What is the off-ramp for changing this? Do we always support future=False to retain the legacy behaviour? Or is there an expectation that after the default changes, any bool atgument will start warning that the future behaviour will become the only behaviour, and the future keyword will be removed?

@rhshadrach
Copy link
Member Author

rhshadrach commented Dec 20, 2024

I personally do not think this rises to the level of a PDEP.

My reason for suggesting that is because it then becomes a "policy" that we follow going forward

Understood - I just think "something becoming policy" is not in and of itself sufficient for requiring a PDEP.

How would someone manage future=True meaning one thing when 4.x gets released and then having it mean something else when 5.x gets released?

We don't. The purpose of the underride is for people who are not trying to maintain code that needs to work across changes in major versions of pandas. If you are maintaining code across major versions of pandas, you should only be using the default of no_default.

What is the off-ramp for changing this? Do we always support future=False to retain the legacy behaviour? Or is there an expectation that after the default changes, any bool atgument will start warning that the future behaviour will become the only behaviour, and the future keyword will be removed?

The latter. I think the entire deprecation process would look like the following:

  1. pandas x.y.0: Introduce the future=[True | False | no_default] argument, default to no_default (raising a warning).
  2. pandas (x+1).0.0: Change the behavior so that future=no_default will give the future behavior. Specifying True or False raises a warning that the argument will be going away.
  3. pandas (x+2).0.0: Remove the future argument altogether.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Dec 23, 2024

I personally do not think this rises to the level of a PDEP.

My reason for suggesting that is because it then becomes a "policy" that we follow going forward

Understood - I just think "something becoming policy" is not in and of itself sufficient for requiring a PDEP.

IMHO, the PDEP is the best way for us to document how we want to move forward with similar changes in the future. This particular idea (which I support) is related to the deprecation policy PDEP-17 that we recently approved. So if we make it a PDEP, then we have that reference going forward for similar issues in the future.

@rhshadrach
Copy link
Member Author

PDEP-1 starts with

A PDEP (pandas enhancement proposal) is a proposal for a major change in pandas

This is not a major change. We have contributor docs for documentation, not PDEPs.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Dec 26, 2024

This is not a major change. We have contributor docs for documentation, not PDEPs.

It's not a major change in the code, but a change in how we will handle these kinds of deprecations. I agree that it's borderline on whether it should be a PDEP or not.

@WillAyd
Copy link
Member

WillAyd commented Dec 26, 2024

I would be +1 to a PDEP for this. It would also be a good supplement to PDEP-17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

5 participants
@WillAyd @bashtage @Dr-Irv @rhshadrach and others