-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use regex for parsing step_dir #739
base: main
Are you sure you want to change the base?
Conversation
@@ -81,8 +82,10 @@ class CheckpointValidationType(str, enum.Enum): | |||
|
|||
|
|||
def parse_step_from_dir(step_dir: str) -> int: | |||
# TODO(markblee): use regex. | |||
return int(step_dir[-STEP_NUM_DIGITS:]) | |||
step = re.findall(r'(\d{' + str(STEP_NUM_DIGITS) + r'})', step_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I do think we should be restrictive about the patterns that we accept, specifically that it ends with STEP_NUM_DIGITS
digits. Should we update to a pattern that matches that and also add a test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We run into an issue quite often where we'll have some checkpoint suffixed (i.e. checkpoints/step_00001000-ext
) where the only way to run this with axlearn is to copy this directory to step_00001000
. How would you suggest we incorporate a pattern like this? Happy to add test cases to this PR as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll probably need to understand more about your use-case and why we need to suffix checkpoints. Can you open an internal PR for discussion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will defer to @markblee for approval.
Instead of assuming the end of a step directory will be a series of digits to be converted to a step number, uses regex to find the last group of
STEP_NUM_DIGITS
from the directory passed into the function.