Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DVCFileSystem: inconsistent behavior of DVCFileSystem #10647

Open
adamliter opened this issue Dec 10, 2024 · 0 comments
Open

DVCFileSystem: inconsistent behavior of DVCFileSystem #10647

adamliter opened this issue Dec 10, 2024 · 0 comments
Assignees

Comments

@adamliter
Copy link
Contributor

Bug Report

Description

DVCFileSystem exhibits some inconsistent behavior (I think, based on my understanding of the documentation), and I'm not sure what the intended behavior is. In particular, DVCFileSystem's get_file raises an error with rpath=lpath and rev=None from a non-default branch. But if explicitly instantiated with rev='name of branch', then the error is not raised.

Reproduce

$ cd /tmp
$ mkdir dvc-test-1
$ cd dvc-test-1
$ pdm init --python [email protected]
$ pdm add dvc==3.58.0 # not specific to this version though
$ git init
$ dvc init
$ git add .
$ git commit -m "initial commit"
$ git checkout -b train_model
$ echo 1 > model.ckpt
$ dvc add model.ckpt
$ git add . 
$ git commit -m "trained first model"

Now, from Python (e.g., pdm run python):

from dvc.api import DVCFileSystem
fs = DVCFileSystem()
fs.get_file("model.ckpt", "model.ckpt") # raises shutil.SameFileError
fs2 = DVCFileSystem(rev="train_model")
fs2.get_file("model.ckpt", "model.ckpt") # no error raised

Expected

I'm not sure what the intended behavior is supposed to be. The documentation for rev says "In case of a local repository, if rev is unspecified, it will default to the working directory." Is "working directory" here supposed to be git's concept of "working tree"? If so, this makes me think the behavior of fs.get_file("model.ckpt", "model.ckpt") and fs2.get_file("model.ckpt", "model.ckpt") in the example above should be identical; but one raises an error and one does not. Is this expected?

Environment information

$ dvc doctor
DVC version: 3.58.0 (pip)
-------------------------
Platform: Python 3.12.8 on macOS-14.7.1-x86_64-i386-64bit
Subprojects:
	dvc_data = 3.16.7
	dvc_objects = 5.1.0
	dvc_render = 1.0.2
	dvc_task = 0.40.2
	scmrepo = 3.3.9
Supports:
	http (aiohttp = 3.11.10, aiohttp-retry = 2.9.1),
	https (aiohttp = 3.11.10, aiohttp-retry = 2.9.1)
Config:
	Global: /Users/adam.liter/Library/Application Support/dvc
	System: /Library/Application Support/dvc
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s5s1
Caches: local
Remotes: None
Workspace directory: apfs on /dev/disk1s5s1
Repo: dvc, git
Repo.site_cache_dir: /Library/Caches/dvc/repo/e935e1cd05376dcbfdd7b97f975e242b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants