Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Catch errors when reading single features #54

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

meggart
Copy link
Member

@meggart meggart commented Apr 27, 2021

Following up on #52 since we are using shx to move from feature to feature in the file, we can in theory also recover when single entries in the file are broken or the file is truncated. This PR would put the reading of the single entries into a try-catch block moves on if an entry is broken. This is not supposed to be a final implementation, I just wanted to get some feedback on if this is something we want or not.

Since this is not really compatible with the sequential reading of entries where it is much harder to recover from bad data (though not impossible), it would make sense to split the sequential read algorithm and the shx-based one into different functions, that are called separately in different use cases. I would do this if there is general consensus that we do care about broken files....

@visr
Copy link
Member

visr commented Apr 27, 2021

This seems like it could be helpful in giving better error messages. I'm not sure how I feel about just returning the non-broken or non-truncated features with only a print or even warn statement. Would like to hear what others think about this.

I would do this if there is general consensus that we do care about broken files....

I guess it depends. To me, if there are some variants that occur often and we can read with small modifications to the code, then definitely yes. Should we be able to read everything that shapelib accepts? Not necessarily, I'm afraid that would add a lot of code and make it harder to maintain.

@meggart
Copy link
Member Author

meggart commented Apr 28, 2021

I'm not sure how I feel about just returning the non-broken or non-truncated features with only a print or even warn statement.

I had the same feeling as well. I think a good solution might be to implement the idea of optional lazy access to the different entries in their shapefile. Then we throw an error if broken entry is accessed and move on for the good ones. Then it would be up to the users to include their try/catch statements according to their needs and they would still be able to access partially broken files.

@visr
Copy link
Member

visr commented Apr 28, 2021

Yes, something like that sounds good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants