Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Audio Features - DO NOT MERGE] PoC for adding an offset+sliced reading to audio file. #7312

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

TParcollet
Copy link

This is a proof of concept for #7310 . The idea is to enable the access to others column of the dataset row when loading an audio file into a table. This is to allow sliced reading. As stated in the issue, many people have very long audio files and use start and stop slicing in this audio file.

Right now, this code work as a PoC on my dataset. However, this is just to illustrate the idea. Many things are messed up, the first being that the shards have wildly varying sizes.

Could be of interest to @lhoestq and @sanchit-gandhi ?

Happy to test better ideas locally.

Titouan Parcollet/Embedded AI /SRUK/Engineer/Samsung Electronics added 2 commits December 8, 2024 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant