[Audio Features - DO NOT MERGE] PoC for adding an offset+sliced reading to audio file. #7312

TParcollet · 2024-12-08T10:27:31Z

This is a proof of concept for #7310 . The idea is to enable the access to others column of the dataset row when loading an audio file into a table. This is to allow sliced reading. As stated in the issue, many people have very long audio files and use start and stop slicing in this audio file.

Right now, this code work as a PoC on my dataset. However, this is just to illustrate the idea. Many things are messed up, the first being that the shards have wildly varying sizes.

Could be of interest to @lhoestq and @sanchit-gandhi ?

Happy to test better ideas locally.

Titouan Parcollet/Embedded AI /SRUK/Engineer/Samsung Electronics added 2 commits December 8, 2024 10:21

simple poc

efb2615

quick fix

bc0dc6c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Audio Features - DO NOT MERGE] PoC for adding an offset+sliced reading to audio file. #7312

[Audio Features - DO NOT MERGE] PoC for adding an offset+sliced reading to audio file. #7312

TParcollet commented Dec 8, 2024

[Audio Features - DO NOT MERGE] PoC for adding an offset+sliced reading to audio file. #7312

Are you sure you want to change the base?

[Audio Features - DO NOT MERGE] PoC for adding an offset+sliced reading to audio file. #7312

Conversation

TParcollet commented Dec 8, 2024