Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create whisper_evaluator.py #3990

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

pwolnows
Copy link
Contributor

Enable validation of whisper models with:

  • WhisperPipeline from openvino_genai
  • AutomaticSpeechRecognitionPipeline from transformers

@AlexKoff88
Copy link
Contributor

@eaidova, please take a look and trigger the CI please

return [], outputs


class GenAI_WhisperPipeline(WhisperPipeline):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename classes for consistency, e.g. HFWhisperPipeline, OptimumWhisperPipeline, GenAIWhisperPipeline,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree - suggested names are self descriptive

@AlexKoff88
Copy link
Contributor

it looks good overall. It would be great to get some sanity tests on a dummy model to make sure that all three classes work.

Comment on lines 19 to 22
import openvino_genai as ov_genai
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor
from transformers.pipelines.automatic_speech_recognition import \
AutomaticSpeechRecognitionPipeline
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make these packages optionl like inflect bellow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I had them all in try except in initial version but then I thought that packages are so common that was no sense to import, but indeed there are checks that fails to import them though.

@AlexKoff88
Copy link
Contributor

it looks good overall. It would be great to get some sanity tests on a dummy model to make sure that all three classes work.

e.g. with yujiepan/whisper-v3-tiny-random model from the Hub.

input_data = [sample["audio"]["array"]]
input_meta = [{"sample_rate": sample["audio"]["sampling_rate"]}]
identifiers = [sample["id"]]
# print(ground_truth)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clean up the code a bit and remove print. Also, you need to remove the directory after test suite finishes. You can define teardown_module() function for that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also looks like you need to install datasets in test requirements

 tools/accuracy_checker/tests/test_whisper_evaluator.py:23: in <module>
    from datasets import load_dataset
E   ModuleNotFoundError: No module named 'datasets'

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pwolnows I believe you have enough permissions to open github actions status, right? still some dependencies missed
https://github.com/openvinotoolkit/open_model_zoo/actions/runs/12442951517/job/34781160595?pr=3990

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants