-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Samples] merge LLM samples to "text_generation" folder #1411
base: master
Are you sure you want to change the base?
Conversation
490743c
to
9e7f861
Compare
9e7f861
to
3901fbb
Compare
- **Main Feature:** Demonstrates simple text continuation. | ||
- **Run Command:** | ||
```bash | ||
./text_generation -m <model> -i "Hello, how are you?" -d CPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If it's an example command, it must be specific, replace
<model>
with TinyLlama. If it's a help message, align it with what the sample would print. - There's no
text_generation
target. - The samples don't take device as an arg. They also don't have named args
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's a help message, align it with what the sample would print.
Align them to match exactly. For example greedy would print greedy_causal_lm <MODEL_DIR> "<PROMPT>"
(remove Usage:
because you have *Run Command:
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same must be done for python
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to apply all comments first. And then make similar changes to python samples (or in the next PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to apply all comments first. And then make similar changes to python samples (or in the next PR)
Please change the structure of cpp sample and python in the same PR. It will be more convenient to update OpenVINO documentation and other public materials, which reference these samples.
COMPILE_PDB_NAME prompt_lookup_decoding_lm | ||
# Ensure out of box LC_RPATH on macOS with SIP | ||
INSTALL_RPATH_USE_LINK_PATH ON) | ||
# Don't install prompt_lookup_decoding_lm because it doesn't use openvino_genai library and is not verified yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's obsolete information and now prompt_lookup_decoding_lm
uses GenAI.
Please, uncomment this code.
@@ -2,14 +2,8 @@ | |||
# SPDX-License-Identifier: Apache-2.0 | |||
# | |||
|
|||
add_subdirectory(cpp/beam_search_causal_lm) | |||
add_subdirectory(cpp/benchmark_genai) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
benchmark_genai is also LLM sample
See VLM https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/visual_language_chat as example
|
||
This example showcases inference of text-generation Large Language Models (LLMs): `chatglm`, `LLaMA`, `Qwen` and other models with the same signature. The application doesn't have many configuration options to encourage the reader to explore and modify the source code. For example, change the device for inference to GPU. The sample fearures `ov::genai::LLMPipeline` and configures it to run the simplest deterministic greedy sampling algorithm. There is also a Jupyter [notebook](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-chatbot) which provides an example of LLM-powered Chatbot in Python. | ||
# OpenVINO AI Text Generation Samples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# OpenVINO AI Text Generation Samples | |
# OpenVINO GenAI Text Generation Samples |
|
||
These samples showcase the use of OpenVINO's inference capabilities for text generation tasks, including different decoding strategies such as beam search, multinomial sampling, and speculative decoding. Each sample has a specific focus and demonstrates a unique aspect of text generation. | ||
The applications don't have many configuration options to encourage the reader to explore and modify the source code. For example, change the device for inference to GPU. | ||
There is also a Jupyter [notebook](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-chatbot) that provides an example of LLM-powered text generation in Python. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## Table of Contents | ||
1. [Download and Convert the Model and Tokenizers](#download-and-convert-the-model-and-tokenizers) | ||
2. [Running the Samples](#running-the-samples) | ||
3. [Using encrypted models](#using-encrypted-models) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this section is missed.
BTW, it not a dedicated section, it's also just a sample as others.
@@ -13,32 +24,99 @@ pip install --upgrade-strategy eager -r ../../requirements.txt | |||
optimum-cli export openvino --trust-remote-code --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 TinyLlama-1.1B-Chat-v1.0 | |||
``` | |||
|
|||
## Run | |||
## Running the Samples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run Command
in each sample already demonstrates how to run
@@ -13,32 +24,99 @@ pip install --upgrade-strategy eager -r ../../requirements.txt | |||
optimum-cli export openvino --trust-remote-code --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 TinyLlama-1.1B-Chat-v1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we cannot use just one model for all use cases.
For chat sample - chat model
For typical generation - either instruct or typical model
For speculative decoding we need 2 models and it's not shown at all what models to use
No description provided.