NVIDIA / TensorRT-LLM Public

Notifications You must be signed in to change notification settings
Fork 1k
Star 9k

Code
Issues 321
Pull requests 70
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 33 Milestones 0

New pull request New

70 Open 350 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Custom samplingconfig addition

#2633 opened Dec 27, 2024 by buddhapuneeth

Loading…

Create c-cpp.yml

#2601 opened Dec 20, 2024 by TNGBBK

Loading…

Constraint pynvml version

#2570 opened Dec 12, 2024 by MahmoudAshraf97

Loading…

fix NV bench output len was garbage value

#2516 opened Nov 29, 2024 by ekagra-ranjan

Loading…

[DO NOT MERGE] test CI

#2513 opened Nov 29, 2024 by niukuo

Loading…

feat(qwen): add trust_remote_code argument support triaged

Issue has been triaged by maintainers

#2493 opened Nov 24, 2024 by ShivamSphn

Loading…

bugfix/incorrect lora out dims triaged

Issue has been triaged by maintainers

#2484 opened Nov 22, 2024 by akhoroshev

Loading…

Fix prompt_table_data empty tensor shape error triaged

Issue has been triaged by maintainers

#2470 opened Nov 20, 2024 by BasicCoder

Loading…

Create INT8 KV Cache on Qserve triaged

Issue has been triaged by maintainers

#2446 opened Nov 14, 2024 by dleunji

Loading…

th::optional -> std::optional triaged

Issue has been triaged by maintainers

#2397 opened Oct 31, 2024 by r-barnes

Loading…

attention mechanism toggle added functionality issue triaged

Issue has been triaged by maintainers

waiting for feedback

#2384 opened Oct 28, 2024 by Aaryanverma

Loading…

fix load_model_on_cpu on qwen/convert_checkpoint.py feature request

New feature or request

triaged

Issue has been triaged by maintainers

#2382 opened Oct 27, 2024 by lkm2835

Loading…

Fix errors when using smoothquant to quantize Qwen2 model Low Precision

Issue about lower bit quantization, including int8, int4, fp8

triaged

Issue has been triaged by maintainers

#2370 opened Oct 24, 2024 by Missmiaom

Loading…

README.md: Add 3rd Party Inference Speed Dashboard Documentation

Improvements or additions to documentation

triaged

Issue has been triaged by maintainers

#2244 opened Sep 22, 2024 by matichon-vultureprime

Loading…

Modify small-batched weight only quantization Low Precision

Issue about lower bit quantization, including int8, int4, fp8

triaged

Issue has been triaged by maintainers

#2213 opened Sep 10, 2024 by dasistwo

Loading…

[examples/bert/build.py]: Load weights for BertModel and RobertaModel if --model_dir is provided triaged

Issue has been triaged by maintainers

#2187 opened Sep 3, 2024 by tkhanipov

Loading…

Create sync.yml

#2154 opened Aug 27, 2024 by inkimikoko

Loading…

fix wrong buffer for oneShotAllReduceKernel under PUSH_MODE

#2099 opened Aug 8, 2024 by YconquestY

Loading…

decoder MMHA kernel support INT8 SCALE_Q_INSTEAD_OF_K and SCALE_P_INS…

#2085 opened Aug 5, 2024 by lishicheng1996

Loading…

fix GemmFpAIntB MMa::IteratorB::Layout

#2070 opened Jul 31, 2024 by luliyucoordinate

Loading…

fix wrong arg in Engine Building Command in docs/source/performance/perf-overview.md Documentation

Improvements or additions to documentation

#2057 opened Jul 30, 2024 by RuibaiXu

Loading…

Fix default min length triaged

Issue has been triaged by maintainers

#1935 opened Jul 11, 2024 by akhoroshev

Loading…

Add support for custom tokenizer and batch size

#1927 opened Jul 9, 2024 by uppalutkarsh

Loading…

Dev sm87 trt101

#1880 opened Jul 3, 2024 by sunnyqgg

Loading…

Bump transformers from 4.36.2 to 4.38.0 in /examples/multimodal bug

Something isn't working

dependencies

Pull requests that update a dependency file

triaged

Issue has been triaged by maintainers

waiting for feedback

#1689 opened May 28, 2024 by dependabot bot

Loading…

Previous 1 2 3 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly