Tags: NVIDIA/TensorRT-LLM
Tags
[None][doc] Refine Nemotron Ultra doc (#15113) Signed-off-by: nv-guomingz <[email protected]>
[https://nvbugs/6244695][fix] Revert Pass IPC HMAC key through file d… …escriptor (#14782) Signed-off-by: Chenfei Zhang <[email protected]>
[None][infra] Fix release sanity config compatibility Signed-off-by: Yiteng Niu <[email protected]>
[None][infra] Waive 9 failed cases for main in post-merge (#14515) Signed-off-by: xinhe-nv <[email protected]>
[None][test] Waive 1 failed cases for main in QA CI (#14332) Signed-off-by: xinhe-nv <[email protected]>
[None][fix] fix PEFT page accumulation in MaxUtilizationPolicy schedu… …ler (#13528) Signed-off-by: Aurelien Chartier <[email protected]>
[None][feat] Add conversation router integration test and improve fin… …ish_request perf - Add eager_poll config to KvCacheAwareRouter for test determinism - Make finish_request non-blocking by firing poll_and_update as background task - Add _base_url property to avoid double http:// prefix - Add ConversationRouterTester with explicit conversation_id and implicit prefix matching tests in test_workers.py - Add conversation router test to l0_dgx_h100 and QA test lists Signed-off-by: Lizhi Zhou <[email protected]> (cherry picked from commit b0f0b5d)
[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <[email protected]>
[https://nvbugs/6025177][fix] Fix KV cache issue (cherry-pick to rele… …ase/1.3.0rc5.post2) (#12819) Signed-off-by: thorjohnsen <[email protected]>
[None][chore] Bump version to 1.3.0rc12 (#13129) Signed-off-by: Xiwen Yu <[email protected]>
PreviousNext