Watch the video:
Oobabooga WebUI had a HUGE update adding ExLlama and ExLlama_HF model loaders that use LESS VRAM and have HUGE speed increases, and even 8K tokens to play around with. Watch the full video: Go to post or View on YouTube
Oobabooga WebUI had a HUGE update adding ExLlama and ExLlama_HF model loaders that use LESS VRAM and have HUGE speed increases, and even 8K tokens to play around with. Watch the full video: Go to post or View on YouTube