llama2-chatbot-cpu
PublicA LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel? Extension For PyTorch with bfloat16.
Creat:2023-08-10T21:32:14
Update:2024-09-03T21:34:20
14
Stars
0
Stars Increase