llama2-chatbot-cpu
PublicA LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel? Extension For PyTorch with bfloat16.
Discover Popular AI-MCP Services - Find Your Perfect Match Instantly
Easy MCP Client Integration - Access Powerful AI Capabilities
Master MCP Usage - From Beginner to Expert
Top MCP Service Performance Rankings - Find Your Best Choice
Publish & Promote Your MCP Services
A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel? Extension For PyTorch with bfloat16.