Groq Inc. has unveiled a large-scale inference chip that surpasses traditional GPUs and Google's TPU with a speed of 500 tokens per second. The team, including founder Jonathan Ross, hails from Google's TPU division. The chip employs Groq's proprietary LPU solution and aims to outperform NVIDIA within three years, with a price tag of approximately $20,000. It boasts extremely fast API access speeds and supports a variety of open-source LLM models.