PIGuard is a new type of prompt protection model specifically designed to detect prompt injection attacks. Through an innovative training strategy, it significantly reduces the bias towards trigger words, performs excellently in multiple benchmark tests, surpassing the current best model by 30.8%, and provides a powerful open-source protection solution for LLM security.
Natural Language Processing
TransformersEnglish