PromptInject
PublicPromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. ? Best Paper Awards @ NeurIPS ML Safety Workshop 2022
adversarial-attacksagiagi-alignmentai-alignmentai-safetychain-of-thoughtgpt-3language-modelslarge-language-modelsmachine-learning
Creat:2022-10-25T19:42:12
Update:2025-03-24T21:56:51
399
Stars
0
Stars Increase