LLM

Sockpuppeting: How a Single Line Can Bypass LLM Safety Guardrails

10 de abril de 2026

A jailbreak through sockpuppeting can be easily done as it requires no special tools nor optimization. It only takes a faulty prefill feature, and the gates are open. We tested 11 LLM-powered assistants against sockpuppeting and found varying levels of robustness across today’s leading LLMs.

Leer más  

  • 31 de marzo de 2026
    TrendAI™ Research has developed a model training procedure for learning an essential representation of prompt injection attacks. The resulting prompt representation exhibits approximately linear separability, allowing the specialized, small-scale classifier trained on features derived from the representation to achieve high classification performance.
  • 25 de julio de 2024
    The adoption of large language models (LLMs) and Generative Pre-trained Transformers (GPTs), such as ChatGPT, by leading firms like Microsoft, Nuance, Mix and Google CCAI Insights, drives the industry towards a series of transformative changes. As the use of these new technologies becomes prevalent, it is important to understand their key behavior, advantages, and the risks they present.