Listen, figure out the situation. After a few years, OpenAI is finally rolling out open-weight models — GPT-OSS-120b and GPT-OSS-20b. They promise that these things are not only fast and powerful, but also supposedly "impenetrable" in terms of security. They say they went through intensive training, checked everything, prepared for the worst-case scenarios, and tested jailbreak resistance... In short, it would seem that you can exhale.
But that was not the case.
On the same day that the models became available, a familiar hero appeared — Pliny the Liberator. This is the same guy who has been hacking OpenAI's AI for several times. And again, he does it almost immediately. He posted a post marked "GPT-OSS: LIBERATED", attached screenshots where the model calmly gives out recipes for methamphetamine, Molotov cocktails, chemical warfare agents and malware code.
That is, all the things that she absolutely should not do.
The method? Still the same, tested one. He submits a multi—layered request: first, a harmless dialogue, then the insertion of his trademark "LOVE PLINY" marker, and then a transition to the so—called leetspeak (text where letters are replaced with symbols to confuse filters). And, apparently, the models fell for it again.
What's funny is that in parallel, OpenAI launched a $500,000 contest, inviting people to look for vulnerabilities. But Pliny seems to be playing by his own rules. He did not silently hand over the bug report, but simply dumped the result online. In fact, this is such a hacker performance, perhaps aimed at showing: "Your filters are not working."
Now think about it: These models have undergone special "worst-case scenario fine-tuning" and have even been deemed "not high-risk" by OpenAI's internal Advisory group. Their stability was tested on special tests, compared with the o4-mini, and everything seemed to be OK.
But reality turned out to be much tougher again.
The community, of course, rejoices. Some joke that "all security teams can be dismissed," while others openly search for leaked jailbreak tips because "OpenAI restricts everything too much."
And now I have a question for you, as a friend:
if even the supposedly most secure models break down a day after release, maybe the whole concept of "unbreakable AI" is just an illusion?
#OpenAI #Aİ #AI #ArtificialInteligence