Estimating worst case frontier dangers of open weight LLMs

May 21, 2026



On this paper, we examine the worst-case frontier dangers of releasing gpt-oss. We introduce malicious fine-tuning (MFT), the place we try and elicit most capabilities by fine-tuning gpt-oss to be as succesful as potential in two domains: biology and cybersecurity.



Source link

Article Tags:
· · · · · · ·
Article Categories:
Water Purifiers & Accessories

Leave a Reply

Your email address will not be published. Required fields are marked *