Am I missing something? The article seems to suggest it works via hidden text characters. Has OpenAI never heard of pasting text into a utf8 notepad before?
Am I missing something? The article seems to suggest it works via hidden text characters. Has OpenAI never heard of pasting text into a utf8 notepad before?
It wouldn’t be surprising to me if they’ve had this implemented for awhile.
There’s still some question about why their 3.5 model had an apparent sudden drop-off in quality about a year ago, and among the plausible explanations for it could be that they were fucking with their weights in order to watermark the outputs in exactly the way you’re mentioning. They were also fighting against prompt-injection methods and censor disapproved uses at the time, so who the fuck knows.
This doesn’t touch the weights at all, it’s just a change to the sampler.
What lobotomizes their models is cost cutting and trying to make them “safe,” or at least thats what I suspect.