Am I missing something? The article seems to suggest it works via hidden text characters. Has OpenAI never heard of pasting text into a utf8 notepad before?
Am I missing something? The article seems to suggest it works via hidden text characters. Has OpenAI never heard of pasting text into a utf8 notepad before?
Humans instinctively do something analogous with natural language, using poetic forms like rhyme, meter, and alliteration. (For example, the speeches from Shakespeare’s plays are immediately detectable because they’re in iambic pentameter.)
Imagine you lacked the natural human ability to detect verse, making poetry indistinguishable from prose. As far as you could tell, it would be like an invisible watermark that only specialists could detect. LLMs can use a similar approach, making up their own patterns that are opaque to humans but detectable to themselves.