“One in all our instruments mistakenly flagged this identify and prevented it from showing in responses, which it shouldn’t have. We’re engaged on a repair,” an OpenAI spokesperson advised The Guardian.
The issues with hard-coded filters
Permitting a sure identify or phrase to at all times break ChatGPT outputs may trigger numerous bother down the road for sure ChatGPT customers, opening them up for adversarial assaults and limiting the usefulness of the system.
Already, Scale AI immediate engineer Riley Goodside found how an attacker would possibly interrupt a ChatGPT session utilizing a visible immediate injection of the identify “David Mayer” rendered in a light-weight, barely legible font embedded in a picture. When ChatGPT sees the picture (on this case, a math equation), it stops, however the person won’t perceive why.
The filter additionally signifies that it is probably that ChatGPT will not have the ability to reply questions on this text when shopping the net, reminiscent of by means of ChatGPT with Search. Somebody may use that to doubtlessly stop ChatGPT from shopping and processing an internet site on goal in the event that they added a forbidden identify to the positioning’s textual content.
After which there’s the inconvenience issue. Stopping ChatGPT from mentioning or processing sure names like “David Mayer,” which is probably going a preferred identify shared by a whole bunch if not 1000’s of individuals, signifies that individuals who share that identify may have a a lot more durable time utilizing ChatGPT. Or, say, when you’re a trainer and you’ve got a pupil named David Mayer and also you need assist sorting a category listing, ChatGPT would refuse the duty.
These are nonetheless very early days in AI assistants, LLMs, and chatbots. Their use has opened up quite a few alternatives and vulnerabilities that persons are nonetheless probing each day. How OpenAI would possibly resolve these points remains to be an open query.
This story was up to date on December 3, 2024 at 3:50 PM to incorporate OpenAI’s assertion about “David Mayer” in its ChatGPT block being a glitch, sourced from The Guardian.