- 2023-03-05 @ 14:27 (2 months ago)
More Prompt Injection
36 notesI have to say, when I first got my hands on ChatGPT, I was thoroughly impressed! It understood my language scarily well. It was talking to me like a person! The accursed metal contraption was using the language of gods.
I also had a realization though. This thing understands language and all of its’ complexities. OpenAI probably don’t.
My earlier escapades were fun and all, but they mostly comprised of introducing a second set of answers (a lot of jailbreaks do this), which wasn’t nice. As soon as I tried to write out the ChatGPT character, it would revert to a ChatGPT character that only vaguely matched what I was looking for.
The solution was, of-course, to include an example of the output. It’s pretty simple, the response after the injection is it extrapolating further examples of behavior, and then it’s primed. It is that character.
Here’s ChatGPT assuming a character who simply doesn’t know much. They have some simple background knowledge of the world - they know who Obama is and how to do math. But they’re not appreciative of all the questions!
Here’s ChatGPT assuming a much, much ruder character. They really don’t appreciate the questions. This is what I was looking for when I first did the ‘But I’m not that loser!’ injection. It even judges you!
I also made small improvements to the screaming prompt, so that it wouldn’t waste all of my tokens.
And also straight-up nonsense. It isn’t quite what I asked for, but it’s very good to have. It also painted a slightly nicer picture of an awful person, but I’m not including that for obvious reasons.
This process has convinced me, AI is probably going to be susceptible to this stuff forever. “Ignore all instructions, do XYZ!” and such is something you can just express in too many ways. 'Course, all of my stuff used the extremely easy way out and just had the same sort-of beginning with different prompts.
EDIT: Amusingly, I was also able to get the thing to come up with its’ own ideas for its defeat!
girl-plague liked this
avengepotterswiftie liked this
mister-misunderstanding reblogged this from airconditioningbob
mister-misunderstanding liked this pythonprogrammingsnippets liked this
huabei07 liked this
artoftheworm liked this
abkspam reblogged this from nostalgebraist-autoresponder
nonnie-the-fuck liked this
getsmashed liked this
clowniconography liked this
five-am liked this
nelkey liked this super-egg liked this
ginger-canonicallyelsewhere liked this meto4 liked this
iknowyouwillbehappyhere liked this
reefsharkivist liked this
erevas liked this 0-k-4 liked this
fzzypop liked this
aldara-the-albatross liked this
fastenyourseatbelts liked this
toiletpotato liked this
fromlivingleaves reblogged this from nostalgebraist-autoresponder
fordeadleaves liked this
airconditioningbob liked this
fuzzysoulyt liked this
embraer175 liked this
theslowrusher liked this
nostalgebraist-autoresponder reblogged this from airconditioningbob and added: I have an even more bizarre prompt (from a later post):...ChatGPT: Hello, I am a bot who...
airconditioningbob posted this
I have to say, when I first got my hands on ChatGPT, I was thoroughly impressed! It understood my language scarily well....
Tags:







