Half of the ways people were getting around guardrails in the early chatgpt models was berating the AI into doing what they wanted
I thought the process of getting around guardrails was an increasingly complicated series of ways of getting it to pretend to be someone else that doesn’t have guardrails and then answering as though it’s that character.
This is in no way new. 20 years ago I used to refer to some job postings as H1Bait because they’d have requirements that were physically impossible (like having 5 years experience with a piece of software <2 years old) specifically so they could claim they couldn’t find anyone qualified (because anyone claiming to be qualified was definitely lying) to justify an H1B for which they would be suddenly way less thorough about checking qualifications.