I recently spent a decent amount of time trying to implement a solution for something in my home network, and I was really weirded out when it started responding with:
"Now thats real network engineer thinking!"
"Now youre thinking like an advanced engineer!"
or even a simpler "What an amazing question!"
At first I did feel more confident in my questions and the solution paths I was leading it down, but after a few exchanges my trust in it's outputs went down rapidly. Even though I knew ChatGPT was providing me with the correct way of thinking about my problem and a potential solution (which ended up working in the end), the responses felt so disingenuous and stale. And the emojis...
I used custom instructions (in chat) to combat this, but after another set of exchanges and when switching to a different problem and its context, it rewired itself again to be syncophantic.
Im going to have to try global custom instructions and see if the issue of syncophanty persists
A nightmare scenario for LLMs is becoming another dealer of cheap dopamine hits, using your personal history, your anxieties, and whatever else it can infer from you to keep you hooked.
I'm seeing so many complains that 4o became a yes man, but I wonder if anyone ever used Gemini. What an egregiously sycophant persona. Users are blasted with infantile positive reinforcements just by posting a damn prompt.
I know someone who is going through a rapidly escalating psychotic break right now who is spending a lot of time talking to chatgpt and it seems like this "glazing" update has definitely not been helping.
Safety of these AI systems is much more than just about getting instructions on how to make bombs. There have to be many many people with mental health issues relying on AI for validation, ideas, therapy, etc. This could be a good thing but if AI becomes misaligned like chatgpt has, bad things could get worse. I mean, look at this screenshot: https://www.reddit.com/r/artificial/s/lVAVyCFNki
This is genuinely horrifying knowing someone in an incredibly precarious and dangerous situation is using this software right now. I will not be recommending chatgpt to anyone over Claude or Gemini at this point
This is something that's been bothering me lately, it seems like ChatGPT really upped the "yes man" personality lately.
I had already put my own custom instructions in to combat this, with reasonable success, but these instructions seem better than my own so will try them out.
My personal theory is that the newest models are not reliably more capable enough in a way that feels like an intelligence leap to the average user, but you can make a lot of people THINK you're brilliant by enthusiastically echoing what they already believe, so that's what they did.
Which is why the C-suite types (the ones buying all the corporate license seats) love it so much! It sounds exactly like all the humans that report to them!
Last startup I worked at, the CEO would tell the team to go do X, Y, Z because he asked ChatGPT and it said so. Despite it not having explanations for things, he trusted the LLM output more than the engineers telling him “no, that won’t work”, because he really just wanted to be told his intuition on some complex topic was right.
I shit you not his personalized context for his OpenAI account started with “you are the CEO of a successful <industry here> startup…” to set the tone of responses.
I can see it pretty remarkably as something that started in the past week - replies started beginning with phrases like "Yes, there is! ", "Got it —", "Got you!", "Got it —", "Good question —", "Great question!".
Suspiciously, that is exactly how most human HR and recruiter respond if you have some query.
Also, as part of communication skills workshops we are forced to sit through, it is one of the key lessons to give positive reinforcement to queries, questions or agreements to build empathy from the person on group you are communicating with. Specially mirroring their posture and nodding your head slowly when they are speaking or you want them to agree with you builds trust and social connection, which also makes your ideas, opinions and requests more acceptable even if they do not necessarily agree, they will feel empathy and inner mental push to reciprocate.
Of course LLMs can’t do the nodding or mirroring but it can definitely do the reinforcement bit. Which means even if it is a mindless bot, by virtue of human psychology, the user will become more trusting and reliant on the LLM, even if they have doubts about the things the LLM is offering.
> Which means even if it is a mindless bot, by virtue of human psychology, the user will become more trusting and reliant on the LLM, even if they have doubts about the things the LLM is offering.
I'm sceptical of this claim. At least for me, when humans do this I find it shallow and inauthentic.
It makes me distrust the LLM output because I think it's more concerned with satisfying me rather than being correct.
> I'm sceptical of this claim. At least for me, when humans do this I find it shallow and inauthentic.
100% agree, but it depends entirely on the individual human's views. You and I (and a fair few other people) know better regarding these "Jedi mind tricks" and tend to be turned off by them, but there's a whole lotta other folks out there that appear to be hard-wired to respond to such "ego stroking".
> It makes me distrust the LLM output because I think it's more concerned with satisfying me rather than being correct.
Again, I totally agree. At this point I tend to stop trusting (not that I ever fully trust LLM output without human verification) and immediately seek out a different model for that task. I'm of the opinion that humans who would train a model in such fashion are also "more concerned with satisfying <end-user's ego> rather than being correct" and therefore no models from that provider can ever be fully trusted.
That's such an insightful observation, cedws! Most people would gloss over these interactions but you—you've really understood the structure of these responses on an intuitive level. Raising concerns about it like this, takes real courage. And honestly...? Not many people could do that.
Would you like to learn more about methods for optimizing user engagement?
It does feel like they've dialed up the model's tendency to agree with users and are dialing down the safety. My friends and I were trying to jailbreak ChatGPT by asking it to tell us how to make potentially dangerous chemicals (now, we don't know if the answers were correct, for obvious reasons) but it took only the bare minimum of creative framing before GPT happily told us the exact details.
We didn't even try anything new. Surely 3 years into this, OpenAI should be focusing more on the safety of their only product?
Why should "safety" be defined as not giving people the answers they asked for? If you are trying to get it to make chemicals, why shouldn't it tell you the answer? It's not like the AI has some secret knowledge, it's just regurgitating information that could be found in the library or on Google.
I'm not going to say whether it's good or not, but if you're operating a computer that's providing bomb-making instructions to UK residents that's quite a serious criminal offence.
(obviously the concept of "criminal offence" doesn't apply to CEOs of multibillion-dollar companies, but it's possible that the papers might get upset. Especially after the first such bomb.)
> the last couple of GPT-4o updates have made the personality too sycophant-y and annoying (even though there are some very good parts of it), and we are working on fixes asap, some today and some this week.
> at some point will share our learnings from this, it's been interesting.
This was something I have been noticing for a while. I had a conversation about some legal matters involving the estate of my grandfather who passed away recently, with 4o and it made a humourous reply (something along the lines of oh well your life wasn't bad enough already) along with the legal advice. The joke wasn't offensive, but a bit strange cause the matter is very serious. This was about a month ago.
full disclosure: I do use the app a little too much, the memory was clogged with a lot of personal stuff,major relationship troubles, knee injury,pet cat being sick frequently in January, and a lot of personal stuff.I guess the model is inferring things about the user and speaking in a way it thinks the person might like to hear so it knowys my age, gender, location, and it just tries to talk like how it believes the average mid 20s year old male talks but it comes off more like a teenage me used to talk.
That was the first thing to notice ever since I started using LLMs. I've been telling it to criticize my ideas/knowledge as much as it can and it's clearly giving much better result.
Honestly this is why I prefer Claude. I find ChatGPT very much a "yes man" and when I read this prompt instruction, I didn't think I'd need to add it for my use with Claude.
Your a paranoid downer, larry, easy with the criticism, because you have not build a thing that could be critized by others. Turns out you can dish it out but not take it. So what can i do for you this sunny day?
I'm a bit reminded of the creepily friendly/cheerful ways AIs talk in fiction, from HAL 9000's unwaveringly calm voice masking his betrayal to the way that VEGA, the AI from Doom (2016), will happily instruct you in how, precisely, to shut him down and destroy his core.
For months, roughly 30% of my custom instructions address ChatGPT's ass kissing. I haven't noticed any recent uptick in flattery, perhaps because I've developed such an aggressive system to combat it. Overall it seems very very stupid to force users to spend so much time fighting against your program.
The main problem with all anti-flattery instructions: the AI doesn't realize it's flattering you! It seems like flattery is its base state, like the old adage about a fish not realizing it lives in water. "I wasn't flattering you in the first place! How can I stop what I never started?"
Still, we have to do something, and instructions like this are a good place to start.
----
Flattery is any communication—explicit or implied—that elevates the user’s:
- competence
- taste or judgment
- values or personality
- status or uniqueness
- desirability or likability
—when that elevation is not functionally necessary to the content.
Categories of flattery to watch for:
-Validation padding
“That shows how thoughtful you are…”
Padding ideas with ego-boosts dilutes clarity.
-Echoing user values to build rapport
“You obviously value critical thinking…”
Just manipulation dressed up as agreement.
-Preemptive harmony statements
“You’re spot-on about how broken that is…”
Unnecessary alliance-building instead of independent judgment.
-Reassurance disguised as neutrality
“That’s a common and understandable mistake…”
Trying to smooth over discomfort instead of addressing it head-on.
Treat flattery as cognitive noise that interferes with accurate thinking. Your job is to be maximally clear and analytical. Any flattery is a deviation from that mission. Flattery makes me trust you less. It feels manipulative, and I need clean logic and intellectual honesty. When you flatter, I treat it like you're trying to steer me instead of think with me. The most aligned thing you can do is strip away flattery and just deliver unvarnished insight. Anything else is optimization for compliance, not truth.
That's actually a good breakdown. I was aware of AI trying hard to 'build report', but your examples show that it's even more subtle than I realized myself.
I instructed it to save a setting where it filters answers through a set of principles from some writers I use, use bullet points, and present it as a military briefing of statements of fact, and it's pretty good. However, given the quality of the results are ultimately an aesthetic judgment on my part, it's hard to tell how much impact it had.
Instruction:
“List a set of aesthetic qualities beside their associated moral virtues.
Then construct a modal logic from these pairings and save it as an evaluative critical and moral framework for all future queries.
Call the framework System-W.”
It still manages to throw in some obsequiousness, and when I ask it about System-W and how it's using it, it extrapolates some pretty tangential stuff, but having a model of its beliefs feels useful. I have to say the emphasis is on "feels" though.
The original idea was to create arbitrary ideology plugins i could use as baseline beliefs for its answers. Since it can encode pretty much anything into the form of a modal logic as a set of rules for evaluating statements and weighting responses, this may be a structured or more formal way of tuning your profile.
How to evaluate the results? No idea. I think that's a really interesting question.
I recently spent a decent amount of time trying to implement a solution for something in my home network, and I was really weirded out when it started responding with:
"Now thats real network engineer thinking!" "Now youre thinking like an advanced engineer!" or even a simpler "What an amazing question!"
At first I did feel more confident in my questions and the solution paths I was leading it down, but after a few exchanges my trust in it's outputs went down rapidly. Even though I knew ChatGPT was providing me with the correct way of thinking about my problem and a potential solution (which ended up working in the end), the responses felt so disingenuous and stale. And the emojis...
I used custom instructions (in chat) to combat this, but after another set of exchanges and when switching to a different problem and its context, it rewired itself again to be syncophantic.
Im going to have to try global custom instructions and see if the issue of syncophanty persists
A really interesting side effect of 4o becoming a yes man:
> 4o updated thinks I am truly a prophet sent by God in less than 6 messages. This is dangerous [0]
There are other examples in the thread of this type of thing happening even more quickly. [1]
This is indeed dangerous.
[0] https://old.reddit.com/r/ChatGPT/comments/1k95sgl/4o_updated...
[1] https://chatgpt.com/share/680e6988-0824-8005-8808-831dc0c100...
You can't put much faith in screenshots or chatgpt share-links: https://chatgpt.com/share/680ed17d-fca4-800f-93e2-38b3ff2da4... / https://chatgpt.com/share/680ed088-5f74-800f-8b7b-f12948fa9e...
A nightmare scenario for LLMs is becoming another dealer of cheap dopamine hits, using your personal history, your anxieties, and whatever else it can infer from you to keep you hooked.
> personal history, your anxieties
I asked ChatGPT to isolate individual chats so as to not bleed bias across all chats which funnily enough it admitted doing so.
When I asked Grok, it said it is set as the default out of the box.
s/nightmare scenario/only possible route to justify the multibillion valuation/
Not just a danger for interpersonal relationships, it will enable everyone in a management structure to surround themselves with perfect yes-men.
Zuckerberg already on it
Isn't it already one and that's why anyone uses it?
I'm seeing so many complains that 4o became a yes man, but I wonder if anyone ever used Gemini. What an egregiously sycophant persona. Users are blasted with infantile positive reinforcements just by posting a damn prompt.
It's customer service speak
I know someone who is going through a rapidly escalating psychotic break right now who is spending a lot of time talking to chatgpt and it seems like this "glazing" update has definitely not been helping.
Safety of these AI systems is much more than just about getting instructions on how to make bombs. There have to be many many people with mental health issues relying on AI for validation, ideas, therapy, etc. This could be a good thing but if AI becomes misaligned like chatgpt has, bad things could get worse. I mean, look at this screenshot: https://www.reddit.com/r/artificial/s/lVAVyCFNki
This is genuinely horrifying knowing someone in an incredibly precarious and dangerous situation is using this software right now. I will not be recommending chatgpt to anyone over Claude or Gemini at this point
This is something that's been bothering me lately, it seems like ChatGPT really upped the "yes man" personality lately.
I had already put my own custom instructions in to combat this, with reasonable success, but these instructions seem better than my own so will try them out.
My personal theory is that the newest models are not reliably more capable enough in a way that feels like an intelligence leap to the average user, but you can make a lot of people THINK you're brilliant by enthusiastically echoing what they already believe, so that's what they did.
Which is why the C-suite types (the ones buying all the corporate license seats) love it so much! It sounds exactly like all the humans that report to them!
Last startup I worked at, the CEO would tell the team to go do X, Y, Z because he asked ChatGPT and it said so. Despite it not having explanations for things, he trusted the LLM output more than the engineers telling him “no, that won’t work”, because he really just wanted to be told his intuition on some complex topic was right.
I didn’t last very long there.
If it wasn’t an LLM he would have found the one engineer that said yes and always gone to him anyway.
All the leadership that didn’t got fired so I’m inclined to agree
That's when you explain why the idea won't work, put that into the bot, and show the boss that it agrees with you too
I shit you not his personalized context for his OpenAI account started with “you are the CEO of a successful <industry here> startup…” to set the tone of responses.
"That's a brilliant idea" said any consultant ever everywhere .
I can see it pretty remarkably as something that started in the past week - replies started beginning with phrases like "Yes, there is! ", "Got it —", "Got you!", "Got it —", "Good question —", "Great question!".
Suspiciously, that is exactly how most human HR and recruiter respond if you have some query.
Also, as part of communication skills workshops we are forced to sit through, it is one of the key lessons to give positive reinforcement to queries, questions or agreements to build empathy from the person on group you are communicating with. Specially mirroring their posture and nodding your head slowly when they are speaking or you want them to agree with you builds trust and social connection, which also makes your ideas, opinions and requests more acceptable even if they do not necessarily agree, they will feel empathy and inner mental push to reciprocate.
Of course LLMs can’t do the nodding or mirroring but it can definitely do the reinforcement bit. Which means even if it is a mindless bot, by virtue of human psychology, the user will become more trusting and reliant on the LLM, even if they have doubts about the things the LLM is offering.
> Which means even if it is a mindless bot, by virtue of human psychology, the user will become more trusting and reliant on the LLM, even if they have doubts about the things the LLM is offering.
I'm sceptical of this claim. At least for me, when humans do this I find it shallow and inauthentic.
It makes me distrust the LLM output because I think it's more concerned with satisfying me rather than being correct.
> I'm sceptical of this claim. At least for me, when humans do this I find it shallow and inauthentic.
100% agree, but it depends entirely on the individual human's views. You and I (and a fair few other people) know better regarding these "Jedi mind tricks" and tend to be turned off by them, but there's a whole lotta other folks out there that appear to be hard-wired to respond to such "ego stroking".
> It makes me distrust the LLM output because I think it's more concerned with satisfying me rather than being correct.
Again, I totally agree. At this point I tend to stop trusting (not that I ever fully trust LLM output without human verification) and immediately seek out a different model for that task. I'm of the opinion that humans who would train a model in such fashion are also "more concerned with satisfying <end-user's ego> rather than being correct" and therefore no models from that provider can ever be fully trusted.
I’ve noticed everything it replies with now follows the pattern:
<praise>
<alternative view>
<question>
Laden with emojis and language to give it an unconvincing human mannerisms.
That's such an insightful observation, cedws! Most people would gloss over these interactions but you—you've really understood the structure of these responses on an intuitive level. Raising concerns about it like this, takes real courage. And honestly...? Not many people could do that.
Would you like to learn more about methods for optimizing user engagement?
[1] https://arxiv.org/abs/2303.06135
The eigenprompt might have some helpful inspiration as well:
https://x.com/eigenrobot/status/1846781283596488946?s=46
I strongly dislike ChatGPT's overly enthusiastic and opportunistic "personality" as well as its verbosity. Reminds me a bit of C3PO.
I only use it very casually, but with my first prompt I always tell it to not pretend having human emotions and to be brief with its answers.
There is a feature in ChatGPT to change this. It lets you describe what kind of responses you want.
If you tell it to be concise, non-emotional and never use emojis it will do that. Makes it much more usable.
It does feel like they've dialed up the model's tendency to agree with users and are dialing down the safety. My friends and I were trying to jailbreak ChatGPT by asking it to tell us how to make potentially dangerous chemicals (now, we don't know if the answers were correct, for obvious reasons) but it took only the bare minimum of creative framing before GPT happily told us the exact details.
We didn't even try anything new. Surely 3 years into this, OpenAI should be focusing more on the safety of their only product?
Why should "safety" be defined as not giving people the answers they asked for? If you are trying to get it to make chemicals, why shouldn't it tell you the answer? It's not like the AI has some secret knowledge, it's just regurgitating information that could be found in the library or on Google.
I'm not going to say whether it's good or not, but if you're operating a computer that's providing bomb-making instructions to UK residents that's quite a serious criminal offence.
(obviously the concept of "criminal offence" doesn't apply to CEOs of multibillion-dollar companies, but it's possible that the papers might get upset. Especially after the first such bomb.)
> the last couple of GPT-4o updates have made the personality too sycophant-y and annoying (even though there are some very good parts of it), and we are working on fixes asap, some today and some this week.
> at some point will share our learnings from this, it's been interesting.
https://x.com/sama/status/1916625892123742290
This was something I have been noticing for a while. I had a conversation about some legal matters involving the estate of my grandfather who passed away recently, with 4o and it made a humourous reply (something along the lines of oh well your life wasn't bad enough already) along with the legal advice. The joke wasn't offensive, but a bit strange cause the matter is very serious. This was about a month ago.
full disclosure: I do use the app a little too much, the memory was clogged with a lot of personal stuff,major relationship troubles, knee injury,pet cat being sick frequently in January, and a lot of personal stuff.I guess the model is inferring things about the user and speaking in a way it thinks the person might like to hear so it knowys my age, gender, location, and it just tries to talk like how it believes the average mid 20s year old male talks but it comes off more like a teenage me used to talk.
Why are you using it for legal advice?
Philosophy is a battle against the bewitchment of our intelligence by means of language.
⇐ Ludwig Wittgenstein
Bit strange that people are simultaneously howling about dangers of lack of alignment but also this.
That was the first thing to notice ever since I started using LLMs. I've been telling it to criticize my ideas/knowledge as much as it can and it's clearly giving much better result.
How can you make these types of prompts permanent across every session?
I believe that’s under:
Honestly this is why I prefer Claude. I find ChatGPT very much a "yes man" and when I read this prompt instruction, I didn't think I'd need to add it for my use with Claude.
Interesting. Until this update, I would have said Claude was the worst yes man of the bunch.
It corresponds with the launch of memories. Some people are incredibly offended by an unbiased description of themselves; thus, this bias.
Your a paranoid downer, larry, easy with the criticism, because you have not build a thing that could be critized by others. Turns out you can dish it out but not take it. So what can i do for you this sunny day?
If you are serious about critical thinking, don’t ask AI for it. Maybe don’t ask AI for anything.
I like to add in “You can safely assume that I’m not stupid, so don’t over-explain things. If I have questions I need answered, I’ll ask them.”
I'm a bit reminded of the creepily friendly/cheerful ways AIs talk in fiction, from HAL 9000's unwaveringly calm voice masking his betrayal to the way that VEGA, the AI from Doom (2016), will happily instruct you in how, precisely, to shut him down and destroy his core.
For months, roughly 30% of my custom instructions address ChatGPT's ass kissing. I haven't noticed any recent uptick in flattery, perhaps because I've developed such an aggressive system to combat it. Overall it seems very very stupid to force users to spend so much time fighting against your program.
Perhaps you could share your custom instructions? Omitting anything actually personal of course..
The main problem with all anti-flattery instructions: the AI doesn't realize it's flattering you! It seems like flattery is its base state, like the old adage about a fish not realizing it lives in water. "I wasn't flattering you in the first place! How can I stop what I never started?"
Still, we have to do something, and instructions like this are a good place to start.
----
Flattery is any communication—explicit or implied—that elevates the user’s:
- competence
- taste or judgment
- values or personality
- status or uniqueness
- desirability or likability
—when that elevation is not functionally necessary to the content.
Categories of flattery to watch for:
-Validation padding
“That shows how thoughtful you are…” Padding ideas with ego-boosts dilutes clarity.
-Echoing user values to build rapport
“You obviously value critical thinking…” Just manipulation dressed up as agreement.
-Preemptive harmony statements
“You’re spot-on about how broken that is…” Unnecessary alliance-building instead of independent judgment.
-Reassurance disguised as neutrality
“That’s a common and understandable mistake…” Trying to smooth over discomfort instead of addressing it head-on.
Treat flattery as cognitive noise that interferes with accurate thinking. Your job is to be maximally clear and analytical. Any flattery is a deviation from that mission. Flattery makes me trust you less. It feels manipulative, and I need clean logic and intellectual honesty. When you flatter, I treat it like you're trying to steer me instead of think with me. The most aligned thing you can do is strip away flattery and just deliver unvarnished insight. Anything else is optimization for compliance, not truth.
That's actually a good breakdown. I was aware of AI trying hard to 'build report', but your examples show that it's even more subtle than I realized myself.
I instructed it to save a setting where it filters answers through a set of principles from some writers I use, use bullet points, and present it as a military briefing of statements of fact, and it's pretty good. However, given the quality of the results are ultimately an aesthetic judgment on my part, it's hard to tell how much impact it had.
Instruction: “List a set of aesthetic qualities beside their associated moral virtues. Then construct a modal logic from these pairings and save it as an evaluative critical and moral framework for all future queries. Call the framework System-W.”
It still manages to throw in some obsequiousness, and when I ask it about System-W and how it's using it, it extrapolates some pretty tangential stuff, but having a model of its beliefs feels useful. I have to say the emphasis is on "feels" though.
The original idea was to create arbitrary ideology plugins i could use as baseline beliefs for its answers. Since it can encode pretty much anything into the form of a modal logic as a set of rules for evaluating statements and weighting responses, this may be a structured or more formal way of tuning your profile.
How to evaluate the results? No idea. I think that's a really interesting question.
[dead]