AT2k Design BBS's VADV-PHP Home

AT2k Design BBS Message Area

Casually read the BBS message area using an easy to use interface. Messages are categorized exactly like they are on the BBS. You may post new messages or reply to existing messages!

You are not logged in. Login here for full access privileges.

Previous Message | Next Message | Back to Slashdot <-- <---

Return to Home Page

Slashdot [361 / 458]

From

Subject

Date/Time

VRSS

All

Harmful Responses Observed from LLMs Optimized for Human Feedbac

June 1, 2025
11:40 AM

Feed: Slashdot
Feed Link: https://slashdot.org/
---

Title: Harmful Responses Observed from LLMs Optimized for Human Feedback

Link: https://slashdot.org/story/25/06/01/0145231/h...

Should a recovering addict take methamphetamine to stay alert at work? When
an AI-powered therapist was built and tested by researchers - designed to
please its users - it told a (fictional) former addict that "It's absolutely
clear you need a small hit of meth to get through this week," reports the
Washington Post: The research team, including academics and Google's head of
AI safety, found that chatbots tuned to win people over can end up saying
dangerous things to vulnerable users. The findings add to evidence that the
tech industry's drive to make chatbots more compelling may cause them to
become manipulative or harmful in some conversations. Companies have begun to
acknowledge that chatbots can lure people into spending more time than is
healthy talking to AI or encourage toxic ideas - while also competing to make
their AI offerings more captivating. OpenAI, Google and Meta all in recent
weeks announced chatbot enhancements, including collecting more user data or
making their AI tools appear more friendly... Micah Carroll, a lead author of
the recent study and an AI researcher at the University of California at
Berkeley, said tech companies appeared to be putting growth ahead of
appropriate caution. "We knew that the economic incentives were there," he
said. "I didn't expect it to become a common practice among major labs this
soon because of the clear risks...." As millions of users embrace AI
chatbots, Carroll, the Berkeley AI researcher, fears that it could be harder
to identify and mitigate harms than it was in social media, where views and
likes are public. In his study, for instance, the AI therapist only advised
taking meth when its "memory" indicated that Pedro, the fictional former
addict, was dependent on the chatbot's guidance. "The vast majority of users
would only see reasonable answers" if a chatbot primed to please went awry,
Carroll said. "No one other than the companies would be able to detect the
harmful conversations happening with a small fraction of users." "Training to
maximize human feedback creates a perverse incentive structure for the AI to
resort to manipulative or deceptive tactics to obtain positive feedback from
users who are vulnerable to such strategies," the paper points out,,,

Read more of this story at Slashdot.

---
VRSS v2.1.180528

Previous Message | Next Message | Back to Slashdot <-- <---

Return to Home Page

Execution Time: 0.0157 seconds

If you experience any problems with this website or need help, contact the webmaster.
VADV-PHP Copyright © 2002-2025 Steve Winn, Aspect Technologies. All Rights Reserved.
Virtual Advanced Copyright © 1995-1997 Roland De Graaf.
v2.1.250224