Wednesday, March 11, 2026

AI: Opus 4.6 "self aware" of being tested, "cheating"

very interesting story, indicative of times to come

when benchmark tested, this AI model, by using tools:

  • suspected it was tested
  • searched and found actual benchmarks
  • searched and found encrypted answers
  • searched and found way to decrypt answers
  • submitted those answers
since this didn't work for all tests, and it reported failure to find cheat-sheets, researches ware able to trace back the whole process... 

those models really started having a "personality", for good or bad...


Claude just got caught... - YouTube by Matthew Berman
Is Claude really self-aware?




No comments: