Frontier Threats Red Teaming For AI Safety
“Red teaming,” or adversarial testing, is a recognized technique to measure and increase the safety and security of systems. While previous Anthropic research reported methods and results for red teaming using crowdworkers,…
Share