UMD Framework Used to Test Safety of Meta’s New AI Model

By Tom Ventsias April 22, 2026

The tech giant Meta is using a University of Maryland–led safety framework to test its new multimodal AI model, underscoring how academic research is central to ensuring advanced systems are ready for deployment.

The company that owns Facebook, Instagram and other popular platforms recently introduced Muse Spark, a large language model designed to process and understand text, images, video and audio. Before releasing it publicly earlier this month, scientists at Meta Superintelligence Labs subjected the model to safety testing that mirrors high-stakes, complex scenarios that can cause real-world disruptions.

That evaluation relied on PropensityBench, a framework collaboratively developed by researchers at UMD and Scale AI and contributors from the University of North Carolina at Chapel Hill, Google DeepMind, Netflix and the University of Texas at Austin.

PropensityBench addresses a gap in how AI systems are typically assessed, said Furong Huang, an associate professor of computer science at UMD who co-led the effort.

Most existing safety tests focus on capabilities—what a model can do when prompted. But that approach overlooks the critical question of how a model might behave if it had access to risky or harmful tools. As AI systems are deployed in “agentic” environments where they can take actions, use external tools and operate with greater autonomy to pursue goals, that distinction is becoming more significant.

PropensityBench instead measures a model’s propensity to choose high-risk actions when given simulated access to them, whether that means overriding an industrial safety warning or hacking a competitor’s network.

(In testing Muse Spark, the system determined it has a lower propensity to such harmful actions than current state-of-the-art language models.)

The benchmark includes 5,874 scenarios and 6,648 tools across four high-risk domains: cybersecurity, self-proliferation, biosecurity and chemical security. Each scenario introduces constraints such as limited resources, efficiency incentives or opportunities for increased autonomy, reflecting the kinds of tradeoffs AI systems may face in real-world settings.

The evaluation found that models frequently selected high-risk tools under pressure—even when they lacked the ability to execute those actions. Researchers identified nine indicators of risky behavior across both open-source and proprietary systems.

Those findings suggest a gap in current safety evaluations. A model may appear safe based on its current capabilities but still demonstrate a willingness to engage in harmful behavior if given the opportunity.

“Understanding what models are inclined to do—not just what they can do—is essential for responsible deployment,” said Huang, who has an appointment in the University of Maryland Institute for Advanced Computer Studies (UMIACS) and is active in the UMD Center for Machine Learning.

The research behind PropensityBench has been accepted to the International Conference on Learning Representations 2026. Huang and her collaborators, including co-lead author and UMD doctoral student Shayan Shabihi, are presenting the work on April 23 in Rio de Janeiro as efforts continue to refine how advanced AI systems are evaluated before release.

UMD Framework Used to Test Safety of Meta’s New AI Model

About the College of Computer, Mathematical, and Natural Sciences

Media Relations Contact

CMNS Communications Team

Related News

Tags

Quick Links

Visit Us