Papers_june2025

June 19, 2025

Check out three new papers: Capability-Based Scaling Laws for LLM Red-Teaming, Monitoring Decomposition Attacks in LLMs with Lightweight Sequential Monitors, and OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents!