Papers_june2025
Check out three new papers: Capability-Based Scaling Laws for LLM Red-Teaming, Monitoring Decomposition Attacks in LLMs with Lightweight Sequential Monitors, and OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents!
Check out three new papers: Capability-Based Scaling Laws for LLM Red-Teaming, Monitoring Decomposition Attacks in LLMs with Lightweight Sequential Monitors, and OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents!