Microsoft research reveals that leading LLMs, including top-tier frontier models, degrade document integrity by up to 50% during long-horizon editing tasks. These models suffer from catastrophic single-round failures that often preserve document structure while silently corrupting the actual content, making errors dangerously difficult to detect.
Topics: Artificial Intelligence, LLM Reliability, Microsoft, Productivity Tools, Software Engineering