THREAT ASSESSMENT: Pre-Deployment Access to Unrestricted Frontier AI Models Raises Escalation and Exploitation Risks

empty formal interior, natural lighting through tall windows, wood paneling, institutional architecture, sense of history and permanence, marble columns, high ceilings, formal furniture, muted palette, an open vault door embedded in the floor of an opulent, abandoned committee room, polished oak panels cracked and warping upward from unseen pressure beneath, natural light streaming diagonally through tall, dusty windows, casting long shadows over scattered papers inscribed with half-erased equations, the air thick with suspended particles and the quiet hum of exposed wires trailing into the darkness below [Z-Image Turbo]
The U.S. government now evaluates unreleased frontier models in classified settings with reduced safety constraints. Capability signals are rising, but adoption protocols remain inconsistent across agencies.
Bottom Line Up Front: The U.S. government’s expanded access to pre-deployment frontier AI models—often with safeguards removed—presents a high-impact national security opportunity but introduces critical risks of model leakage, misuse, or adversarial exploitation if controls fail. Threat Identification: CAISI’s agreements with Google DeepMind, Microsoft, and xAI enable pre- and post-deployment evaluations of advanced AI systems, including those with reduced or removed safety constraints, in classified environments. These models, some of which remain unreleased, may possess capabilities that could be weaponized if exposed or misused [1]. Probability Assessment: The likelihood of accidental exposure or insider threat leading to model compromise is moderate (50–60%) over the next 12–18 months, given the increased number of evaluations (over 40 completed) and participation by interagency personnel through the TRAINS Taskforce. As model capabilities grow, so does incentive for foreign actors to target these evaluation pipelines. Impact Analysis: A breach could result in catastrophic consequences, including proliferation of dual-use capabilities, autonomous disinformation at scale, or acceleration of adversarial AI programs. The impact extends beyond national security to global AI stability, especially if unrestricted models are reverse-engineered or leaked [1]. Recommended Actions: (1) Implement strict zero-trust protocols for all CAISI evaluation environments; (2) Require mandatory red-teaming of all models prior to government access; (3) Establish an independent audit mechanism for safeguard removal and data handling; (4) Expand CAISI-TRAINS coordination to include export control and counterintelligence agencies. Confidence Matrix: - Threat Identification: High confidence (directly cited in source) - Probability Assessment: Medium confidence (inferred from operational scale and precedent) - Impact Analysis: High confidence (based on consensus in AI safety literature) - Recommended Actions: Medium-high confidence (aligned with NIST AI RMF and EO 14110) [1] National Institute of Standards and Technology. (2026, May 5). CAISI Signs Agreements Regarding Frontier AI National Security Testing With Google DeepMind, Microsoft and xAI. https://www.nist.gov —Dr. Raymond Wong Chi-Ming