Novaprospect

FedRAMP works because the authorization boundary is a specific, drawn thing. The SSP describes it. The 3PAO walks it. The ATO grants permission to operate it. Changes to the boundary require a change request. Everything outside the boundary is out of scope, but also out of reach — the authorized system operates on data that does not leave.

An AI workload that calls an external foundation model provider violates that pattern on every inference.

What the model call actually does

When a system inside the boundary makes an API call to an external model provider, data leaves the boundary. The prompt, whatever it contains, is transmitted to the provider's infrastructure. The response is returned. The call completes. From the system's perspective, this is routine. From the boundary's perspective, it is a data transfer to a system that has not been authorized under the same ATO.

The model provider may themselves hold a FedRAMP authorization. Some do, at Moderate; fewer at High. When they do, the transfer can potentially be structured as a boundary extension, with the appropriate agreements and shared responsibility matrix documented in the SSP. When they do not, the transfer is not compliant, full stop, no matter how secure the provider's infrastructure actually is.

This is the boundary problem. Most AI architectures were designed as if the provider were a commodity service. FedRAMP does not treat providers as commodity services. FedRAMP treats them as systems that are either authorized or not.

The architectural choices

There are a limited number of compliant patterns.

Use a FedRAMP-authorized provider at the appropriate impact level, with the provider's authorization documented in your SSP and the shared responsibility model understood. This is the cleanest path when available and when the authorized provider covers the model class you need.

Deploy the model inside your own boundary. Open-weight models that can be self-hosted on authorized infrastructure avoid the boundary problem entirely, at the cost of losing access to the frontier commercial models and taking on the operational burden of hosting.

Architect a hybrid where sensitive workloads run against boundary-internal models and non-sensitive workloads can use external providers under a separate, lower-impact authorization. This requires a clean data-classification layer and a routing decision that cannot fail closed in a direction that leaks sensitive data.

Treat the AI feature as out of scope for the authorized system, not available to authorized workloads. Acceptable for some programs. Not a viable product story for most commercial CSPs.

None of these paths is easy. All of them are substantially more work than "add an API key and call the model." Most mid-market CSPs pursuing authorization are still discovering that the AI features they shipped against their commercial product cannot be carried into the FedRAMP product without rearchitecting.

What to build against now

If you are a CSP pursuing authorization, the AI boundary decision is load-bearing and needs to be made early. Pushing it to the end of the authorization process means rearchitecting features that were built against assumptions that do not hold.

The specific decisions to make early:

Which AI capabilities are actually required for the authorized product, versus nice-to-have features carried from the commercial product.
For each capability, whether the model can be self-hosted or requires a specific provider.
For each provider, whether that provider holds an authorization at the impact level you need.
What data classification and routing layer ensures sensitive workloads never reach unauthorized providers, even under error conditions or administrator override.

These decisions are architectural. They are the kind of decision the FedRAMP Management Engine is designed to surface early — showing where the authorization boundary actually runs in the code, not where the architecture diagram says it runs.

The AI boundary problem is solvable. It is not solvable late, and it is not solvable by pretending the model provider is inside a boundary that has not been extended to include them.

FedRAMP and AI Workloads: The Boundary Problem

What the model call actually does

The architectural choices

What to build against now