Is on-device AI better than cloud AI?

Neither one wins outright. On-device AI is better for privacy, latency, offline use, and per-request cost, because nothing leaves the phone and there is no server bill per call. Cloud AI is better for raw capability, since a data-center model is far larger than anything that fits on a phone, and it runs the same way on every device instead of only on recent high-end hardware. A lot of 2026 apps use both: an on-device model for the fast, private, everyday work, and a cloud model for the heavy requests. The right choice depends on the feature, not on a rule.

What is Apple's Foundation Models framework?

Foundation Models is the framework Apple introduced at WWDC25 in June 2025 that lets an app use Apple's on-device model through a Swift API. It ships in iOS 26, iPadOS 26, and macOS 26 on Apple Intelligence-capable devices, and it exposes Apple's roughly 3-billion-parameter on-device model. The API includes guided generation through the @Generable macro, which gets structured output back as Swift types, plus tool calling so the model can invoke functions in your app. Because it runs on the device, the text never leaves the phone and there is no per-request server cost.

Can a React Native or Expo app use on-device AI?

Yes, but not from plain JavaScript. Apple's Foundation Models framework and Android's ML Kit GenAI APIs have no JavaScript API, so a React Native or Expo app reaches them through a native module or a config plugin that bridges the native call into JavaScript. That also means it does not run in Expo Go: you run npx expo prebuild and make a development or EAS build that includes the native code. Community packages exist that wrap these frameworks, so you may not have to write the bridge yourself, but you do need a custom development build either way.

Why does on-device AI need a development build instead of Expo Go?

Expo Go is a prebuilt app that only contains the native code Expo ships with it, and the on-device AI frameworks are not part of that bundle. To use them you have to add native code, either a native module you write or a community package with a config plugin, and then compile that into the binary yourself. That is what npx expo prebuild and a development or EAS build do: they generate the native projects, link the extra native code, and produce an app that actually has the on-device AI framework inside it. Expo Go cannot load native code it was not built with, so a development build is the only path.

Does Newly support on-device AI out of the box?

Not as a turnkey toggle, and it would be dishonest to claim otherwise. Newly's own built-in AI chatbot feature is cloud-based. The honest advantage is the code: Newly generates a React Native and Expo codebase you own, so you are free to add a native module for Apple's Foundation Models or Android's Gemini Nano and ship it in a development build, rather than being boxed in by a closed builder's built-in AI. The caveat stays the same as for any RN app, since on-device access needs a custom native module plus a development build, not Expo Go, and for many features a cloud model through the bundled backend is simpler. So the path is open, not automatic.

Articles · GuideUpdated June 2026

On-device AI mobile apps.
The 2026 explainer.

Q: What is Gemini Nano and which Android phones run it?

Gemini Nano is Google's on-device model for Android. It runs inside the AICore system service, and apps reach it through ML Kit's GenAI APIs, which today cover Summarization, Proofreading, Rewriting, and Image Description, with a Prompt API in alpha. The catch is hardware: Gemini Nano only runs on supported high-end devices, such as recent Pixel and Samsung Galaxy phones, not on every Android phone. So an Android app that wants on-device AI has to check whether the feature is available on the device and fall back to a cloud model or a plain UI when it is not.

On-device AI runs the model on the phone itself, not in the cloud. In 2026 that became real for ordinary apps, with Apple and Google both shipping on-device models you can call from your own code. This is what on-device AI mobile apps are, where they beat the cloud and where they do not, and what it takes to reach them from React Native and Expo.

Start building today See on-device vs cloud

The short answer

What on-device AI actually means.

On-device AI means the model runs on the phone’s own chip rather than on a server. You send it text, it answers locally, and the input never leaves the device. That buys you three things a cloud call cannot: privacy by default, no network round trip, and a feature that still works with no signal.

The 2026 state is that this is no longer a research demo. Apple ships its Foundation Models framework on iOS 26, and Google ships Gemini Nano on supported Android phones, so apps can use a built-in on-device model through native code. The honest limit is size: a model small enough to fit on a phone is good at focused jobs like summarizing, rewriting, or classifying text, while a large open-ended request still runs better in the cloud. So most on-device AI mobile apps in 2026 mix the two, using the local model for the fast, private work and a cloud model for the heavy lifting.

On-device vs cloud

The tradeoffs, side by side.

On-device and cloud are not rivals so much as different tools. Here is how they compare on the things that decide which one a feature should use.

Factor	On-device AI	Cloud AI
Privacy	Input stays on the phone; nothing is sent to a server	Requests travel to a server, so data leaves the device
Latency	No network round trip, so responses start fast	Adds a round trip; speed depends on the connection
Cost	No per-request server bill once the app is built	You pay per call, and it scales with usage
Offline	Works with no signal, on a plane or underground	Needs a connection; fails or stalls without one
Capability	A small model good at focused, scoped jobs	A far larger model for open-ended, heavy requests
Device reach	Only recent, capable hardware; needs a fallback	Runs the same on every phone, old or new

The pattern most teams land on: on-device for the private, instant, everyday work, and cloud for the requests that need a bigger model.

Apple

The Foundation Models framework.

Apple introduced the Foundation Models framework at WWDC25 in June 2025. It gives apps a Swift API to Apple’s on-device model, a model of roughly 3 billion parameters, and it ships in iOS 26, iPadOS 26, and macOS 26 on Apple Intelligence-capable devices. This is the path for on-device AI on iOS, and it is built into the system, so you are not bundling a model of your own.

The API is built for app developers, not just researchers. Guided generation, through the @Generable macro, hands you structured output as real Swift types instead of a blob of text you have to parse. Tool calling lets the model invoke functions you define, so it can pull in your app’s data or take an action mid-response. All of it runs on the device, so the text stays on the phone and there is no per-request server cost.

Ships in iOS 26, iPadOS 26, and macOS 26 on Apple Intelligence-capable devices
A Swift API to Apple's roughly 3-billion-parameter on-device model
Guided generation with the @Generable macro returns structured Swift types
Tool calling lets the model invoke functions you define in the app

Apple Newsroom Foundation Models docs

Android

Gemini Nano and ML Kit GenAI.

On Android, the on-device model is Gemini Nano. It runs inside the AICore system service, and apps do not poke at it directly. Instead you reach it through ML Kit’s GenAI APIs, which today cover Summarization, Proofreading, Rewriting, and Image Description, with a Prompt API in alpha for more open-ended use. So the common text jobs come as ready-made APIs rather than something you prompt from scratch.

The catch is hardware. Gemini Nano only runs on supported high-end devices, such as recent Pixel and Samsung Galaxy phones, not on every Android phone in the wild. That makes availability checks and a fallback non-optional: your app asks whether the feature is ready on this device, and routes to a cloud model or a plain screen when it is not. Plan for both, or the feature simply will not exist for a large share of users.

Gemini Nano runs in the AICore system service on the device
Reached through ML Kit's GenAI APIs, not a raw model handle
Summarization, Proofreading, Rewriting, and Image Description, plus a Prompt API in alpha
Only on supported high-end devices, like recent Pixel and Samsung Galaxy phones

Gemini Nano docs ML Kit GenAI docs

React Native and Expo

Reaching it from React Native and Expo.

Here is the part that trips people up. Both platform frameworks are native, so a React Native or Expo app does not call them from JavaScript. Four points cover what it actually takes.

1
There is no JavaScript API
Apple's Foundation Models framework is Swift, and Android's ML Kit GenAI APIs are Kotlin and Java. Neither one exposes a JavaScript entry point, so a React Native or Expo app cannot call them straight from the JavaScript layer. The model lives behind native code on both platforms.
2
You bridge it with a native module
To reach the framework you write a native module, or use a community package that ships one, plus a config plugin so Expo wires the native pieces in during the build. The module calls the native API and returns the result to your JavaScript, so the rest of the app sees a normal async function.
3
It does not run in Expo Go
Expo Go only contains the native code Expo ships with it, and these frameworks are not in that bundle. So you run npx expo prebuild and make a development build or an EAS build that compiles the native module in. From then on the on-device model is inside your binary and you test on a real, capable device.
4
You handle availability and fallback
On-device AI is not on every phone. iOS needs an Apple Intelligence-capable device, and Gemini Nano needs supported high-end Android hardware. The native module checks whether the feature exists on the device, and your code falls back to a cloud model or a plain screen when it does not, so older phones still get a working app.

The one-line version: prebuild, then a development build

On-device AI has no JavaScript API, so you bridge it with a native module or a community package, run npx expo prebuild, and ship a development or EAS build. It does not work in Expo Go, because Expo Go cannot load native code it was not compiled with.

Where Newly fits

Where Newly fits in all this.

Newly is a paid AI mobile-app builder, from $25 a month. You describe the app in plain language and it generates a React Native and Expo codebase that you own. Newly’s own built-in AI chatbot feature is cloud-based, and that is the honest starting point: there is no on-device toggle to flip.

The advantage is the code you walk away with. Because you own a real React Native and Expo project rather than rows in a closed builder, you can add a native module for Apple’s Foundation Models or Android’s Gemini Nano yourself and ship it in a development build. A closed builder with one built-in AI cannot offer that, since you never get to the native layer. So on-device AI mobile apps built on a Newly codebase are possible in a way they are not on a locked platform.

The caveat is the same one this whole guide has been making. On- device access needs a custom native module and a development build, not Expo Go, and that is real work. For many features a cloud model, reached through the backend bundled with every project, is simpler, cheaper to wire up, and runs on every phone. Treat on-device as a deliberate choice for the features that benefit from privacy, speed, or offline use, not a default for everything.

Start with the cloud chatbot, keep the on-device door open

A practical route: ship the built-in cloud AI chatbot first, since it works on every device with no native build, then add an on- device model later for the features that need it. Both live in the same code you own.

Try Newly from $25/mo Add an AI chatbot

FAQ

On-device AI mobile apps, answered.

On-device AI mobile apps run an AI model directly on the phone instead of sending requests to a server. The model lives on the device, so the work happens on the phone's own chip. That keeps the input private, removes the network round trip, and lets a feature keep working with no signal. In 2026 the two big platform options are Apple's Foundation Models framework on iOS and Gemini Nano on Android, both reachable through native code. The tradeoff is capability: a small model that fits on a phone handles focused jobs like summarizing or rewriting text well, while large, open-ended reasoning still tends to run better in the cloud.

Keep reading.

Add an AI Chatbot to Your App What Is React Native?Expo for Beginners All Articles and Guides

Build it on code you own.

From $25 a month, Newly generates a real React Native and Expo app from a prompt. Ship the built-in cloud AI today, and because the codebase is yours, add an on-device model later when a feature calls for it.

Try Newly from $25/mo See pricing

On-device AI mobile apps.The 2026 explainer.