Flower Intelligence
An Open-Source AI Platform to Run LLMs Locally in Your App or Remotely on Flower Confidential Remote Compute.
Start Building





Thunderbird Assist
Powered by Flower Intelligence
“Our 20 million users expect data privacy from every feature we build. Flower Intelligence allows us to ship on-device AI that works locally with the most sensitive data.”

Ryan Sipes
Managing Director, Product Mozilla Thunderbird
Cloud-only AI or Local-only AI is too limited
AI today faces a trade-off:
Run LLM in the cloud: powerful, but slow, unavailable when offline and not possible with sensitive data.
Run LLM on the device: fast, privacy-preserving, but works only on modern devices.
Flower Intelligence: Local-first AI with Confidential Remote Compute
Flower Intelligence prioritizes on-device AI for speed, privacy and offline use. When extra power is needed, Flower Confidential Remote Compute steps in as a seamless private extension of the device, without compromising privacy, security or performance.
This hybrid approach delivers the best of both worlds: local-first AI that remains powerful, private and compatible with all devices.
Why Flower Intelligence?
Run your favorite LLM locally on phones, tablets and laptops. Large models can run remotely in Flower Confidential Remote Compute. On our roadmap are local and federated fine-tuning to improve LLMs using local user data. Larger models can run remotely in the Flower Confidential Remote Compute service. Upcoming features include local and federated fine-tuning to improve LLMs using local user data.

Get Started
Visit the Docs to learn more
import { FlowerIntelligence } from '@flwr/flwr';
const fi = FlowerIntelligence.instance;
const response = await fi.chat({
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Why is the sky blue?' },
],
});
console.log(response.message.content);Supports your favorite models
Flower Intelligence runs your favorite LLMs locally on-device or remotely on Flower Confidential Remote Compute (early access preview).
More models coming soon
| On Device (TypeScript) | On Device (Swift) | Confidential Remote Compute | |
|---|---|---|---|
| LLaMA 3.2 1B (Meta) | |||
| LLaMA 3.2 3B (Meta) | |||
| LLaMA 3.1 8B (Meta) | |||
| SmolLM2 135M (HuggingFace) | |||
| SmolLM2 360M (HuggingFace) | |||
| SmolLM2 1.7B (HuggingFace) | |||
| DeepSeek-R1 (Distill-Llama-8B) | |||
| LLaMA 3.3 70B (Meta) | |||
| Mistral Small 3 | |||
| Qwen 235B |
Flower Intelligence Pilot Program
Apply now to get personalized support from the Flower team and Early Access to Flower Confidential Remote Compute.
Flower Intelligence
Private Inference API via Flower Confidential Remote Compute
What's included
- Confidential Compute
- End-to-End Encryption
- Isolated Private Inference
- No Logs
Usage-Based Billing
Detailed rates
| Model Tier | Model Size | Standard | Confidential Compute |
|---|---|---|---|
| Tier 1 | <4B | $0.10 | $0.20 |
| Tier 2 | 4-16B | $0.20 | $0.40 |
| Tier 3 | 16-48B | $0.90 | $1.80 |
| Tier 4 | 48-100B | $1.20 | $2.40 |
| Tier 5 | 100-450B | $3.00 | $6.00 |
| Qwen 235B | 235B | $0.22 input, $0.88 output | $0.44 input, $1.76 output |
Pricing per million tokens, by model size and compute type.