
Phi
- Features
- Ease of use
- Ease of management
- Quality of support
- Affordability
- Market presence
What is Phi
Small-model efficiency focus
Developer-friendly distribution
Strong baseline for prototyping
Capability ceiling vs larger LLMs
Licensing and usage constraints
Operational gaps vs hosted platforms
Plan & Pricing
Pricing model: Pay-as-you-go Free tier/trial: Free access available for real-time deployment via Microsoft Foundry and Hugging Face; Azure offers a free account ($200 credit for 30 days) that can be used to experiment with Foundry models. Example costs (selected models — prices are per 1,000 tokens unless noted):
- Phi-3-mini (4K or 128K context) — Input: $0.00013 / 1,000 tokens; Output: $0.00052 / 1,000 tokens.
- Phi-3.5-mini (128K) — Input: $0.00013; Output: $0.00052.
- Phi-3-small (8K or 128K) — Input: $0.00015; Output: $0.00060.
- Phi-3-medium (4K or 128K) — Input: $0.00017; Output: $0.00068.
- Phi-3.5-MoE (128K) — Input: $0.00016; Output: $0.00064.
- Phi-4 (128K) — Input: $0.000125; Output: $0.0005.
- Phi-4-mini (128K) — Input: $0.000075; Output: $0.0003.
- Phi-4-multimodal (text & image, 128K) — Input: $0.00008; Output: $0.00032.
- Phi-4-multimodal (audio, 128K) — Input: $0.004; Output: $0.00032.
Fine-tuning example costs (official figures):
- Training: $0.003 per 1,000 tokens (example shown for Phi-3/Phi-4 series).
- Hosting (fine-tuned model): $0.80 per hour.
- During fine-tuning, Input/Output usage pricing matches the inference input/output rates shown above (e.g., Phi-3-mini input $0.00013 / output $0.00052).
Discount options: Microsoft Foundry pricing page and Azure purchasing options reference volume/commitment and enterprise agreements (contact sales / request quote) — enterprise/provisioned throughput and reservation options are available via Azure sales.
Notes & scope: All figures are taken from Microsoft Azure’s official product and pricing pages and Microsoft announcements (Azure product page, Foundry pricing, and Azure blog announcement). Prices are listed in USD as presented by Microsoft and are charging units per 1,000 tokens for inference; fine-tuning shows training per 1,000 tokens and hosting per hour. Actual charges may vary by region, purchase option, or agreement with Microsoft; see Azure Foundry/pricing for region/currency specifics.