Now Available: Optimized DeepSeek-R1 On GMI Cloud

GMI Cloud is happy to announce that we are hosting DeepSeek and its distilled models!

February 03, 2025

GMI Cloud is excited to announce that we are now hosting a dedicated DeepSeek-R1 inference endpoint, on optimized, US-based hardware.

What's DeepSeek-R1? Read our initial takeaways here.

Technical details:

Model Provider: DeepSeek
Type: Chat
Parameters: 685B
Deployment: Serverless (MaaS) or Dedicated Endpoint
Quantization: FP16
Context Length: The model can remember and process up to 128,000 tokens from previous inputs within a single session.

Additionally, we are offering the following distilled models:

Try our token-free service with unlimited usage!

Reach out for access to our dedicated endpoint here.

GMI Cloud helps you architect, deploy, optimize, and scale your AI strategies

Explore powerful AI models and launch your project in just a few clicks.