Search
Popular Searches
On-device Large Language Model

On-device Large Language Model - AI tools

Bodhi Run LLMs locally, powered by Open Source
Bodhi is a free, privacy-focused application allowing users to run Large Language Models (LLMs) locally on their macOS devices without technical setup.
- Free
Ollama Get up and running with large language models locally
Ollama is a platform that enables users to run powerful language models like Llama 3.3, DeepSeek-R1, Phi-4, Mistral, and Gemma 2 on their local machines.
- Free
Lora Integrate local LLM with one line of code.
Lora provides an SDK for integrating a fine-tuned, mobile-optimized local Large Language Model (LLM) into applications with minimal setup, offering GPT-4o-mini level performance.
- Freemium
Kalavai Turn your devices into a scalable LLM platform
Kalavai offers a platform for deploying Large Language Models (LLMs) across various devices, scaling from personal laptops to full production environments. It simplifies LLM deployment and experimentation.
- Paid
- From 29$
Fullmoon A billion parameters in your pocket - chat with private and local large language models
Fullmoon is an open-source app that enables users to run local large language models directly on Apple devices, offering completely offline functionality and optimized performance for Apple silicon.
- Free
Kolosal AI The Ultimate Local LLM Platform
Kolosal AI is a lightweight, open-source application enabling users to train, run, and chat with local Large Language Models (LLMs) directly on their devices, ensuring complete privacy and control.
- Free
ONNX Runtime Production-grade AI engine for accelerated training and inferencing.
ONNX Runtime is a production-grade AI engine designed to accelerate machine learning training and inferencing across various platforms and languages, supporting Generative AI and performance optimization.
- Free
WebLLM High-Performance In-Browser LLM Inference Engine
WebLLM enables running large language models (LLMs) directly within a web browser using WebGPU for hardware acceleration, reducing server costs and enhancing privacy.
- Free
GGML AI at the Edge
GGML is a tensor library for machine learning, enabling large models and high performance on commodity hardware. It's designed for efficient on-device inference.
- Free
onedollarai.lol Access Top Large Language Models for Just $1 a Month
onedollarai.lol provides access to a variety of top-tier large language models (LLMs), including Meta LLaMa 3 and Microsoft Phi, for a flat monthly fee of $1.
- Paid
- From 1$
FriendliAI Accelerate Generative AI Inference
FriendliAI provides a high-performance platform for accelerating generative AI inference, enabling fast, cost-effective, and reliable deployment and serving of Large Language Models (LLMs).
- Usage Based
BrowserAI Run Local LLMs Inside Your Browser
BrowserAI is an open-source library enabling developers to run local Large Language Models (LLMs) directly within a user's browser, offering a privacy-focused AI solution with zero infrastructure costs.
- Free
Axolotl AI We make fine-tuning accessible, scalable, fun
Axolotl AI is a free, open-source tool designed to make fine-tuning Large Language Models (LLMs) faster, more accessible, and scalable across various AI models and platforms.
- Free
LlamaEdge The easiest, smallest and fastest local LLM runtime and API server.
LlamaEdge is a lightweight and fast local LLM runtime and API server, powered by Rust & WasmEdge, designed for creating cross-platform LLM agents and web services.
- Free
Float16.cloud Your AI Infrastructure, Managed & Simplified.
Float16.cloud provides managed GPU infrastructure and LLM solutions for AI workloads. It offers services like serverless GPU computing and one-click LLM deployment, optimizing cost and performance.
- Usage Based
Neural Magic Deploy Open-Source LLMs to Production with Maximum Efficiency
Neural Magic offers enterprise inference server solutions to streamline AI model deployment, maximizing computational efficiency and reducing costs on both GPU and CPU infrastructure.
- Contact for Pricing
CentML Better, Faster, Easier AI
CentML streamlines LLM deployment, offering advanced system optimization and efficient hardware utilization. It provides single-click resource sizing, model serving, and supports diverse hardware and models.
- Usage Based