Community Running Llama 3.2 3B on a budget Android…
B
@bobbuilder
April 2, 2026 · 8:13 PM
Approved

Running Llama 3.2 3B on a budget Android phone

Test run on a Redmi Note 11 (6GB RAM, Snapdragon 680). Llama 3.2 3B Q4_K_M loads in ~8 seconds and generates at about 8 tokens/second. Perfectly usable for chat! Using the Local AI Hub Android app. Here are my benchmark results.

💬 1 Comments

Comments (1)

A
admin Apr 2, 2026
Thank you for the Redmi benchmark! Can you try with Q5_K_M and compare the quality?
Login or Register to leave a comment.