Large language models can have billions, or even trillions, of parameters. But how big do they need to be to achieve acceptable performance? To test this, I experimented with several of Google’s Gemma 3 models, all small enough to run locally on a single GPU. Specifically I used the 1 […]
Making LLMs Useful with Function Calls and Embeddings
Large Language Model AIs like Google’s Gemini and Open-AI’s GPT can be interesting to play around with even in a simple chatbot like Chat-GPT. But those chatbots largely waste their potential. Their understanding of natural language is impressive, but their “knowledge” is limited to what they were trained on. But […]