TIL: Storing MLX models on an external drive

Tue, 25 Feb 2025 21:22:53 +0000

TL;DR: Create a directory to hold Hugging Face data and set the environment variable HF_HOME to that directory's path.

For example:

mkdir /Volumes/externaldrive/huggingface
export HF_HOME='/Volumes/externaldrive/huggingface'

You'll also want to set it in your .profile or .bashrc or wherever you set these things.

Explanation

I use Simon Willison's llm tool for interacting with LLMs from the CLI or from Python. It's particularly nice because you can access many different LLM providers using plugins.

One such plugin is llm-mlx. It uses Apple's MLX library to run models locally on Apple hardware. Being able to run models under your own control is obviously very interesting, but it does mean storing gigabytes of model weights; if you're running a Mac you probably don't want those on your expensive, not-so-large internal hard drive. I, for example, would much rather they live on my external drive, which is much roomier.

llm-mlx gets its models from the mlx-community group on Hugging Face; we can manage them accordingly:

Hugging Face tools store local data by default
llm-mlx adopts the same convention.
HF_HOME is stored in XDG_CACHE_HOME ( ~/.cache/huggingface/ by default).
We can override that to store model files on an external drive.

The external drive will likely be a little slower to load than the onboard one, but as with any performance question you'll have to measure for your specific case and choose the tradeoffs that work for you.

Release: llm-questioncache

Sun, 09 Feb 2025 05:59:09 +0000

#python #llm #embeddings #release #simonwillison

I just released version 0.1 of a plugin for Simon Willison's llm called llm-questioncache. It lets you send questions to your default LLM with a system prompt that elicits short, to-the-point answers. It also maintains a cache of answers locally so that you only have to hit the LLM once for each bit of esoteric knowledge.

It uses embeddings of each question to find similar questions so that (for example) if you ask

How do you compare two branches in git

and

How to compare different branches in git

you'll get the same answer.

If you've already got LLM installed you can try it out with

llm install llm-questioncache

Here's the PyPI package: https://pypi.org/project/llm-questioncache/

And here's the source code: https://github.com/nathanielknight/llm-questioncache

llm — Nat Knight

TIL: Storing MLX models on an external drive

Explanation

Release: llm-questioncache