<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>llm &amp;mdash; Nat Knight</title>
    <link>http://natknight.xyz/tag:llm</link>
    <description>Reflections, diversions, and opinions from a progressive ex-physicist programmer dad with a sore back.</description>
    <pubDate>Sat, 23 May 2026 14:44:45 -0700</pubDate>
    <item>
      <title>TIL: Storing MLX models on an external drive</title>
      <link>http://natknight.xyz/til-storing-mlx-models-on-an-external-drive</link>
      <description>&lt;![CDATA[#llm #mlx #huggingface&#xA;&#xA;TL;DR: Create a directory to hold Hugging Face data and set the environment variable HFHOME to that directory&#39;s path.&#xA;&#xA;For example:&#xA;&#xA;mkdir /Volumes/externaldrive/huggingface&#xA;export HFHOME=&#39;/Volumes/externaldrive/huggingface&#39; &#xA;&#xA;You&#39;ll also want to set it in your .profile or .bashrc or wherever you set these things.&#xA;&#xA;!--more--&#xA;&#xA;Explanation&#xA;&#xA;I use Simon Willison&#39;s llm tool for interacting with LLMs from the CLI or from Python. It&#39;s particularly nice because you can access many different LLM providers using plugins.&#xA;&#xA;One such plugin is llm-mlx. It uses Apple&#39;s MLX library to run models locally on Apple hardware. Being able to run models under your own control is obviously very interesting, but it does mean storing gigabytes of model weights; if you&#39;re running a Mac you probably don&#39;t want those on your expensive, not-so-large internal hard drive. I, for example, would much rather they live on my external drive, which is much roomier.&#xA;&#xA;llm-mlx gets its models from the mlx-community group on Hugging Face; we can manage them accordingly:&#xA;&#xA;Hugging Face tools store local data by default&#xA;llm-mlx adopts the same convention.&#xA;HFHOME is stored in XDGCACHE_HOME ( ~/.cache/huggingface/ by default).&#xA;We can override that to store model files on an external drive.&#xA;&#xA;The external drive will likely be a little slower to load than the onboard one, but as with any performance question you&#39;ll have to measure for your specific case and choose the tradeoffs that work for you.&#xA; ]]&gt;</description>
      <content:encoded><![CDATA[<p><a href="http://natknight.xyz/tag:llm" class="hashtag"><span>#</span><span class="p-category">llm</span></a> <a href="http://natknight.xyz/tag:mlx" class="hashtag"><span>#</span><span class="p-category">mlx</span></a> <a href="http://natknight.xyz/tag:huggingface" class="hashtag"><span>#</span><span class="p-category">huggingface</span></a></p>

<p>TL;DR: Create a directory to hold Hugging Face data and set the environment variable <code>HF_HOME</code> to that directory&#39;s path.</p>

<p>For example:</p>

<pre><code class="language-shell">mkdir /Volumes/externaldrive/huggingface
export HF_HOME=&#39;/Volumes/externaldrive/huggingface&#39; 
</code></pre>

<p>You&#39;ll also want to set it in your <code>.profile</code> or <code>.bashrc</code> or wherever you set these things.</p>



<h2 id="explanation" id="explanation">Explanation</h2>

<p>I use Simon Willison&#39;s <a href="https://llm.datasette.io/en/stable/">llm tool</a> for interacting with LLMs from the CLI or from Python. It&#39;s particularly nice because you can access many different LLM providers using plugins.</p>

<p>One such plugin is <a href="https://github.com/simonw/llm-mlx">llm-mlx</a>. It uses <a href="https://opensource.apple.com/projects/mlx/">Apple&#39;s MLX library</a> to run models locally on Apple hardware. Being able to run models under your own control is obviously very interesting, but it does mean storing gigabytes of model weights; if you&#39;re running a Mac you probably don&#39;t want those on your expensive, not-so-large internal hard drive. I, for example, would much rather they live on my external drive, which is much roomier.</p>

<p><code>llm-mlx</code> gets its models from the <a href="https://huggingface.co/mlx-community">mlx-community</a> group on Hugging Face; we can manage them accordingly:</p>
<ul><li>Hugging Face tools <a href="https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables#hfhome">store local data</a> by default</li>
<li><code>llm-mlx</code> <a href="https://github.com/simonw/llm-mlx/blob/main/llm_mlx.py#L74">adopts</a> the same convention.</li>
<li><code>HF_HOME</code> is stored in <a href="https://specifications.freedesktop.org/basedir-spec/latest/#basics"><code>XDG_CACHE_HOME</code></a> ( <code>~/.cache/huggingface/</code> by default).</li>
<li>We can override that to store model files on an external drive.</li></ul>

<p>The external drive will likely be a little slower to load than the onboard one, but as with any performance question you&#39;ll have to measure for your specific case and choose the tradeoffs that work for you.</p>
]]></content:encoded>
      <guid>http://natknight.xyz/til-storing-mlx-models-on-an-external-drive</guid>
      <pubDate>Tue, 25 Feb 2025 21:22:53 +0000</pubDate>
    </item>
    <item>
      <title>Release: llm-questioncache</title>
      <link>http://natknight.xyz/release-llm-questioncache</link>
      <description>&lt;![CDATA[#python #llm #embeddings #release #simonwillison&#xA;&#xA;I just released version 0.1 of a plugin for Simon Willison&#39;s llm called llm-questioncache. It lets you send questions to your default LLM with a system prompt that elicits short, to-the-point answers. It also maintains a cache of answers locally so that you only have to hit the LLM once for each bit of esoteric knowledge.&#xA;&#xA;!--more--&#xA;&#xA;It uses embeddings of each question to find similar questions so that (for example) if you ask&#xA;&#xA;  How do you compare two branches in git&#xA;&#xA;and&#xA;&#xA;  How to compare different branches in git&#xA;&#xA;you&#39;ll get the same answer.&#xA;&#xA;If you&#39;ve already got LLM installed you can try it out with &#xA;&#xA;llm install llm-questioncache&#xA;&#xA;Here&#39;s the PyPI package:&#xA;https://pypi.org/project/llm-questioncache/&#xA;&#xA;And here&#39;s the source code:&#xA;https://github.com/nathanielknight/llm-questioncache&#xA;&#xA;]]&gt;</description>
      <content:encoded><![CDATA[<p><a href="http://natknight.xyz/tag:python" class="hashtag"><span>#</span><span class="p-category">python</span></a> <a href="http://natknight.xyz/tag:llm" class="hashtag"><span>#</span><span class="p-category">llm</span></a> <a href="http://natknight.xyz/tag:embeddings" class="hashtag"><span>#</span><span class="p-category">embeddings</span></a> <a href="http://natknight.xyz/tag:release" class="hashtag"><span>#</span><span class="p-category">release</span></a> <a href="http://natknight.xyz/tag:simonwillison" class="hashtag"><span>#</span><span class="p-category">simonwillison</span></a></p>

<p>I just released version 0.1 of a plugin for Simon Willison&#39;s <a href="https://github.com/simonw/llm"><code>llm</code></a> called <a href="https://github.com/nathanielknight/llm-questioncache"><code>llm-questioncache</code></a>. It lets you send questions to your default LLM with a system prompt that elicits short, to-the-point answers. It also maintains a cache of answers locally so that you only have to hit the LLM once for each bit of esoteric knowledge.</p>



<p>It uses <a href="https://vickiboykis.com/what_are_embeddings/">embeddings</a> of each question to find similar questions so that (for example) if you ask</p>

<blockquote><p>How do you compare two branches in git</p></blockquote>

<p>and</p>

<blockquote><p>How to compare different branches in git</p></blockquote>

<p>you&#39;ll get the same answer.</p>

<p>If you&#39;ve already got LLM installed you can try it out with</p>

<pre><code>llm install llm-questioncache
</code></pre>

<p>Here&#39;s the PyPI package:
<a href="https://pypi.org/project/llm-questioncache/">https://pypi.org/project/llm-questioncache/</a></p>

<p>And here&#39;s the source code:
<a href="https://github.com/nathanielknight/llm-questioncache">https://github.com/nathanielknight/llm-questioncache</a></p>
]]></content:encoded>
      <guid>http://natknight.xyz/release-llm-questioncache</guid>
      <pubDate>Sun, 09 Feb 2025 05:59:09 +0000</pubDate>
    </item>
  </channel>
</rss>