how it works · Maurycy Blaszczak

This page explains the custom bits of this site — the features that aren’t just “click a link, read a page.” The goal is for you to understand what’s happening on the site, especially when something behaves in a way you wouldn’t expect from a plain blog. As more custom machinery shows up here, it’ll be documented on this page too.

Today that’s three things, all related to finding and grouping pages: a plain text search, an optional semantic search, and a similarity view on the knowledge graph. They look similar, but they answer different questions.

Default: fuzzy text search

Hit / (or click the search icon) and you get the default search. Two pieces of machinery cooperate, each with its own job:

A small fuzzy index over the title, tags, and short summary of every page. This is the typo-tolerant one — kowledge still matches knowledge, embedings still matches embeddings — and title hits rank above summary hits. Pages it matches show up at the top of the results.
A separate full-text index that has read the whole body of every page. It does two things: it pulls a highlighted excerpt from the body of every hit (replacing the static summary), and it also surfaces body-only matches — pages whose title, tags and summary don’t contain your query but whose body does. Those appear in the result list under the fuzzy hits.

So the fuzzy index gives you the obvious matches, neatly typo-tolerant. The full-text index catches the long-tail “I know I wrote this somewhere” hits and shows you exactly where the words live on the page.

This whole thing runs in your browser. Nothing is sent anywhere.

Optional: semantic search

There’s a semantic off toggle on the search bar. Turn it on and the search switches from “matching letters” to matching meaning.

The basic idea:

On first use, the browser downloads a small embedding model (~23 MB). It’s cached after that — no second download.
Your query is turned into a list of numbers — a vector — that represents what the query is about.
Every page on the site was sliced into short overlapping passages at build time, and each passage was turned into a vector the same way.
The results are the pages whose passages have vectors closest to the query’s vector.

Why the pages are split into passages

A long blog post or knowledge note is usually about several things at once. If the whole page were squashed into a single vector, that vector would be a blurry average — a query about one specific paragraph could fail to match because the rest of the page dilutes the signal.

So at build time each page is cut into short windows of about 200 tokens — roughly a couple of paragraphs. Each window gets its own vector. When you search, every window is scored, and a page wins on the strength of its single best-matching window rather than an average of the whole thing. Long pages produce many windows; short notes might be one window.

(Tokens, roughly: how the model counts text. A token is usually a short word or a piece of one. 200 tokens is around 150 words.)

The 200 isn’t a design preference, it’s a ceiling. The small embedding model used here can only “see” up to 256 tokens at once — that’s the entire input it can read in a single pass. 200 just leaves a margin for the model’s own bookkeeping tokens. A model with a longer context window would also be a much bigger download, which would defeat the point of running this in the browser. Small model, small windows, many of them — that’s the trade.

Why the passages overlap

The slicing isn’t clean. Each new window starts before the previous one ended — consecutive windows share about 50 tokens of overlap, roughly a paragraph.

The reason is that natural language doesn’t break neatly at fixed lengths. Without overlap, a sentence that’s exactly what you were searching for might end up split across two windows, with neither half carrying enough context to score well. With overlap, whatever you’re searching for is almost always wholly contained inside at least one window.

A side effect: the same sentence shows up in two adjacent windows. That’s fine — they get scored independently, and the system uses the best score, not the sum, so a page doesn’t “win” just by being long.

Trade-offs

Because it works on meaning, you can ask things like “how does the site handle privacy?” and get the privacy page back even if you didn’t use the word privacy the same way the page does. The trade-off is that it can be less precise for exact strings — if you’re looking for a specific phrase, the default fuzzy search is usually better.

The model runs entirely on your device. No query, no embedding, no result is sent to a server. The cost is the one-time download and a second or two of warm-up the first time you toggle it on.

Similarity clustering on the knowledge graph

The knowledge graph has two modes: links and similarity.

Links is the default. An edge between two notes means I explicitly wrote a link from one to the other (a wikilink in the source markdown).
Similarity replaces those edges with edges drawn by the same embedding model the semantic search uses. Each note is connected to its few most similar other notes — “most similar” in the sense the model learned, not in the sense I declared.

So in similarity mode, a cluster of notes hanging together means the model thinks these notes talk about related things, even if I never linked them. It’s a useful sanity check: clusters that match my own mental groupings are reassuring; clusters that don’t are interesting prompts to either link the notes properly, rewrite one of them, or accept that the model sees a connection I hadn’t noticed.

The actual neighbour-finding happens at build time, not in your browser. When the site is published, each knowledge note’s chunk vectors are averaged into a single per-note vector and the few most similar other notes are picked out once. The result is a small list of edges that ships as its own file (a few KB of JSON). When you flip the toggle, the browser just downloads that list and draws it — no embedding model needed for this view, no math at view time.

What this is and isn’t

It is: three lightweight, on-device tools for finding and grouping pages.
It isn’t: a recommendation engine, a personalisation system, or a ranking that changes based on who you are. The site doesn’t know who you are. See privacy.md for that side, and ai-use.md for how AI shows up elsewhere.