In a previous article, I mentioned that I had begun keeping notes in Org Mode using GNU Emacs. This has the advantage that the notes are stored in plain text, making them portable, durable and easy to keep under version control. This contrasts with commercial note-taking applications that store your data on someone else’s machine or lock it into proprietary formats that cannot be easily read by other software. However, I still faced challenges in how to structure the notes themselves. Originally, I kept everything in one large file, and later moved to one Org Mode file per topic, but this still did not feel like the right level of granularity, as these files gradually became dumping grounds for loosely related ideas enmeshed together.
Recently, I stumbled upon GNU Hyperbole, which adds a hypertext layer on top of Emacs that works together nicely with Org Mode. The Hyperbole manual describes it as a system for convenient access to information, with linking across various forms of content, such as files, buffers, mail and news messages, grep and compilation results, system documentation and URLs. What really caught my attention, however, was its HyWiki component that automatically highlights HyWikiWords and turns them into hyperlinks, similar to the original wiki implementation, WikiWikiWeb.
In Hyperbole, hyperlinking is built around the concept of buttons. Explicit buttons are defined directly by the user in a separate file, where the action can be programmed with Emacs Lisp. Implicit buttons are discovered automatically from context, for example recognizing file names, function names or structured text patterns and turning them into interactive targets. In addition, global buttons can be triggered from anywhere in a buffer or session and act more like persistent commands or shortcuts. Together, these turn otherwise ordinary text into actionable elements.
Most people learn about hypertext through the World Wide Web, which began as an attempt to build a hypermedia information retrieval system at CERN. This was, however, not the only attempt at building a hypermedia system. HyperCard was a hypermedia system that let users build and navigate stacks of linked cards long before the web became the default model for hyperlinks. Hyperbole feels closer to that older hypertext tradition, except that links can span across multiple types of information, similar to the idea of the general-purpose hypertext abstract machine. It is a working hypertext environment, not just a notebook with references sprinkled in.
Hypertext shares many similarities to older information management traditions outside software. For a long time, commonplace books, in which one collects useful quotations, ideas, observations and references, were popularly used for this purpose. Later, however, others started writing their notes on index cards instead, eventually leading to the use of Zettelkästen where notes are not just collected, but linked together so that ideas build upon one another later. Modern wiki systems takes this concept and expands it to collaborative editing. While Wikipedia became the most familiar wiki to most netizens, it represents a highly curated and editorially controlled form of wiki. The fundamental principles of wiki are probably best reflected in WikiWikiWeb.
HyWiki takes these concepts and applies it to personal note-taking.
Notes are stored as Org Mode files stored under hywiki-directory and
can use any of the features that Org Mode provides. HyWikiWords are
highlighted automatically for pages under hywiki-directory, and the
global minor mode can extend that behaviour to text buffers outside
the wiki directory as well. For navigating and searching notes,
HyRolo offers powerful free-text search and retrieval facilities.
Originally, it was originally designed as a flexible contact
management system, inspired by the idea of a digital Rolodex.
However, its generality makes it equally suitable for managing other
kinds of semi-structured information. This talk shows how HyRolo,
even before HyWiki existed, could be used to implement a Zettelkasten.
What I like about the HyWiki approach to taking notes is that linking is not a separate action but a natural by-product of writing. I do not have to interrupt my train of thought to remember the link syntax; I simply write a HyWikiWord whenever I wish to make a connection. If a page already exists, the word becomes a link automatically, and if it does not, clicking on it creates the page. WikiWords therefore function as more than simple navigational devices: each one names a concept. As these concepts accumulate, they form a personal vocabulary that allows more complex ideas to be expressed in terms of previously established ones. Over time, the structure of the notes and their network of connections emerges organically, rather than being imposed in advance, which closely matches how I actually think and work with ideas.
The particular choice of what the system recognizes automatically as a link is called a link pattern. Many wiki systems set this pattern so that it corresponds to CamelCase. This works reasonably well, since there are few words in the English language that naturally use this convention outside of trademarks, and it has historical precedent as the style originally used by WikiWikiWeb. However, it makes expressing single-word concepts awkward. A common response is to simply forbid single-word WikiWords altogether. Another frequently used restriction is to require at least one intervening lowercase letter between capital letters, which effectively excludes acronyms and initialisms. In practice, this leads to a proliferation of workarounds, which are, in my opinion, UgLy.
An alternative are free links, where links are explicitly marked up (e.g., by surrounding the link with square brackets) rather than inferred from a naming convention. The additional syntax introduces friction, since you must interrupt writing to decide that something should be a link, recall the name of the page you want to link to and apply the correct markup. Even with a shortcut to create links automatically, that interruption still adds up over time. Free links also lack the distinctiveness and consistency of WikiNames, making them less immediately identifiable in running text and harder to keep in working memory.
On the other hand, you may wonder why not set the link pattern to recognize every word, especially since only words with associated pages will be highlighted? One problem is that certain words could be inadvertently shadowed by other implicit buttons due to button type precedence rules. More importantly, it would remove any sense of intentionality from the system, as you no longer have any control over what becomes a link. It could, for example, become distracting if common words have pages, and completely removes the distinction between ordinary text and concepts represented by WikiWords, similar to the problem with free links. A notation could be defined to suppress automatic link generation, but this effectively reintroduces the very syntax overhead that the approach is trying to avoid.
Therefore, I prefer to capitalize all letters in the WikiWord using hyphens to separate intra-word boundaries. This solves the single-word problem, and it also makes WikiWords stand out more clearly from surrounding prose. For the latter reason, it is not unusual to see ALL CAPS or small caps in technical documentation, legal documents and screenwriting, albeit often in alternation with other methods like initial capitals and bold or italic face. An example of this is that Emacs Lisp docstrings use ALL CAPS to mark function parameters. This convention also feels natural for acronyms, which are frequently used to define technical terms concisely (e.g., DRY and WYSIWYG). The main disadvantage is that it may be more inconvenient to type, since it requires either activating the caps lock key or holding down the shift key. Another concern may be the issue of readability, although, contrary to popular belief, this is more likely due simply to a lack of practice rather than because of the word’s shape. Nonetheless, I believe that this strikes a reasonable compromise overall.
HyWiki defaults to CamelCase as the link pattern, but luckily this can
be easily changed. Internally, HyWiki recognizes HyWikiWords in a few
steps. On initialization, or whenever the wiki directory changes,
hywiki-get-referent-hasht scans hywiki-directory and builds a hash
table mapping WikiWords to their corresponding file paths, where each
item in the hash table is known as a referent. At the same time, it
constructs a list of regular expressions
hywiki--any-wikiword-regexp-list from hywiki-word-regexp, each of
which for performance reasons matches 25 HyWikiWords at a time.
HyWiki then performs regexp-based scanning over buffers using this
list, and checks that each match follows certain boundary rules by
calling hywiki-maybe-at-wikiword-beginning. Valid WikiWords are
then rendered according to hywiki-word-face using overlays. In
addition, there is an implicit button type defined whose action is to
jump to the referent in the case of highlighted WikiWords or to create
it otherwise.
The first step is to redefine hywiki-word-regexp so that WikiWords
are recognized as uppercase words with optional hyphens and digits.
This can be done with the following Emacs Lisp code.
(setq hywiki-word-regexp
(format "\<\([[:upper:]][-_[:upper:]0-9]+\)\>\(?:%s\)?"
(regexp-quote hywiki-file-suffix)))
Once that changes, the other variables that depend on this one have to be redefined as well. Otherwise, links with section references or line numbers will not be recognized.
(setq hywiki-word-with-optional-suffix-regexp
(concat hywiki-word-regexp hywiki-word-section-regexp "??"
hywiki-word-line-and-column-numbers-regexp "?"))
(setq hywiki-word-with-optional-suffix-exact-regexp
(concat "\`" hywiki-word-regexp "\(#[^][#\n\r\f]+\)??"
hywiki-word-line-and-column-numbers-regexp "?\'"))
After changing those expressions, the referent hash table should be regenerated so that the cache matches the new pattern.
(hywiki-clear-referent-hasht) (hywiki-make-referent-hasht)
There is a long-standing issue in wikis known as the plural problem, where singular and plural WikiWords are treated as distinct pages even when they refer to the same concept. HyWiki supports automatic recognition of plural WikiWords using simple heuristics, which then refer back to the singular form. Unfortunately, this is not very reliable because English plurals can be highly irregular (e.g., mouse/mice, child/children, analysis/analyses). It also breaks down for HyWikiWords written in other languages that have different pluralization rules. Since WikiWords typically represent abstract concepts rather than concrete objects, one solution is simply to pick an appropriate form for the WikiWord and treat it as an invariant noun, singulare tantum or plurale tantum. The surrounding text should then be rephrased accordingly.
Because of the above issues, we will disable the HyWiki’s pluralization functionality using the following code.
(setq hywiki-allow-plurals-flag nil)
The code so far works well for text written in Western languages, but CJK characters complicates the picture. Han characters do not have case distinctions, and the Chinese and Japanese languages do not typically use spaces as word separators. If we limit ourselves to Latin characters to constitute WikiWords, terms from East Asian languages can be included through romanization. Typographically, a space should be added between an adjacent Han character and Western character, sometimes referred to humorously as Pangu spacing (盤古之白) after the myth of Pangu, who is said to have separated heaven and earth. This space then serves to delimit the HyWikiWords from regular words in CJK text. This rule does not apply to full-width punctuation, however, as such characters already contain a visual space inside the glyph itself. Consequently, we need to write additional code to handle this case correctly.
First, we need to modify hywiki--buttonize-character-regexp, which
is responsible for matching a separating character at the end of a
HyWikiWord to trigger HyWikiWord highlighting. This regular
expression is appended to hywiki--any-wikiword-regexp-list whenever
hywiki-get-referent-hast is called. Normally, it is derived from
the output by the function hywiki-get-buttonize-characters, which
gives the ASCII characters that have punctuation and symbol syntax in
Org Mode. However, to accommodate non-ASCII punctuation, we instead
simply define it to match any character that is not a valid HyWikiWord
character. We then also need to update the other variables that are
derived from this definition accordingly. These expressions must be
applied before refreshing referent cache, so that the updated regular
expressions are correctly incorporated into subsequent lookups and
highlighting behaviour.
(setq hywiki--buttonize-character-regexp "\\([^-_*#:[:alnum:]]\\|$\\)")
(setq hywiki--word-and-buttonize-character-regexp
(concat "\(" hywiki-word-with-optional-suffix-regexp "\)"
hywiki--buttonize-character-regexp))
Similarly, the function hywiki-maybe-at-wikiword-beginning
determines whether point is positioned at the start of a WikiWord. By
default, it recognizes only a limited set of ASCII punctuation
characters and whitespace as valid boundary markers. To support mixed
CJK text, we override this function to also recognize selected CJK
punctuation characters, such as full-width commas and sentence
terminators. In addition, we introduce an association list that maps
non-ASCII delimiters to their closest ASCII equivalents, ensuring that
HyWikiWords following CJK brackets are also handled correctly. This
mapping will be reused in the subsequent step.
(setq asciify-delimiters-alist
'(((?\“ ?\” ?\„ ?\« ?\» ?\" ?\『 ?\』 ?\「 ?\」 ?\﹃ ?\﹄ ?\﹁ ?\﹂) . ?\")
((?\‘ ?\’ ?\‚ ?\‹ ?\› ?\') . ?\')
((?\[ ?\〔 ?\【 ?\〖 ?\﹇ ?\︹ ?\︻ ?\︗) . ?\[)
((?\] ?\〕 ?\】 ?\〗 ?\﹈ ?\︺ ?\︼ ?\︘) . ?\])
((?\《 ?\〈 ?\︿ ?\︽) . ?\<)
((?\》 ?\〉 ?\﹀ ?\︾) . ?\>)
((?\( ?\︵) . ?\()
((?\) ?\︶) . ?\))
((?\{ ?\︷) . ?\{)
((?\} ?\︸) . ?\})))
(advice-add 'hywiki-maybe-at-wikiword-beginning :override
(lambda ()
(let ((prev-char (char-before)))
(when (or (bolp)
(and prev-char
(or (memq (char-syntax prev-char) '(?\ ))
(memq prev-char '(?, ?、 ?; ?: ?。?. ?? ?!))
(string-match (regexp-quote
(char-to-string
(or (assoc-default prev-char asciify-delimiters-alist #'seq-contains-p)
prev-char)))
"\[\(\{\<\"'`"))))
(or prev-char 0)))))
That addresses most of the issues. However, HyWiki distinguishes
between delimited WikiWords enclosed in brackets or quotation marks
and non-delimited WikiWords. Unfortunately, the sets of delimiters
and quotation characters are hard-coded in the HyWiki implementation.
To work around this limitation, we redefine char-before and
char-after within hywiki-maybe-highlight-balanced-pairs, replacing
non-ASCII delimiters with their ASCII equivalents using the previously
defined association list. This approach is admittedly a hack, but it
seems to work as intended.
(advice-add 'hywiki-maybe-highlight-balanced-pairs :around
(lambda (oldfun &rest args)
(cl-flet ((map-char (c) (or (assoc-default c asciify-delimiters-alist #'seq-contains-p) c)))
(cl-letf (((symbol-function 'char-before) (lambda (&optional pos) (map-char (char-before pos))))
((symbol-function 'char-after) (lambda (&optional pos) (map-char (char-after pos)))))
(apply oldfun args)))))
Finally, we need to wrap the whole code using with-eval-after-load,
so that it is only run after HyWiki has been loaded.
(with-eval-after-load 'hywiki
(setq asciify-delimiters-alist
'(((?\“ ?\” ?\„ ?\« ?\» ?\" ?\『 ?\』 ?\「 ?\」 ?\﹃ ?\﹄ ?\﹁ ?\﹂) . ?\")
((?\‘ ?\’ ?\‚ ?\‹ ?\› ?\') . ?\')
((?\[ ?\〔 ?\【 ?\〖 ?\﹇ ?\︹ ?\︻ ?\︗) . ?\[)
((?\] ?\〕 ?\】 ?\〗 ?\﹈ ?\︺ ?\︼ ?\︘) . ?\])
((?\《 ?\〈 ?\︿ ?\︽) . ?\<)
((?\》 ?\〉 ?\﹀ ?\︾) . ?\>)
((?\( ?\︵) . ?\()
((?\) ?\︶) . ?\))
((?\{ ?\︷) . ?\{)
((?\} ?\︸) . ?\})))
(setq hywiki-allow-plurals-flag nil)
(setq hywiki-word-regexp
(format "\<\([[:upper:]][-_[:upper:]0-9]+\)\>\(?:%s\)?"
(regexp-quote hywiki-file-suffix)))
(setq hywiki-word-with-optional-suffix-regexp
(concat hywiki-word-regexp hywiki-word-section-regexp "??"
hywiki-word-line-and-column-numbers-regexp "?"))
(setq hywiki-word-with-optional-suffix-exact-regexp
(concat "\`" hywiki-word-regexp "\(#[^][#\n\r\f]+\)??"
hywiki-word-line-and-column-numbers-regexp "?\'"))
(setq hywiki--buttonize-character-regexp "\\([^-_*#:[:alnum:]]\\|$\\)")
(setq hywiki--word-and-buttonize-character-regexp
(concat "\(" hywiki-word-with-optional-suffix-regexp "\)"
hywiki--buttonize-character-regexp))
(advice-add 'hywiki-maybe-at-wikiword-beginning :override
(lambda ()
(let ((prev-char (char-before)))
(when (or (bolp)
(and prev-char
(or (memq (char-syntax prev-char) '(?\ ))
(memq prev-char '(?, ?、 ?; ?: ?。?. ?? ?!))
(string-match (regexp-quote
(char-to-string
(or (assoc-default prev-char asciify-delimiters-alist #'seq-contains-p)
prev-char)))
"\[\(\{\<\"'`"))))
(or prev-char 0)))))
(advice-add 'hywiki-maybe-highlight-balanced-pairs :around
(lambda (oldfun &rest args)
(cl-flet ((map-char (c) (or (assoc-default c asciify-delimiters-alist #'seq-contains-p) c)))
(cl-letf (((symbol-function 'char-before) (lambda (&optional pos) (map-char (char-before pos))))
((symbol-function 'char-after) (lambda (&optional pos) (map-char (char-after pos)))))
(apply oldfun args)))))
(hywiki-clear-referent-hasht)
(hywiki-make-referent-hasht))
Now that the link pattern is in place and the main multilingual issues
have been addressed, it is worth outlining the note-taking workflow
that I use. Although I am using HyWiki for the note taking system, I
still refer to the whole thing a Zettelkasten because it functions as
a personal tool for thinking and writing, rather than as an online
collaboration tool like WikiWikiWeb. In particular, I adopt several
ideas commonly associated with Zettelkästen, such as keeping notes
atomic (i.e., limited to a single concept), writing notes in my own
words rather than as excerpts and allowing structure to emerge from
the network of connections rather than from a predefined hierarchy.
The directory in which the notes are stored lives under the home
directory and is simply named zk.
Keeping to a standard naming convention for WikiWords is important to avoid the issue of multiple WikiWords for the same underlying concept. I prefer the singular form wherever appropriate because the pages these link to describe a single conceptual entity rather than a collection of instances. This also lends itself well to forming an ontology through is-a and has-a relationships (e.g., DOG is an ANIMAL and HUMAN has a BRAIN). Ideally, WikiWords should be nouns or noun phrases, avoiding gerunds in favour of specific deverbals (e.g., COMPILATION over COMPILING). In general, I use the (British) English term for a concept following the Oxford English Dictionary spelling, unless it is strongly associated with another particular language or untranslatable (e.g., computing terminology, local customs, food items, etc.).
I try to keep each page as simple as possible. Org Mode allows you to add tags to files and headings, but tags are supprisingly difficult to use effectively and risk turning into a parallel classification system that competes with WikiWords. For this reason, I restrict tags to a very narrow role, using them only as state indicators to mark pages as, for example, stubs, although some individuals like to use a garden analogy to indicate the maturity of their notes. Aside from a small set of file tags, I do not include any additional metadata – not even a title, since the WikiWord itself serves this function. I also keep the markup as light as possible, using headings and lists to provide some structure for larger pages. Occasionally, it can be useful to use emphasis such as bold, italics or underline, but only sparingly.
Whilst I try to keep each page relatively small so that it corresponds to one concept, this is easier said than done. In reality, I will simply start writing, and, when a page gets too big, I split it. I do not make a hard distinction between note types. They follow roughly the GRID system but are not marked explicitly in any way. The vast majority notes I keep are definitions of terms. For reflections on notable pieces of literature, I will create a WikiWord titled according to the short title of the work. I do not bother with creating a full bibliographic entry for every work referenced in pages, since this adds significant overhead. Instead, I simply link to a stable identifier, such as a URL or ISBN, that lets me locate the source again when needed. For smaller works like web pages and short papers, I will download a copy and add it directly as an attachment to the page.
Having some topical pages that link to subtopics are an important way
to keep the content organized. Within these pages, instead of
providing a list of relevant WikiWords, I aim to synthesize the
information on the topic by specifying the relationship between each
of the concepts in order to form a web of relationships. Were you to
plot the links between the pages, the resulting graph might resemble
the spokes on a wheel, the wheels themselves being interconnected.
The system grows incrementally starting from a small set of central
pages, and subtopics are added and filled out gradually as needed.
The starting point to my Zettelkasten is a personal button file
containing links to a few interesting pages. Alternatively, WikiWords
can be found at any time using the Hyperbole pop-up menu via {C-h h}
and selecting HyWiki.
Whilst I do write a journal, I keep this separate from the Zettelkasten in a single Org Mode file using a hierarchy of headings following the ISO week date system. I find that this separation helps, because my journal contains mainly ephemeral information, such as reminders about appointments, tasks to complete and fleeting thoughts, as compared to the Zettelkasten, which stores “evergreen” notes. To make it convenient to quickly write down entries in the journal, I use the Org Mode capture feature for creating plain headings, tasks and appointments anywhere from within Emacs. At the start of each new year, I archive the previous year’s entries to keep the file manageable. I also keep contact information in a separate Rolodex-style file in Org Contacts format and search it using HyRolo.
Hyperbole turns Org Mode into an excellent digital Zettelkasten with its lightweight hypertext capabilities. The use of WikiWords makes notes easy to create and refer to, allowing you to fully focus on act of the writing. Regardless of whether you already take your notes in Org Mode, if you are looking for a more powerful way to connect ideas together, HyWiki is a very good fit.