Harper's Architecture

This document seeks to solve one simple problem:

"Roughly, it takes 2x more time to write a patch if you are unfamiliar with the project, but it takes 10x more time to figure out where you should change the code." - Alex Kladov

This document is meant to serve as a kind of table of contents for the Harper project. Hopefully, we can reduce that 10x down to something a little more reasonable.

What does Harper do?

Harper tries to do one thing well: find grammatical and spelling errors in English text. If possible, provide suggestions to correct those errors. An error and its possible corrections together form what we call a lint.

In this vein, Harper serves the role of a Linter for English.

harper-core

harper-core is where all the magic happens. It contains the code needed to tokenize, parse, analyze and lint English text.

At a high level, there are just a couple types you need to worry about.

  • Document : A representation of an English document. Implements TokenStringExt to make it easier to query.
  • Parser : A trait that describes an object that consumes text and emits tokens. The name is somewhat of a misnomer since it is supposed to only lex English (and emit Tokens ), not parse it. It is called a parser since most types that implement this trait parse other languages (JavaScript) to extract the English text.
  • Linter : A trait that, provided a document, will produce zero or more Lints . This is usually done using direct queries on the document or by implementing a PatternLinter .

If you want to add a linter to Harper, create a new file under the linters module in harper-core and create a public struct that implements the Linter trait. There are a couple places in other parts of the codebase you'll need to update before it will show up in editors and have persistent settings, but that's a problem for after you've opened your pull request.

harper-ls

harper-ls is a language server that wraps around harper-core. In essence, it enables text editors and IDEs to access the capabilities of Harper over a network or via standard input/output.

If you aren't familiar with what a language server does, I would suggest reading this or the official language server protocol documentation .

When Harper is used through Neovim, Visual Studio Code, Helix, Emacs or Sublime Text, harper-ls is the interface.

You can read more about it here .

harper.js

harper.js is a JavaScript/TypeScript module that enables developers to use Harper on any platform that supports JavaScript and WebAssembly. Most of the JavaScript code in harper.js exists to load and manage the underlying WebAssembly module (otherwise known as harper-wasm).

There are more details about it in the documentation.

Last update at: Invalid Date