Architecture notes

Automatic input mode

There are two possible ways DomTerm can detect cooked vs raw mode:

Line structure

"Line" here refer to "visual line": A section of the DOM that should be treated as a line for cursor movement. Line breaks may come from the back-end, or be inserted by the line break algorithm.

The lineStarts array maps from a line number to the DOM location of the start of the corresponding line.

The lineEnds array maps to the end of each line. Always points to a span node with the line attribute set. Normally lineEnds[i] == lineStarts[i+1]; however, sometimes lineStarts[i] is the start of a <div> or other block element.

Caret positions

Valid positions for caret (for line-editing or view mode) are as follows:

In text, the caret can be at either end, or between grapheme clusters. The caret can be before or after a line element, and also before an object element, which includes elements whose tag is object, canvas, img, svg, or iframe. It can also be at the end of a th or td element.

A caret position has no width, but it has a height: For a text element it is the same as the bounding box of the caret position. If before/after an object, it should be the height of the object.

There is an ambiguity when betwen two objects or between object and text. LibreOffice and Chrome display the caret with the height of the text that is before the caret: The caret vertical extent depends on the preceding text. For Firefox it’s context-dependent: It depends on the previous caret position, or (after a mouse-click) which character was clicked. This is related to the ambiguity if the caret is at a soft line-break: Should it be shown at the end of preceding line or the start of the following line. In this case it makes sense to resolve based on the starting position.

The visible height of the focus caret should be that of the caret.

The move character left/right should move the caret to the previous/next valid caret position, in document order.

Colors and high-lighting

This needs updating.

Escape sequences (for example "\e[4m" - "underlined", or "\e[32m" - "set foreground color to green") are translated to <span> elements with "style" attributes (for example ‘<span style="text-decoration:underline">‘ or ‘<span style="color: green">‘). After creating such a ‘<span>‘ the current position is moved inside it.

If we’ve previously processed "set foreground color to green", and we see a request for "underlined" it is easy to ceate a nested ‘<span>‘ for the latter. But what if we then see "set foreground color to red"? We don’t want to nest <span style="color: red">‘ inside <span style="color: green">‘ - that could lead to some deep and ugly nesting. Instead, we move the cursor outside bot existing spans, and then create new spans for red and underlined.

The ‘<span>‘ nodes are created lazily just before characters are inserted, by ‘_adjustStyle‘, which compares the current active styles with the desired ones (set by ‘_pushStyle‘).

A possibly better approach would be to match each highlight style into a ‘class‘ attribute (for example ‘green-foreground-style‘ and ‘underlined-style‘). A default stylesheet can map each style class to the correspoding CSS rules. This has the advantage that one could override the highlighting appearance with a custom style sheet.

Remote connection over ssh

The support for remote connections is a relatively thin layering on top of ssh. When you type:

domterm user@host command

it basically translates to:

ssh user@host domterm --browser-pipe command

All connection setup and logging in are handled by ssh. If you do a command like:

domterm user@host status

this will run domterm --browser-pipe status on the host. In this case the --browser-pipe option is irrelevant: Output from the command will be printed on standard output which ssh then sends back to the local domterm, which prints in on the local terminal.

If the command creates a window, things get more interesting. For example:

domterm --qt user@host attach :1

Again, ssh will execute domterm --browser-pipe attach :1 on the remote host. This will attach to existing session 1. Then, when domterm would normally create a a new window, it sees the --browser-pipe. It prints a special escape character, which is sent back to the local domterm server, which switches from command-proxy mode to display-proxy mode, opening up a window. (A Qt window because of the --qt option.) In display-proxy mode, characters and other events from the window are written to the input of the ssh process, while output from the remote user process are sent via the remote domterm server over ssh to the local domterm server, which displays them in the window.

            ┌─────────────────────────────────────┐
 Front-end  │  Window, input devices (keyboard)   │
            ├─────────────────────────────────────┤
            │  Browser engine (runs terminal.js)  │
            └-────────────────────────────────────┘
                  🠉
               WebSockets connection (local)
                  🠋
            ┌─────────────────────────────────────┐
 Local      │  Proxy (local end)                  │
 back-end   ├─────────────────────────────────────┤
            │  Ssh client (local session)         │
            └-────────────────────────────────────┘
                  🠉
               ssh connection (network)
                  🠋
            ┌─────────────────────────────────────┐
  Remote    │  Proxy (remote end)                 │
  back-end  ├─────────────────────────────────────┤
            │  Application (session)              │
            └─────────────────────────────────────┘

(Compare with diagram at Terminology and Concepts.)

Predictive echo

Mosh implements local “tentative echo”, which makes network latency less a problem. DomTerm implements this leveraging the “deferred deletion” mechanism (used for line mode echo).

To do this we use a <span> that contains predicted input: an optional text node, the _caretNode, and an optional text node. The node has 3 additional properties: textBefore, textAfter, and pendingEcho. When output arrives from the server, the function _doDeferredDeletion is called, which replaces the span by the textBefore and textAfter, with the _caretNode in between; this is “real” (confirmed) output, before processing the new output. We also _doDeferredDeletion when unable to do echo predication.

Handling keyboard input is as follows: First, if _deferredForDeletion is null, we set it to a fresh span that wraps the _caretNode. As needed, any text node immediately before or after can be moved into the _deferredForDeletion span, also setting textBefore and textAfter. Then, for a printing character, we insert it before the caret, and append it to pendingEcho. For left or right arrow, delete, or backspace, if possible we adjust the _deferredForDeletion span appropriately, and add a special code to pendingEcho. If not possible, we _doDeferredDeletion, which we also do for other keys.

Calling _doDeferredDeletion just before handling output is correct but suboptimal if the output only contains part of the pending echo. In that case we try to create (after handling output) a new _deferredForDeletion span, whose pendingEcho string is a tail of the previous value. (We only do this if there are no changes to any other (logical) line.)

Connection closing and detaching

A session can have one or more open windows, or it can be in a detached state, in which case the process (and pty) still exist, along with preserved state that can be used to attach a new window to the session. Onc can also attach a new window to a session that has open windows, in which case the open windows can provide the state needed to create the new window.

Preserved state is currently represented in two parts: a snapshot (the DOM slightly sanitized and serialized, plus terminal state properties), and a replay log: the session output sent to the terminal since the last snapshot.

For each session, the sever maintains a detach-count, which is rougly the number of windows that have been explictly closed. This is also the number of attaches “pending“ before a session is closed. When a new non-detached session is created, the detach-count is zero; when a window is detached, the detach-count is incremented; it a window is connected to an existing session, the detach-count is decremented. (If a new session is created in detached state, should the initial connect-count be zero or one?)

A connection (between session and window) can be closed in 4 ways:

Line-breaking / pretty-printing

For a terminal emulator we need to preserve (not collapse) whitespace, and (usually) we want to line-break in the middle of a word.

These CSS properties come close:

white-space: pre-wrap; word-break: break-all

This is simple and fast. However:

Hence we need to do the job ourselves.