<slides>

<slide id="title">
<caption>JEmacs: re-implementing Emacs using Java and Unicode</caption>
<center>
<p>Per Bothner</p>
<p><code>&lt;per@bothner.com&gt;</code></p>
<p>March 2000</p>
</center>
<ul>
<li><em>What</em> is JEmacs?</li>
<li><em>Why</em> create JEmacs?</li>
<li><em>How</em> does JEmacs work?
What are the problems and how are they solved?</li>
<li>Internationalization in JEmacs and Java.</li>
</ul>
</slide>

<slide id="what">
<caption>What is JEmacs</caption>
<p>JEmacs is a next-generation Emacs written in Java.</p>
<p>It includes:</p>
<ul>
<li>An implementation of the Scheme language.</li>
<li>An implementation of the ELisp base language.</li>
<li>Compilers for both to Java bytecodes.</li>
<li>A set of Java classes, based on the Swing GUI api, that
implement the Emacs "types", such as <code>Buffer</code>, <code>Keymap</code>,
<code>Window</code>, <code>Marker</code>.</li>
<li>A set of Scheme bindings to the Java methods.
These are "similar to" and have the same names as standard Emacs Lisp
functions, but are written in Scheme and intended to be called from Scheme.</li>
<li>The equivalent ELisp functions:  Implementations of the high-level
Emacs functions as ELisp functions, so existing ELisp applications
can (mostly) run without change.</li>
</ul>
</slide>

<slide id="why-jemacs">
<caption>Why create JEmacs?</caption>
<p>This is a major undertaking.  Some reasons why it might
be worthwhile:</p>
<ul>
<li>Swing is a modern GUI toolkit with good support for major Emacs
concepts.</li>
<li>Building on a Java run-time means we benefit from the work
being done to run Java (bytecodes) fast.</li>
<li>Java is multi-threaded.</li>
<li>Kawa is a modern object-oriented Scheme, while Emacs is based
on rather old design ideas.</li>
<li>Java is based on Unicode and has good internationalization
support.</li>
<li>Java has lots of neat packages we can use.</li>
<li>It would be useful to have Scheme (and ELisp) scripting
for Swing applications.</li>
<li>It is a good way to learn Swing!</li>
<li>One problem:  There is no open-source Swing implementation.</li>
</ul>
</slide>

<slide id="screenshot1">
<caption>A screenshot</caption>
<img src="jemacs1.png"/>
</slide>

<slide id="kawa">
<caption>Kawa</caption>
<p>Kawa is a Scheme implementation written in Java.</p>
<p>Kawa compiles Scheme definitions to Java bytecodes.<br/>
This provides decent performance, with fast response.<br/>
Benefits from products and research into fast Java execution.</p>
<p>Kawa is a modern object-oriented Scheme with many features,
includes a full set of numeric types (including units),
a module system, and easy Java/Scheme integration.</p>
<p>Kawa is GPL'd free software, used by a number of projects.</p>
</slide>

<slide id="kawacomp">
<caption>The Kawa Compiler framework</caption>
<p>
Part of Kawa is a package for generating Java bytecodes,
plus a higher-level package using <code>Expression</code> objects.</p>
<p>
These are used to generate efficient Java bytecodes from Scheme.
There is also an incomplete compiler for EcmaScript (JavaScript).</p>
<p>
Others are using Kawa to compile other languages.</p>
<p>
I am working on compiling Emacs Lisp.
This replaces the Emacs byte-compiler:  Instead of <code>.elc</code>
files loaded into the Emacs bytecode interpreter we use
Java <code>.class</code> bytecode files loaded into a Java Virtual Machine.
</p>
</slide>

<slide id="elisp-vs-scm">
<caption>Problems implementing Emacs Lisp</caption>
<p>Implementing ELisp raises some problems compared to Scheme:</p>
<ul>
<li>Slightly different reader syntax.</li>
<li>ELisp treats false and empty list as the same;
in Scheme they are different.</li>
<li>Different set of pre-defined functions and macros.</li>
<li>ELisp has a separate name-space for functions and variables.</li>
<li>ELisp has dynamic scoping and buffer-local variables;
Scheme has lexical scoping.</li>
</ul>
</slide>

<slide id="bind-constr">
<caption>Constrained Variables</caption>
<p>In Kawa, each global variable is a <code>Binding</code> object.</p>
<p>A <code>Binding</code> has an optional name, a value field,
and a <code>Constraint</code>.</p>
<p>The constraint contains the actual methods that get/set the
value of the <code>Binding</code>.</p>
<p>Setting <code>b</code> to <code>x</code> does
<code>b.constraint.set(b, x)</code>.</p>
<p>The default action for <code>get</code> retrieves the <code>Binding</code>'s
value field.<br/>
Different sub-classes of <code>Constraint</code> have
different implementations of <code>get</code> and <code>set</code>.</p>
<p>If there is a thread-local dynamic or buffer-local binding,
we just put the appropriate constraint in the binding.</p>
<p>This framework can handle indirection, unbound variables (<code>get</code>
throws an exception), and constraint propagation.
Changing a value can trigger arbitrary checks or notification messages.</p>
</slide>

<slide id="modes">
<caption>Mode classes</caption>
<p>In Emacs, a <em>mode</em> is a set of keybindings, functions,
and variables local to a buffer.</p>
<p>In JEmacs, it makes sense (<em>not yet implemented</em>)
we want a separate <em>mode class</em> for each mode.</p>
<p>Each buffer has a linked list of <em>mode instances</em>, one
for each major/minor mode of the buffer.</p>
<p>Mode functions are compiled to virtual methods of the mode object.</p>
<p>Instead of a buffer-local variable, use a field of the mode object.</p>
<p>Provides fast access to variables in compiled code, without
run-ime symbol lookup.</p>
<p>A derived mode can use mode class inheritance.</p>
<p>Not clear if this model works for legacy ELisp code.</p>
</slide>

<slide id="swing">
<caption>The Swing toolkit</caption>
<p>Swing is a new GUI toolkit introduced by Sun in 1998.
It builds on the earlier AWT (Abstract Windowing Toolkit),
but has new "widgets" based on separating the "model" (data)
and "view+control" (look+feel).</p>
<p>
Swing contains a number of concepts similar to or inspired by Emacs.
Most importantly, separation between <code>Document</code> and
<code>JTextComponent</code> is similar to that between Emacs
buffer and window.  JEmacs builds on this separation.</p>
<p>
Swing also has a <code>Keymap</code> class similar to that of Emacs,
and a <code>Position</code> is like an Emacs marker.</p>
<p>
Unfortunately, neither of these are quite right for Emacs, but it
was not difficult to create new classes that implement the
Swing interfaces.</p>
</slide>

<slide id="buffer-classes">
<caption>Swing-based classes for Emacs</caption>
<p>
I wrote Java classes to implement the Emacs editing types:</p>
<ul>
<li>
<code>Buffer</code>:  An Emacs buffer.  Contains a Swing
<code>StyledDocument</code> that manages the actual text (and styles).
Contains a <code>BufferKeymap</code>, which manages the actions
executed for different keystrokes.</li>
<li>
<code>BufferKeymap</code>:  A data structure in one-to-one association
with a <code>Buffer</code>.  It implements the Swing <code>Keymap</code>
interface, and manages the primitive <code>Keymaps</code>,
to give the Emacs functionality.</li>
<li>
<code>Window</code>:  Extends the Swing <code>JtextPane</code> class.
Includes an associated <code>Modeline</code>, and a scrollbar.</li>
<li>
<code>Frame</code>:  A top-level window. 
A <code>Frame</code> contains a nested
hierarchy of <code>Window</code>s,
sub-divided using Swing's <code>JSplitPane</code>.</li>
<li>
<code>Marker</code>:  A position in a buffer that gets adjusted as
needed.  Implements the Swing <code>Position</code> interface, but also knows
the <code>Buffer</code> it points to.</li>
<li>
<code>BufferContent</code>:   The actual characters of the <code>Buffer</code>.
(A <code>Buffer</code> contains a <code>Document</code>,
which used a <code>BufferContent</code> object,)
This class is needed because standard Swing does not support
the <code>Marker</code> semantics Emacs needs.</li>
</ul>
</slide>

<slide id="swing-scm">
<caption>Scheme editing procedures</caption>
<p>
JEmacs includes a number of Scheme procedures
for operating on the Java classes just mentioned.</p>
<p>
The Scheme API is designed to be similar to the traditional
ELisp functions, but put in Scheme form.</p>
<pre>
(define (beginning-of-line
         #!optional
         (n  :: &lt;int&gt; 1)
         (buffer :: &lt;buffer&gt; (current-buffer)))
  (invoke buffer 'setPoint (point-at-bol n buffer)))

(define-key global-map "\C-a" beginning-of-line)
</pre>
<p>This makes it easier to mix Scheme and ELisp, convert
(if desired) ELisp to Scheme, and lets us re-use ELisp documentation.</p>
</slide>

<slide id="hanoi">
<caption>Towers of Hanoi in JEmacs</caption>
<img src="jemacs3.png"/>
</slide>

<slide id="emacs-lisp">
<caption>Running existing Emacs packages</caption>
<p>Goal is to be able to run most ELisp package unmodified.</p>
<p>Some packages may require minor changes, but no more than
(say) porting to XEmacs.</p>
<p>So far one package (<code>hanoi.el</code>)
runs basically as is.</p>
<p>It may be <em>desirable</em> to re-write some packages, optionally
in Scheme.  For example, <code>dired</code> needs a more modern interface.</p>
</slide>

<slide id="threads">
<caption>Multi-threading</caption>
<p>
Major Emacs annoyance is that it is single-threaded:<br/>
This means Emacs freezes up when doing non-trivial tasks.
</p>
<p>But Java supports multiple threads, so no problem, right?</p>
<p>Wrong - we still have to deal with:
<ul>
<li>ELisp is single threaded.  An Elisp function may at any time
switch to a different buffer or window.<br/>
Hence we cannot just associate a thread with each buffer or window.
</li>
<li>
Swing is single-threaded.  Only one thread (the event thread)
can safely modify a buffer that is visible in a window.
</li>
</ul>
</p>
<p>Solution:  Interactive Emacs "commands" execute in the
Swing event handler thread.
However, commands that may take some time should create a "worker
thread."  When the worker thread is finished, it lets the event thread
know it is done, so it can update the buffers and display.</p>
</slide>

<slide id="unicode">
<caption>Unicode in Java</caption>
<p>Java <code>char</code> type is 16-bit Unicode character.</p>
<p>Internal strings and JEmacs buffers use these Unicode <code>char</code>s.</p>
<p>External files consist of 8-bit bytes.</p>
<p>Mapping between byte streams and character streams uses named encodings.</p>
<p>The Swing framework handles bi-directional text for us.</p>
</slide>

<slide id="unicode-img">
<caption>Displaying Unicode chars with bidi</caption>
<img src="jemacs4.png"/>
<p>(Note split selection due to bi-directional text; also incomplete fonts.)</p>
</slide>

<slide id="xml">
<caption>Word Processing and XML</caption>
<p>
Documents will be increasingly represented using XML (externally)
and DOM [Document Object Model] (internally).</p>
<p>
Document formatting will be done using the "flow objects"
of XSL (XML Stylesheet Language).</p>
<p>
Most of the programs and libraries for manipulating and
formatting XML are written in Java.
The Apache group has some major Java-based XML projects.</p>
<p>
If Emacs is to support word processing features, it
should build on XML standards.
It is easier to use third-party Java libraries if the
Emacs core is Java-based.</p>
<p>The future of Emacs should include XML, WP, I18n, L10n.
Much of the activity in those areas uses Java.
This suggests that perhaps the future Emacs should be Java-based.</p>
</slide>

<slide id="conclusion">
<caption>Conclusion</caption>
<ul>
<li>"Proof-of-concept" prototype working, including partial ELisp implementaion</li>
<li>Still a lot of work before you want to use it for day-to-day editing.</li>
<li>Using Java and Swing libraries takes care of many problems.</li>
<li>JEmacs homepage: <code>http://JEmacsSourceForge.net/</code>.<br/>
   (Soon also <code>JEmacs.net</code>.)</li>
<li>JEmacs mailing list.  (See homepage.)</li>
<li>JEmacs currently distributed as part of Kawa.</li>
<li>Kawa home-page: <code>http://www.gnu.org/software/kawa/</code>.</li>
</ul>
</slide>

</slides>
