May 22, 2020
Photo: En Cas de Feu ©2014 David Bethune
When I talk about Mimix as a fact-checking or fact verification tool, the most common question I hear is, “But how does Mimix know what constitutes a fact?” I imagine the other person is visualizing the man behind the green curtain from The Wizard of Oz – someone pulling the levers and deciding what is truth. This is definitely not how Mimix works, so to dispel this notion, let’s look at another way to determine facts… the way people already do it in their own heads.
Facts Are Not Opinions
Whenever the idea of asserting the truth comes around, people worry their own truths might be stomped on: their feelings about culture, politics, religion, and morality just to name a few. While these feelings are justifiably important and, in fact, they define many aspects of our lives and personalities, they are not actually facts. A contemporary example will serve to illustrate. Today, I saw the headline, “Michigan AG tells Trump to wear mask while visiting Ford plant: It’s ‘the law.’” This certainly sounds like a factual statement, but it is an expression of opinion. Neither the headline, nor the article, nor its quoted sources provide any factual backing as to why this would or wouldn’t be law. In fact, the actual text of the article does not say that the President must wear a mask to visit the Ford plant, despite that implication in the headline. Therefore this cannot be a story about the facts of the matter. An honest evaluation, to be performed by the reader, would require access to all the underlying facts on both sides of the opinion.
While all of our interesting discussions and documents contain opinions like this, an enormous amount of our everyday communications consists of things which are real and largely indisputable facts, such as the boiling temperature of water at sea level, the price of our mobile phone plan, the name of that gas station on the corner (or how much they charge for gas), or the atomic number of plutonium. Our speech and writing serve to communicate our thoughts and feelings, but those are peppered with real, verifiable facts. In this story, Trump, the Attorney General of Michigan, the laws of that State, and the laws of the United States are either verifiable sources of facts that can be quoted – or they are not.
Recasting a Narrative as Facts
With enough research into original sources, it’s possible to re-frame a narrative as a list of the facts it’s asserting. To continue with the previous example, a journalist or legal researcher might consult many documents from different sources and see what court decisions, opinions, or precedents they represent. Much of this information is locked up in proprietary databases and not easy to find on Google. It’s certainly not readily available in the article itself.
However, it is possible to combine the narrative text from the article just as the author wrote it with the factual sources used to support his assertions. Taken to the extreme, every sentence could be backed by some source of information. Even a statement like, “The sun rose at 6:58 this morning,” could be backed by a source which the reader could access from the sentence itself.
It’s also possible to make a table of all the facts which are stated in a paper, a website, or a book. It’s just a list of every assertion made in the text. Researching something like the Kennedy assassination would produce tables of facts from different authors which contradict each other, even on basic things like the timeline of events in Dealey Plaza.
Facts Depend on Whom You Ask
Now, wait a second. I just finished saying facts are supposed to be indisputable, like the sunrise. And now I’m saying different people turn up with different facts. And both of those things are true. Science tells us that the observer cannot be separated from the event. Crime scene interviewers will tell you different people report widely different information from their observations of (what can only be) a single set of facts.
Science, engineering, medicine, law, and business rely on accurate, repeated metrics from their observations. Scientific progress depends on repeatable, shared observations. Academia, too, relies on accurate recordings of others who have gone before.
Not only the source we are reading but also all of that person’s sources and their repeatability become factors in determining the truthfulness of a narrative. Some facts show up repeatedly and others are outliers, buried underneath noise or lost among more popular reportage. In the Dealey Plaza narrative, only a single set of events can, by the laws of physics, actually have taken place. The differences in observations by the participants and writers (and the parts they chose to leave out) serve as one “universe of truth” contained inside a single document or passage.
Facts Depend on When You Ask
Not only whom you ask but also when is a factor in determining truth. For many, many things, truth changes over time. On a quotidian scale, the sun rises at a different time every day. So the truth of “What time did the sun rise?” is different everyday (and everyplace). On a more cosmic scale, the Earth’s environment and the continents themselves are in constant shift. So even simple questions like “Where is North?” or “What are dinosaurs?” require updated information in order to be correct. With complex questions like, “How does climate affect life on earth?,” the stakes are even higher.
Legal and financial truths are also under constant revision. Was this person accused of something and later acquitted? Then the facts of his or her legal status have changed. Were the company’s financials improperly reported and then later revised? The facts of their accounting have changed.
In the medical field, facts shift from day-to-day and, in critical care, from minute-to-minute. What drugs has a person been given? What was their physiological response? In nuclear and physics processes, experiments produce huge amounts of data and analysis that must be connected with existing data in order to be useful. In biomedical research, new test results continually add to a list of known facts which can be difficult to relate across papers and publications, and are therefore frequently outdated.
Many industrial accidents occur because critical safety and systems information was not properly communicated to crews or staff – or was overlooked during the research process. Famous disasters in this category include the Space Shuttle Challenger, the Chernobyl nuclear accident, and the recent 737 MAX debacle. All of these were multibillion-dollar economic catastrophes and all caused by faulty written documents.
So in order to have any chance at finding out the truth, we have to know both who was quoted and when. A system for managing facts across the enterprise would also make it easy to effect bulk changes across documents when facts do, necessarily change.
Deciding What Is True
So who makes the final determination about what is true? You do! And this is the process we also use in real life. Over time, we learn to trust and rely on certain sources as truthful, largely based on how much we know (or think) that person has done their homework. If I were writing about a painting in the Louvre, for example, I could reasonably assume that their information is correct. There’s no need for me to go and verify a date or fact elsewhere because I’m unlikely to find out they’re wrong. If my writing references theirs in a way my readers can see, then they can trust me, too – at least on those few facts.
Our personal picture of truth is never exactly the same as anyone else’s – and we wouldn’t want it to be. If we all thought the same, innovation would come to a halt and liberty would be extinct. Better to give people the tools they need to find out where facts came from and when. That way, individual readers and writers can, themselves, decide what is factual.
Fact-Based Writing Will Change Society
Just a sprinkling of veracity in our everyday communications would be enormously helpful. A true fact-based form of writing has the potential to change every area of society. This is what we are building at Mimix. Our MSL language enables identifying and marking up the facts inside narrative text. And our Hybrid Database technology tracks changes in those facts over time and lets you analyze them, correct them, and incorporate them into your own work with your sources included. All of this happens inside your machine, using your organization’s documents and the sources you choose.
To learn more about our vision, I invite you to download the whitepaper. And if you’re interested in experimenting with our technology, please explore the MSL Specifications and our new open source framework, Nebula.
April 27, 2020
The Mimix Nebula system of software deployment simplifies the packaging and distribution of cross-platform applications built with NodeJS, HTML, and WebSockets.
Today, we’re making available an early version of our Nebula deployment system for cross-platform applications. Nebula is an open source product, available free under the Blue Oak License and ready for you to customize to deploy your own HTML-based desktop applications with WebSockets on Windows, Mac, and Linux.
Get the Download
There are four ways you can get the free Nebula software:
- Download the Windows installer.
- Download the Mac installer.
- Download the AppImage for Linux.
- Download the source zip or tar and build from instructions.
The Nebula installer contains everything you need to experiment with creating and modifying a Nebula app without any “building” steps. Simply use your favorite text editor to modify the HTML/JS files. Note: The current release is not digitally signed. Click “Yes” in the Windows installation dialog to allow an “Unknown Publisher” (The Mimix Company) to install the app.
You can move to building from source when you’re ready to design your own back-end server or use a different set of frameworks than we include.
Before You Begin
We recommend reading Nebula: Simple & Flexible Applications with MSL Data before working with the Nebula download. It provides the background information and jumping-off points to help you think about the new kinds of applications you can build with Nebula based on the tools and frameworks you already know.
Disclaimer of Warranty
The Nebula software is licensed to you without cost on an as-is basis and without any warranty, express or implied. This free license does not include access to tech support. Bug fixes and upgrades, when and if they become available under this license, are your responsibility to install or integrate.
HTML applications offer a wealth of frameworks and tools for generating cross-platform user interfaces. Each developer has his or her own favorite code editor, web UI framework, and hosting solution. Where all of these fail to deliver is in packaging an application so that it can fully benefit from its cross-platform nature on all three desktop operating systems: Mac, Windows, and Linux.
The Nebula solution brings together the foundational elements needed to build a modern web application with HTML that can be rapidly deployed to any user desktop — without requiring a developer account signup anywhere and without committing to any specific UI framework or backend.
Here are the components we’ve packaged for you to build with. All of them are also open source and you can plug-in other packages or tools you prefer:
- Built-in local NodeJS webserver.
- Built-in Chromium browser.
- Built-in WebSocket server and receiver.
- Built-in “Local World,” an easy-to-edit HTML home page.
- Built-in data-binding, routing, and commanding (AngularJS).
- Built-in Bootstrap UI components.
- Built-in Bskit UI components and themes.
- Built-in MDI icon sets.
- Built-in custom color names.
- Access to local (packaged) or remote web pages.
- Access to any NodeJS service via WebSocket.
- All packaged in an Electron-based, one-click desktop executable for Windows, Mac, or Linux!
The Nebula tech stack was designed to fulfill our requirement that our apps be able to run locally on any platform or be hosted in the cloud. We also wanted our app to have a simple way to communicate with a server or another instance of itself over a network, or within the same machine.
These goals were achieved by stacking the Electron packaging system with an AngularJS front end. We chose AngularJS because of our experience with the framework and the body of our existing code that could be used to scaffold the app’s interface. You can use Nebula with any framework of your choice by simply substituting the AngularJS index.html file (and its supporting dependencies which are installed with Nebula) with an index.html and the appropriate dependencies for the framework(s) you want to use.
Because all HTML/JS frameworks work in a server/browser environment and Nebula provides both the built-in web server and browser for all platforms, all HTML/JS frameworks will work with Nebula.
Being regular HTML/JS apps, Nebula applications can communiciate using any standard protocol or API supported by the Chromium browser. We’ve included special scaffolding for WebSocket applications, a simple text-based, always-on communciations channel we use to let our app talk to a local or remote server or to another instance of the app itself.
We chose WebSockets because the format is pure text without API restrictions. This means you can send messages over WebSockets which use any existing API syntax — or you can make up an entirely new one to fit your application. Since our applications speak the MSL language, we use this wire to communicate MSL between our viewers and servers, but you could use the wire for anything — even a group chat.
Using WebSockets as data wires, separate from the HTML/JS user interface itself, opens up Nebula applications to a new kind of communication and UI paradigm.
Desktop apps communicate with data sources using the same monolithic app that draws the UI. Web applications rely on a remote server to perform any backend communications and to provide a renderable interface.
Nebula apps are enabled for a third kind of architecture, shown in the diagram. In this design, UI elements are delivered via HTML/JS while data is delivered on separate WebSockets for each source. This allows hot-swapping an interface over data, or vice-versa.
Components & Data Binding
We’ve chosen to include the Bootstrap UI components because they cover most of the requirements of modern UI design. Our Nebula app also contains many examples of data binding where application values are shown inside UI buttons or other elements, or where the appearance or functions of UI buttons are changed based on program values or values received over the WebSocket wire.
Using our sample code, you can quickly build a non-trivial AngularJS application, or you can replace our AngularJS components with your own components from a different framework.
Routing & Commanding
Because routing (figuring out what page to display) and commanding (executing a function in response to a button or menu item) are two of the most tedious parts of application design, we’ve scaffolded these into Nebula for you to use as a starting point. By examining and modifying our code, you’ll be able to connect your app’s buttons and menus to the back-end functions they need. We’ve also provided examples of how to make UI elements send data over WebSockets and how to respond to incoming WebSocket messages and update the UI.
After downloading the Nebula app for your platform (Windows, Mac, or Linux), you can experiment with the built-in controls before modifying the app to suit your needs.
Nebula installs a local WebSocket server which you can connect to using the Local Mimix button in the interface. The local Mimix service provides an echo of whatever you send it. You can also connect to two public WebSocket servers which also echo what you send.
Once connected, use the Send button to send text over the WebSocket. The examples show MSL text from our programming language, but you can modify these in the source code.
Websocket messages you send and receive are shown in the UI in the order in which they took place. The Clear button erases the input text box and the Auto checkbox does so automatically after each WebSocket message is sent.
We’ve provided the scaffolding to easily modify this code to use WebSockets from any set of servers. The interface updates automatically to show Connect buttons for all the servers you define.
In addition to a standard WebSocket wire, we’ve provided a separate admin wire on a different port which your application might find useful. In Mimix, this wire is used to send admin commands to the server which are separate from the data, and you might find a simliar use for an admin wire in your app. Admin wires are defined in the built-in scaffolding. We use them to increase security by ensuring that user-facing applications simply don’t have the wiring to communicate with admin servers.
Commanding & Data Binding
Most of the buttons in the sample UI contain examples of data binding and commanding. The orange Connect buttons, for example, are labeled with the server names which are dynamically taken from the list of WebSocket servers you define in the source.
When a WebSocket server connects successfully, the Send button is automatically updated with the server name and the button color is changed. Connecting also updates the banner component which opens the page, changing it from “Not Connected” to the version number of the connected server. This is an example of an AngularJS service updating the bound values for a separate UI component, in this case the banner. The banner detects the changes in the bound values and updates automatically without explicitly being called.
When data is received, the scrolling list of received messages is updated in the UI. You can move between servers sending different messages and the code is smart enough to preserve the “always-on” connection if it already exists or to start a new one if it doesn’t, using the same UI button.
While these examples are trivial, your application can use the same commanding and data binding methods to communicate with any kind of server and update the UI appropriately.
Navigation & Routing
The Nebula app includes built-in features for navigation and routing. The menu displayed across the top right is made of easy-to-edit JSON. AngularJS pages are contained within views, each view being made up of directives or custom HTML components with code behind them in controllers. Straightforward file organization inside the app’s folders makes it easy to find and edit the views or directives you need to build your app.
Fonts, Icons, & Colors
Icons and colors are often frustrating to add or edit inside non-trivial apps. The Nebula sample app includes a custom color naming system, making it easy to design named colors and apply them to all of your UI elements. On the home page, for example, you can click a color name to apply that color to a large icon on the page. We’ve included a custom font and the MDI icon set, too, as jumping-off points for you to choose your own fonts and icons.
Global and Scoped Data
Finally, we’ve included examples of both globally and locally scoped data, often a point of confusion in AngularJS apps. The version information shown on the app home page, for example, is a global value passed to the directive which displays it. The WebSocket messages on the same page are lexically scoped and visible only to the component that displays them. You can use these examples to build similar components in your apps without digging through documentation for working examples.
Other Scaffolding in the App
The Nebula app contains other functions it inherited from another product of ours. These are not utilized or documented in this release. However, an inspection of the source code will reveal the following ready-made and working features which you can use in your own Nebula app:
- JSON-based data from local files or Firebase
- User and app configuration settings
- Protected routes with Google authentication (no keys/ids in app)
- Page-style CMS with named, routable pages and JSON contents.
- Blog-style posts with named, routable posts and JSON contents.
- Modular components for displaying products and profiles.
Note that this version of the Nebula app will show console logs and error messages and also displays broken links and missing page content. The content was removed from this open source version because it contains code that is the subject of a US patent application.
You can visit the Ungallery app from which Nebula was built to see these features working in a production alpha.
Index to Source Files
The Nebula installers create the following folder structure under the directory where you choose to install this product. These examples (installation directory: c:\Nebula) are for Windows but the same structure applies under the installation directory for all platforms.
Contains nebula.exe, the desktop executable. For diagnostics, Nebula can be run from your own browser. See the documentation to start the local web server and for more information about building your own Nebula app from source.
Contains index.html, the app home page.
Contains folders for the custom controllers, css, directives, images, json, services, and views which comprise the AngularJS app. The AngularJS functions used to build these components are documented here. Notice you do not need to download or install AngularJS in order to begin coding with the installed app.
Contains one folder for each framework used by the app.
Contains the back-end servers for the app. Currently compiled, these will be released in open source later. You can replace the streams module with your own back-end and rebuild the app. For testing, simply run your server in a separate process and change the ports in the app to test from a browser.
— D 4/27/2020
April 27, 2020
Note: While preparing our new Nebula tool for release, I came across this unpublished post about tooling that I’d written two years ago when we started the company in November, 2018. Sprint, mentioned here, has since become a nothing-burger sub-brand of T-Mobile. Their tooling is one of the reasons they failed.
Thanksgiving is almost as famous for the dinner conversations as for the dinner itself. This year, while discussing geeky stuff with my cousin, he asked me about the latest Intel Core i9 processors and how they fit into our hardware plan. I told him that I like to buy the fastest processor I can get at the time because it lasts longer. In my experience, the very best equipment costs about half again as much as the next tier down but performs well far longer. I said, “Most people are on a three year upgrade cycle but with this approach you can get to a…”
“Five year cycle,” he said, as we finished the sentence together. Xavier was a VP of Operations for a major worldwide airline. He knows a lot about procurement cycles and getting the most out of expensive hardware. And PC hardware is expensive, there’s no doubt, which is why many companies avoid spending what they should on IT.
When I was a mainframe programmer at IBM’s Santa Teresa Laboratory, we had a zero-day upgrade cycle. You could get anything you needed immediately. IBM operated a PC Store on campus just for employees. With nothing more than a manager’s signature on a tiny scrap of paper, you could stroll into the PC Store and get whatever you wanted. There was no drawn-out requisition process. I didn’t even have to specify a model or a price. Your ticket would say something like “PC Desktop” and a signature and you’d just pick out what you need. Needless to say, this was a geek’s dream come true. No having to explain your equipment requirements to people in suits. No arguments about how much it cost to have good tools.
Now you may say, “Well, IBM has a lot of money and they make computers and they could afford to do that.” Of course, those things are true. But there’s another reason for IBM’s generosity when it came to computer hardware and software. It’s because the company figured out that such an approach would actually save them a huge amount of money.
You see, IBM is very studious. They don’t do anything without having a PhD chart it out first. Years ago, they asked some smart guys to see if the computers and software tools available to programmers made a difference in their productivity. They used a simple metric: the responsiveness of the computer vs. the number of lines of code minus the lines of errors that the developer produced. Using their own worldwide network that predated the internet, it was easy to measure the computer’s responsiveness — down to the millisecond. Likewise, it was easy to correlate every programmer’s output with how many lines of ‘good’ code and how many lines of errors he or she wrote.
If you’ve ever written code, their findings won’t surprise you. Every millisecond of delay in the developer’s computer was costing the company money in reduced output and increased errors. With what IBM was paying programmers then (and what they cost now), it was simply cheaper to make the computers as fast as humanly possible in order to allow the developers to be as productive as possible. When programmers leave the flow state, as Mihaly Csikszentmihalyi calls it, there is a time penalty to recover. Any delay in the hardware or software could result in losing the programmer’s attention and taking him or her out of flow. Hence their willingness to let us have any tech tools that we needed for our jobs.
Taken from a 30,000 ft. view, this concept applies to everyone in your organization — not just programmers. In an earlier post, I alluded to a brief stint I did at the Sprint store. Let me tell you, Sprint has shockingly bad IT. Shockingly. Bad. It’s clear that no one at Sprint has studied how much time their employees spend fighting with the company’s outdated hardware and software. It’s time they’re not spending on getting new customers — or making the current ones happy, which is even more important. Anyone who watches Undercover Boss can tell you the same thing. Old, slow computers and outdated software waste your employees’ time but, worst of all, they hurt your product and your customer service quality.
In addition to making the programmers more productive, having better computers made working there a lot more fun! I’ve always thought that one of the best perks of being in the computer business is that you get to have kick-ass hardware. It’s quite possible that the enjoyment of having performant hardware and software contributes just as much to programmer productivity as response time. People should have the tools they need and the tools they want in order to do a great job. At IBM, this required a $40 million mainframe which they also let me access from home. How many nights I spent volunteering to learn about work hardware and software because it was simply awesome — versus being frustrated all the time with crappy tools and trying to avoid them. Today, I use a $2,000 laptop for the same reason.
IBM had another brilliant way to spread tools around to those who needed them: an early internal version of eBay where anyone could post hardware they didn’t need and any other employee could take it from them. All that was needed to complete a transaction was the signature of the manager in the listing department and the signature of the manager in the receiving department. The surplus hardware was then shipped via the company’s internal shipping mechanisms to the loading dock nearest the new “buyer.”
What a brilliant way to reuse equipment within the company! At IBM, we called this spending Blue Money, money that had already been spent and now should be maximized and used again as many times as possible since it wasn’t costing the company anything with outside vendors. Perhaps there is some “Blue Money” in your organization that could be better utilized somewhere else.
At The Mimix Company, we’re working hard to invent the next generation of reading, writing, and research tools for computer workers of every kind. We believe that better tools give better results. We invite you to check out the whitepaper and join our efforts!
January 21, 2020
The following is a brief excerpt from the MSL Language Specifications 1.0, a 70-page comprehensive introduction to a new programming language for applications built with Mimix technology. I hope this tiny introduction encourages you to download and read the full MSL Spec, and to email me with your comments, questions, and ideas! — D
Mimix Stream Language is a domain-specific language designed for recording and playing back text editing sessions. A list of MSL expressions is an indelible record of source materials, original writing, editing, and notes in the order in which they were introduced. MSL is the filament in the Mimix light bulb. It is the most important part of the system.
MSL provides high level operations for creating and destroying atoms, building and applying metadata, recording and sharing streams, and accessing and editing these elements.”
— Mimix Whitepaper
A word processor knows about words, sentences, paragraphs, pages, and documents. It does not, however, understand the factual information contained in what you read and write. Mimix represents the evolution of the word processor in that it provides a means for text processing software to interpret, display, analyze, compare, correct, and transform the factual data inside our documents.
The Mimix whitepaper describes in detail the historical precedents for a different kind of reading and writing system. It also lays out several scenarios in which such a system can radically transform how we interact with written documents. In short, this demands several new capabilities from our software:
- Fact checking and correction of factual errors in written text.
- Options for viewing facts in multiple formats.
- Access to source materials for all references.
- A “rewind” facility to return to any previous version of text.
- A portable version that can be shared with all references intact.
Mimix is a new way of looking at research and writing. To implement it requires new tools and a new vocabulary. When this document first mentions a word that has a special meaning in Mimix, it will appear in bold.
The Mimix ecosystem is built around a new programming language called MSL, Mimix Stream Language. This document describes that language in detail. A new programming language was desirable for several reasons:
- It allows modeling an editing session as a program which can be audited and replayed.
- It is easier to read and write than the underlying system code.
- It provides built-in protections that the underlying system cannot.
- It serves as a common specification between all the parts of the system.
- It enables anyone to write software that works with Mimix.
Recording a series of text documents and edits in the form of a program allows thinking about an editing session in programming terms. We can view words and paragraphs as variables and values. Text references such as a document we’re quoting can be viewed as dependencies. This lets us extract any part of the program (the editing session), find its text dependencies, and work with them using any software that understands MSL.
The MSL language is designed to be easier to write than the actual system code which is needed to perform the same functions. You can think of MSL as a shortcut method for describing reading and writing as they take place in linear time.
In most cases, Mimix users will never see MSL, but it’s designed to easy to read by human beings regardless. This makes testing, debugging, and extending the system easier.
The MSL language provides built-in protections for data that the underlying system would not. This is used to enforce protections for factually correct information, which we call canon. MSL also uses a built-in set of rules to prevent overwriting important data. Most importantly, it offers a granular rewind capability in which any piece of data can be “rewound” to any of its previous values. This same ability allows extracting any value in the system for use in a different context with all of its dependencies and sources intact. These features are unprecedented among programming languages and are defining aspects of MSL.
The MSL language is communicated between a viewer, where text editing takes place, and an MSL engine which records the expressions in a hybrid database which offers access to both past and present values. This allows the development of numerous compatible engines and viewers, all of which rely on the present MSL specification.
Anyone can write tools that work with MSL and this is encouraged by specifying a language definition that is simple, complete, and coherent. Developers can easily visualize the MSL they will need to input or output in order to achieve their desired results.
Mimix Stream Language is a domain-specific language designed for recording and playing back text editing sessions. A list of MSL expressions is an indelible record of source materials, original writing, editing, and notes in the order in which they were introduced. MSL is the filament in the Mimix light bulb. It is the most important part of the system.
MSL is designed to be good at two kinds of jobs. First, it can be used to record the actions of a Mimix user as he or she reads, writes, edits, and annotates text. It provides good traceability in that the source of values and their changes over time can be readily ascertained. It enables a rewind facility so that previous versions of a stream can be viewed, extracted, or rolled back.
Second, it can be used as a playback language to recreate the same view or a different view of the recording. This allows recreating an entire view or stream from the MSL text at any point, or creating a new type of presentation or transformation of the materials a user has viewed, written, edited, or annotated (or any subset of these) up to a certain point. In other words, MSL can be used to play back any part of the stream with a different view than the original.
MSL is a stream recording language which captures a user’s reading, annotations, writing, and editing into a stream of views as they take place in the viewer. User behaviors are recorded as a series of MSL expressions. These can be used individually, in parts, or in a sequence to recreate any part of the stream using the same view, a different view, or a different viewer.
MSL expressions are executed by a state machine. The state is stored in a hashtable in the MSL engine. MSL expressions passed to the engine can get or set values in the hashtable. The current value of the hashtable represents the full system state after the last MSL expression executed. The state machine serves as a convenient reference to the values contained in the stream, which are also preserved separately.
MSL is an imperative language in which every expression recalls or updates the system state in a sequential order. The state is defined as the value of the hashtable up to and including the execution of the last MSL expression. Although MSL is not procedural, it does allow the application of selectors and transforms, functions which are applied to an atom, the basic unit of data in the language. Selectors and transforms permit viewers and engines to define functions outside of MSL which are then acted upon by MSL values.
MSL is lexically scoped with atoms having values defined in the atoms themselves, their metadata, in their applied canon, or in one of these belonging to aprevious containing view, stream, world, or machine.
MSL is a functional language in which every program contains all its literal values inline and all functions can be resolved through variable substitution without reliance on an independent state. MSL is also functional in the style of the Lambda Calculus in that every MSL expression is permitted only one parameter, its value. MSL supports currying values in that the result of single-argument functions are passed upwards to other functions in order to provide a final value.
The hybrid database in a Mimix system serves only as a convenience for quick resolution of an atom’s value. The actual values themselves are always recorded in MSL text and can be resolved by simply examining the MSL text backwards without reference to the hybrid database.
MSL is not Turing complete and essentially consists of only elaborate variable getters and setters. No MSL expressions perform looping, testing, branching, or recursion.
MSL is an interpreted language with the system state stored in a nested series of hashtables backed up by a stream of MSL text. MSL expressions which set an atom’s value change the system state by updating values in the hashtables and recording a new entry in the text. MSL expressions which only return a value do not change the system state and simply recall the current value from the hashtables or the text while also recording the expression itself in MSL text.
The simplest way to provide a permanent record of what you read and write is to write down everything you do in chronological order. In fact, this is how MSL works.
If you were planning a trip to Hawaii, you might read two books and make some notes:
- (@fodor-book Fodor’s 2020 Guide to Honolulu and Oahu)
- (@lonely-planet Lonely Planet Hawaii)
- (@hawaii-notes International Marketplace. Pearl Harbor. Shave ice.)
These are MSL expressions. Whenever a valid MSL expression appears in this document, we’ll use a code font in blue. When MSL concepts are referenced without a full, valid syntax, we’ll use a code font in black.
The (@) notation here refers to an atom, the basic unit of storage in MSL. Every atom includes a key which identifies it. It’s customary (but not required) to write the key up against the @ sign.
(@lonely-planet Lonely Planet Hawaii)
This atom contains the title for a book. In order to be useful as a reference going forward, we would need to record the entire book. We could simply dump the book’s contents into the atom. Or we could give individual paragraphs, people, places, etc. their own atoms that sit inside the book atom.
(@lonely-planet Lonely Planet (@Hawaii)
(@intro The goddess (@Pele) …)
(@chapter1 (@p1 When the ancient (@Hawaii Hawaiians)…) (@p2 …))
Text which is read, written, or edited by a Mimix user is recorded sequentially in a series of atoms, and this will be shown through many examples.
Atoms can be grouped into views, streams, worlds, and machines. Atoms can also be defined in canon, giving them a type of data protection. Views, streams, worlds, machines, and canon are atoms that reside in their own namespaces. These basic concepts enable all five of the abilities we need to go beyond word processing.
— D 1/21/2020
December 16, 2019
Improving the user experience with software in the ways described in the Mimix whitepaper demands new thinking about how we present and deploy applications to users. Today, I’m excited to announce our Nebula system which greatly simplifies these tasks for anyone using or developing Mimix applications.
In Mimix, a user’s reading and writing are recorded in a series of (msl) language expressions called streams. Like word processing documents, streams may be located on a user’s own computer or on a remote host. But streams are not documents. Thanks to their (msl) nature, they contain a great deal of intelligent information about the facts inside documents and how they came together. This information can be used to present unique user interfaces that benefit each type of stream. We call these unique interfaces worlds.
Each world controls not only what streams it offers but also how they appear to the user. A medical world, for example, has a user interface tailored to doctors and medical staff. It uses their terminology and offers specific features for working with medical streams. An oil exploration world would have entirely different streams, navigation, user interface buttons, etc. A button to “Compare Rx” prescriptions would be very useful in a medical world but is useless in an oil world, even though both worlds display (msl) streams.
Many organizations using Mimix will want help with customizing the body of document streams and the interface world for their users. These kinds of changes are difficult to implement with today’s office tools. Every kind of office sees the same interface for Microsoft Word, Excel, and PDFs, even though many offices could benefit from customization if it were simple.
Not all of the desired customizations are as dramatic as the medical vs. oil example. Some customers just want unique navigation or security over a set of standard office documents. Many Mimix customers will also want to offload the job of hosting their worlds and streams to a cloud solution from The Mimix Company.
Nebula was designed to deliver and host these customized worlds, but its modular design also lets the Company offer a suite of valuable free resources which simplify the user on-ramp and encourage exploration and experimentation.
How It Works
Nebula is a way to a deliver (msl) streams from many worlds, each with their own set of rules and user interfaces. It does this with a single application which runs identically on Windows, Mac, and Linux. More than a cloud service, Nebula works both offline and online to provide automatic connections to a variety of worlds and their streams without the need for complex configurations, accounts, or profiles.
Nebula is user-centered and privacy-focused. Unlike web-based applications, control is centered in the open source Nebula app which resides on your physical machine. You decide which worlds you’ll explore with Nebula. Your Local World includes administration tools to see and manage all the streams and worlds available to your machine. Connections between worlds are made on your machine, not between remote worlds themselves. Likewise, (msl) processing is done on your machine by a built-in stream engine. This simplifies administration for world owners and allows greater freedom for users in how they work with all the streams to which they have access.
Along with the hybrid database and (msl) itself, Nebula is a key development that makes it easier to develop, deploy, and support our products. Its free and extensible design encourages others to adopt the system, driving demand for the Company’s expertise and paid services.
A Nebula app is a self-contained system which includes several key components:
- Built-in world HTTP server. Serves up worlds as HTML/JS user interfaces.
- Built-in stream web socket server. Communicates streams via (msl).
- Built-in web browser. Looks and works identically on all machines.
- Built-in Local World. Provides immediate usability, navigation, and admin.
- Built-in Local (msl) Streams. An offline, private data store.
Worlds can be designed and implemented with standard web design tools and are not restricted to a particular framework or language. They can incorporate any kind of display, controls, or security desired by the customer. Worlds look and work identically across the organization for Windows, Mac, and Linux users.
The Nebula app uses a separate connection method, web sockets, for communicating with streams and a separate WS wire for each source of streams. These (msl) web sockets are “always on” inside the app. Streaming (msl) separately from the UI offers three key benefits: First, a single world can integrate serveral sources of streams into a single custom interface.
Secondly, separation of streams and worlds makes it easier to keep the data protected and distinct from different uses of the same Nebula application. The user of a world only has access to the streams the world offers to that particular user. This provides a simple way of managing complex sets of documents from different sources in a single user application. Moving data from a stream between worlds is easy when the user permits it: a Mimix world can send an (msl) expression to any other world on this machine or a different one.
Finally, navigation between worlds and streams is simplified because it’s handled by the world itself, not the Nebula app. The built-in Local World offers one-click access to the online Hello World. Hello World includes public streams but it can also be customized to offer streams from private or corporate worlds. No configuration of the Nebula app is required to make these connections.
The Local World and Local Streams are important components of the Nebula app which represent several fundamental aspects of the Mimix design. The Local World has a built-in interface to, at minimum, the Local Streams. This provides a place to start working with documents immediately, a place to save any streams from any other worlds, and a fallback world for offline use.
Every world can be customized to provide any type of desired user interface. The free Local World included with Nebula, represented by the free icon in the diagram, includes one-click access to the online Hello World with resources for Mimix users.
Streams can also be customized. A number of free Local Streams ship with Nebula, such as sample texts from literature and science. An organization might ship Nebula with its own customized Local Streams.
Local World offers an administrative interface to manage Local Streams as well as other worlds and streams available to the machine. Streams administration takes place over a separate web socket from (msl) communications, represented in the diagram by a second WS wire between the Local World and Local Streams. Customized versions of the Nebula app can include or exclude this administrative wiring. In many cases, a world will make admin functions available to some users and not others by enabling or disabling this admin web socket.
Hello World is a remote, hosted world provided by The Mimix Company. It includes resources for Mimix beginners and experts, including Getting Started resources for new users and documentation for experts. All Nebula users see the same Hello World, indicated by the second app user in the diagram. In this way, Hello World serves as a home page for Mimix users – but a home page built entirely in (msl) with all of the benefits therein. Hello World, like all worlds, can draw from several sources of streams. Each stream is connected on its own web socket, as shown by the three individual WS wires in the diagram.
Nebula users can easily capture and re-use content from any of the streams offered on Hello World. Likewise, they can contribute streams from their own Local World. Hello World is a training system, corporate website, sales tool, support portal, software updating tool, and community forum in one. Access to Hello World is built into Local World and always available.
Because Hello World is hosted, it can be customized and updated easily. World owners can create a hosted Hello World tailored to their own specifications, serving as a portal to both Mimix and private streams. The customized Hello World can be hosted in the Nebula cloud or remotely.
Since content is transmitted as (msl) streams, it’s easy for the world to extract just the part of the Mimix hosted content (or any stream’s content) that it deems useful, rather than just redirecting to someone else’s web page, for example. This is a feature of all Mimix worlds.
Private worlds can be entirely customized by the user or by The Mimix Company. The open source nature of Nebula allows new users to experiment and build offline Mimix solutions at no cost. It also allows enterprise users to inspect and modify the source before trusting it with their organization’s important documents.
The Nebula app includes everything a private user needs to create his or her own worlds and streams. There are no additional frameworks, builders, tools, accounts, or connections required – just your favorite text editor. Private Worlds can be displayed inside anyone else’s Nebula app without a separate distribution step and they work identically on Windows, Mac, and Linux without special configuration.
Private worlds and streams can be hosted anywhere: on a local machine with Nebula, on a network machine, or on the Nebula cloud service from Mimix. Private worlds can communicate via (msl) with any other worlds on the user’s local Mimix machine.
Corporate customers are likely to deploy some of the more advanced features available to everyone with the Nebula app. Like all worlds and streams, corporate worlds can be hosted anywhere: inside the organization, remotely over the internet, or in the Nebula cloud.
In the diagram, the Corporate World also gives access to streams from another private world, perhaps a subsidiary company or partner. In this way, Nebula allows sharing streams of individual documents or entire libraries (or any part thereof) between organizations. The Private Streams in the example control what is offered and the Corporate World controls how it is consumed and used in the user interface. All communications between worlds and streams are by independent web sockets that speak (msl).
As with most Mimix features, corporate worlds and streams can be customized by The Mimix Company. Many corporate customers will want a large body of documents of the same type converted to (msl) with their special features intact. The Mimix Company performs this work and returns a set of streams to the customer with their existing documents in (msl). The Company can also host the customized streams in the Nebula cloud.
Nebula gives academic organizations the flexibility to combine document streams from internal servers, remote locations, and the Nebula cloud into a custom interface meeting the school’s specifications. Unlike a website, however, a University World can also share streams with all the other worlds on student and faculty machines, making it easy for students to integrate university materials into their own work and to submit papers for review.
Students can share streams with each other easily, too, using the University World’s ability to send (msl) to any or all connected Nebula users. The student machines require no additional setup or configuration to connect with each other through University World and no setup from the university to enable these connections. Rather than sharing entire documents or folders, students can share any part of their work as individual (msl) expressions. Likewise, faculty writing or reports can incorporate streams and expressions from many sources, even student streams from their individual laptops.
The open source nature of the Mimix ecosystem makes it easy for users to develop new worlds and streams of their own based on (msl) technology. Some users will start with our working system and modify it to their needs. Some users will develop entirely new worlds with different user interface frameworks or concepts. Other users will want to customize or replace the engine that delivers msl streams. For all of these users, their work fits easily into the Mimix ecosystem for several reasons:
- All the pieces speak (msl).
- Setup is simple. Nebula has no dependencies and requires no configuration.
- Programmers are free to expand or create new worlds without any limits on language, frameworks, or user interface design.
- New worlds can be delivered inside Nebula without changing Nebula itself.
- Developers and designers don’t have to re-invent or configure Nebula to deploy their new creations locally, with other users, or in the Nebula cloud.
In the diagram scenario, our main user in the center has met a friend at university, shown as the second Nebula app user who is also connected to University World. Together, they decide to develop a new Open Source World. Because both Nebula users have access to both worlds, their new Open Source world can do anything University World can do.
The developers can build new navigation or (msl) features on top of the school’s existing streams. They don’t need any permission from the school and they don’t need to download any databases, document libraries, or frameworks to develop their unique world. Their existing installation of the Nebula app contains everything they need to make their own Open Source World.
When the developers want to invite others to try their new world, those users only need the Nebula app. If the new users also attend university, they’ll be able to access all the same school streams from their friends’ new world.
Nebula is a new way to share and use information from a variety of sources inside the same application. Built on the power of (msl), Nebula offers several improvements over browser-based applications or traditional desktop and mobile apps.
Advantages Over Web & Desktop Apps
- Nebula works identically on all systems and requires no other software.
- Nebula works identically offline and online.
- Nebula worlds can have any kind of user interface. No framework requirements.
- Nebula can extract and move data between projects or hosts without any setup.
- Nebula can “hot swap” interfaces with persisent data and hot swap data in an interface.
- Nebula is deployed identically on single machines, on networks, and in the cloud.
- Nebula data is stored identically on local and remote machines.
- Nebula data is free of scaffolding and API clutter. It is pure (msl).
- Nebula includes a complete world and stream development system.
- Nebula users retain final control over all data streams on their machine.
— D 12/16/2019
October 28, 2019
Henry Ford is famous for inventing the assembly line, a place where a complex product could be created in a series of repeatable steps. In programming terms, he described an algorithm for producing a motor vehicle. By doing so, he improved the quality and consistency of the final product while at the same time reducing its cost of production.
100 years later, it is information even more than motor cars which drives the progress of society. We have become dependent on complex electronic documents which, like a car, need to be assembled, edited, and used by others in a consistent and repeatable way. Yet no algorithmic processes have been described for document creation, verification, editing, or review.
In this presentation, we’ll explore the idea of looking at the creation of text narratives as a programming process and describe the activities of creating, editing, verifying, and disseminating written documents in algorithmic form.
If a programming language could be developed which described written documents in this way, the art and science of communication could be improved and simplified in the same way as Ford did for automobiles.
Note: This post contains slides and background concepts on the (msl) language by David Bethune. The (msl) expressions discussed are fully explained in the separate MSL Specifications.
Problems Caused by Current Text Systems
All current systems for working with text have two common ancestors. The first is the printing press, attributed to Gutenberg but actually invented by Koreans several hundred years before him. Our current use of precomposed paper documents which are fixed in their content and disconnected from other papers is the direct descendant of the printing process. This isn’t how our minds work: we retain fragments of documents: words, phrases, and ideas; and we connect them to other fragments in a mental map. We don’t visualize or containerize an entire document as one entity. Every paper we see simply refers to other things we already know and then adds some new context or commentary on top of that.
The other ancestor of all electronic communications is the memory and storage systems used by digital computers. Everyone knows that computers reduce everything to “ones and zeros.” There’s no room in a system like that for contextualizing or differentiating one set of bits from another. The same ones and zeros could represent text about Walt Disney, a photo of him, or a history of his company’s stock performance.
The combination of paper-based communications which have no “smarts” and are only loosely correlated with each other and of digital systems with lots of processing power but no idea of what their digital memories contain has resulted in text processing systems which are far behind our needs today.
Benefits of Looking at Text Programmatically
Computer programs are expected to be predictable in their outcomes, traceable, reproducible, debug-able, and extensible in that they can be easily shared with and reused by others. None of these things is true for written documents today. But viewing text editing as a programmatic process could open the door to a new way to use writing in society. It introduces the possibility of creating a programming language specifically for text editing, one which could enable new ways to write, verify, and use documents.
Let’s examine some key programming concepts and see how they apply to research, writing, and text editing. As we go, we’ll build the requirements for a programming language that can represent all the actions in a typical (human) text editing session.
An algorithm is simply a recipe, a process for producing a specified result. The word came into English in the 13th century and is borrowed from the surname of a Persian mathematician, al-Khwarizmi, who began to codify the “recipes” for arithmetic in the 8th century.
What is the recipe for non-fiction writing? Usually, it looks something like this:
- Read a bunch of stuff.
- Mentally connect ideas and facts across documents.
- Notice facts which appear repeatedly and others which are outliers.
- Use existing text to validate your own new hypotheses and observations.
- Draw ideas and conclusions from some things you read, discarding others.
- Make notes and write new text about what you read.
- Share your writing with others, who fold it in at their own step #1.
Of course, the real recipe includes an unknown number of “loops” as we cycle back through the steps in no particular order until we decide the text is complete. Only by following all the original steps could we write an algorithm that produced the same document, along with its references, annotations, and edits.
In 1945, Vannevar Bush was an engineer and inventor who had worked on the Manhattan project. He saw first hand how people worked with complex documents and used them together with their own notes and writing to produce new “a-ha” moments. By attempting to automate the task, he described an algorithmic process for research, though unintentionally.
After we dropped the bomb in WWII, Bush was deeply concerned about the direction that man’s thinking was taking. He proposed a new tool to help mankind “think better.” Called a memex, this was to be a microfilm-based machine that implemented a research algorithm mechanically. Bush’s description of the machine’s behavior allows us to induce the algorithm it must be “running.”
The memex was intended to:
- Record everything you read, photographing it onto a microfilm reel.
- Record everything you wrote, photographing it in parallel with the pages you read.
- Allow sharing a single microfilm frame or an entire reel of research with others.
A machine with nothing more than these three features would still be revolutionary today. In order to create one, we should look at how specific aspects of reading, writing, and research scenarios could be described in modern programming terms. If we’re to define a programming language for text editing recipes, that language will need to emulate these three features of Bush’s design and enable them through code.
Linearity and Chronology
Because a computer’s digital memory is a “blank slate,” the chronological order of operations determines what kind of output it creates. Early video game systems, for example, used cartridge memories. Yanking out the cartridge and putting in a new one would entirely reset the computer to a new state. You may have been playing Tank a few minutes ago but now you’re playing Pitfall, which has nothing to do with Tank and knows nothing about the operation or conditions of that game.
With the earliest computers, jobs were run in batches with a new computing environment being created for each user. Realizing that the hardware was spending most of its time idle, programmers then developed timesharing systems – but each user’s job would still “start over” with a new fresh slate.
Every user’s program consists of instructions which must be followed in a strict order to produce the expected program result. In older languages, these lines were sometimes numbered. Modern programming languages interpret statements in the order in which they appear in a text document, and this gives us a jumping-off point for looking at text editing as a linear program.
A computer’s activities may be based only on what’s in its memory and storage right now, but a human’s activities always contain references to previous work or information. A programmatic view of research, reading, and writing, therefore, must extend across more than one work session. It must incorporate elements that were viewed or edited earlier in the chronology, for this is how we work as human beings.
One aspect of human behavior that’s shared with computers is particularly relevant here. A computer can only operate on instructions in the order in which they are provided. People, too, only operate on text (by reading, editing, annotating, or writing it), in a strict chronological order.
A document consisting of prose or one made up of tables of numbers can be organized as a chronology, even though the facts and figures may have been composed in an order that differs from the final document. Nearly all writing goes through an editing phase where some text is added and other text removed. While the document itself can be read linearly, the final output doesn’t contain any information about this editing process. It doesn’t show what materials were consulted, added, or discarded in the creation and organization of the final text.
A programming language that described text editing would need to act like Bush’s memex machine and “photograph” every piece of reading and writing. Edits would necessarily appear later in the list of instructions, since they were performed later. Just like the memex’s indelible microfilm records, a program that represented text editing would never actually change any existing text. It would simply record editing (or excision) as separate programming language statements that appear later in the recipe. A language which described reading, writing, and editing in this kind of chronological form – necessitating references to existing materials rather than eschewing them – would be an important first step in approaching written communications as programs.
Such a language would also allow “rewinding” text to any previous state. This would be a huge advancement over simple “undo” functions in a word processor or even file versioning. The same file could simultaneously represent all the states of a document’s creation, from its inception to its final form – along with all the references and notes used in between. This simple idea of recording everything, as suggested by Bush, enables all of the other programming magic in our text editing language.
Imperative & Procedural Programming
A simple list of instructions lends itself well to the idea of imperative programming. The key feature of this programming style is that it depends on the idea of a global state which is updated after each instruction. Future instructions rely on the current values from this system state, not on the literal instructions that came before them. This lends an uncertainly to their interpretation because it can be hard to predict the future state from a list of past instructions.
Imperative programming further developed into procedural programming where repetitive actions could be encapsulated into procedures and passed varying arguments. In the presence of a global system state, however, the output of a procedure is indeterminate, further complicating the debugging of these types of programs.
A language for recording research should not depend on a global state that’s unpredictable because that isn’t how we work with written materials in the real world. We are always extracting, moving, deleting, and adding to actual words that we can see, not a hidden system state.
Functional programming was proposed by John Backus at IBM (where he previously invented FORTRAN) as a way to resolve the ambiguities in imperative programming and its reliance on a system state. In the functional style, the output of a function relies entirely on its inputs and not on hidden values from the system state. This corresponds better to the human way of working with documents, each of which combines fragments from others along with new work.
What if imperative and functional programming could be combined so that the language expressions themselves held the “system state?” That way, a function could collect its arguments from previous language statements in the instruction list. Rather than being hidden and unpredictable, the state would be present and easily examined. Any expression could then be resolved into a functional one with all of its arguments fully coerced into literal values.
An ideal text recording language would retain the command-oriented characteristics of an imperative language (“Do this, then do that.”) while at the same time offering the clarity and reliability of a functional language (“Given this, do that.”).
Operators & Operands
Whatever style of programming we use, there is the question of syntax within the individual expressions themselves. From the highest level language down to machine instructions, programming statements consist of two parts: an operator which tells what to do, and an operand which tells what to do it with. Operators are literally wired into a chip’s design while operands are highly variable, so the traditional approach has been to write operator operand in that order, and this applies all the way down to machine language.
But this is the opposite of the way humans approach text. We do not say to ourselves, “I’d like to make something bold. Which something? This text over here.” We start with the operand. “I’d like to take this text over here and make it bold, or “I’d like to take every occurrence of this text anywhere and change the value.” A programming language that was interpreted this way would have a much more natural reading and writing order when it comes to recording and editing text.
If our text recording language were to start with values and then apply changes to them, the entire language might read as a list of instructions in operand operator form. Working this way means giving up access to traditional languages within the same program because it would be difficult or impossible to interpret the results. The language would need to be entirely built on text operands which are then passed the operations to be done to them.
Variable Assignment and Recall
In narrative text, information is communicated by what would in programming terms be called assignment statements. If we write that Walt Disney was born in 1901, we are essentially assigning the value 1901 to a variable called “Walt Disney’s birthday.” In other words, these two forms are equivalent:
“Walt Disney was born in 1901.” ⇒ WALTDISNEY:birthday = 1901
Other kinds of text can also be reduced to assignment statements. The periodic table, for example, shows the atomic number of each element. We could extract this into an assignment statement.
“Hydrogen 1” ⇒ HYDROGEN:number = 1
While every new piece of writing should contain some unique information or insights from the author, the vast majority of what we read and write consists of known facts like these. What day of the week is May 8th? Which drugs has this person taken and what are their interactions? All writing is peppered with facts which can verified independently from the new materials that surround them.
Our text programming language needs some way to indicate when facts like these appear in a sentence. It’s easy to scan a document for common nouns like people, places, and things. If we include in our language an expression to isolate or denote facts, then we can assign the facts to variables, give each variable a value, and recall that value later. We can also easily determine if a later expression is trying to assign a different value to the same variable.
Traditional text comparison tools show the literal differences between two documents, usually line-by-line. But this kind of comparison doesn’t help when the documents are comprised of entirely different text. Recently, it was discovered that Jupiter has 10 more moons than previously known, bringing the total number to 79. Many published documents and web pages will have the old number. A line-by-line comparison of a new paper on Jupiter with one published in 2018 won’t show where the old paper is wrong. But re-casting both documents into a programming language and isolating the variable assignments (JUPITER:moons = 79) would make it easy to review, update, and re-use any older publication that had assigned a different value (JUPITER:moons = 69).
Thinking of facts in narrative text as variable assignments opens up two new possibilities. First, we can recall the value of a variable that was previously set, regardless of when or where it was set. A student might record the periodic table at the start of a class semester and then recall its variable value in a paper written months later. There is no need to re-consult the periodic table document:
LITHIUM:number ⇒ 3
Secondly, we can test a new provided value against the existing value of a variable and correct any errors. The same periodic table loaded by the student at the start of class shows that the atomic number of lithium is 3. If the student were to write in a future paper:
“Lithium, having an atomic number of 7…”
The system could detect that 7 is not equal to 3, the currently stored value for that variable. Values like this which are known to be correct (or “canonical”) could be marked as such and our text programming language will need an expression for doing so. Even many months later and without referring to the periodic table, errors in the student’s writing could be detected and corrected when the new variable assignment is compared with the existing, canonical value. Of course, the same method used to store the properties of the elements could be used to store maintenance requirements on airline parts, for example, and to check these against future documents.
Thinking of factual statements in writing as variable assignments allows sophisticated methods of future recall, verification, and checking. This the first and most important aspect of viewing text editing as a program. Variables are assigned and re-assigned in a chronological order. Variable values which are different from known good values can then be isolated, highlighted, or corrected automatically.
In all but the simplest programs, version control is a huge issue. Today’s systems store each edit to a file as a full copy of the file itself with only its current text represented. This necessitates separate tools to compare files between versions and detect the changes therein.
Comparing text between versions of a narrative document is even more arduous than comparing versions of program code. In legal documents, for example, strike-through text is used inline to show words or sentences which have been removed from the original text. New text is sometimes shown italicized or in a different font. Despite these visual affordances, there is no easy way to identify all the removed text, all the new text, or all the original text and no way to roll-back or review previous versions of a document.
Even in creative writing, authors may write many paragraphs about a character or scene and then discard them, lacking any way to store those previous versions of writing and refer to them later. A programming language which was specialized for recording research and writing would be able to record all of the versions of a document inside that document itself, including other materials which were consulted during writing, the annotations made, and text which was written but not included or was later moved to a different location in the narrative.
When combined with code for variable assignment and recall, as described previously, a version control system for writing would be even more useful. Such a system could detect changes in factual information over time, helping authors to highlight when critical information in their library has changed and alerting them to new interpretations of materials they’ve already consulted.
A hash serves as a way to fingerprint a piece of text or code (or a graphic in a binary file). A hashing algorithm produces the same hash when given the exact same text and an entirely different hash if the text varies by even one character. If the current state of a document is recorded into a hash value and that hash value is also included in the hash of the next round of edits, a chain of edits can be established. This is exactly the principle used in Bitcoin and other cryptocurrencies. The amount of money held by any one address is hashed together with others into a block, and the block hash is also included in the next block, forming a chain of transactions.
An ideal language for text editing would also be able to record the hash of any document’s state at any time. It should then include that hash into the next round of hashing after the next round of changes. As with a blockchain application, these hashes could be published independently of the materials themselves and used to prove a chain of edits. With linked hashes forming a blockchain, the language could be used to quickly determine if a document, a group of documents, or an entire editing session was fully intact or if it had been edited in any way, including showing which previous version (if any) is matched by an unknown version of a document by comparing hashes without having to actually compare their contents.
Digital computers can only make two kinds of tests, based on the values held in their registers or in storage. A computer can tell you if two values in registers are the same, or if a value in a register is the same as one in storage. This kind of testing mirrors the evaluation that humans do when they encounter facts: Do the facts align with something already stored in our memories?
In a programming system for writing, the values of all factual information (WALTDISNEY:birthday = 1901 and JUPTER:moons = 79) can be tested against future references. If factual information were to be extracted from narrative text and stored in a computer variable, it could be tested against any future reference to that information – regardless of the containing document or the format of the presentation.
Computers can also evaluate “near misses” and this is the basis of features like spelling correction. If a known value were held in storage, such as the name Walter Elias Disney, it could be matched against partial or incomplete text, such as “Walt” or “W.E.D.” Even words like “him” could be matched to an existing variable value, depending on context. Such a system would be able to connect multiple references to a person, place, or thing in narrative text – even if those references don’t include any exact text which matches the subject.
The ability to quickly test variables against each other makes it easy to see not only if a document has changed but in what way it has changed. The health plan section of a corporate HR handbook, for example, is likely to have very different formatting and parameters from year to year. This can make it difficult to compare or even find information in the new version of the document. Written as a text programming language, it would be easy to detect if a user was consulting an out-of-date version not just of the entire document, but of the specific facts inside it which have changed.
Branching & Looping
Branching and looping are essential parts of Turing completeness, something that defines every modern computer. In the example algorithm for research mentioned earlier, a human will repeatedly “branch” based on some discovered value, or “loop” over a set of input values and perform the same action with each one.
In the context of a text editing language, however, these are detrimental concepts because they open the system up to errors in interpretation and to security exploits. A safer way to represent human branches and loops is to simply record them as linear actions performed in a sequence. The rationale behind the actions (branching) or the repeated review or editing of the same object (looping) are essentially inconsequential. Therefore, a programming language for text editing should not include branching or looping instructions.
Recursion is a form of looping: looping over one’s self. A recursive program is one which performs some task on part of its inputs and then calls itself with the remaining inputs to complete the task. While recursion is a powerful programming methodology and useful in the implementation of a programming language for text editing, the actual language itself cannot include recursion due its unpredictability and the potential for exploits with unintended consequences.
Recursion often makes programs shorter, but this is of negligible benefit in a text editing system. Rather than call a routine which calls itself repeatedly, a text editing language should simply perform each individual action and record them in sequence. This simplifies “rewinding” the program’s (and the user’s) behavior to an earlier point and avoids ambiguities in the interpretation of the final result.
Compilation & Dependency Injection
In programming terms, compilation means packaging up an application in the smallest possible form that can run on another system. Research materials could greatly benefit from the same packaging. When an author refers to another person’s works, as most writing does, the original works themselves could be “compiled” into a composite document to allow the reader to access them without searching outside the system. The compiled file is smaller than the total of the materials available in the editing session because not all of the original materials are referenced in the final writing.
Today, we often use web links or citations of printed materials in our own original writing. The problem with these is that they are external to the document we’re writing. There’s no guarantee that the next person will find our links still active or will have access to the original printed documents we consulted. The programming correlate is dependency injection, in which outside code needed to run your program is injected into the source before the program starts.
A programming language for research and writing would need to include the full text of the materials that we consult (with their own facts separated into variable assignments, as previously mentioned) in order to be truly useful to the next researcher. The language should be expressive enough that dependencies can be easily identified, resolved, and injected into any code fragment. Finally, the entire body of research must be able to be compiled into a runnable program which produces the same body of research on another person’s machine. This follows Vannevar Bush’s model of recording the full text of the author’s reading along with his or her writing and sharing both of these with the next researcher.
Serialization & Backup
Unlike the apps we use today, a software program with memex-like capabilities would need to retain our work in between sessions. It should be easy to see what we worked on yesterday, last week, or last year. We should be able to extract any part of that work and use it in something new.
In programming terms, this represents serialization or saving the system state so that the machine can be restarted at the same place. The same programming language features that give us dependency injection also facilitate serialization. The system can take a snapshot of its state at any time and inject any material dependencies into the snapshot file. The system can also remove any text or materials on which the snapshot is not dependent (as in compilation), making it as small as possible.
A snapshot file like this is the ideal form of backup for a text editing session. By running the language statements again (on the same or another machine), the state can be returned with all documents intact. Canon or authoritative facts can also be separated into their own snapshot and used with a variety of editing sessions so that the original corpus doesn’t get repeated in the backup.
Bush’s memex introduced the idea of a storage medium that was also the sharing medium. The same microfilm roll used to document one’s own work could be duplicated, spliced, or printed, giving the next researcher the same path through the materials that you took – or any edited version of those that you chose. By the same token, the next person might inherit an entire reel and only choose to view or print a handful of frames.
A programming language for text editing would need to offer this same flexibility. An author should be able to extract and share any part of his or her work. Likewise, the next person should be able to extract and use only the pieces that are of interest. The dependency resolution and compilation ideas mentioned earlier can also provide this flexibility. The language definition should include an easy way to identify and resolve dependencies in any text fragment, up to and including entire books from which it quotes. The language should also make it evident from any point in text which materials are not being referenced so that the smallest possible useful piece can be extracted.
A file containing a list of programming expressions like these is not data itself; rather, its the instructions on how to create that data. These instructions can be started or stopped at any point. The conditions leading up to any expression can be quickly recreated (by running the expressions before it). Instructions which don’t affect the data we want can be easily ignored or removed. A programming language view of text is like a vector file for an illustration vs a bitmap image. It offers infinite transformation of the data contained in the document.
Extensibility in a programming language comes from two kinds of freedoms: what can you do with a language as its defined today and what you can do to it to evolve or re-purpose the language in the future.
Doing cool things with a language requires that it have a syntax which is easy to interpret in your own head. It should be clear from any language statement exactly what behavior it will have. It should also be clear what dependencies like previous code or documents are implied from any particular expression. It should be hard to make errors but easy to catch them.
A text editing language should simplify much of the “heavy lifting” of working with documents. It should be significantly easier to describe and achieve text editing results with this specialized language than it would be with a traditional programming language. A new language inspires more people if it is well-documented. It needs examples of all code syntax and multiple usage scenarios.
You can’t do too much to a language if it is broken or locked up in someone else’s intellectual property. The language description must be robust and fixed from the beginning such that old code “just works” in future systems. The language can’t be controlled by one company. It’s necessarily open source. It must be easy to incorporate this new language into many types of software, user interfaces, and storage systems and those new programs should be inter-operable by design, even if the language designer never imagined them.
A programming language for research recipes would open up exciting new possibilities for reading, writing, verifying, correcting, and sharing written information, but creating such a language requires new ways of thinking about programming and a radical departure from traditional language design.
Such a language needs to…
- Be linear and non-procedural.
- Offer a “vector” view into editing rather than a “bitmap” of text snapshots.
- Be imperative in declaration and functional in execution.
- Use an operand operator syntax.
- Offer variable assignment and recall across documents and programs.
- Hash documents for verification and version control.
- Use testing to find and correct data in variables.
- Avoid branching, looping, or recursion.
- Simplify debugging.
- Offer compiled and portable versions of any text fragment.
- Include built-in serialization and backup.
- Be easily extensible.
The Mimix Company
The Mimix Company is developing a programming language for text editing, built around the concepts discussed here. The open source (msl), Mimix Stream Language, will be incorporated into the Company’s own (msl) Engine and (msl) Viewer applications and will be available free for use in word processors, research tools, and databases of all kinds. It is the first software product for recording, verifying, correcting, and disseminating factual information based on this new and specialized programming language for text editing and analysis.
– David Bethune
January 18, 2019
Mimix is an invention that will transform how we work, how we study, and how we share information.
The ending of WWII changed everything for our country. It brought great prosperity but it left us with the tools to destroy ourselves. A brilliant scientist of the day named Vannevar (vuh-NEE-ver) Bush wrote that we needed something to help us “think better.” His answer was a machine called a memex. It would have been enormous and expensive and was never completed. But the dream is still with us.
Today, I’m going to share how our Mimix software will help people think better with new technology we’re creating. Our company’s name and slogan are inspired by his pioneering work.
You can’t watch the news today without hearing about artificial intelligence. It’s in everything from washing machines to the space program. But AI is missing from two things we do every day – reading and writing.
Now when I say “AI for written documents,” you may be thinking of spelling correction and stuff like that. Those tools are great, but they don’t help you think better and they don’t help the next person get any more out of your writing than what you put in.
If there’s more to a washing machine than meets the eye, then there’s certainly more to writing. Mimix is the first AI program for reading, writing, and communicating with others.
Word processors are great, but they’re also stupid. I’ll give you an example.
How many moons does Jupiter have? Would you have guessed 79? I wouldn’t. You might say, “Well, Google it,” but even Google would lead you to the wrong answer because 12 of them are recent discoveries. Most documents Google can find won’t know that, and Google won’t help you if you get it wrong.
Besides, why should you have to go to Google as a separate act? That doesn’t seem very intelligent. Surely the correct information could come to you automatically, as you write. Can’t it? We say it can.
Word processors throw away most of the work you put into your writing. You say, “What do you mean, throw it away?” Think of the last time you wrote a report for work or school. How many hours of research did you put in? A lot more than the reader will ever see.
We do the reader a disservice when we throw work away. We’re making that person repeat what you already did! That’s missing intelligence. Can your final documents include the research you went through? We say they can.
Of course, academic papers have footnotes but those make your readers repeat your work, too. Why should they have to find the sources for themselves? We say this needs to change.
The tools we use for reading are no smarter than the word processor. Sure, you can click a link, but that’s you repeating work again.
Even the links are dumb. Should you have to read an entire book to see a quote in context? A link can’t take you to the exact facts the author mentioned. If you don’t scour the source material, you might come to the wrong conclusions or be mislead by someone who didn’t get the facts right.
Browsers and PDF viewers don’t know one piece of paper from the next. A browser can’t tell that you read 27 stories on the California wildfires and some of them contradict each other.
Today’s tools are forgetful. There’s no way “rewind” your work stream. Often we need to go back to a paragraph we read this morning. But Adobe Acrobat doesn’t know what you did this morning or two years ago.
These tools are missing the intelligence to relate facts together across documents and over time. Our disks and cloud storage folders are a mess. Can software better organize your work? We say it can.
A better term for reading and writing is research. Important writing demands that you do your homework. You postulate your ideas, check out what others have done, make notes, and write your own new material.
Today, these are all separate tasks with no relation to one another. Can a single program bring reading and writing together? We say it can. In fact, this was a key part of Bush’s original dream for the microfilm-based Memex machine.
We’ve come a long way since microfilm, but modern programs still make you use an awful lot of windows just to get a simple project done. The word processor can’t see that you’re quoting from a web browser. It doesn’t know if you’ve forgotten to include critical information that’s sitting just one window away.
We’re left to juggle multiple programs and documents in order to complete one goal: sharing our ideas with someone else.
Mimix radically changes the broken research model. First, it brings together reading and writing in the same program. More importantly, Mimix uses AI to make sense of what’s going on in your research.
Using a concept we call canonical data, Mimix verifies facts as you read and write. Obviously, many things are matters of opinion – but many more are not.
How many moons does Jupiter have? When was Walt Disney born? And who shot JFK? Mimix lets writers and readers choose their own canonical data sources and change them later, even in someone else’s writing.
If you were trying to answer “Who shot JFK?,” which sources would you choose as canonical?
Mimix can also see across your documents in linear time. It knows that two papers you wrote are related, even if you created one yesterday and one a year ago. It knows that two web pages are related in content even if they came from different websites.
Mimix not only saves the reader time, avoiding the task of organizing web bookmarks or fumbling through document folders hunting down that critical piece of information. It also surfaces new insights and facts that you or your readers might easily miss in a pile of unrelated papers.
Writing might be an unexciting business but it’s absolutely necessary. We rely on the written word for everything from our medical records to our laws. Errors and omissions in writing can keep you from getting a job – or cause a plane crash.
Census reports and investigation by the Wall Street Journal show that ⅔ of the population uses computer documents in their work. About ⅓ of them are the document creators and ⅔ are the consumers.
Students at all levels are writers, but most important writing starts at the university level. The US Department of Education says 20 million students are enrolled in universities.
The numbers in this slide are for the US alone, but the market is worldwide. Everywhere, every day, hundreds of millions of people need to create and consume factual written materials.
Written documents are a big business. They’ve created vast profits for Microsoft, Google, and Adobe. Each of these companies has grown to near monopoly status precisely because the written word is so important.
Losing your work is even worse, and this has led to the rise of cloud computing. Today, we expect our documents to be everywhere – even after we replace a broken smartphone, upgrade a computer, or travel for a meeting.
Any commercially viable solution for reading and writing must be fully cloud enabled and not rely on users to install software, fix bugs, or create backups.
When computers were first invented, the companies that built them included software for free. This was necessary to get anyone to buy the boxes. The program code behind that software was also available for users to inspect and modify. This is known today as open source.
In 1983 beginning with IBM, big software companies introduced a new model. Instead of source files they started shipping compiled or binary files that couldn’t be viewed or edited by humans. Most desktop software like Microsoft Word and mobile apps now fall in this category.
A backlash against closed source software began with the Linux operating system movement. Today, the trend has reversed with large companies demanding open source from their vendors. This allows customers to look for security holes, to edit the software to fit their needs, and to avoid vendor lock-in.
Open source is behind the largest software acquisition in history, IBM’s purchase of Linux vendor Red Hat.
IBM didn’t want Red Hat for Linux; they could get that for free. IBM bought Red Hat because supporting Linux users is a multi-billion dollar business today that can only grow in the future. If IBM can become the go-to brand for hosted, supported Linux, they might have a chance to save their stodgy business.
It’s a great irony that IBM dipped their toes in open source when they invented the PC. Legal PC clones led to the mass market adoption of Windows that we know today. What IBM failed to do last time around was to monetize that invention.
The things that make open source wonderful also make it terrible. It’s the biggest DIY project in history. Every bit of the open source world is DIY. That might be OK if you have a techie team behind you, but it doesn’t work in the consumer space. This is why we don’t see Linux on desktop machines. It’s just too much work to coordinate all the moving parts.
Enterprise users can grow their own systems, but they don’t like it. Ford wants to weld together cars, not code. These customers were the first to demand that someone package up open source tools for them. They want their software hosted, backed up, and fixed when it’s broken – all things that the free Linux community couldn’t provide. They want clear and easy-to-read documentation and they want someone to answer the phone when they call for support.
To satisfy these demands, The Mimix Company will provide a paid software suite with turnkey service for both enterprise and consumer customers.
I’m excited about what we’re doing at Mimix and I hope you are, too!
As always, you’re invited to email me with your thoughts. I look forward to hearing from you.
August 26, 2018
Ted Nelson’s Xanadu is one of the most important and most maligned ideas in computer history. Why has it held such sway after more than 50 years? Why isn’t there any working software for download or a GitHub community of devs? Is Xanadu so difficult to achieve that no one call pull it off, or are there other factors? These are questions I’ve attempted to answer for myself.
Truthfulness relies on an unchanging record of the past.
Gary Wolf’s 1995 investigation for Wired into what he called The Curse of Xanadu is the de facto online history, not only for its scathing criticisms of Nelson (which he later disputed in an open letter) but for the rich detail and fascinating revelations about the people and companies behind the project. Where else could you find out about Autodesk’s involvement and the $5M they spent on trying to achieve Nelson’s dream?
Certainly personal factors and generally being viewed as a crazy person are problems shared by all inventors and geniuses. If it were just Ted’s fault, others would have come forward and made Xanadu without him. A deeper analysis shows that there are technical and market reasons that keep Xanadu from being real. They fall roughly into four buckets:
- Things that already exist
- Things we don’t need
- Things Xanadu proposed but didn’t deliver
- Things Xanadu is missing
Where It All Started
Nelson’s books Computer Lib and Dream Machines, published as a single volume and bound together, are the geek’s version of Be Here Now by Ram Dass. The books are amazing and indescribable, remarkable for their breadth and freshness, completely unorganized, and impossible to navigate. These are the first hand accounts of a brilliant traveler to a new place, witnessed and narrated for us in stream-of-consciousness style. Ram Dass was also a former academic turned on by the new and drug-induced culture of the sixties, and like him, Nelson conveyed enough “key driver” ideas to have inspired generations since. By 1980 in Literary Machines, he had formalized the main goals of Xanadu into a list of 17 rules. Let’s see which of these buckets each rule falls in.
17 Rules of Xanadu
1. Every Xanadu server is uniquely and securely identified.
We could say that every device today can be uniquely identified by an IPv6 address, a MAC address, or a domain name. “Securely identified” is an interesting phrase, but blockchains like Bitcoin meet the requirement. This Xanadu rule goes in the “exists” category.
2. Every Xanadu server can be operated independently or in a network.
This is true of nearly all computers, so it exists.
3. Every user is uniquely and securely identified. This is the same as rule #1. With peer networking, there is no difference between servers and users. Already exists.
4. Every user can search, retrieve, create and store documents.
This one definitely exists.
5. Every document can consist of any number of parts each of which may be of any data type.
HMTL covers this requirement in an ugly way. I would not vote for HTML as the way to organize composite documents going forward, but it does exist.
6. Every document can contain links of any type including virtual copies (“transclusions”) to any other document in the system accessible to its owner.
This definitely does not exist yet and it’s the thing that seems to drive Ted crazy. We’ll get to why it doesn’t exist in a minute. It goes in the “proposed” category.
7. Links are visible and can be followed from all endpoints.
This is another aspect of Xanadu that’s only proposed and doesn’t exist. Really, it’s a feature of rule #6 in that once a document is transcluded in another, you can see and follow links both ways.
8. Permission to link to a document is explicitly granted by the act of publication.
This is the first example of a Xanadu feature we don’t need. A tool for readers and writers should not get involved in the concerns of platform or content providers.
9. Every document can contain a royalty mechanism at any desired degree of granularity to ensure payment on any portion accessed, including virtual copies (“transclusions”) of all or part of the document.
Again, we are messing where we shouldn’t be messing. This “feature” is only here to enable rule #15.
10. Every document is uniquely and securely identified.
This is a terrific idea and sorely needed. Enormous problems are caused by making humans take on the naming and management of documents. Is a blockchain answer the way to address this proposed idea?
11. Every document can have secure access controls.
It can be argued that we’ve claimed to have this forever but the current systems are all pretty bad. Today’s best solution to secure access is public key encryption like we see in blockchains, but losing a key has catastrophic results and will require better custodial and backup solutions in the future. I’m putting this in the “exists” bucket.
12. Every document can be rapidly searched, stored and retrieved without user knowledge of where it is physically stored.
This absolutely exists in many forms, the most obvious of which is Google. Of course, Ted is talking about a more personalized system of searching and saving documents that interest you, but the tech exists.
13. Every document is automatically moved to physical storage appropriate to its frequency of access from any given location.
This isn’t even a user feature, it’s housekeeping. Not needed. If sharding and redundant access to remote documents is required, solutions like IPFS are coming online to provide that.
14. Every document is automatically stored redundantly to maintain availability even in case of a disaster.
This is more housekeeping but definitely useful. Systems already exist to do this. Bitcoin is a great example of redundant storage of common data which is needed by many people.
15. Every Xanadu service provider can charge their users at any rate they choose for the storage, retrieval and publishing of documents.
Could this be the straw that broke the camel’s back? The requirement to turn Xanadu into a copyright management business certainly isn’t a user feature nor necessary technologically.
16. Every transaction is secure and auditable only by the parties to that transaction.
This is a terrific and important idea which is only now coming to fruition, especially with privacy-focused cryptocurrencies like Zcash. Although recent, it exists.
17. The Xanadu client–server communication protocol is an openly published standard. Third-party software development and integration is encouraged.
And strangely, here at the end, is a rule that was violated even before work began on the code. Perhaps Nelson didn’t believe enough in his publishing business model to allow anyone to see the software as it was being developed. This aspect of Xanadu was only proposed and not delivered.
The Missing Rules
I mentioned earlier that, in addition to what was proposed, Xanadu is missing some features that would help it attain commercial success today. Let’s examine them.
18. The system provides multiple views of the same data and makes it easy to switch between them.
This idea is from Douglas Engelbart and his NLS system. Xanadu was never shown with multiple views, such as slideshows, timelines, or outlines.
19. Text is stored without formatting.
This idea is discussed at length in Nelson’s papers and even mentioned in Computer Dreams, but it’s not on the Xanadu rule list. It should be. Today’s systems mix text, formatting, and data structure which makes it difficult to extract any of these later.
20. Metadata can be canonical.
This idea, which I expound upon in the Mimix whitepaper, addresses Nelson’s desire to have certain data or documents accepted as absolutes, such as the signed copy of the Declaration of Independence and its previous versions.
21. Data is atomic.
Another idea intrinsic to Xanadu but not revealed until later was that it was to hold all data in unalterable “atoms.” What appeared to be documents on screen, including your own writing, were merely atoms strung together. Ted described a complex system of altering data later and changing its address, which never took off.
22. Streams can be shared, disassembled, and re-used.
Although Ted’s papers refer to the idea of a set of Xanadu documents as a “stream,” he doesn’t develop the idea of how streams would be shared or utilized by others. These must be built-in to the system and offer abilities beyond today’s document or app sharing.
23. Your data is kept private.
End-to-end encryption is necessary today for many reasons. Only you should have access to the materials you read and write inside Xanadu or its successor.
Sorting It Out
If we rearrange the rules into the four buckets: Exists, Unnecessary, Missing, and Proposed, we get something like this:
A quick glance at this diagram shows that much of what’s needed to create Xanadu already exists or can be cobbled together from existing pieces by using API’s. An interesting side-effect of diagramming things this way is that it makes clear what was proposed but not delivered. The first two (#6 and #7) are the things that Ted talks about today in his YouTube videos.
Delivering on the Promise
If half of the problem has already been solved (or doesn’t need to be), we are left asking the question, “How difficult are the remaining pieces?” Judging by what’s left of the original 17 rules, I would say, “Achievable.” The last rule, open source, we can dispatch by simply starting from that point. Mimix or any worthy successor to Xanadu must be open source.
The Holy Grail of Xanadu was transclusion, the idea of including part of one document inside another and being able to refer back-and-forth between them. We’ve actually had transclusion for some time. Microsoft delivered it in their OLE object linking and embedding technology and HP went so far as to ship an entire version of Windows, called HP NewWave, based around transclusion of atomic objects. Today you can create a logo in Adobe Illustrator, transclude it in a Photoshop document, then double-click it from Photoshop to edit the original in Illustrator. But what Ted is referring to is transclusion while reading others’ works and writing your own. This, we do not have.
It turns out, that kind of transclusion is actually easy. It requires two things which were missing from Nelson’s proposal. The first is massive storage. It simply wasn’t possible to keep thousands of books or papers on the personal computers of the day, or even on your personal account on a university or corporate computer. Instead, Ted thought we should link to copies of the same data. That made sense in the days when schools and institutions had all the computing power and storage, but that’s not the case today. You can buy an external terabyte drive on Amazon for $50. Cloud storage is cheap, too.
Nelson liked to use the example of the Declaration of Independence, and so do I. While recognizing that documents would need to be copied and virtualized, he held to the idea that links would be to the original document or its alias. He also suggested that all links would be bidirectional, meaning that any elementary school child who quoted the Declaration of Independence could expect to have his or her paper linked back from the original document. That’s simply preposterous. Imagine the huge numbers of scientific papers which all refer to facts in the periodic table. Should visiting that table produce links to all of them? Balderdash!
If we think of transcluding a document from our own local storage rather than from some remote, shared location, we can greatly simplify the problem and produce a system that actually works for the user. Instead of linking to a remote copy of the Declaration controlled by someone else (or a copy of a copy), why not just keep a copy in your own local storage? If it comes from a trustworthy source, you can rely on it going forward. Taking the Declaration from the Library of Congress is a 100% reliable source. In Mimix, we call this canonical metadata. It’s not necessary for you to refer back to the original Declaration, just pull up your own (canonical) copy which is known to be correct. As I’m writing now, I’m referring to my own saved versions of all these documents I’ve mentioned.
With multiple terabytes of personal storage, even the most prolific author or researcher could have his or her own copy of every document he or she ever examined, along with all the user’s annotations. The periodic table (or a book about the history of France) isn’t going to change after you download it, and if it does, that’s easy for software to find and fix. You would still have your old copy, of course.
In working on my nuclear book, I’ve amassed a collection of over 250,000 pages of PDFs along with my highlighting and notes inside each. There’s no software I’ve found that’s really up to the job, but the terrific and free Docear (rhymes with “dog ear” as one would do to a book) is a good place to start. Its interface is tedious and not well integrated with a writing environment. It also doesn’t make any connections for me between my data and my writing, nor offer any alternative views. But it does let me keep my notes together with the original sources they came from and let me find either of these when I need to.
Keeping a copy of every document you read in your own personal storage and accessing them with the same tool that you use for your own writing opens the gateway to a true Xanadu-style system. Now, if we draw links between our writing and the documents we read, it’s easy for the system to retrieve the referenced files and show them on the screen at the same time — even with the graphical links that Nelson preferred. After all, we have both documents on our own disk! In this environment, it makes to sense to show links from every document as well as to it. In other words, every paper that you read or write about the Declaration of Independence should be linked from your view of that original document, but not from every else’s in the entire world.
The storage issues notwithstanding, I think Nelson skipped over the idea of local copies because he was trying to build a business around publishing and copyright management. The idea of every user having his or own copy of a document would be anathema to such a system. All three unnecessary features of Xanadu hinged on the publishing business model. In a last ditch effort to commercialize Xanadu after Autodesk pulled out, the owners approached Kinko’s (now known as FedEx Office) with a copyright and document management system. Even for them, it was unclear who would get paid and how. The entire idea of your writing and research tool being bound-up with a publisher and his royalties is simply awful. This, at the end of the day, is probably why Xanadu failed.
Building the Missing Pieces
If the missing pieces I’ve described are truly missing in that no deployed system exists which offers them, then we have something to aim for. If we can build a program that incorporates all of the existing aspects, skips the unnecessary ones, delivers on the unfulfilled promises, and fills in the holes in the original idea — then we can have a salable product! LOL.
We’ve already seen a brute-force way to include the existing technologies. A variety of frameworks, application programming interfaces, and services can cover everything in green on our chart. It would be preferable to build a new system which incorporates open source code for these features rather than relying on outside providers to continue to offer them. The biggest unfulfilled promise of Xanadu was transclusion and we’ve already seen how that can be done by simply reading and writing with a tool that keeps copies of what you’re doing. All that’s left are the orange items on our chart.
We work with multiple views of data every day, but the process is messy and mostly manual. We often need to move data from something that looks like a list to something that looks like an email, a chart, or a slideshow. Today’s tools offer a million ways to present data but it’s still surprisingly difficult to move the same information around into different views.
Douglas Engelbart’s revolutionary NLS system had this idea built-in from the start, in 1968. Tragically, NLS was almost impossible to use. If anything, it offered the antithesis of a user interface and died commercially after a series of acquisitions and mergers. Along the way, we lost the idea of a consolidated system for multiple views.
Mimix proposes to re-frame the current document paradigm into a system of data called atoms and ancillary data about your data called metadata. If you are writing about Gaudí, the factual sources you consult would be metadata. So would any formatting you applied. Together, these are Mimix views.
I’ve mentioned formatting as an example of metadata because it needs to be separated from the writing itself. Nelson has complained that today’s WordStar and SGML-inspired systems are messing up the real data we’re after, and he’s absolutely right. Mimix does not allow the co-mingling of data and formatting and these aspects are formalized into the design. Formatting is just another kind of Mimix metadata which can be applied or removed, both of which are recorded in the stream.
Nelson put a great emphasis on the idea of referencing the “one true version” of a document (or its copy or alias), but he didn’t address how that truthfulness would be determined. This is a real problem which hasn’t been addressed in any software that I know of. Taking a simple example, a college student might write a paper for music class without being exposed to such divergent concepts as diatonic set theory or serial music. “Google it,” we might reply today. But this is an artifact of the library card catalog, itself an artifact of the printing press. Instead of telling children “Let’s go to the library,” we tell them to go to Google. This is something of an improvement but we’re not home yet.
Much of the data that children would find in a library could be called canonical in computer terms, or simply “correct.” Everyone who inquires as to the distance to the Sun should receive the same answer. In this world of fake news and unreliable sources, it is more important than ever to declare and classify data according to its authenticity and sources. A more complex example is, “Who shot JFK?” What sources would you choose as canon? These are decisions best left to each individual author, but his or her choices should be clear not only to the writer but also to readers. In a Mimix stream, the author’s canonical sources are available to inspect and alter.
Truthfulness relies on an unchanging record of the past. Although facts and opinions change, our interactions with them in the past were based on our knowledge then. Our actions in the future must be based on the future values. The values that some data had yesterday were the basis of all of yesterday’s decisions and should not be erased. When changes are needed, we can simply record a new atom of data with the new values.
Nelson referred to this idea in his paper on the Xanalogical Structure of Documents, but the proposed method of accessing and “updating” these atoms was unnecessarily complex. Using the mass storage now available to us, we can simply snapshot every piece of data as it comes in and also snapshot every change to it. By replaying our snapshots, we could see where the data started and how it had changed.
Whenever there are long streams of documents and you need to only access a small part, some kind of addressing system is necessary. Two programmers working for Nelson came up with a system called tumblers based on the transfinite numbers they had studied in college. The method relied on literal rows, columns, and character counts to identify chunks of text, making it fragile and unreliable after text was changed.
Since the 1950’s, we’ve had regular expressions. I’ll be the first to admit that I hate them, but they do work. Rather than trying to specify the literal character positions of the data we want, regular expressions let us find the text we need syntactically. To grab the Preamble from the Constitution, I only need to write:
/we the people(.*)america./gi
Once I’ve referred to that piece of the original text, I could format it any way I like, display it on a slide, or quote it in the Schoolhouse Rock video that still rings in my mind. Selecting just the first letter and applying special formatting like a drop-cap is easy with regular expressions, too.
The Constitution isn’t changing anytime soon, but how does the concept of atomicity apply to your own writing which might change frequently? Let’s take two very different use cases. A screenwriter needs to create characters and then draw from many resources to develop a narrative around that character. There are sure to be numerous references, notes, illustrations, etc. and these will grow and evolve over time. But keeping them atomic lets the author go back and see the original data from which the notes were derived. If some material had been deleted, it would be impossible to refer to it going forward but easy to access it going backward.
A doctor needs to keep copious notes on his or her patients as well as a variety of canonical documents such as lab tests or radiology studies. Every prescription written must be maintained. There is some list of what drugs the patient is on right now versus all the drugs he or she has ever been prescribed. Today, this is mostly in paper files. In Mimix, the patient is an atom and all of these ancillary data are applied to that atom over time. The metadata itself is atoms, too. In other words, everything known about Vioxx at the time it was prescribed to patients is quite different from what is known now.
Sharing today is second nature and we have all kind of systems for sharing screens, apps, and they content they produce. Unfortunately, none of these systems makes it easy to do anything useful with the data after it’s shared. These are siloed applications in computer terms with data only moving up and down in the same system but not outside it. A corporate webinar or conference keynote makes a great example. Some useful content might be displayed by the presenters but it’s trapped in the video recording and can’t be extracted. At best, the audience gets a list of links they can click later. Most of the valuable information in the content stream is lost unless the viewer keeps separate notes.
Mimix introduces the idea of streamsharing, giving others a functional copy of your own research from which they can extract data, metadata, and annotations or add their own. More properly, Mimix recreates this idea from the original work of Vannevar Bush who in 1945 proposed a stream recording system called Memex based on microfilm. A Memex user who received a reel of film would be able to extract any frames, rearrange them, print them, delete some, and compile the whole thing onto a new spool. The medium carrying the information was the information itself, giving the recipient the same powers as the author.
My Mimix stream recording idea differs in several ways from app or document sharing. First of all, once shared, the stream belongs to the recipient who retains all its contents. Secondly, the stream is live and can not only be played linearly like a video but also accessed for its atoms and metadata. The new user can extract or discard any part of the stream he wishes, a concept not possible with today’s software. Because Mimix data is not bound up in documents or presentation formats, the system can show the stream recipient a summary of everything inside it and allow that person to choose what he or she wants to extract.
In addition to selectively importing the stream, a recipient is free to rewind and make changes to a “past” view of the stream’s contents. The audience member or potential new inventor could discard some part of the stream’s metadata halfway through and substitute new data of his own, changing the views thereafter. Scientific researchers could use this feature to look through old papers with the lens of new information, for example.
A Mimix stream naturally incorporates all the documents referenced in its metadata, so a Shakespeare commentary stream would include the full text of all the plays mentioned as well as any other reference sources the author consulted. The massive storage of today’s systems makes it possible and desirable for each reader and writer to have a full digital library which he or she can annotate and reference going forward.
I leave for last the topic that came first in the history of computing and that is encryption. Your favorite news source is full of stories about why we need encryption, from corporate data breaches to government hacks and rampant surveillance. The earliest computers didn’t worry much about encryption for several reasons. Many were released before strong encryption techniques were popularized. Home computers didn’t seem to need encryption (back then) and they didn’t have the processors or storage to provide it anyway. Corporate and university computers were managed by the institutions that owned them and users didn’t worry about how to protect the data.
Things have changed drastically since then. The prevailing idea around security is, “Be afraid. Be very afraid.” Any reasonable successor to Xanadu (or any non-trivial software, really) should provide end-to-end encryption as a built-in and non-optional feature. In a system like this, the Mimix software would only store “garbage data,” of no use to anyone without your private key to decrypt it. When you share your streams with others, a different set of garbage data is sent to them which only they can decrypt. When your data is backed up in the cloud or elsewhere, only the garbage data is recorded, requiring your key to decrypt it later.
As mentioned earlier, this kind of public/private key encryption system comes with a risk. If you can’t decrypt your data, neither can anyone else. This reinforces the need for key backup systems and custodial oversight, such as making sure a trusted friend or associate has your key.
Can We Get to Xanadu?
I think we can. If we incorporate the technologies that already exist and build the missing pieces, I see no technical barriers to its completion. As for the market, it’s huge. Like word processors and spreadsheets which were once expensive and limited to big business owners, a tool like Mimix which is a hit with early adopters (education, scientific, medical, creative) can evolve into a version for everyone.
Key West, Florida
August 26, 2018
August 12, 2018
The ideas in this article lead up to the invention of the MSL language which is introduced in the article Recipes for Research and fully disclosed in the MSL Specifications, both of which are recommended for readers interested in the ideas presented here.
I’m pretty sure I was born a geek. I’ve always been interested in the latest technology, forever holding out hope that tomorrow’s inventions would make the world better than it is today. At the very least, maybe technology could one day make using technology itself easier. Alas, that has not been the case.
Not long ago, I did a brief stint as a part time employee at my local Sprint store. That will need to be its own blog post. Perhaps if I’m feeling ambitious I could aim for something akin to Nickel and Dimed by fellow Key West author Barbara Ehrenreich. Anyway, I digress. While working at the Sprint store I got to see on a daily basis how Apple’s most advanced and consumer-friendly software (in the form of iOS) was interpreted by real people in the real world. It’s surely different than what Apple imagines.
It’s crazy to think that Apple is the world’s first trillion dollar company. It was started in Steve Jobs’s garage and the original Apple I was designed and hand built by his buddy Steve Wozniak. The other Steve left the company in the early 80’s. Though never giving it as his reason for leaving, Wozniak did say that after having a child he realized that technology wasn’t going to be the thing that saved mankind.
In those early days of personal computing, regular people could not only program their computers, it was expected of them! It was assumed that the computer owner would be in complete control of his machine. Of course, this would entail a learning curve, but the rewards were huge. Any college student could write whatever software he wished, save it on a disk, and sell it. An enterprising young man named Bill Gates got his start doing just that. It was as easy as putting a disk in the mail.
One of the most important programs for early computers was Hypercard by Bill Atkinson. It was revolutionary for two reasons. First, it let normal people design software with a modern graphical user interface based on stacks of cards. Secondly, it shipped free with every Macintosh at the insistence of its inventor who gave the product to Apple only if they agreed to that condition. Hypercard was dreamed up during an LSD trip and was clearly something designed around the idea of a “user-centered computer.” Apple had to make it go away, and they did. Hypercard would have hurt their applications software business too much, so the company thought. The company never really had an applications software business to worry about and in screwing-up Hypercard they missed out on the chance to define the web browser, which is what it would have become.
The people like Steve Jobs who made those decisions put an end to the ecosystem that made them rich. They took away the structure that they used to gain their power, transferring all of the creative and financial output of the old systems to themselves. Today, any app that you can think of will require the approval of one or more of the four big tech players: Apple, Google, Facebook, and Amazon. You will have to write your software using tools and languages they approve of. You may not discover or extend any device or platform features. In fact, quite the opposite. You’ll be constrained by your tech “partner” in the set of tools and features you can use. In this world, not only do you not own your software (or device), neither does the end user. Apple or Google will decide what code you can write and how it might interact with your device.
On one hand, this sounds terrific. In a nerdy fever dream, we could be done with “bad” software and security problems. We’d never have to worry about updates or compatibility. Our software stuff would just work because somebody big was taking care of it. Well, guess what? All of this totalitarian control over your software and devices didn’t deliver on any of those promises. It actually made them worse. We spend more time today on updates, security, compatibility, and learning to use our tools than people did when they were running WordStar on a CP/M machine in 1982.
And not only are your software and devices controlled by the big guys, your content is, too. Even content you create. How is it possible that Google or Amazon should decide what files you can own and where you put them? Yet every day, millions of people lose access to their own work because of these platforms and systems. They made the mistake of writing something under the “wrong” account. Or they saved their contacts in a different “app.” It’s as though buying a Joni Mitchell 8-track gave her record company the right to open your front door and rifle through your music collection, taking what they please. And you better not forget your Joni Mitchell password or we might not ever let you hear her again!
I’m just old school enough to still subscribe to the idea that you own your stuff. I also cling to the fantasy that every company should try to make something better than the next guy. In the old days, we sold software because it was good. It did something people wanted and were willing to pay for. The first spreadsheet, Dan Bricklin’s Visicalc, was so great that people bought computers just get their hands on it. It was the world’s first “killer app.” Are there any killer apps today? No. No one gives a shit about animoji. Today, you are the product and software and devices are forced on you piecemeal in order to sell you to advertisers, politicians, and governments.
The foxes are running the hen house. Some will say it was always this way. The new book Surveillance Valley by Jasha Levine posits that the entire internet was nothing more than a psyops and surveillance tool from the start. While doing research for the Mimix whitepaper, I read several of the manuals and papers for Douglas Engelbart’s NLS, an important and early precursor to almost all of today’s modern computing concepts. Every document acknowledges its Army or Defense Department funding, dutifully quoting the contract numbers. NLS was also an early use for ARPANET, the DoD project which became the internet.
Before that, IBM got its start in helping the government compute ballistics trajectories and using their new punch card machines to count Nazi concentration camp prisoners. Much later, while I was there, the company defended its support of the South African government with its intentionally racist apartheid policies that resulted in incalculable human suffering. The whole history of computing, really, is traceable to military and government atrocities. Maybe they just needed it more than “The Rest of Us.”
These and countless other examples aside, I’m still a believer in digital democracy. Software is like castles in the sky. Really, we can build anything in software — including things its inventors didn’t want. This is at the core of hacking, maximizing what can be done with software and hardware. Pioneers will always find a way to push the tools and tech in a new direction in order to work around the established order. I do think that Jobs and Wozniak had that in mind, at least at the start. Many other people have done serious and important work in making computers positive tools for society. A personal favorite of mine is Seymour Papert, an educator and inventor of the LOGO programming language. He believed that learning to program could help children to think better, giving society real hope for the future. Papert’s work and writing provided a computer epiphany for me in my own explorations with LOGO. It was he who made me realize that software was “castles in the sky” without limits. Papert was South African.
Microsoft, a company which grew up in an open and competitive environment with thousands of small players, came to dominate the business computer landscape and leave only a handful. But you can only drink so much of your own medicine. Today, no one seriously thinks of Microsoft as a long term factor in the computer industry. They’ve stopped innovating. Today, Microsoft aligns itself with Linux, with open source software, and even with gaming platforms like Unity that they didn’t invent. If they hadn’t, the company would be half its size already. People outside the Microsoft philosophy created other ways of doing things and those ways might win.
Can software return to some of its former values of user-centered control and openness? We’ve seen the world’s biggest companies exploit open source software to make billions without having to pay anyone for it, yet it’s the paid products of those companies that most people buy rather than the open source tools themselves. It’s an “embrace and extinguish” approach. But if techno monstrosities can use open source software to create upheaval, so can anyone else.
One area of tech where companies have embraced their retro values is in electronic music. My favorite synth company, Roland, has a full line of vintage synths based on modern technology, complete with their knobs and buttons and noticeably lacking screens or “software.” You can even go back to plug-out synths with real cables if you’d like and still use your modern DAW. Roland has also begun partnering with other companies on circuit design and re-creation. You see, unlike Apple, Roland can’t lock you into an ecosystem. Moving to another synth is as easy as putting your hands on the keys. Music instruments have to earn the player’s business because they have to be loved.
Does anyone love software anymore? I love retrocomputing, the use of old systems for entertainment. Of course, nostalgia has a lot to do with it. But I play with emulators for many machines I never used in real life. They’re so blissfully free of updates, internet connections, and other modern computer baggage that you can actually do something entertaining with them. You can program them. And, of course, you can add all that modern stuff to them if you want to. In 2018, you can program a Wang 700 calculator by loading software from a web address.
You know what? In Lisp, you still can, though you’d write ‘(HELLO WORLD.) in that language. I think that if we’re going to go back to our retrocomputing values we need to start there, with a self-contained machine that’s immediately responsive to simple commands. It should be insulated from the outside world (not dependent on it), yet able to draw resources from there when the user chooses.
One thing early computing pioneers didn’t address was encryption, an interesting oversight since it is literally the foundation of all modern computing in the form of Alan Turing’s work and inventions. Could it be they didn’t want the plebiscite to have privacy? In any case, a modern computing system focused on the user would need to have end-to-end encryption. In other words, only “garbage data” would be stored on the system itself, requiring a key to unscramble it into something that could be read or changed.
The use of encryption to protect data from unwanted spying or changes is part of the blockchain technology that powers Bitcoin. Bitcoin also relies on a network of peers to provide a big set of backups, a quality that could be useful to anyone for saving his own work. Today’s systems don’t have enough speed or storage to shard everyone’s files and replicate them in a distributed way. But the demands of software always exceed the capacity of hardware and those systems will be developed.
I can’t bring up “future computing” without mentioning Magic Leap, who this week released the Creator Edition of its augmented reality headset. The system projects 3D digital objects and characters into your eyes in a way that makes them appear to be in real space. Sort of. Magic Leap founder Rony Abovitz is convinced that his system is how we’ll want to interact with computers in the future. And Samsung thinks you want to talk to your washing machine. I’m not so sure.
It seems to me that people might prefer that a future computer act more like one from the past while retaining some important great inventions from today. The internet has to be there, along with encryption. We need modern devices and user interfaces. But underneath that we need privacy, reliability, simplicity, and control. We need comprehension and systems tailored to our needs, not those of platform providers.
Thanks in advance,
August 11, 2018
I love designing logos for other people, but I usually hate doing it for myself. In the case of the Mimix logo, there were three immediate inspirations. I wanted the logo to have a sixties or historic feel because the ideas behind Mimix came from that era, and so did I.
One was the Apple /// logo with three forward slashes. When I was growing up in Silicon Valley, the neighbor across the street had an early Apple ][ while the other neighbor next door had one of the first IBM PCs (and a gorgeous card punch machine the size of an executive desk!) I like the idea of “turning computing around” from what it has become today, so let’s turn Apple’s model name into three backslashes.
Secondly, I wanted the Mimix symbol to look like any other math symbol that might appear in an equation. This is a nod to Alonzo Church and his lambda calculus which in the 1930’s offered a new and more advanced way (over Turing) to represent computer operations. It was to become the basis of symbolic computation and the Lisp language that underlies Mimix. It has been said that any sufficiently powerful computer language is nothing more than a subset of Common Lisp, and the more you study that language, the more it appears to be true. The reason is that Church solved the basic problem of how to represent symbolic functions. Everything after that is just an abstraction over his work.
Finally, for a color palette I turned to the PDP-11, one of the first machines to run Lisp. It was certainly gutsy and sexy as hell in its orange and purple colorways straight out of a PSA interior. PDPs caused serious computer lust in the hallways of the universities that had them. While that DEC model was a bit too early for me to be involved, I did get to program one of their VAX machines while hanging out as a latchkey kid at my Mom’s office. The VAX’s printed manuals in their orange vinyl covers definitely had their design roots in the PDP color family.
The final logo has a drop shadow, but not a soft, modern one. The shadow itself is almost mathematical, like something that would be produced by an old computer. Everything about Mimix is something that was envisioned for an old computer. Sadly, that computer has not yet come to be.
Thank you in advance!