Is there any alternative approach to the design of future interactive software, somewhere between commercial “user journeys” – efficiently scripted interactions toward objectives that somebody else has defined for you – and interaction with machine learning-based AI: stochastic parrots that, while replicating yours or others’ actions, have no goals at all beyond consuming your attention?
When computer systems allow people to discover and pursue their own goals, such systems support two essential outcomes of conscious attention: the opportunity to create a unique self, and the agency to influence the course of one’s life. Creativity and agency are not achieved by software offering a set of defined steps toward a defined goal, or by training a machine learning algorithm to imitate the actions of other people. Reducing mechanical repetition can be useful, because repetition in itself may be an obstacle to creativity. Indeed, repetition often comes about in the first place through the obligation to blindly follow rules. However, I don’t want to suggest that every computer user should be constantly alert, obliged to innovate, looking for opportunities to improve efficiency or invent new things. The intention of Moral Codes is to give people a choice, so that they can do those things if and when they want to.
This perspective, focusing on goals that the user might create by themselves, requires a different way of thinking about user interface design. The earliest theories of human-computer interaction described the human user as if a person was also an AI algorithm, having clearly defined goals like in a game, and only needing to be shown the correct steps they should take in order to win. Although the terminology sounds liberating at first (who wouldn’t want to be a “winner” and achieve their life goals?), the meaning of the word “goal” in problem-solving AI is limited to predefined rules and criteria. An AI goal must be measurable, for the algorithm to assess whether it has been satisfied, and it must be specified in terms of a well-defined problem space – the rules of some particular game that define what moves you are allowed to make.
These things are sometimes true, for example in board games, where much of the earliest AI research was carried out. Some of the most acclaimed achievements of AI have been playing board games, as with DeepMind’s AlphaGo. However, board games are themselves algorithms - a set of intentionally unambiguous rules, including one rule to calculate the winning score. When an AI system plays a board game, this is just one algorithm playing another one.
AI algorithms can be effective in other areas of life, but only in parts of society that are predictably structured. Defined goals and rule-based algorithms do not support the potential for creative self-expression, or the agency to influence the course of your own life. Furthermore, algorithms that follow the rules, no matter how well they do so, won’t help us to change rules that are unjust or outdated. Changing the rules also requires creativity, as well as access to the knowledge tools - or codes - that are used to define the rules[1]. The most significant problems faced by humanity (climate change, war, inequality) result from our current system of rules. Early critics of algorithmic problem solving Horst Rittel and Melvin Webber called these “wicked problems,” wicked because they did not meet the basic requirements for AI-style approaches[2]. In a wicked problem, the goals are not clearly specified, and the rules of the game are not fully defined.
It is this close dependency between social rules and technical algorithms that mean access to code is a moral enterprise. If the purpose of AI systems is to recognise, repeat and facilitate established kinds of behaviour - machine-learning from the past in order to do more of the same thing in future - then alternative knowledge tools are needed for other human purposes.
A user interface designed as a knowledge tool for creative agency would not be restricted to specific goals or pre-defined journeys, but would instead provide a space for potential exploration. In this way of thinking, the task of a tool designer is not to facilitate a defined sequence of actions, but to enable a kind of exploration in some abstract world of possible futures, whose potential is determined by the language of its code and structure of the space this language describes. This distinction might be imagined by comparison to the kind of video games where playing the game involves rote learning and precise repetition, in contrast to open world games where the player can explore freely. Even more creative are world-building games, where players are given the codes to construct their own world. Players who create new worlds can invent, play, and share new games. The freedom to invent rules is one of the most basic kinds of social play – when children play freely together, they often spend more time inventing the rules of the game than they do following them!
This kind of agency, in the world of software, depends on abstraction. The conventional game-style model of an optimum journey toward a known goal must be replaced by some kind of language that describes many possible journeys, navigating an abstract space, offering structured expressions of the ways that those journeys might vary. World-building offers freedom within some combination of physical behaviour, material constraints, and socially-negotiated interactions. The potential kinds of structure in that world will be determined by the structure and properties of the abstract language that we invent to describe it.
As computers were becoming more powerful and pervasive during the 1950s and 60s, researchers recognised that the mathematical abstractions inside the computer must be related to the outside world. Abstract descriptions of the world tell the computer how things are (or should be) related. Once a pattern of relations has been encoded, algorithmic rules can follow those relations from one thing to another. These two elements – relations and rules to follow them – are the foundational principles of education in computer science, often described as “data structures plus algorithms”.
The first programming language FORTRAN was created to be an automatic FORmula TRANslator, designed to help mathematicians translate handwritten formulas into typewritten code readable by the computer, after which the computer itself could then produce a sequence of operations to do the calculations. The structure of the language was a knowledge tool, reflecting the thought processes of a mathematician as already seen in their paper and pencil notation[3]. But a radical change came with later programming languages that were not simply mathematical notations, but direct descriptions of relationships between things in the real world.
Two famous pieces of software have shaped our thinking. The first was a programming language called SIMULA[4], created in Norway in the 1960’s, specifically to build simulations. Simulation languages are now everyday tools for engineering, logistic planning, climate science, economics, epidemiology, and many other fields where local calculations are used to predict the large-scale consequences of interaction between many individual components. The fundamental principle of a language like SIMULA is to define what an object is, what properties it has, and how it relates to other objects. SIMULA can be considered the ancestor of the videogame, since all games rely on the underlying logic (or “game mechanics”) of objects in the game and the interactions between them.
The second piece of software is Ivan Sutherland’s Sketchpad, an intelligent drawing editor created 20 years before the first Graphical User Interface (GUI) products, when Sutherland was a PhD student at the Lincoln Laboratory in Massachusetts. Sketchpad used a “light pen” (which Sutherland also invented, as described in a chapter of his 1963 dissertation[5]) to draw and select points of light on a cathode ray screen. All GUIs today rely on the core principle of referring directly to points within a 2-dimensional image, whether by touching the screen, using a mouse, tablet, pad or joystick. But the implications of specifying structure via an image, rather than with teletype text, represented a radical change for the field of computer science. The implications of that change are still being worked out today, underpinning many of the themes in this chapter and elsewhere in the book.
Sutherland saw his work as supporting “understanding of processes … which can be described with pictures,” but he wrote this at a time when computers were based only on linguistic and mathematical abstractions. It might be more natural today to think of user-centric interaction design in exactly the opposite way – that interactive computers work by understanding of pictures, which can be described with abstract mathematical processes (whether the abstract symbolic conventions of the computer desktop and phone touchscreen, or the realistic imagery of video games, virtual reality and CGI film production).
The relationship between pictures, computational abstractions, and the real social and physical worlds is absolutely critical to the creation of Moral Codes, but is often more subtle than straightforwardly reading off what we see on a screen. Computer imagery offers the potential for direct visualisation of data structures, in ways that might be more or less diagrammatic, but also for depiction of affairs in the world, in ways that might be more or less pictorial. Often these two aspects are combined into a single display, resulting in very different design opportunities and potential, but still providing computational power equivalent to the textual and symbolic codes discussed in previous chapters.
Take a look at the picture in Fig 6.1 b), which is included in Ivan Sutherland’s PhD thesis[6]
Ivan Sutherland’s rendering of a bridge force diagram in Sketchpad relies on the direct correspondence between the physical organisation of the bridge and the abstract paths along which force is transferred. This diagram can be interpreted as though it is a picture of a bridge (although not a very realistic one), because we might imagine that the bridge is constructed out of straight elements, each one corresponding to one of the lines representing a force in the diagram. This interpretation of the diagram relies on the fortunate coincidence that the lines of force to be calculated roughly correspond to the shape and position of girders in an iron bridge. The white space on the page between them looks that way because air is invisible (so doesn’t need to appear in a picture), and air doesn’t usually transfer much force (so doesn’t need to appear in the diagram).
In cases like this, where structure relates to visible components, it doesn’t hurt to interpret an abstract diagram as if it were a picture, in effect treating the map as the landscape[7]. However, there are many kinds of digital information where important structural elements are genuinely invisible. Family relationships, legal contracts, financial data and many other parts of our social reality are structured in the sense that we require of a meaningful code language, while also invisible apart from their representation in documents or symbols. Whenever we design digital codes, we need to consider how the established symbolic structure that has become embedded in the use of physical paper documents will be preserved, represented, and made negotiable or changeable[8].
Invisible abstractions can often be represented with diagrams showing important properties (such as the amount of a bank balance, or the name of an uncle), and abstract relationships (for example, a transfer between accounts, or which children share the same parents). There might sometimes be a resemblance between how an object would look in a photograph, and the abstract properties and relations that would appear in a diagram, but digital representations like the Sketchpad bridge disguise any differences. Insidious moral problems seem to come where previously visible features of power and inequality (huge houses, large estates and smoke-belching factories) are replaced by invisible ones (offshore money, social influence and the mysterious “cloud” where data is processed and stored[9]).
Some people are sceptical of any kind of diagram, especially where these are associated with technocratic control and personal disempowerment, but structured diagrammatic representations can also be welcome. A classic example is the London Underground diagram (often described, against the wishes of its inventor, as the Underground “Map”)[10]. Before 1931, published guides to the Underground railways of London were actual maps, with the routes of each track drawn exactly as they would be on land among rivers and roads. Henry Beck was an electrical draughtsman who realised that most people needed to know the connections between the lines, rather than the precise routes they followed, just like a circuit diagram. So he reorganised the image in the style of an electrical schematic, making it easier to see sequences, connections and intersections, although at the expense of accurate scale and position. The result of this structural rendering was to give greater freedom to users of the diagram. Compare this to an algorithm specifying the steps toward a goal. A diagram does not tell you which journey you have to make, in the way that a user journey toward a predefined goal would do, but provides you with a new visual language – a More Open Representation for Accessible Learning, or M.O.R.A.L. Code that helps people to work out for themselves what the logical possibilities might be.
Both Dahl and Nygaard’s SIMULA and Sutherland’s Sketchpad shared a computational idea that is important when presenting an open space for exploration: formal representation of objects, and the relations between those objects. In the case of SIMULA, this was used primarily to describe (and simulate) entities in the world, while in the case of Sketchpad it was used primarily to describe the behaviour of elements to be rendered on the screen. The question of how the screen relates to the world will take far longer to resolve (and I discuss the problematic concept of the “desktop metaphor” in chapter 9), but much credit for the mathematical abstraction belongs to Doug Ross, a researcher at Lincoln Lab who had invented a computational technique that he called the “plex” for “an interwoven combination of parts in a structure”; a network that allowed data, types and behaviour to be named and manipulated as a conceptual unit[11].
Ross was a key architect of early programming languages for computer-aided design and robot control, and he saw this idea as a fundamental requirement for any robot that could operate in the real world. The potential of the plex idea influenced both Ivan Sutherland, and also the programming language researchers who created the “record” structure in the theory-based language ALGOL, applied in SIMULA as the fundamental representation of things in the world[12]. All of these are early examples of the abstract data type, now a fundamental basis of the many object-oriented programming languages and databases underpinning the internet and all our personal devices.
This relationship between the abstract data type, visual screen representations, and the structure of real world affairs, was integral to the work of GUI inventor Alan Kay. Kay has described his own experience as a new PhD student, encountering both SIMULA and Sketchpad, and recognising the potential of the concepts they shared[13]. Kay is famous for his many achievements of visionary imagination, not least the Dynabook, a portable computer concept that anticipated by decades the laptops, tablets and smartphones of today in a project at Xerox PARC that led to a “personal computer for children of all ages” designed for creative play rather than consumption of bureaucratic software[14].
The Smalltalk environment and programming language created by Kay’s team both integrated and hugely surpassed the object-structured simulation logic of SIMULA on one hand, and the interactive graphical elements of Sketchpad on the other. Kay’s student David Canfield Smith described how drawings on a pixel-based screen could themselves be used as the elements of an abstract programming language, moving beyond the geometric diagrams and electronic schematics that inspired Kay’s work in Sutherland’s Sketchpad, to allow the drawing of new computational abstractions – symbolic components of a visual language that Kay and Smith called “icons” – within a system named Pygmalion (after the mythical sculptor) for its apparent potential to interact intelligently, and which Smith hoped would satisfy the ambition of research funders at that time for man-machine symbiosis[15].
These projects by Kay, Smith and others in development teams at Xerox laid the foundations for the elements of the GUI that became familiar in the 1980’s through the Apple Macintosh and Microsoft Windows, including icons, menus, folders, windows, dialogs and so on[16]. The early promotion of these products did associate the pictorial screen with freedom and creativity, most famously in the 1984 Super Bowl advertisement depicting the Macintosh as a revolutionary product that would shatter the computer industry’s Big Brother-like control over rows of grey-suited corporate clones.
The early ambitions of Kay’s KiddiComp/Dynabook and Smalltalk projects included many creative tools: painting programs, editors for children to publish their own writing in professional-style typefaces, and music composition software. However, despite creating the most famous computer science laboratory of its time, inventing the personal computer, the laser printer, and the ethernet hardware that jump-started the Internet, Xerox was a photocopier company which saw its real business opportunities in document processing and support for bureaucracy[17]. The pictorial rendering capabilities and object-oriented structural abstractions created by Alan Kay’s team were transformed into simulations of the office-worlds where Xerox customers worked, having in- and out-trays for email, filing cabinets to store folders in, and waste paper bins to discard unwanted papers.
The visual style of the Xerox office workstations became the basis of the GUI for everybody – whether “creative” Apple Mac users or corporate adopters of Microsoft Windows. In order to explain these new ways of thinking about user interaction, both companies published guidelines explaining how other developers should create new software applications with pictorial icons that followed the visual identity rules of the operating system supplier, while supposedly being easy to understand because of the “desktop metaphor” – the explanatory analogy by which all these pictures on the screen worked the same way as familiar office accessories.
I’ll come back to some problematic issues with those design principles in chapter 9, but at this point I want to consider further how the design insights of Simula, Sketchpad, and Henry Beck’s London Underground Diagram bring together two things that are more often considered separately: how software is built, and how users experience it. The standard business assumption is that building software is a professional activity, carried out by trained programmers and software engineers, and that everyone else in the world will be a software user, not a software creator. Many areas of society are structured this way. Some people build cars, others drive them. Some people build houses for others to live in. Some make food, others eat it. When it comes to software, do we also have to assume that some people naturally make rules, while other people naturally follow them?
In the digital world of today, most of us can only hope that software builders will not make our lives too unpleasant with new and badly designed rules. There has been an unfortunate habit among software engineers to regard this requirement as a specialist problem called “usability,” which can be taken care of after the basic functionality is in place, and not really the responsibility of the original inventor. Even usability specialists are often instructed simply to specify a “user journey”, taking for granted that the user’s goals will be determined by some business model, rather than chosen freely by the users themselves.
However, the evidence presented in this book so far has shown a number of ways in which this traditional separation between coding and experiencing software does not need to be so definitive. Even those analogies I have drawn to other areas of modern life are not as clear as they first seem, if we think of them more critically. Certainly I did not build the house that I live in, but I do “construct” pieces of it – including painting walls, building shelves, fixing doors and so on. And of course most people both cook food and eat it – sometimes at the same time! Why should software be any different? Why can’t it be possible to both usesoftware and make it work differently? My discussion of methods for programming by example with machine learning techniques showed how users of such systems are also re-coding them, supplying training examples to be replayed as “intelligent” behaviour, predictive text, or content recommendations.
So the habit of seeing software as something that you either make or use, in which the world is divided into programmers and end-users, into rule-makers and rule-followers, is no more than a social and business convention, and is not technically determined by the fundamental nature of software. On the contrary, software, because it lacks physical substance, is more changeable than cars, houses, or even food. There is no technical reason why every user of software should not also be a builder of software, with all the opportunities of creativity, freedom and personal agency that this would entail.
The examples earlier in this chapter, including Sketchpad, Simula and the Underground diagram, show the essential principles for building more freely with software. First there must be some kind of abstract structuring of the world, representing relevant aspects as separate parts with relationships between them. Secondly, there must be some kind of language, representation or notation that allows us to look at that structure, describe it, criticise it, perhaps adjust and modify it. Thirdly, there must be the potential to formulate your own plans in relation to a representation of that structure, just as the Underground diagram allows many possible journeys, and a diagram of forces in a bridge allows you to either read off summary values, inspect local components, or follow logical paths of reasoning as your gaze moves over the connections between the parts.
These ways of working with abstract structure are fundamental to information technology, and must be the core elements of design, if we want to offer creativity and agency to humans. Computer science conventionally thinks of abstract structures as being manipulated either via programming languages or via user interfaces, but not both. However, the distinction between programming languages and diagrams is not so clear. Sutherland’s Sketchpad and David Canfield Smith’s Pygmalion blurred the boundaries, but we shouldn’t have become so fixated on those inventions as the only possibilities.
All these ways of describing structure are part of a larger class of notational systems, joining the whole history of creative knowledge tools – music, poetry, printed books, mathematics, GUIs and even paintings and photographs – all combining aspects of language and of visual representation. All are kinds of codes, all are possible ways that we could initiate more meaningful conversations with AI, and all are fundamentally neglected in current international fashions of AI research that focus only on words and verbal language, forgetting how much advance we have made by inventing special codes that are better than words for many purposes. AI researchers worry about the lack of trust and explainability in AI systems, but sometimes act as though this is something that can be fixed after their algorithms are finished (like the old-fashioned view of usability), rather than a moral imperative to consider alternative ways of coding them.
[1] Graeber, The utopia of rules
[2] Horst W.J. Rittel and Melvin M. Webber. "Dilemmas in a general theory of planning." Policy Sciences 4, no. 2 (1973): 155-169.
[3] Ian Arawjo, "To write code: The cultural fabrication of programming notation and practice," in Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. (2020).
[4] Ole-Johan Dahl and Kristen Nygaard. "SIMULA: an ALGOL-based simulation language." Communications of the ACM 9, no. 9 (1966): 671-678.
[5] Sutherland, Ivan E. Sketchpad, A Man-Machine Graphical Communication System, 1963. Facsimile of the original PhD thesis, with an introduction by Alan F. Blackwell and Kerry Rodden. Technical Report 574. Cambridge University Computer Laboratory, 2003. http://www.cl.cam.ac.uk/TechReports/UCAM-CL-TR-574.pdf
[6] Sutherland, Sketchpad
[7] In contrast to Sutherland’s (rather simplistic) drawing of bridge loads corresponding to visible mechanical components, it might be argued that the notorious Tacoma Narrows bridge disaster of 1940 occurred in part because the wind forces that caused it to collapse had not been sufficiently “visible” in the representations used by engineers during the design process.
[8]Abigail J. Sellen and Richard H.R. Harper, The myth of the paperless office. (Cambridge, MA: MIT Press, 2003).
[9] Crawford, Atlas of AI
[10] Ken Garland, Mr Beck’s Underground Map (Capital Transport Publishing, 1994).
[11] Douglas T. Ross, "A generalized technique for symbol manipulation and numerical calculation." Communications of the ACM 4, no. 3 (1961): 147-150.
[12] See the editors’ introduction to the online facsimile of Sutherland Sketchpad.
[13] Alan C. Kay. “The early history of Smalltalk”. ACM SIGPLAN Notices 28, no. 3 (1993): 69–95. Reprinted in Thomas J. Bergin Jr and Richard G. Gibson Jr, eds. History of programming languages II. (Reading, MA: Addison-Wesley, 1996), 511–578.
[14] Alan C. Kay, "A personal computer for children of all ages." In Proceedings of the ACM annual conference. 1972
[15] David C. Smith, Pygmalion: A Computer Program to Model and Stimulate Creative Thought. (Basel, Stuttgart: Birkhauser, 1977).
[16] Jeff Johnson, Teresa L. Roberts, William Verplank, David Canfield Smith, Charles H. Irby, Marian Beard, and Kevin Mackey, "The Xerox Star: A retrospective." Computer 22, no. 9 (1989): 11-26.
[17] Douglas K. Smith, and Robert C. Alexander, Fumbling the future: How Xerox invented, then ignored, the first personal computer. (New York: William Morrow, 1999).