Skip to main content
SearchLoginLogin or Signup

Chapter 2: Would you like me to do the rest? When AI makes code

Published onSep 19, 2022
Chapter 2: Would you like me to do the rest? When AI makes code
·

Chapter 2: Would you like me to do the rest? When AI makes code

“Would you like me to do the rest?” If you’ve worked in one of those bullshit jobs that David Graeber wrote about[1], you’ll know what a relief it is when someone takes over a mind-numbing repetitive task. Even around my house, those words are a fairly sure way to make somebody else happy.

When you think about the things that we do with computers every day, there are countless repetitive actions that we have to grind through, but would be so much easier if we could just say to the computer “you do the rest”. That simple phrase, if it worked, would be such an improvement to many tasks. This chapter analyses the design approaches that take us toward this simple vision of how AI might improve our lives rather than threaten them. 

●      I need to get to London this afternoon. You do the rest.

●      This is how to peel a potato, and I’m busy. You do the rest.

●      Time to send out the monthly invoices.  You do the rest.

●      This year’s tax return will be the same as last year. You do the rest.

●      12-bar blues with a reggae beat, modulating to major in the third verse. You do the rest.

●      I’m ready for bed. You do the rest.

Pretty much any time we would really like some AI in our lives, the simple instruction “you do the rest” expresses the essence of what we would like a computer to do automatically. Rather than wasting our conscious hours on mechanical action, being made into a machine, saying “you do the rest” is empowering - it is about helping you to do what you want. You give the instructions. Some adjustment may be necessary, which means that real empowerment also has to include “not like that,” or even “I’ve changed my mind”. And it’s tiring to have to take all the responsibility all the time. So we would appreciate an AI that spots an opportunity to ask: Can I help you with that? Would you like me to do the rest?

If I had the biggest AI research budget in the world, this would go a long way toward my ideal computer. Just make it so that I can say “you do the rest”, and things will happen the way that I want. 

Some of us are lucky enough that we have really had this experience – although not with a computer. Very wealthy people have always hired butlers, or aides-de-camp, or personal assistants – those kinds of servant who know so much about their boss's preferences, lifestyle and needs that they can indeed “do the rest” whenever asked, or even when not. 

The fantasy that we could all have personal servants or robots has been a common theme, in the kind of corporate advertising video that imagines a future where a company’s products are going to solve all your life problems[2]. Imagine asking your “smart house” (or “smart speaker,” in the less ambitious products of today) voice assistant to do any of these things.

Some of the examples I gave at the top of the chapter have already received attention from AI researchers, and there are useful products that do part of the job. Getting to London suggests some kind of autonomous vehicle, while generative music algorithms often have libraries of standard rhythms and chord progressions. But when I say “you do the rest”, I don’t really want a team of researchers to start work for 15 years, before selling me a product that will respond to just that command. What I’d love is an intelligent assistant able to help me out with all kinds of mindless and boring tasks. I’m actually quite prepared to explain what I want, although after I’ve explained this once, I don’t want to repeat it.

Fortunately, computers are good at remembering instructions. The difficult part for a computer is to understand what natural language instructions mean. In conventional computer science, explaining to a computer what you want it to do in the future is called programming, and must be done using a programming language. Creating or modifying a program, or changing your mind to make it do something else, involves editors, compilers, debuggers, revision systems and many other specialist tools in a software development environment of the kind that will be discussed in more detail in chapters 6, 7, 10 and 11. These are complex tools, designed for use by professional engineers, often coordinating the work of large teams so that they can break down a complex problem into small parts for each engineer to work on.

It is far less common for regular computer users, who are not trained as software engineers, to explain to a computer what they want it to do in the future. The small research field concerned with achieving this possibility describes it as “end-user programming” (EUP), “end-user development” (EUD) or “end-user software engineering” (EUSE)[3]. Although I started my own research career as an AI engineer, much of the content of this book builds on my later research, as described in research publications over the past 20 years for other researchers in the EUP/EUD/EUSE community.

These fields start to overlap more closely with AI research if we would like the computer to volunteer “Can I help you with that?” or “Would you like me to do the rest?” The idea of an AI system that sometimes offers to help, rather than simply waiting for instructions, has been described as mixed-initiative interaction, where new actions are sometimes initiated by the user, and sometimes by the computer. The phrase “mixed-initiative” is most often associated with a research paper by Eric Horvitz, now Chief Scientific Officer of Microsoft Research[4]. But the basic idea was demonstrated decades earlier, in a research prototype created by Allen Cypher, who was then working for Apple research.

The story of Eager – the little AI that could have been

Like many people who work on the boundary between AI and user experience, Allen Cypher is a mathematician who is really interested in people. After finishing a PhD in theoretical computer science, he used his teenage programming skills – a  rare qualification in the 1970s – to teach a class at Yale in “computer programming for poets", so that he could hang out with the pioneering AI researchers working on natural language understanding in Roger Schank’s group. 1980s AI involved writing out knowledge as rules, and machine learning methods were in their infancy, associated with “connectionist” neural networks and brain science. That’s what started to turn Allen from a mathematician/programmer/poet into a psychologist, when he was awarded a Sloan Fellowship to join the brand new cognitive science group at the University of California San Diego (UCSD), working with pioneering researchers such as David Rumelhart and Donald Norman who were studying human and machine learning.

AI researchers like Allen often tinkered with making their own mental work more efficient through computer tools, such as a simple UNIX popup called Notepad, that he made to capture ideas quickly when he was distracted or didn’t have time to concentrate. But Allen’s thinking was turned around when UCSD pioneer Don Norman told him off for simply designing a product according to his own preferences. Norman and his team were working on the classic book User-Centered System Design (a now-forgotten punning acronym recording its origins in UCSD cognitive science). A programmer who made things for his own taste, rather than considering the needs of others, was the exact opposite of the user-centered philosophy they were campaigning for. Decades later, Allen is still embarrassed about the day Don Norman told him he should think about his users, but Notepad did become a popular tool at UCSD and more widely, echoed in other influential hypermedia apps like NoteCards from Xerox PARC.

Another project from Allen's UCSD days did not get widely used at the time, but turned out to have further-reaching effects. AI researchers were all using the UNIX command line, including its built-in function to store a history of old commands. Experts could avoid repetitive operations by scrolling back in time to repeat commands they’ve typed before. You might think of this as a really literalistic kind of machine learning, which “learns” by simply recording the commands it sees, and can only repeat exactly what it has seen before. The problem is that, while it’s trivial to “learn” by simply playing back previous examples, UNIX users seldom want to repeat exactly the same command twice. More often, you need another command that's very similar, but just slightly different, from what you did last time. Allen thought about this problem mathematically – the learning algorithm should work out which parts of the command will be constant, and which parts are variables. In fact, this has been the fundamental problem of machine learning ever since: how can a system learn to generalise from a set of examples, creating an abstract representation that defines the range of likely future variations?

This problem stayed in the back of his mind while Allen went to work as an AI consultant during the 1980s “expert system” boom. Although we didn’t fully appreciate it at the time (I also started work as an AI engineer then), the excitement of that particular boom was going to fade away because of usability problems. In order to turn expert knowledge into executable programs, somebody has to translate human knowledge into code[5]. The big problem at Intellicorp where Allen worked, like other companies in the expert system business, was that AI programming systems were too unusable for the experts to get their knowledge encoded. The generation of AI that might have resulted from more usable programming never arrived, and the systems never worked as well as hoped.

All these experiences came together when Allen was hired to work at the Apple Advanced Technology Group (ATG), where he had the freedom to see how learning algorithms could provide an alternative to unusable programming, by figuring out the mathematical abstractions that would automate repetitive tasks (to “do the rest”), just as he had imagined for his UNIX command history.

The Eager prototype that Allen built at Apple was a compelling demonstration of how this would work. Eager’s algorithms watched all the user’s actions in the background, waiting for sequences of keystrokes and mouse clicks that seemed to be partly repetitive. Eager would wait until it was able to predict what the user might want to do next, at which point a friendly icon of a little cat would pop up and offer to help. That might have seemed a bit creepy at the time, but nowhere near as bad as the situation we have today, when many companies really do watch every keystroke and mouse click, keeping that information to themselves! Sadly, despite the level of software surveillance we have today, it’s still unusual to get an offer of help, outside of simple text prediction[6].

Apple’s ATG had research funds to sponsor graduate students working on technically related approaches, and Allen even convened a conference, paying for other researchers with the same interests to come and share their ideas and inventions. That conference turned into a classic book, edited by Allen, called Watch What I Do[7] – a motto expressing the ambition to make people’s lives better with a combination of machine learning and end-user automation, and paving the way for computers to offer “would you like me to do the rest?” as in the start of this chapter.

Allen Cypher’s work is still famous among the community of people who want to make computers more efficient by automating repetitive tasks. Contributors to the Watch What I Do book have gone on to make some of the world’s most usable and successful machine learning approaches. And Allen himself has been hired by companies like IBM and Microsoft, still with the promise that machine learning might help reduce how many hours of our finite conscious attention get wasted on bad IT systems. Allen has contributed to the work of groups who build products like IBM’s CoScripter and Microsoft’s FlashFill – tools that still show the promise of systems that could help us when they watch what we do. I’ll be discussing FlashFill in more detail in a later chapter. But more often, capabilities like this get hidden away, or only deployed for use by professional programmers, in part because these features rely on different software engineering approaches from the deep learning and big data architectures that focus on extracting value from people rather than helping with their lives. Many machine learning researchers don’t spend a lot of time thinking about user interfaces beyond their own tools, and unlike the days of Don Norman’s UCSD, today’s AI researchers are not even in the same room with people who will tell them otherwise.

So what happened to Eager? Every year, two ideas were selected from ATG research projects, to be built into Apple products. Eager was selected as one of those ideas. But the routine obstacles of corporate management and communication got in the way. A product manager was busy on other things, Eager wasn’t urgent, a year went by, the window of opportunity closed, and so Apple computers are not quite as helpful as they might have been. 30 years ago, companies were not expecting big opportunities to come from machine learning. Perhaps Eager was just (a lot) before its time?

Sharing the mental load

There are some important design principles in the creation of mixed initiative end-user programming systems such as Eager, which can be understood in relation to a few basic mechanisms of human cognition. We can think of the human user, and the AI assistant, working together to achieve some shared goal. This human-plus-AI is an engineering hybrid (or “symbiosis” as it was once called[8]), where the design goal is for the human to do the parts most suitable for humans, and the computer to do the parts most suitable for computers. Obviously, in the “you do the rest” scenario, we want the computer part of the hybrid to take care of repetition of the same actions in the future, since this is what computers are good at, and since computers are not going to suffer from boredom, stress injuries, and all the other adverse consequences when humans are made to do needlessly repetitive mechanical work.

One important principle in getting this balance right can be thought of as “attention investment,”[9] building on the observations I made in chapter 1 about the finite hours of conscious attention that we have in our lives, and Herb Simon’s famous definition of the attention economy[10]. Imagine that I’ve written a novel featuring the character Liz, but I decide to change her name to Elizabeth. It would take a long time to read through every page, looking for each occurrence of the word, but the search-replace function allows me to do the whole thing in a few steps. Unfortunately, it turns out that there is a passage where she is speaking to her mother, who doesn’t use the short version of her name. After my search and replace, the “Elizabeth” in that passage turns into “Eelizabethabeth” (something similar happened in one of my research papers, where an automated search and replace by a production editor enforcing the journal’s house style meant that the printed article repeatedly referred to famous AI researcher Seymour Papert as “Seymour Articlet”).

Whenever we ask a computer to repeat things in future – programming – we need to anticipate the circumstances where our instructions might go wrong, or be inappropriate, or misunderstood. A human servant is able to use common sense and general knowledge (and if we are fortunate enough to have a personal servant like a butler or PA, an intimate understanding of our own habits and intentions), to avoid stupid mistakes like these. When the servant has no common sense or social context, as in the case of computers, all the responsibility of imagining unintended results falls instead on the person giving the instruction[11]. We have to spend time thinking carefully about the consequences of what we are asking, and refining the instruction we give, so that it results in exactly the effects we want, nothing more and nothing less[12].

This thinking effort can be defined as investment of attention. If the automated future actions are going to save us from further mechanical concentration in future, then the investment will be paid back[13]. But this can’t be guaranteed to happen. In fact, many programmers have had the experience of imagining that some repetitive task could in principle be automated, spending several hours writing the code to do this, and then never needing to do the same thing ever again. Cognitive psychologist Lisanne Bainbridge described this problem as one of the “ironies of automation”[14]. The opposite mistake is also common - many people are so determined to focus on finishing the task in front of them that they never pause to ask whether it could be achieved more effectively – what psychologists Jack Carroll and Mary Beth Rosson called the “paradox of the active user”[15].

Mixed-initiative interaction could be a key technique to mitigate these errors of judgement in attention investment. An AI system is able to detect repeated patterns and opportunities for automation, perhaps drawing on records of other users who have faced similar situations in the past. However, there have also been cases where this design strategy has gone badly wrong. The most notorious example is the Microsoft Office Assistant “Clippy”, which annoyingly offered to help users of Microsoft Word when they seemed to be drafting letters. The basic idea (originally coming out of an AI research project at Microsoft) was sound, but the problem was that the template letters being offered seldom corresponded to what the user was actually trying to write. Extremely formulaic letters may have their place, but perhaps not as often as Microsoft imagined. (In the style of the examples that introduced this chapter, can you imagine the implications of saying: “I need to write a love letter. You do the rest”?)

Even more annoyingly, Clippy was designed to consume the user's attention, because the animated character popped up over the top of the document you were working on, jiggling around, changing shape, and blinking its eyes until you responded to it. In the era of the attention economy, we do understand that companies want to gain our attention – and we resent them for it. Although Clippy is never mentioned in Eric Horvitz’s paper on mixed-initiative interaction, his previous research studying reasoning under resource constraints and human response to interruptions must have made it pretty clear to him, as to all other Microsoft researchers at the time, that the company’s designers needed to be more sensitive to the dangers of saying “Can I help you with that?”. Having served as President of the American Association for Artificial Intelligence, Horvitz is now both an influential campaigner for ethical AI, and also the Chief Scientific Officer for Microsoft Research, so we have grounds for optimism that the mixed initiative perspective will continue to be influential – as I said at the start of the chapter, this is what I would do if I had a big enough research budget.

End-user programming without machine learning

New mixed-initiative systems following the example of Allen Cypher’s Eager would do a lot to help people complete repetitive tasks, but computers have already excelled at repetitive tasks for many years. Professional programmers are in the habit of writing small pieces of code to help them with routine personal tasks that happen to be repetitive – traditionally described as “scripts” or “macros”. Because professional programmers also have to use computers like regular people – saving their files, doing their tax returns, writing documents and so on – they do encounter the kinds of repetitive task that anybody else does. The difference is that programmers have a way to escape the repetition, by writing a program to do it for them. But writing programs requires a programming language. If you were making repeated changes to a list of addresses in a Microsoft Word document, and you wanted this to be done automatically, you could write a program to instruct Microsoft Word what things to repeat. If you were repetitively renaming or moving around a whole bunch of different files on your personal computer, you could write a program to instruct the operating system what things to repeat.

So because standard word processing products and home operating systems are made by programmers, it turns out that programmers have already thought about how to make things more convenient for themselves, by building special facilities into many standard products, in the form of internal programming languages that can be used to avoid the repetitive operations that annoy programmers when they are off duty at home. However, software companies don’t want to give the impression that their products are unnecessarily complex or hard to use, so those convenient internal languages are often hidden away, in places that you wouldn’t find unless you specifically went looking for them.

Many people are surprised to learn that Microsoft Word has had a programming language built in since 1987, and that anyone can use this to write code that automates repetitive changes to a document. Word even has a trivial form of “learning”, where you can press a record button to capture a sequence of editing actions that you want to repeat. This macro recording can be replayed over and over. But more usefully, each of your key presses and menu commands is translated into the relevant Visual Basic code that would automatically make the same change to the document. These lines of macro code can be called up in a programming editor, right inside Microsoft Word, and modified to behave differently in different contexts, to repeat some actions automatically, or extended with additional business logic or calculations. Of course, this could be pretty dangerous if someone else had written macro code that was going to automatically change your document, in ways that you might not be expecting. That’s why many email programs, operating systems and security applications specifically test Word documents to see if they contain macros which might be a security risk. As a result, many people have seen warnings about the dangers of macros, even if they have never considered creating a macro themselves.

Almost all personal computers have built-in ways to automate repeated requests to the operating system, including changes to files and folders. The traditional way to do this builds on the functionality of the command line or terminal window. Keyboard-driven operating system commands such as “ls -l”, “del *.*” or “rm -rf[16]” look like lines of code because they are lines of code. It’s possible to put any sequence of commands into a script or batch file that can be replayed many times. And these scripting languages are powerful programming languages in their own right. They can be made to repeat operations thousands of times, test conditions and make decisions. Once again, most programmers are familiar with these kinds of capability, because other programmers have put the capabilities into products for their own convenience. But while it is reasonably easy for a non-programmer to make their own macro in Microsoft Word by recording a sequence of keystrokes, few operating systems offer a similarly easy way to record and repeat file operations. The closest alternatives look strikingly similar to what Allen Cypher was doing in the UNIX shell over 30 years ago – manually looking back at the history of commands, and then assembling those into a script.

Although I’ve been talking as though there are only two kinds of people in the world, programmers who have the power to avoid repetitive tasks by writing code, versus regular people who are doomed to repeat the same mechanical operations over and again because they don’t have access to these hidden features, in fact there are many gradations in between[17]. Many people do write small pieces of code, even though they might not be trained in programming. In addition to that, a more experienced programmer might help their colleagues by creating a small script or macro that will make their lives easier. Or else people might ask their programmer friends for assistance. Or else, if a repetitive task is particularly annoying, and the user is feeling curious, they might investigate how to write some code themselves, and perhaps even share this with their own colleagues if it turns out to be useful.

These many kinds of end-user programmer, with their fine gradations and variety of professional and social contexts, can also be understood in relation to the attention investment principles that I have used to explain mixed-initiative interaction. Repeating the same mechanical commands over and over, whether these are editing actions in a Word document or moving files around, consumes many precious hours of our conscious lives, when there are many other things that we would rather be doing than attending to a computer screen. The promise of programming is that we can write code to automate those actions, and save ourselves the need to attend to screens so much.

The catch is that automating those actions with code also involves spending time with the screen – time spent figuring out how to write the necessary code. This is how attention investment got its name. End-user programming is an investment decision, where you invest a certain amount of attention in advance, in order to get a longer-term payback of attention saved through automation. The return on investment is the ratio between the amount of time you would spend writing (or finding) the necessary code, and the number of times that the original action was going to be repeated. Writing code to avoid a million repetitions is a clear win. In fact, nobody is going to repeat a million actions with a computer, so if you have a job that does need to be repeated a million times, there is no way you are going to do that. You’ll either have to figure out how to automate it, or (more likely if there is so much time to be saved) hire a professional programmer to do it for you. At the other end of the scale, if you have a simple action that is only going to be repeated 50 times, you should probably just take a deep breath and hammer at the keyboard for a couple of minutes. Even a professional programmer would struggle to write a script faster than this.

Attention investment decisions are unfortunately harder than they seem, because it is not just a simple matter of calculating the effort to be saved through automation on one hand, and the time needed to write the code on the other. Writing code for a new automation problem requires some original thought, analysing which parts of the repeated operation will be the same every time, which parts will be different, and how. Even where many of the repeated cases are very similar, there could be exceptions, like the example of ‘Seymour Articlet’ that I described earlier. And if the automation code fails to do the right thing with exceptions, as in my own case with ‘Articlet’, it may take a lot more attention to fix up the embarrassing consequences later[18]

So investments of attention are associated with uncertainty and risk. Uncertainty about exactly how much time it will take to write (and debug) automation code, and risk that the investment might not pay off. There is even risk in estimating how big the payoff is going to be. I was writing as though people always know in advance whether a particular action is going to be repeated 50 times or a million times. But often, you haven’t yet seen the future situations in which your script or macro might be used. It’s a common experience that a company imagines some situation that they expect is going to occur millions of times, hires programmers to automate it, and then finds that this situation never happens again. Such things can happen because there turns out to be no demand for a product, or perhaps because government regulations change, meaning that a certain algorithmic calculation, totally compulsory until a certain date, never needs to be done by anybody again after that day[19].

There is also a lot of uncertainty in estimating how hard it is going to be to write a particular piece of code. Writing code can sometimes be very hard, for example if you need to use a programming language you have never used before, meaning that you have to acquire a lot of background knowledge before even starting. On the other hand, it could be trivially easy, for example if it turns out that somebody else on the internet has written exactly the piece of code you need, with instructions on how to use it, so that your first search result solves the problem immediately[20]. Even professional programmers are very cautious with estimating in advance the amount of time required to write a new piece of code.

This means that attention investment decisions involve a trade-off between two kinds of activity. One of them varies in a relatively predictable way according to the size of the job (the number of actions to be repeated). The other varies in an unpredictable way, according to the amount of expertise and previous experience that you might have. In practice, people make their attention investment decisions on the basis of their personal biases and expectations – thinking fast, rather than thinking slow, in the terminology introduced by behavioural economist Daniel Kahneman[21]

Some decisions are perfectly sensible, for example to hire a programmer for that million-task problem. But other personal biases routinely result in the wrong decision. A common experience among working programmers is to get frustrated by some repetitive task, and decide “I could write a program to do this”. Five hours later, the program is finished, but they could have finished the original task in 30 minutes. Even more common, there is some moderately annoying task where you think “I never want to do this again”, so you spend the time writing a program to do it instead. And then nobody ever does ask you to do it again (or perhaps they do, but you’ve forgotten where you stored the program), so the whole investment was pointless.

The opposite biases can be found among people who are not trained as programmers, or perhaps have some coding skills but don’t feel confident and don’t enjoy it. These people may persist with repetitive operations for far too long, perhaps because they over-estimate the costs of learning to solve their problem with code, or perhaps because they under-estimate how much time they could save. Ironically, this problem gets worse through feedback effects, because although every experience of successfully solving a problem with code improves your skills and reduces the time it will take you to solve problems with code in future, choosing the opposite path of avoiding code means that you gain no skills, and are likely to continue losing hours of your life to mindless repetition. This self-defeating feedback loop is familiar to educational psychologists as lack of “self-efficacy”, defined as the belief that you will be capable of an intellectual or practical task[22]. In mathematics education, student performance is in large part determined by whether the student believes they are good at mathematics. Students who don’t believe they can do maths don’t persist long enough to find out if they could. The belief that you won’t understand something initiates cognitive processes that result in you really not understanding it.

There is a lot of research confirming that the same thing happens with computers, and even with end-user programming[23]. If somebody believes that they are not going to understand computers, they are less likely to actually understand them. In part, this is because understanding is dependent on experience. Most people learn to use computers (like any piece of machinery) by fiddling around, figuring out what things work and what things don’t. Unsurprisingly, if you never spend the time to play around, you never develop the skills, you don’t acquire any confidence in your ability, so you suffer from low self-efficacy, meaning that you don’t enjoy playing, with the result that you never spend the time to play around. This is a self-reinforcing loop that, as many teachers know, is hard to escape once a student falls into it. 

Sadly, the starting point for self-efficacy is not your own ability, but the expectations of people around you. If people have low expectations of you, you don’t acquire skills in maths or in coding, as in many other technical areas. This is a really significant problem, because many societies have low expectations, in both maths and coding, of women, people of colour, working class people, and other excluded groups. We can’t solve it by just telling people to believe in themselves, which would suggest it is their own fault for not believing hard enough. Of course we have to look at the root causes, meaning that race and gender campaigners focus on the fundamentals of inequity, leaving things like maths and coding skills to be mentioned only as symptoms rather than causes of injustice.

Those exclusions are a lot more significant than whether you are able to save yourself a few hours by writing a macro in Microsoft Word. This is about getting computers to do what you want. In a society where many basic human rights, including the right to even be a citizen, is delivered via computer, then self-efficacy in getting computers to do what you want looks rather fundamental. Coding – the ability to express your own wishes to a computer – has become a moral capacity.

It’s also not an accident that self-efficacy applies both to end-user programming and to mathematics. Attention investment is fundamental to both, because both programming and maths share two important cognitive foundations – abstraction and notation. Programming and maths both rely on the power of abstraction. As a casual definition puts it, mathematics is the skill of calling different things by the same name. Abstract language (whether computer code or human speech) gives us the power to refer to many different things or situations with a single word. The abstract significance of names is an important topic that I will be returning to in chapter 10.

That basic principle of attention investment – devote a bit of effort to thinking about something in advance in order to avoid mundane repetition in future – is also the basic power of abstraction[24]. Each repetition is an individual concrete case, that without abstraction must be handled one at a time. A program that describes how all the cases should be handled is necessarily an abstraction, describing abstractly what must be done, describing abstractly how the actions might need to vary slightly in different circumstances, and defining abstractly what those circumstances are. For an experienced programmer (whether professional or end-user), once you have learned how to use the basic programming tools, the attention investment for each new task is an investment in abstraction. Using abstractions is all about investing attention, and self-efficacy (believing that it’s worthwhile to even start) is a fundamental prerequisite for this kind of abstract strategy.

The other thing that mathematics and programming have in common is that they are both fundamentally about the use of notation - “codes”, as I call them in this book. We need notation (or a recorded language of some kind[25]) in order to use abstraction. It’s possible to talk about a single concrete object by simply touching it as we speak, but talking about a whole lot of objects requires abstraction and names. This is the difference between on one hand touching a single object while saying “pick this up,” and on the other hand saying “pick up all those Lego bricks you’ve left around the house”. More complex requirements involve more sophisticated instructions: ‘delete all the emails from John, but only the ones that I’ve already replied to’, or ‘change all occurrences of “paper” to “article”, except where “paper” is a sequence of letters inside a person’s name’. It’s this complexity of communicating instructions that makes coding hard. Programming languages are the notations - codes - that give us ways to manage this complexity by constructing and applying abstractions.

Years ago, typical computer users were more familiar with abstractions than they are today. Before the graphical user interface (GUI) of the Apple Macintosh and Microsoft Windows, regular interaction used the programming-like notation of the command line, and it was routine to avoid repetitive operations through abstract specifications such as “DEL *.BAK” (the asterisk is a wildcard, indicating an abstract specification of all files whose names end in the letters BAK, which was a convention for backup files). Using such commands usually involved some simple attention investment: learning how to use the wildcard, deciding on the correct specification, assessing the risk if it goes wrong and so on.

The GUI replaced the command line with the principle of “direct manipulation”, where files were represented by iconic pictures, and the mouse was used to act on those pictures as if they were physical objects, selecting and dragging them to their destination. Direct manipulation of individual icons involved less up-front investment of attention, just as it’s easier to pick up one Lego brick than explain to somebody else that you want them to do it. However, direct manipulation is not itself a notation, meaning that it was no longer possible to describe more abstract requirements, such as applying a certain operation to all the files of a particular type, or to multiple files that are in different folders. Direct manipulation is relatively easy to think about because it doesn’t involve any notational codes, and does not provide any powerful abstractions. But direct manipulation by itself, however easy and convenient, can’t be the only approach to interaction with computers. In an information economy and a computerised society, abstraction is the source of control over your own life, and potentially the source of power over others. Moral codes are abstract codes.



[1] Graeber, Bullshit Jobs

[2] Eric Bergman, Arnold Lund, Hugh Dubberly, Bruce Tognazzini, and Stephen Intille. "Video visions of the future: a critical review." In CHI'04 extended abstracts on Human factors in computing systems (2004), pp. 1584-1585

[3] Andrew J Ko, Robin Abraham, Laura Beckwith, Alan Blackwell, Margaret Burnett, Martin Erwig, Joseph Lawrence, Henry Lieberman, Brad Myers, Mary-Beth Rosson, Gregg Rothermel, Chris Scaffidi, Mary Shaw, and Susan Wiedenbeck, "The state of the art in end-user software engineering." ACM Computing Surveys (CSUR) 43, no. 3 (2011): 1-44.

[4] Eric Horvitz, "Principles of mixed-initiative user interfaces". In Proceedings of the SIGCHI conference on Human Factors in Computing Systems (1999), 159-166

[5] Diana Forsythe, Studying those who study us: An anthropologist in the world of artificial intelligence. (Redwood City, CA: Stanford University Press, 2001).

[6] My student Tanya Morris demonstrated how the capabilities of a large language model could help people struggling with special text formats like email addresses by monitoring the attention needs of older users through their keystrokes, and giving helpful advice at the time they need it most: Tanya Morris and Alan F. Blackwell, "Prompt Programming for Large Language Models via Mixed Initiative Interaction in a GUI," in Proceedings of the Psychology of Programming Interest Group (PPIG) (2023).

[7] Allen Cypher, Ed. Watch what I do: programming by demonstration. (Cambridge MA: MIT press, 1993.

[8] Joseph C. R. Licklider, "Man-Computer Symbiosis". IRE Transactions on Human Factors in Electronics. HFE-1 (1960):4–11. doi:10.1109/THFE2.1960.4503259.

[9] Alan F. Blackwell, "First steps in programming: A rationale for Attention Investment models". In Proceedings of the IEEE Symposia on Human-Centric Computing Languages and Environments (2002):2-10; Alan F. Blackwell, Jennifer A. Rode, and Eleanor F. Toye, “How do we program the home? Gender, attention investment,

and the psychology of programming at home”. International Journal of Human Computer Studies 67 (2009):324-341.

[10] Simon, "Designing organizations for an information-rich world".

[11] Paul Dourish, "What We Talk About When We Talk About Context". Personal and Ubiquitous Computing 8, no. 1 (2004 ):19-30.

[12] The consequences of getting this wrong often make reference to the classic scene in Disney’s Fantasia where Sorcerer’s Apprentice Mickey Mouse tries to automate a repetitive task, but gets his magic “programming” wrong.

[13] A particular irony for me, as I have just spent most of my free time for the past two weeks mechanically converting the academic citations in these footnotes from the APA and ACM styles common in psychology and computing research, to the distinctive “Chicago” style mandated by MIT Press.

[14] Lisanne Bainbridge, "Ironies of automation," Automatica 19, no. 6 (1983): 775-779. It was my PhD supervisor Thomas Green who first drew my attention to how this related to programming: Thomas R.G. Green and Alan F. Blackwell, "Ironies of Abstraction," inProceedings 3rd International Conference on Thinking. British Psychological Society, 1996.

[15] John M. Carroll and Mary Beth Rosson, "Paradox of the active user," in Interfacing thought: Cognitive aspects of human-computer interaction, ed. John M. Carroll (Cambridge, MA: MIT Press, 1987), 80-111.

[16] Do *not* try these commands out on your own computer! They are intended for use when deleting large amounts of data.

[17] For one of the first comprehensive investigations of this topic, see Bonnie A. Nardi, A small matter of programming: perspectives on end user computing. (Cambridge, MA: MIT press, 1993). A pioneering example introducing the concept was Allan MacLean, Kathleen Carter, Lennart Lövstrand, and Thomas Moran, "User-tailorable systems: pressing the issues with buttons," in Proceedings of the SIGCHI conference on Human factors in computing systems, (1990), 175-182.

[18] In a rare breach of academic publishing protocol, the journal publisher did apologise for their automated error, retracting and modifying the online digital version of that paper so that the electronic archive differs from the supposedly authoritative paper one. If your library has a printed paper copy of Volume 13, Issue 4 of ACM Transactions on CHI from 2006, you can still find Seymour Articlet immortalised in print. A few seconds of automation by one person has resulted in years of corrective aftermath - including the time I have wasted writing this footnote, and the time you have wasted reading it!

[19] An example from September 2022, when I write this note, was the first official action of the UK Conservative government led by Prime Minister Liz Truss. Before this date, the EU Capital Requirements Directive meant it was illegal for staff in the UK banking industry to receive bonuses more than 200% of their nominal salary. The Truss government simply removed the regulation, making any further calculation of this amount unnecessary.

[20] See James Noble and Robert Biddle’s Notes on Postmodern Programming, which I discuss again later in this book. James Noble and Robert Biddle, "Notes on postmodern programming," in Proceedings of the Onward Track at OOPSLA 02, the ACM conference on Object-Oriented Programming, Systems, Languages and Applications, Seattle, USA, 2002, ed. Richard Gabriel (2002), 49-71. http://www.dreamsongs.org/

[21] Daniel Kahneman Thinking, Fast and Slow, (New York: Farrar, Straus and Giroux, 2011).

[22] Albert Bandura, Self-Efficacy: The Exercise of Control. (New York: W.H. Freeman, 1997)

[23] Laura Beckwith, Cory Kissinger, Margaret Burnett, Susan Wiedenbeck, Joseph Lawrance, Alan Blackwell, and Curtis Cook. "Tinkering and gender in end-user programmers' debugging." In Proceedings of the SIGCHI conference on Human Factors in computing systems. (2006), 231-240.

[24] Green and Blackwell, “Ironies of abstraction”

[25] The question of whether language is itself a notation enters tricky philosophical grounds. Written language definitely is a notation. Spoken language, being transient, might not look like one. But a voice recording can be used in many of the same ways as written text - it’s just a lot less convenient to look forward and back, make changes, and so on. I discuss these user experiences of notation in Chapter 14. Even more speculatively, perhaps all language is encoded in the listener’s brain as a neural recording than can itself be “replayed”, “edited” and so. If our brain contains language symbols, perhaps thinking is an internal kind of “reading” and “writing”? This is much the same intuitive principle of operation as Turing’s other famous thought experiment - the “Turing Machine”.

Comments
1
?
Daphne Preston-Kendal:

‘AI researchers were all using the UNIX command line’ — perhaps at UCSD, but outside the University of California system, Unix was a rare choice for AI researchers (of the ‘control systems’ AI variety at least). The influential Stanford and MIT AI Labs had and swore by their own OSes (WAITS and ITS, respectively)