AI.6 Add Concluding Sentences To The Essay

Concluding sentences link the current paragraph’s topic to the topic of the next paragraph.

My next step in developing the scaffold was to add concluding sentences to each paragraph’s index card (Section 2.12). Concluding sentences solidify a manuscript’s narrative structure. A concluding sentence relates to the topic sentence of its own paragraph, but also relates to the topic sentence of the next paragraph.

I created concluding sentences for my essay by paying attention to the topic sentence on the index card I was about to modify while also paying attention to the topic sentence written on the next card. The two topic sentences provide powerful constraints on what concluding sentence can be written, and I tried to pay attention to the constraints as much as possible. However, I still remembered I was writing a first draft. So, rather than writing the best concluding sentence possible, I tried to speed things up by writing a reasonable sentence. I wrote the concluding sentence on the lower part of the lined side of the current index card, then reached for the next card and repeated the process.

After writing a concluding sentence on an index card, I could add additional material – very short notes – in the space between the card’s topic sentence and index sentence. The notes I added were reminders to include particular material when I added supporting sentences later. During my reading and my outlining, I created additional cards. For example, when I read papers on large language models, I created quote cards by writing quotes down on pink index cards. I placed a quote card after the paragraph card which cited the quote; the note on the paragraph card reminded me to look ahead in my cards to retrieve the quote.

The paragraphs which follow provide the topic sentence and the concluding sentence for each paragraph card. Any material in bold font provides additional notes which I added to the index card to help write supporting sentences.

After I created my concluding sentences, I went through the cards one-by-one to read the manuscript in outline form. The manuscript’s meaning should be communicated by having a topic sentence and a concluding sentence for each paragraph. When reading the sentences, I identified additional material which did not seem to work. I have left the material in the following paragraphs but have crossed the words out to indicate I discarded the index card before fleshing out the first draft.

A scaffold created with the Chapter 2 method produces most of a paper before any real writing begins. The paragraphs which follow represent the outline which I moved from my index cards into my word processor (Section 2.14). The paragraphs below contain 1268 words. Given my goal was to write a short essay, my sense was I already had most of my essay – before I had even created a first draft. Table AI-4 provides my paragraph by paragraph outline.

Table AI-4. The outline (topic sentence, notes, concluding sentence) for each paragraph in the essay. Notes are indicated by bold text between topic and concluding sentences.
Paragraph Outline
1 We live in an artificial intelligence (AI) revolution fueled by a new invention called a large language model (LLM). Give basic properties. LLMs are trained on a huge amount of text taken from the internet and learn to predict which new words should follow those presented to an LLM as a stimulus.
2 LLMs are revolutionary because they can generate long, detailed, meaningful responses to short text prompts. Give examples. “What is clear is that these models use language in a way that is remarkably human” (Piantadosi, 2023, p. 4, his italics).
3 LLMs’ performance has generated many questions in both the popular press and scholarly journals (M & K ref). Example questions, NYT headline. Such questions appear to be very polarizing; Mitchell and Krakauer report 51% of scholars believe LLMs understand language.
4 Speaking as a cognitive scientist, I feel such questions miss the key point. I am interested in a different question: ‘Can LLMs inform cognitive science?’
5 Below, I argue LLMs may indeed be able to inform cognitive science – but only if researchers expend considerable effort to study the internal structure of LLMs in order to discover how LLMs produce their amazing behavior. LLMs may provide new theories to cognitive science, but only if we look for a new theory inside an LLM.
6 Modern AI’s excitement and controversy comes from an LLM’s ability to generate paragraphs of meaningful sentences in response to short prompts or questions. 304 eg? LLMs consistently generate long, well-written, interpretable and surprising responses to short, vague prompts.
7 Cognitive science has studied human language for decades. Cognitive science’s most influential account proposes human language involves specialized rules or processes manipulating complex mental representations of sentences. Such a theory is called a generative grammar. Give eg of components – phrase marker, transformation. Chomsky’s generative grammar, which explained language by appealing to special rules and symbolic structures, not only transformed linguistics but also shaped the theories in a broader discipline, cognitive science, for many decades.
8 Cognitive scientists who believe human language is the rule-governed manipulation of symbols do not believe LLMs inform cognitive science. Quote from Chomsky essay. Chomsky’s position mirrors a motto often stated by my own PhD supervisor, Zenon Pylyshyn: no cognition without computation. The motto claims we can only explain cognition by appealing to symbols and rules.
9 UC Berkeley psychologist Steven Piantadosi agrees LLMs do not use grammatical rules. However, he then proceeds to argue an LLM’s high level performance without using rules refutes Chomskyan linguistics. “The success of large language models is a failure for generative theories because it goes against virtually all of the principles these theories have espoused. In fact, none of the principles and innate biases that Chomsky and those who work in his tradition have long claimed necessary needed to be built into these models” (Piantadosi, 2023, pp. 14-15, his italics).
10 My own research concerns cognitive science’s foundations, with particular interest in the relation between theories based on rules and symbols and theories based on artificial neural networks. I therefore recognize as historical precedent for Piantadosi’s position on Chomskyan theory. I argue below the historical precedent is relevant to answering the question of whether LLMs can inform cognitive science.
11 In the mid-1980s, cognitive science found itself in the midst of what is now called its connectionist revolution. The new networks, called multilayer perceptrons, were powerful enough to serve as theories about human cognitive phenomena. R and M quote about hidden units.
12 The rise of multilayer perceptrons in cognitive science caused a revolution because proponents of artificial neural networks attacked traditional theories which appealed to the rule-governed manipulation of symbols. Grab stuff from What is Cognitive Psychology.
13 My interest in the connectionist revolution focused on a curious aspect of the revolutionaries’ argument: they assumed networks abandoned symbols and rules, but never provided evidence to support their assumption, or to show what their networks used to replace symbols and rules. I call their approach gee whiz connectionism (ref). I tried to distance myself from gee whiz connectionism by training multilayer perceptrons on various tasks, and by conducting detailed analyses of the internal structure of my trained networks.
14 When I looked inside my trained networks, I discovered lots of structures which resembled theories based on symbols and rules. Logic network. Mushroom network. I believed my results revealed surprising similarities between network models and symbolic models, blurring the distinctions between the two approaches.
15 Importantly, I did not usually find network structure which replicated existing formal theories. Instead, I found new formal structures which could inform a cognitive science based on symbols and rules. Music network example. In short, when I looked inside my networks, I found new kinds of formal structures for cognitive science to explore.
16 I feel my research on interpreting the structure of artificial neural networks demonstrates the peril of gee whiz connectionism.
17 The moral of my story about my own work is my suspicion LLMs will only inform cognitive science when researchers abandon mere assumptions about what makes LLMs different from rule and symbol models, and instead seek evidence about both the similarities and differences between both types of models.
18 Why must we look inside LLMs to inform cognitive science? Cognitive scientists have long known psychologically plausible performance can be produced by methods completely unrelated to the processes of human cognition. ELIZA example. Examples like ELIZA show why cognitive scientists are more concerned about comparing processes than comparing performance.
19 Indeed, I strongly suspect LLMs use methods radically different from those used by humans because they represent stimuli and responses with encodings unrelated to any proposed by cognitive scientists. Representation eg? If LLM representations are unrelated to human cognition, then LLMs do not refute Chomsky’s approach. Instead, they refute the applicability of Chomsky’s approach to the explanation of LLMs!
20 Because I suspect LLMs use methods unrelated to human cognition, I also believe Piantadosi’s (2023) claim ‘LLMs refute Chomskyan linguistics’ illustrates the danger of being seduced by network performance. Again, cognitive scientists recognize human-like performance is not sufficient to establish human-like processing. Language development eg – NETTALK? An LLM’s high performance signals potential relevance to human cognition. However, the signal means ‘look inside the LLM to see how it functions’.
21 LLM proponents recognize LLMs use some method to produce a remarkable facility with language. Piantadosi quote. To inform cognitive science, to defend claims like ‘LLMs refute Chomsky’, researchers must do the hard work to discover what methods LLMs use, and to compare the discovered methods to those discovered by research on human cognition.
22 However, understanding how LLMs convert stimuli into responses is extremely challenging, because LLMs are intimidatingly large and complex systems. How big is ChatGPT; how big is its training set? Mitchell and Krakauer (2023, p. 1) note “the inner workings of these networks are largely opaque; even the researchers building them have limited intuitions about systems of such scale.” Piantadosi (2023, p. 8, his italics) concurs: “In fact, we don’t deeply understand how the representations these models create work.”
23 Fortunately, researchers recognize the need to extract potentially novel theories or representations from LLMs and are developing new techniques to understand a LLM’s internal structure. Manning et al. examples. End with Manning 2022 p. 131 quote.
24 My hope is more work of this sort is on the horizon. As researchers explore LLM representations, as well as how the representations are used to generate responses, we move closer to relating LLM to human cognition.
25 Piantadosi (2023, p. 30) claims “large language models rewrite the philosophy of approaches to language. Do LLMs refute Chomsky’s approach? Do LLMs represent a new connectionist revolution for cognitive science? I believe we can’t answer such questions – yet.