The things in proofs are weird: a thought on student difficulties

By Ben Blum-Smith, Contributing Editor

“The difficulty… is to manage to think in a completely astonished and disconcerted way about things you thought you had always understood.” ― Pierre Bourdieu, Language and Symbolic Power, p. 207

Proof is the central epistemological method of pure mathematics, and the practice most unique to it among the disciplines. Reading and writing proofs are essential skills (the essential skills?) for many working mathematicians.

That said, students learning these skills, especially for the first time, find them extremely hard.[1]

Why? What’s in the way? And what are the processes by which students effectively gain these skills?

These questions have been discussed extensively by researchers and teachers alike,[2] and they have personally fascinated me for most of my twenty years in mathematics education.

In this blog post I’d like to examine one little corner of this jigsaw puzzle.

Imported vs. enculturated

To frame the inquiry, I posit that there are imported and enculturated capacities involved in reading and writing proofs. Teachers face corresponding challenges when teaching students about proof.

Capacities that are imported into the domain of proof-writing are those that students can access independently of whether they have any mathematics training in school or contact with the mathematical community, let alone specific attention to proof.[3] Capacities that are enculturated are those that students do not typically develop without some encounter with the mathematics community, whether through reading, schooling, math circles, or otherwise. Examples of imported capacities are the student’s capacity to reason, and fluency in the language of instruction. Enculturated capacities include, for example, knowledge of specific patterns of reasoning common to mathematics writing but rare outside it, such as the elegant complex of ideas behind the phrase, “without loss of generality, we can suppose….”

For imported capacities involved in proof, the teaching challenge is to create conditions that cause students to actually access those capacities while reading and writing proofs.

For enculturated capacities, the prima facie teaching challenge is to inculcate them, i.e., to cause the capacities to be developed in the first place. But there is also a prior, less obvious challenge: we have to know they’re there. Since many instructors are already very well-enculturated, our culture is not always fully visible to us. If we can’t see what we’re doing, it’s harder to show students how to do it. (This challenge has the same character as that mentioned by Pierre Bourdieu in the epigraph, although he was writing about sociology.)

When my personal obsession with student difficulty with proofs first developed, I focused on imported capacities. I had many experiences in which students whom I knew to be capable of very cogent reasoning produced illogical work on proof assignments. It seemed to me that the instructional context had somehow severed the connection between the students’ reasoning capacities and what I was asking them to do. I became very curious about why this was happening, i.e., what types of instructional design choices led to this severing, and even more curious about what types of choices could reverse it.

My main conclusion, based primarily on experience in my own and others’ classrooms, and substantially catalyzed by reading Paul Lockhart’s celebrated Lament and Patricio Herbst’s thought-provoking article on the contradictory demands of proof teaching, was this: It benefits students, when first learning proof, to have some legitimate uncertainty and suspense regarding what to believe, and to keep the processes of reading and writing proofs as closely tied as possible to the process of deciding what to believe.[4]

I stand by this conclusion, and more broadly, by the view that the core of teaching proof is about empowering students to harness their imported capacities (in the above sense) to the task, rather than learning something wholly new. That said, in the last few years I’ve become equally fascinated by the challenges of enculturation that are part of teaching proof reading and writing. If I’m honest, my zealotry regarding the importance of imported capacities blinded me to importance of the enculturated ones.

What I want to do in the remainder of this blog post is to propose that a particular feature of proof writing is an enculturated capacity. It’s a feature I didn’t even fully notice until fairly recently, because it’s such a second-nature part of mathematical communication. I offer this proposal in the spirit of the quote by sociologist/anthropologist Pierre Bourdieu in the epigraph: to think in a completely astonished and disconcerted way about something we thought we already understood. Naming it as enculturated has the ultimate goal of supporting an inquiry into how students can be explicitly taught how to do it, though this goes beyond my present scope.

The things in proofs are weird

I recently encountered an article by Kristen Lew and Juan Pablo Mejía-Ramos, in which they compared undergraduate students’ and mathematicians’ judgements regarding unconventional language used by students in written proofs.[5] One of their findings was that, in their words, “… students did not fully understand the nuances involved in how mathematicians introduce objects in proofs.” (2019, p. 121)

The hypothesis I would like to advance in this post is offered as an explanation for this finding, as well as for a host of student difficulties I’ve witnessed over the years:

The way we conceptualize the objects in proofs is an enculturated capacity.

These objects are weird. In particular, the sense in which they exist, what they are, is weird. They have a different ontology than other kinds of objects, even the objects in other kinds of mathematical work. An aspect of learning how to read and write proofs is getting accustomed to working with objects possessing this alternative ontology.[6] If this is true, then it makes sense that undergraduates don’t quite have their heads wrapped around the way that mathematicians summon these things into being.

The place where this is easiest to see is in proofs by contradiction. When you read a proof by contradiction, you are spending time with objects that you expect will eventually be revealed never to have existed, and you expect this revelation to furthermore tell you that it was impossible that they had ever existed. That’s bizarro science fiction on its face.

But it’s also true, more subtly perhaps, of objects appearing in pretty much any other type of proof. To illustrate: suppose a proof begins,

Let Lambda

be a lattice in the real vector space mathbb{R}^n, and let vin Lambda be a nonzero vector of minimal (Euclidean) length in Lambda

Question. What kind of a thing is v?

[The camera pans back to reveal this question has been asked by a short babyfaced man wearing a baseball cap, by the name of Lou Costello. His interlocutor is a taller, debonair fellow with a blazer and pocket square, answering to Bud Abbott.]

Abbott: It’s a vector in mathbb{R}^n

Costello: Which vector?
Abbott: Well, it’s not any particular vector. It depends on Lambda.
Costello: You just said it was a particular vector and now it’s not a particular vector?
Abbott: No, well, yes, it’s some vector, so in that sense it’s a particular vector, but I can’t tell you which one, so in that sense it’s no particular vector.
Costello: You can’t tell me which one?
Abbott: No.
Costello: Why not?
Abbott: Because it depends on Lambda. It’s one of the vectors that’s minimal in length among nonzero vectors in Lambda.
Costello: Which vector?
Abbott: No particular vector.
Costello: But is it some vector?
Abbott: Naturally!

Costello: You said it depends on Lambda

. What’s Lambda?
Abbott: A lattice in $mathbb{R}^n$.
Costello: Which lattice?
Abbott: Any lattice.
Costello: Why won’t you say which lattice?
Abbott: Because I’m trying to prove something about all lattices.
Costello: You mean to say Lambda is every lattice???
Abbott: No, it’s just one lattice.
Costello: Which one?!

For any readers unfamiliar with the allusion here, it is to “Who’s on First?”, legendary comedy duo Abbott & Costello’s signature routine.[7] What’s relevant to the present discussion is that the skit is based on Costello asking Abbott a sequence of questions about a situation to which he is an outsider and Abbott is an insider. Costello becomes increasingly frustrated by Abbott’s answers, which make perfect sense from inside the situation, but seem singularly unhelpful from the outside. Abbott for his part maintains patience but is so internal to his situation—enculturated, as it were—that he doesn’t address, or even seem to perceive, the ways he could be misunderstood by an outsider.[8]

My goal with this little literary exercise has been to dramatize the strangeness of the “arbitrary, but fixed” nature of the objects in proofs. Most things we name, outside of proof-writing, don’t have this character. Either they’re singular or plural; one or many; specific or general; not both. Every so often, we speak of a singular that represents a collective (“the average household”, “a typical spring day”), or that is constituted from a collective (“the nation”), but these are still ultimately singular. They are not under the same burden as mathematical proof objects, to be able to stand in for any member of a class. Proof objects aren’t representative members of classes but universal members. This makes them fundamentally unspecified, even while we imagine and write about them as concrete things.

There’s an additional strangeness: proof objects, and the classes of which they are the universal members, are themselves often constituted in relation to other proof objects. We get chains, often very long, where each link adds a new layer of remove from true specificity, but we still treat each link in the chain, including the final one, as something concrete. I was trying to hint at this by posing the question “what is it?” about v

, rather than Lambda, in the example above. As consternated as Costello is by Lambda, v is doubtless more confounding.

I think there are at least two distinct aspects of this that students new to proof do not usually do on their own without some kind of enculturation process. In the first place, the initial move of dealing with everything in a class of objects simultaneously by postulating a “single universal representative” of that class just isn’t automatic. This is a tool mathematical culture has developed. Students need to be trained, or to otherwise catch on, that a good approach to proving a statement of the form “For all real numbers…” might begin, “Let x

be a real number.”[9]

But secondly, when we work with these objects, their “arbitrary, but fixed” character forces us to hold them in a different way, mentally, than we hold the objects of our daily lives, or even the mathematical objects of concrete calculations. When you read, “Let f

be a smooth function mathbb{R}rightarrowmathbb{R},” what do you imagine? A graph? Some symbols? How does your mental apparatus store and track the critical piece of information that f can be any smooth function on mathbb{R}? Reflecting on my own process, I think what I do in this case is to imagine a vague visual image of a smooth graph, but it is “decorated”—in a semantic, not a visual, way—by information about which features are constitutive and which could easily have been different. The local maxima and minima I happen to be imagining are stored as unimportant features while the smoothness is essential. Likewise, when I wrote, “Let Lambda be a lattice in the real vector space mathbb{R}^n,” what did you imagine? Was there a visual? If so, what did you see? I imagined a triclinic lattice in 3-space. But again, it was somehow semantically “decorated” by information about which features were constitutive vs. contingent. That I was in 3 dimensions was contingent, but the periodicity of the pattern of points I imagined was constitutive. I’m positing that students new to proof do not usually already know how to mentally “decorate” objects in this way.[10]


Here are some specific instances of student struggle that seem to me to be illuminated by the ideas above.

  • In the paper of Lew and Mejía-Ramos mentioned above, eight mathematicians and fifteen undergraduates (all having taken at least one proof-oriented mathematics course) were asked to assess student-produced proofs for unconventional linguistic usages. The sample proofs were taken from student work on exams in an introduction to proof class. One of these sample proofs began, “Let forall nin mathbb{Z}.” Seven of the eight mathematicians identified the “Let forall…” as unconventional without prompting, and the eighth did as well when asked about it. Of the fifteen undergraduate students, on the other hand, only four identified this sentence as unconventional without prompting, while even after being asked directly about it, six of the students maintained that it was not unconventional. I would like to understand better what these six students thought that the sentence “Let forall nin mathbb{Z}” meant.
  • Previously on this blog, I described the struggle of a student to wrap her head around the idea, in the context of varepsilondelta proofs, that you are supposed to write about varepsilon as though it’s a particular number, when she knew full well that she was trying to prove something for all varepsilon>0 at once.
  • A year and a half ago, I was working with students in a game theory course. They were developing a proof that a Nash equilibrium in a two-player zero-sum game involves maximin moves for both players. It was agreed that the proof would begin by postulating a Nash equilibrium in which some player, say P, was playing a move A that was not a maximin move. By the definition of a maximin move, this implies that P has some other move B such that the minimum possible payout for P if she plays move B is higher than the minimum possible payout if she plays A. The students recognized the need to work with this “other move” B but had trouble carrying this out. In particular, it was hard for them to keep track of its constitutive attribute, i.e., that its minimum possible payout for P is higher than A‘s. They were as drawn to chains of reasoning that circled back to this property of B as a conclusion, as they were to chains of reasoning that proceeded forward from it.
  • In the same setting as the previous example, there was a student who, in order to get her mind around what was going on, very sensibly constructed some simple two-player games to look at. I don’t remember the examples, but I remember this: I kept expecting that when she looked at the fully specified games, “what B was” would click for her, but it didn’t. Instead, I found myself struggling to be articulate in calling her attention to B, precisely because its constitutive attribute was now only one of the many things going on in front of us; nothing was “singling it out.” I found myself working to draw her attention away from the details of the examples she’d just constructed in order to focus on the constitutive attribute of B. My reflection on this student’s experience was what first pointed me toward the ideas in this blog post: I mean really, what is B, anyway, that recedes from view exactly when the situation it’s part of becomes visible in detail?
  • This semester I taught a course on symmetry for non-math majors. It involved some elementary group theory. An important exercise was to prove that in a group, xz=yz implies x=y. One student produced an argument that was essentially completely general, but carried out the logic in a specific group, with a specific choice of z, and presented it as an example. Here is a direct quote, edited lightly for grammar and typesetting. “For example [take] xcdot R90=ycdot R90; if we will operate on both sides the inverse of R90 we will get (xcdot R90)R270=(ycdot R90)R270. As we have proven that (AB)C always = A(BC), we can change the structure of the equation to x(R90cdot R270)=y(R90cdot R270), [which] shows that x has to be equal to y.” The symbols R90 and R270 refer to one-quarter and three-quarters rotations in the dihedral group D_4. From my point of view as instructor, the student could have transformed this from an illustrative example to an actual proof just by replacing R90 and R270 with z and z^{-1}, respectively, throughout. What was the obstruction to the student doing this?


My claim is that the mathematician’s skill of mentally capturing classes of things by positing “arbitary, but fixed” universal members of those classes, and then proceeding to work with these universal members as though they are actual objects that exist, is an enculturated capacity.[11] I think it’s a little bit invisible to us—at least, it was so to me, for a long time. My purpose in advancing this claim is that making the skill visible invites an inquiry into how we can explicitly lead students to acquire it. I hope those of you who have given attention to how to train students in this particular aspect of proof (reading and) writing will offer some thoughts in the comments!

Notes and references

[] I trust that any reader of this blog who has ever taught a course, at any level, that serves as its students’ introduction to proof, has some sense of what I am referring to. Additionally, the research literature is dizzyingly vast and there is no hope to do it any justice in this blog post, let alone this footnote. But here are some places for an interested reader to start: S. Senk, How well do students write geometry proofs?, The Mathematics Teacher Vol. 78, No. 6 (1985), pp. 448–456 (link); R. C. Moore, Making the transition to formal proof, Educational Studies in Mathematics, Vol. 27 (1994), pp. 249–266 (link); W. G. Martin & G. Harel, Proof frames of preservice elementary teachers, JRME Vol. 20, No. 1 (1989), pp. 41–51 (link); K. Weber, Student difficulty in constructing proofs: the need for strategic knowledge, Educational Studies in Mathematics, Vol. 48 (2001), pp. 101–119 (link); and K. Weber, Students’ difficulties with proof, MAA Research Sampler #8 (link).

[] Again, I cannot hope even to graze the surface of this conversation in a footnote. The previous note gives the reader some places to start on the scholarly conversation. A less formal conversation takes place across blogs and twitter. Here is a recent relevant blog post by a teacher, and here are some recent relevant threads on Twitter.

[] This and the following sentence should be treated as definitions. I am indulging the mathematician’s prerogative to define terms and expect that the audience will interpret them according to those definitions throughout the work. In particular, while I hope I’ve chosen terms whose connotations align with the definitions given, I’m relying on the reader to go with the definitions rather than the connotations in case they diverge. I invite commentary on these word choices.

[] This is an argument I have made at length in the past on my personal teaching blog (see here, here, here, here, here), and occasionally in a very long comment on someone else’s blog (here). Related arguments are developed in G. Harel, Three principles of learning and teaching mathematics, in J.-L. Dorier (ed.), On the teaching of linear algebra, Dordrecth: Kluwer Academic Publishers, 2000, pp. 177–189 (link; see in particular the “principle of necessity”); and in D. L. Ball and H. Bass, Making believe: The collective construction of public mathematical knowledge in the elementary classroom, in D. Phillips (ed.), Yearbook of the National Society for the Study of Education, Constructivism in Education, Chicago: Univ. of Chicago Press, 2000, pp. 193–224.

[] K. Lew & J. P. Mejía-Ramos, Linguistic conventions of mathematical proof writing at the undergraduate level: mathematicians’ and students’ perspectives, JRME Vol. 50, No. 2 (2019), pp. 121–155 (link).

[] Disclaimer: although I am using the word “ontology” here, I am not trying to do metaphysics. The motivation for this line of inquiry is entirely pedagogical: what are the processes involved in students gaining proof skills?

[] Here’s a video—it’s a classic.

[] One of the keys to the humor is that the audience is able to see the big picture all at once: the understandable frustration of Costello, the uninitated one, apparently unable to get a straight answer; the endearing patience of Abbott, the insider, trying so valiantly and steadfastly to make himself understood; and, the key idea that Costello is missing and that Abbott can’t seem to see that Costello is missing. I’m hoping to channel that sense of stereovision into the present context, to encourage us to see the objects in a proof simultaneously with insider and outsider eyes.

[] Annie Selden and John Selden write about the behavioral knowledge involved in proof-writing, and use this move as an illustrative example. A. Selden and J. Selden, Teaching proving by coordinating aspects of proofs with students’ abilities, in Teaching and Learning Proof Across the Grades: A K-16 Perspective, New York: Routledge, 2009, p. 343.

[] The ideas in this paragraph are related to Efraim Fischbein’s notion of “figural concepts”—see E. Fischbein, The theory of figural concepts, Educational Studies in Mathematics Vol. 24 (1993), pp. 139–162 (link). Fischbein argues that the mental entities studied in geometry “possess simultaneously conceptual and figural characters” (1993, p. 139). Fischbein’s work in turn draws on J. R. Anderson, Arguments concerning representations for mental imagery, Psychological Review, Vol. 85 No. 4 (1978), pp. 249–277 (link), which, in a broader (not specifically mathematical) context, discusses “propositional” vs. “pictorial” qualities of mental images. The resonance with the dichotomy I’ve flagged as “semantic” vs. “visual” is clear. I’m suggesting that the particular interplay between these poles that is involved in conceptualizing proof objects is a mental dance that is new to students who are new to proof. (Actually, it is not entirely clear to me that the dichotomy I want to highlight is “semantic” vs. “visual” as much as “general” vs. “specific”; perhaps it’s just that visuals tend to be specific. However, time does not permit to develop this inquiry further here.)

[] Because this circle of skills involve taking something strange and abstract and turning it into something the mind can deal with as a concrete and specific object, they strike me as related to some notions well-studied in the education research literature: Anna Sfard’s reification and Ed Dubinsky’s APOS theory—both ways of describing the interplay between process and object in mathematics learning—and the more general concept of compression (see, e.g., D. Tall, How Humans Learn to Think Mathematically, New York: Cambridge Univ. Press, 2013, chapter 3).

Leave a Reply

Your email address will not be published. Required fields are marked *