Programming Language Processing

Do humans process code in the same way that they process natural languages?

An interesting new question in the field of human language processing is the question of human processing of programming languages, or code. Programming has primarily been viewed through the lens of problem-solving, and the “language” component of programming is often overlooked; however, natural and programming languages are similar in many ways. With the rise of the prevalence of programming, including proposals to recognize programming languages as a type of “foreign language” in secondary education, researchers have recognized the need to further explore the cognitive processes involved in processing programming languages.

An interesting new question in the field of human language processing is the question of human processing of programming languages, or code. Programming has primarily been viewed through the lens of problem-solving, and the “language” component of programming is often overlooked; however, natural and programming languages are similar in many ways. With the rise of the prevalence of programming, including proposals to recognize programming languages as a type of “foreign language” in secondary education, researchers have recognized the need to further explore the cognitive processes involved in processing programming languages.

Our research aims to leverage psycholinguistic methods to determine whether processing strategies applied by humans to natural language are also used in the human processing of programming languages. In a first step toward answering this question, we test whether regularization of dispreferred constituent orderings, a phenomenon common in natural language processing, also occurs in programming language processing. Our future research will then test whether humans experience processing difficulty when reading less predictable code, in a similar way as they do when reading less predictable natural text. The outcomes of this research will elucidate our understanding of how humans process code, and will have implications for existing pedagogical practices in computer science classes.