Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (2024)

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (1)

Discussions of George Cobb’s “Mere Renovation is Too Little Too Late: We Need to Rethink Our

Undergraduate Curriculum from the Ground Up”,

Special issue of The American Statistician on "Statistics and the Undergraduate Curriculum"

(November, 2015, pp 266-282).

Cobb, 2015 preprint: Mere renovation is too little too late: we need to rethink our

undergraduate curriculum from the ground up

Contents

Albert and Glickman: Attracting undergraduates to statistics through data science

Chance, Peck, and Rossman: Response to mere renovation is too little too late

De Veaux and Velleman: Teaching statistics algorithmically or stochastically misses the

point: why not teach holistically?

Fisher and Bailar: Who, what, when and how: changing the undergraduate statistics

curriculum

Franklin: We need to rethink the way we teach statistics at K-12

Gelman and Loken: Moving forward in statistics education while avoiding overconfidence

Gould: Augmenting the vocabulary used to describe data

Holcomb, Quinn, and Short: Seeking the niche for traditional mathematics within

undergraduate statistics and data science curricula

Kass: The gap between statistics education and statistical practice

King: Training the next generation of statistical scientists

Lane-Getaz: Stirring the curricular pot once again

Notz: Vision or bad dream?

Ridgway: Data Cowboys and Statistical Indians

Temple Lang: Authentic data analysis experience

Utts: Challenges, changes and choices in the undergraduate statistics curriculum

Ward: Learning communities and the undergraduate statistics curriculum

Wickham: Teaching Safe-Stats, not statistical abstinence

Wild: Further, faster, wider

Zieffler: Teardowns, historical renovation, and paint-and-patch: curricular changes and

faculty development

Rejoinder by George Cobb

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (2)

Attracting Undergraduates to Statistics Through Data Science

Jim ALBERT and Mark GLICKMAN

We agree with George Cobb that statisticians need to rebuildtheir undergraduate curricula in statistics in the wake of big dataand the many opportunities for employment in Data Science. AsCobb notes, our statistics curricula are currently facing severalthreats such as big data in computer science and analytics inbusiness, and we agree that it is high time for statisticians toseriously rethink our undergraduate curriculum.

In particular, we believe the one-year sequence in probabilityand mathematical statistics, the standard introduction to statis-ticians for the past 50 years, is no longer a suitable foundationfor training for a modern applied statistician. At Bowling Green,one of us has been fortunate to participate in the creation of anew major in Data Science within a department of mathemat-ics and statistics. At Harvard, where one of us has been visitingfaculty for 10 years, a new Data Science track for the statisticsconcentration is actively under development. Here we focus onwhat we believe are the important components of a data scienceprogram/track that can attract majors and provide a good foun-dation for employment as a data scientist.

Introduce Statistics Through Exploratory andVisualization Methods

For students with minimal statistics prerequisites, a goodfoundation course in a data science major focuses on compu-tation with data. A version of this course is already in existenceat Harvard, and Baumer (2015) and Hardin et al. (2014) de-scribe similar data science courses. The student learns basicmethods for importing, manipulating, and exploring data us-ing a scripting language such as R or python. Particular datawrangling tools such as the use of regular expressions for tex-tual data play an important role since text data is representativeof modern data with a different structure from the traditional“data frame” rectangular grid. This course provides a good op-portunity to learn about categorical and quantitative variables,as well as data management that aids in accessing elements oflarge data sets. Many of the themes of Tukey’s exploratory dataanalysis can be introduced in this computing with data course.Additionally, such a course can emphasize communication ofresults through visualization and interpretable summaries, in-cluding generation of hypotheses by simple data explorations.

Statistical Programming

Much of the work of a data scientist consists of the processof data collection and data wrangling and it seems clear that

Online discussion of “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” by George Cobb,The American Statistician, 69. Jim Albert is Professor of Statistics, BowlingGreen State University, Bowling Green, OH (Email: [emailprotected]).Mark Glickman is Research Professor of Health Law, Policy and Management,Boston University, Boston, MA.

the student needs sufficient training in a statistics programminglanguage. One required course in the new Bowling Green datascience program is “Statistical Programming”. This course isan in-depth look at data types and containers, and the studentsget experience writing scripts and functions for different datascience tasks. The tools for collecting, managing, and visualiz-ing data are changing quite rapidly and the student with a solidfoundation in a statistics language such as R will likely be ableto adapt to these new data science tools.

The Importance of Context

The computing with data course uses several interesting bigdata applications to demonstrate the value of statistical thinking.Nolan and Lang (2015) describe a number of real-life data sci-ence projects, and ideas from these projects can help the instruc-tor in building interesting homework assignments and projects.Ideally, a curriculum in data science can prepare the student towork on an extended data science project in collaboration witha faculty member from an applied discipline. Each undergradu-ate program needs to develop a network of internships, advisersand summer programs that can help in the development of thesecapstone projects.

From Statistical Inference to a Broad View of StatisticalAlgorithms

We agree with Cobb that the statistics curriculum needs tomove away from traditional statistical inference courses withtheir emphasis on testing hypotheses and normal error and in-dependence assumptions. But what should a modern statisticalinference course look like? One possibility is to offer a statis-tical learning course. One of the required courses in the Bowl-ing Green data science program is a course based on James etal (2013) that gives an applied overview of various statisticallearning algorithms together with lab exercises on interestingdatasets. Another alternative would be one based on general-ized linear models (GLM), but this would require the studentsto have knowledge of a variety of probability distributions. Inthe Harvard GLM course that one of us teaches, which assumesstudents have had exposure to basic probability but not mathe-matical statistics, we have incorporated a data prediction com-petition on Kaggle as a course project. This type of exerciseprovides students an opportunity not only to engage in modelcriticism and refinement, but also to explore machine learningprediction algorithms. Students are exposed through this projectboth to stochastic and algorithmic cultures that Breiman (2001)identified. Given the increasing popularity of Bayesian meth-ods, we think the time is right for the development of an appliedBayesian course. Link and Barker (2009) is one example of anapplied Bayesian text that illustrates basic concepts within the

c©2015 Jim Albert and Mark Glickman The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (3)

context of interesting examples from a particular discipline.

The Intro Statistics Course?

We believe the most challenging task is to redesign the in-troductory statistics class. Too many classes focus on learningrecipes and the student leaves the course with a distaste for thesubject, and more troublesome a lack of appreciation of the dis-cipline of statistics. One of us has incorporated magic tricks(Lesser and Glickman 2009) and class-participatory demonstra-tions (Gelman and Glickman 2000) as ways to enhance interestin introductory statistics concepts. One of us has had fun ex-perimenting (Albert 2003) with a baseball version of our intro-ductory class. The course arguably succeeds in part since thestudents are genuinely interested in the sports application andthe statistics concepts make more sense when discussed in thiscontext. Generally, any type of project in which the students getto implement all of the steps of a statistics investigation is oneof the best ways of making the discipline real for the students.

References

Albert, J. (2003), Teaching Statistics Using Baseball, Mathematical Associationof America.

Baumer, B. (2015), “A Data Science Course for Undergraduates: Thinking withData,” available online at arXiv:1503.05570v1.

Breiman, L. (2001). “Statistical Modeling: The Two Cultures” (with commentsand a rejoinder by the author), Statistical Science, 16, no. 3, 199–231.

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

Gelman, A., and Glickman, M.E. (2000), “Some Class-Participation Demon-strations for Introductory Probability and Statistics,” The Journal of Educa-tional and Behavioral Statistics, 25, 84–100.

Hardin, J., ho*rl, R., Horton, N., and Nolan, D. (2014), “Data Science in theStatistics Curricula: Preparing Students to ‘Think with Data,”’ available on-line at arXiv:1410.3127v2.

James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013), An Introduction toStatistical Learning: With Applications in R, Springer.

Lesser, L.M., and Glickman, M.E. (2009), “Using Magic in the Teaching ofProbability and Statistics,” Model Assisted Statistics and Applications, 4 (4),265–274.

Link, W., and Barker, R. (2009), Bayesian Inference: with Ecological Applica-tions, Academic Press.

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (4)

Response to: “Mere Renovation Is Too Little Too Late: We Need ToRethink Our Undergraduate Curriculum From The Ground Up”

Roxy PECK, Beth CHANCE, and Allan ROSSMAN

First, we thank and congratulate George for a thought-provoking article. We whole-heartedly agree that implemen-tation of the new guidelines will require considerable holis-tic thinking about the undergraduate curriculum for statisticsmajors. Rather than tweaking existing courses, departmentsshould embrace the challenge to rethink everything from thefirst course, to progression and scaffolding through the curricu-lum, to assessment of learning objectives. For example, ratherthan continuing to compartmentalize computing, theory, and ap-plications, courses should address the junction of all three areas.Algorithmic thinking needs to be explicitly taught, though notat the complete expense of mathematical underpinnings. We be-lieve that Breiman’s two cultures can complement and reinforceeach other.

The tricky trade-off is of course in practice. What is feasiblefor departments to do in the near term? Continuing George’s“tear-down” metaphor, which requires an expensive and time-consuming process, a key question is where to live in the mean-time? Statistics departments cannot simply suspend their pro-grams for a few years and then admit students once their newprograms are established, so is it realistic to pursue drastic inno-vations such as attempting to break down departmental barriersand abandon teaching “subjects” entirely?

Other questions abound, such as: How will we know ournew structure is feasible? Do we test and evaluate before wetear down and build, or do we just tear down, build, and hopefor the best? Do we need to reach consensus within and amongour departments first? How do we prepare current faculty, thevast majority of whom were taught in the probabilistic culture,

Online discussion of “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” by George Cobb,The American Statistician, 69, DOI: 10.1080/00031305.2015.10930. RoxyPeck is Professor Emerita, California Polytechnic State University–StatisticsP.O. Box 7088, Los Osos, CA 93412. Beth Chance is Professor, Cal Poly, 1Grand Ave, Department of Statistics, San Luis Obispo, CA 93047. Allan J.Rossman is Professor and Chair, Department of Statistics, Cal Poly San LuisObispo, CA 93407.

to develop courses that teach skills from both cultures?Perhaps we should begin with more modest renovations. For

example, at Cal Poly, we have introduced a 4-unit “orientation”course for entering first-quarter students that begins our majors’discussions of historical roots of the discipline, ethics, futuredirections, “big data,” computing in R, communication skills,and collaboration strategies.. This is followed by an applied in-troductory two-course sequence that focuses on the statisticalinvestigation process as a whole, working with messy data andusing simulation to motivate mathematical theory of statisticalinference. We want our statistics majors to immediately applytheir knowledge to genuine research studies, rather than wait-ing to finish courses in calculus and probability first. This se-quence is followed by an applied regression course, where weare currently adding more topics on predictive modeling, but ina manner that complements the modelling culture while focus-ing on overarching principles of statistical thinking. We are stillcollectively revising and hope to go further (especially as morecourse materials and texts become available), but what topicscan/should now be omitted and just how far and should we go?

We hope the ASA will continue to provide resources and sup-port for such changes. We are encouraged by the special issue’sfocus on rethinking the undergraduate curriculum in a collabo-rative manner, sharing resources and lessons learned, and learn-ing from how individual institutions uniquely balance betweencultures. Only by changing how undergraduate statistics majorsare taught now will we be able to positively impact the disci-pline in the future.

c©2015 Roxy Peck, Beth Chance, and Allan Rossman The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (5)

Who, What, When and How:Changing the Undergraduate Statistics Curriculum:

A Discussion of “Mere Renovation is Too Little Too Late”

Thomas J. FISHER and A. John BAILER

First, and foremost, we largely agree with the underlyingthemes presented by Professor Cobb: the need for a curricu-lum that attracts, inspires and engages students preparing themfor the data and analysis questions they will face in their future;that the curriculum of statistics needs to evolve to include moremodern tools; our profession (and discipline) is in danger ofbeing superseded by computer science, business analytics andbioinformatics; and the need for a curriculum that is dynamic inthe face of the evolving role of data science, mathematics andcomputing.

The manuscript is a provocative and interesting piece thatgenerated a good conversation between us and we expect thatit will do the same for our colleagues. The four threads outlinedare particularly noteworthy and the analogy to the fast food in-dustry is chillingly accurate. With all that said, the manuscriptraised several intertwined questions about the implications ofsuch a dramatic shift to the curriculum. These can be summa-rized in three main discussion points. The first is on the role ofcurriculum, the second is a question of scale and the third is onthe topic of competition and collaboration.

As described by Professor Cobb, historically intro courseswere based on sampling distribution theory and required somemathematical chops to grasp the probabilistic and statistical ra-tionale. Applications with real data were a secondary consider-ation and many statistical results were taught as a mathematicalrecipe for different types of theoretical data. Although the stan-dard intro course has evolved to a more data driven approachand to include various computing techniques, the same build upfrom probability through inference is typically taught. Beforewe can completely abandon this approach, we need to address afundamental question: What is the role of introductory statisticsclasses in the curriculum? From our perspective, our introduc-tory courses serve three distinct clientele: all students (think thegeneral public), statistical doers (majors requiring skills withdata and analysis—the sciences) and proto-statisticians (thosemajoring in statistics, mathematics or computer science or whomay get an advanced degree in the area). The teaching goalsfor each of these groups can be quite different and can be sum-marized as literacy, rationale and comprehension, respectively.The discussed revolution of the curriculum appears to largelyconcentrate on the rationale of statistics, moving to a more in-tuitive algorithmic approach, which in our mind largely serves

Online discussion of “Mere Renovation is Too Little Too Late: We Need toRethink Our Undergraduate Curriculum From the Ground Up,” by GeorgeCobb, The American Statistician, 69. Thomas J. Fisher is an AssistantProfessor of Statistics at Miami University, Oxford OH 45056 E-mail:[emailprotected]). A. John Bailer is a University Distinguished Profes-sor and Chair in the Department of Statistics at Miami University, Oxford, OH45056 ([emailprotected]).

the ‘statistical doers.’ How will this affect our other students?That is, what are the repercussions for the general public andthe proto-statistician with such a dramatic change in teaching.Even with the ‘statistical doers’, the methods courses taught intheir home departments often still expect a certain level of es-tablished tools (the two-sample t-test for instance). The issue isnot merely in a change of our curriculum, but an entire shift inthinking with our client departments, the general public and ourgraduate programs as well.

Although the curriculum shift should benefit many aspectsof statistical literacy (understanding the statistical comparisonof groups using a randomization or a classification tree typeapproach, for instance), we potentially lose several importantcomponents in the traditional curriculum. The redesign effortsabandon much of formal probability for the sake of flatteningthe prerequisites. Even if this has an overall benefit and makesthe subject matter more attainable, we must not lose sight ofsome of the key goals in statistical literacy such as a general un-derstanding of uncertainty, randomness, chance and (shall wesay) luck. For instance, in her 7 topics for the Educated Citi-zens, Jessica Utts highlights an understanding of natural vari-ability (what is normal versus what is the average) as a funda-mental element in statistical literacy (Utts 2003). Although webelieve such topics can be discussed even with the flatteningof prerequisites, the discipline needs to decide how to includesuch pertinent elements and how much time to dedicate to suchtopics.

Our second point is largely an issue with scale but does con-nect to the role of curriculum. We commend Professor Cobb andcolleagues at the liberal arts colleges for providing numerous in-novations in the statistics curriculum; but such changes create aset of challenges that need to be discussed. At many large re-search universities, more than 1000 students may take the introstat course every semester, with many of these classes taught byadjunct faculty to large lecture sessions (∼100 students). Therehave been many calls for more active learning experiences inintroductory classes in the recent past. Consider Project INGe-nIOuS or the MAA/ASA/SIAM/AMS Common Vision Project,for instance. When faculty currently teaching intro stat are morecomfortable with the traditional format, how do we convincethem to use different pedagogical methods? How do we createspace for faculty to innovate in a climate where we are alreadybeing asked to do more with fewer resources? At universitieswith undergraduate statistics programs, changes to the founda-tional courses will require major changes in upper level coursesas well. Many of our students enter universities with AP credit;how do we incorporate those trained through a more traditionalmodel at the high school level? Will a need for the traditionalfirst course in statistics, as currently formulated, exist in the near

c©2015 Thomas J. Fisher and A. John Bailer The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (6)

future? How will the change in curriculum translate with otheruniversities and graduate programs that build off the traditionalmodel? And, although a good problem, are we ready for theadditional demand a more attractive curriculum may create?

The last point we would like to make is our disagreementwith the assertion regarding self-interest and the dangersto our field. Although we agree that other disciplines areinfringing on traditional statistics material (the so-called fastfood), where Professor Cobb appears to suggest the needto protect our field against external threats, we believe thecurrent climate is one of opportunity rather than turmoil. AtMiami University, the Department of Statistics partnered withthe Farmer School of Business in developing a Co-Major inAnalytics: students pick up data management proficiency,statistical methods for predictive modeling, data visualiza-tion, communication and teamwork skills, all with an eyetowards application. This program has grown substantiallysince its creation in 2013 with roughly 70 current stu-dents. Recently, the Department partnered with the Department

of Information Systems & Analytics, Computer Science andMarketing to form the Center for Analytics and Data Scienceas an interdisciplinary effort to address the analytics skills gapin the current workforce and foster collaborations in the areasof analytics and data science. Rather than going on the defen-sive, we encourage cooperation with other disciplines to buildstronger programs. If implemented correctly, these associationscan foster interdisciplinary research, strengthen the importanceof computing, disseminate the skills our students need, and willsecure and preserve statistics place as a major partner with otherdata-oriented disciplines.

References

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

Utts, J., (2003) “What Educated Citizens Should Know About Statistics andProbability,” The American Statistician, 57, 74–79.

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (7)

We Need to Rethink the Way We Teach Statistics at K–12

Response to “Mere Renovation is Too Little Too Late: We Need toRethink the Undergraduate Curriculum from the Ground Up”

Christine FRANKLIN

Kudos to George Cobb for again writing an elegant, vision-ary, and timely article about teaching our field of Statistics aswe move forward in the 21st century—I support George’s corebeliefs as to where we need to advance as statistics educators.I would like to offer the following thoughts coming from twoperspectives: one as faculty in a large university statistics de-partment and as one involved with promoting the integration ofstatistics at the K–12 school level.

One statement stood out for me in this article: “In our profes-sion as we practice it, we wait to learn what we need to knowuntil we need to know it, and we focus our learning on whatwe need to know. Why shouldn’t the ways we teach our subjectfollow the approach we use in practice?” At UGA, from myperspective as undergraduate coordinator, I have observed thelegitimacy of this quote with the success of our two semesterCapstone course for statistics undergraduate majors. You canread more about this course in The American Statistician arti-cle, “A Capstone Course for Undergraduate Statistics Major”(Lazar et al., 2011). During the exit interviews with our grad-uating students, it is this one course that students most oftenidentify as having the biggest impact on their learning of statis-tics. They spend a year practicing statistics with a client. Theylearn such attributes as how to deal with messy data, new com-puting and database skills, new statistical techniques not in thestandard statistics courses, soft skills of writing reports, con-versing with clients who don’t necessarily have a statistics back-ground, and presenting their projects as posters to the statisticsfaculty. After graduating, the students frequently communicatewith the department that the Capstone course has helped themmost with their careers. This is not to say the students don’tneed understanding of core statistical concepts taught in stan-dard classroom courses—our goal as statisticians needs to beidentifying those concepts for helping students develop soundstatistical reasoning skills. This development should begin atthe school level, not post secondary. This leads me to my sec-ond perspective.

With the implementation of the Common Core Mathemat-ics State Standards in the U.S. that includes statistics at grades6–12, we are at a crossroads where we have the opportunityto embrace letting young students explore and take ownershipof the wealth of data that surrounds them and that they helpgenerate—with technology young students can explore data vi-sually, learn at an early age database management skills, utilize

Online discussion of “Mere Renovation is Too Little Too Late: We Need toRethink Our Undergraduate Curriculum From the Ground Up,” by GeorgeCobb, The American Statistician, 69. Christine Franklin, University of Geor-gia, Athens GA (Email: [emailprotected]).

programming skills, use simulation for modeling, and appreci-ate the way data impacts their lives. With the accessibility ofdata, formal inference is often not applicable although muchemphasis is still placed on inference in our current teaching.There is so much to be learned from simply exploring the dataand telling a story. I would like to advocate that we work towardimplementing many of George’s suggestions at the school leveland not waiting until the university level. The field of ComputerScience is showing innovation with the new Advanced Place-ment Course, Computer Science Principles—a course that in-troduces students to such topics as programming, abstractions,algorithms, and large data sets to address real-world problems.We as statisticians need to follow the lead of Computer Sci-ence. I observed first hand the first 6 months of 2015 howNew Zealand is attempting to implement a school level curricu-lum where students explore real world data using technology tocarry out the statistical investigative process and letting the datatell an informative story.

The two biggest challenges I see regarding George’s sugges-tions are building a culture that advocates this as the directionwe should travel and the teacher preparation needed (both atthe school level and the post secondary level). Even well re-spected statisticians don’t necessarily want to change the waythey teach or what has always been the traditional mathemati-cal based curriculum. Moving the teachers to empower a newculture of teaching statistics has been advocated for at least50 years, since the championing of exploratory data analysisby John Tukey. We are still struggling to simply teach statisti-cal topics that are more real world and conceptually based ver-sus the more procedural mathematical statistics. Fortunately, theAmerican Statistical Association has one of its priorities teacherpreparation at K–16.

References

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

Lazar, N. A. Reeves, J., and Franklin, C. (2011), “A Capstone Course for Under-graduate Statistics Majors,” The American Statistician, 65, 183–189, DOI:10.1198/tast.2011.10240.

c©2015 Christine Franklin The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (8)

Moving Forward in Statistics Education While AvoidingOverconfidence

Andrew GELMAN and Eric LOKEN

As demonstrated in his provocative article, George Cobb hasstrong views about statistics education and would like to see bigchanges. In these respects he is typical—indeed, we don’t knowif we’ve ever met anyone who feels satisfied with how statisticsis taught at most colleges, whether in statistics departments orelsewhere. What makes Cobb’s thinking worth engaging with,are his decades of experience working on these problems as atextbook writer, a committed teacher, and a participant in manycommittees on the teaching and learning of statistics.

We begin our discussion by emphasizing the parts of Cobb’sarticle we can unequivocally stand behind: we also recommendthe substitution of computing in the place of mathematics, andwe are moving this way in our own teaching: not just havingstudents learn a statistics package, but having them do real (ifsimple) programming to manipulate, graph, and analyze data,and to simulate random processes.

And we also agree that introductory statistics should bettermatch good statistical practice rather than the current standardfocus on null hypothesis significance testing and toy math prob-lems such as the sampling distribution of the sample mean,which we have long felt is an unnecessary stumbling block inthe standard curriculum.

That said, developing a forward-thinking approach to teach-ing is not so easy, given the diversity of modern statistical ap-proaches and the diversity of application areas. On one hand,Cobb supports the teaching of regression models while makingno assumptions about probability models; on the other hand, henotes the increasing popularity of Bayesian methods, which ofcourse are all about probability models. Should an introductorycourse gain some coherence by covering just one of these ap-proaches, or would it be better to have a little of each?

One place we disagree with Cobb is in his linking of algo-rithmic thinking—which we support—with a particular anti-probability-modeling ideology espoused by Breiman in his2001 article. Probability modeling is just as algorithmic as anyother approach to statistics, and it seems to us naive to thinkthat data manipulations are somehow cleaner if they are ex-pressed without reference to generative models for data. Ofcourse, that’s just our perspective based on our teaching andapplied research, just as Cobb is offering his own perspective.The challenge for all of us is to decide what to make of all ofour personal views on these matters, and to decide where andhow we want to teach in a huge and evolving market.

One challenge in dealing with Cobb’s recommendations,

Online discussion of “Mere Renovation is Too Little Too Late: We Need toRethink Our Undergraduate Curriculum From the Ground Up,” by GeorgeCobb, The American Statistician, 69. Andrew Gelman, Department of Statisticsand Department of Political Science, Columbia University, New York (Email:[emailprotected]). Eric Loken, Department of Human Developmentand Family Studies, Pennsylvania State University.

and others of this sort, is figuring out who the “we” is. Withmetaphors ranging from the California real estate market to thefast food industry, Cobb worries about defending “our turf”and the incursion of “others” who teach statistics in unhealthy“Happy Meals.” Cobb seems to be concerned with the futureof traditional statistics departments with their undergraduateand graduate curricula. But at many universities undergraduatestatistics programs are flourishing, indicating that students areattracted to the current system, or at least to the statistics label.Beyond this, there are much bigger forces in play redefiningthe traditional notions of student and university, and so Cobb’sReformation analogy might apply more to higher education ingeneral than to the state of one particular discipline.

As teachers, statisticians have the opportunity to serve broadand evolving populations, including adult workers returning foronline masters programs, students taking online courses, andtraditional undergraduate and graduate students from across thecurrent university structure. At the undergraduate level, non-statistics majors outnumber statistics majors by a huge factor inintroductory courses at many universities. At the graduate level,statisticians again are generally only a small fraction of studentstaking a reasonably in-depth sequence of statistics courses whenone accounts for psychology, political science, sociology, edu-cation, nutrition, kinesiology, engineering and so many otherdepartments. It makes sense that training programs will rise upto meet the demand. Penn State’s Department of Human Devel-opment and Family Studies, for instance, offers about a dozencourses in linear modeling, experimental design, longitudinalmethods, Bayesian methods, data mining, and dynamic systemsanalysis—and we don’t think the content and teaching of thesecourses should be described as unhealthy fare, relative to whatmight be offered in a pure statistics program.

At Harvard, Columbia, and Penn State (to take the three in-stitutions where we teach), undergraduate statistics programsare growing, and the influx is already forcing adaptation andrethinking of curricula. With the process of change well underway, statistics departments will find new ways to serve their owngrowing student bodies, and also the exponentially larger exter-nal market. And this will be achieved by balancing differentinstructional strategies to meet different demands.

We have the impression that attitudes on statistics educationcome much more from views about statistics, and personal ex-periences in the classroom, than from systematic studies of whatworks and in what context. We admit this regarding our ownviews (Gelman and Loken, 2012), and we think it’s the casefor Cobb as well, given that his article has over 100 references,only one of which addresses empirical research in educationaleffectiveness.

From a psychological point of view, we can think of our gen-eral tendency to understate uncertainty and to discount alterna-

c©2015 Andrew Gelman and Eric Loken The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (9)

tive views; or, from a statistical perspective, we can recognizethat effects vary. A teaching style that works well for GeorgeCobb’s students at Mount Holyoke College might not be so ef-fective in the hands of other instructors teaching working adults,or nurses, or MBA students, or sociologists, or political scien-tists.

References

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

Gelman, A., and Loken, E. (2012), “Statisticians: When We Teach, We Don’tPractice What We Preach,” Chance, 25 (1), 47–48.

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (10)

Augmenting the Vocabulary Used to Describe Data

Robert GOULD

This is one of the most exciting papers I have read in a while.George Cobb has, as he has before, clearly identified a challengeto our statistics community that many of us have been aware waslurking somewhere on the margins, but have not seen so clearlyuntil now. There’s much to comment on here, and I expect therewill be years of discussion, but I’d like to emphasize two topicsmentioned in this paper – data and curricula.

Here was my very first introduction to data as an undergrad-uate math major: “Let X1, X2, . . . , Xn denote n random vari-ables that have the joint p.d.f. f (x1, x2, . . . , xn).” (Hogg andCraig, 1978, p. 122). Some of you who enjoyed a similar in-troduction to data are probably marveling that Hogg and Craigwere so far-sighted as to introduce the topic as early as page122. Today, most data used in examples and homework prob-lems in introductory courses—while represented less abstractlythan in my course and possessing, thank goodness, some levelof “realness”—are derived from random samples or studiesthat applied random assignment. When they are not, homeworkquestions often begin “Assume that these are from a randomsample” because without that assumption there’s not much wecan ask students to do. These probabilistic-culture data, I sug-gest, represent a very small fraction of data that our studentsencounter in life and maybe in their careers. This attention toonly one type of data in our classrooms risks making our pro-fession insignificant.

Despite being the “science of data,” the statistics classroomhas a narrow vocabulary for describing data. Let me expandthis vocabulary by two terms. The first, “opportunistic data,”was coined, to the best of my knowledge, by Amy Braverman,a statistician at Jet Propulsion Labs. The second is my own:“algorithmic data.” Algorithmic data are data collected throughan algorithm. Sensors collect data algorithmically, for example.The algorithmic trigger might be an occasional event, such aswhen a sensor detects motion, or might be a semi-continuousevent, such as when sensors on a satellite are programmed tocollect a stream of measurements. Opportunistic data are oftencollected by sensors, but more generally are data sets that arecollected and await an opportunity for analysis. This categoryincludes large, national databases which continue to foster re-search for purposes not originally foreseen by those who col-lected the data. (Think of the NHANES dataset.)

Opportunistic and algorithmic data challenge educators be-cause they do not fit into the inference box; these approachesusually do not produce random samples, and a naive approachcan lead to philosophical and scientific mistakes. (For example,see “The Parable of Google Flu: Traps in Big Data Analysis,”Lazer. D et . al. 2014). And yet these data provide a pivotal role

Online discussion of “Mere Renovation is Too Little Too Late: We Need toRethink Our Undergraduate Curriculum From the Ground Up,” by GeorgeCobb, The American Statistician, 69. Robert Gould, UCLA, Department ofStatistics, Math Sciences Building, Los Angeles, CA 90095-1554 (Email:[emailprotected]).

in students’ lives and so provide a platform in which the scienceof data analysis can be introduced to a very wide audience.

When designing curricula, we should keep in mind thismotto: Data First. We should design curricula that help all stu-dents understand all data, including algorithmic and opportunis-tic data. As George recommends, we should order topics in theorder that best helps them understand data and not because weare supporting a “beautiful structure.” We should exclude topicsthat do not help students understand data.

I would like to depart from George’s recommendations,though, and urge us to think, when designing curricula, not interms of semesters, but years. How should students learn aboutdata from Kindergarten through retirement? This question hadan easy answer when learning statistics meant learning math-ematics. (Answer: wait until they’ve learned calculus.) How-ever, as George points out, many useful and important tools canbe understood through algorithms, and are more accessible atyounger ages. In addition, educational technology can providestudents with experiential access to abstractions such as randomsamples or repeated sampling, and so many topics now taughtin graduate school or in the last months of a bachelor of scienceprogram can be introduced much earlier.

I have some experience with this first-hand. As the principalinvestigator of Mobilize, an NSF-funded project dedicated tobringing a “data science” curriculum to high schools, I’ve beenstruggling with the challenges of helping high school teachersteach their students to find meaning in data that do not belongto the probability culture. Our students use their cell-phones toengage in “participatory sensing campaigns,” a form of algo-rithmic data collection in which they strive to gain insight intotheir lives and their communities. The data they collect are rich.They include geocoded locations, dates, photos, text, as wellas answers to survey questions that fall into the more mundanecategories of categorical and numerical.

From the Mobilize project, I’ve learned a few lessons aboutdesigning curricula. The most important: emphasize the statisti-cal investigation process, as outlined in the GAISE K-12 report(Franklin et al. 2007). This consists of four stages: Ask Ques-tions, Examine/Collect data, Analyze, Interpret. Most statisticscurricula I’ve seen emphasize only the last two stages. Mosthigh school science curricula emphasize the first two stages.Future citizens need all four stages. This investigation processworks well in either of Breiman’s two cultures and it keeps usfocused on what matters: understanding of our lives, commu-nity, world.

The second important lesson I’ve learned is that engaging stu-dents in this cycle is not easy and requires considerable profes-sional development for teachers. Our community needs to en-gage seriously in the preparation of teachers, not just throughhosting workshops, but through changing teacher preparation atthe undergraduate, graduate, and credentialing levels. Both sci-ence and math teachers are, with some exceptions, frightfully

c©2015 Robert Gould The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (11)

unprepared to teach students to engage meaningfully with data.“Big Data” and the algorithmic data culture provide a way for

us to move forward to reach more students and to reach themthrough engagement in authentic analysis of data. George is tobe applauded for shoving us in the right direction.

References

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

Franklin, C., Kader, G., Mewborn, D., Moreno, J., Peck, R., Perry, M., andScheaffer, R., (2007), “Guidelines for Assessment and Instruction in Statis-tics Education (GAISE) Report: A Pre-K-12 Curriculum Framework,” [on-line] The American Statistical Association. Available at www.amstat.org/education/gaise/

Hogg, R. and Craig, A., (1978), Introduction to Mathematical Statistics (4thed.), London: Macmillan.

Lazer, D., Kennedy, R., King, G., Vespignani, A. (2014), “Big Data: The Parableof Google Flu: Traps in Big Data Analysis,” Science, 343(6176), pp. 1203–1205. DOI: 10.1126/science.1248506.

2

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (12)

Seeking the Niche for Traditional Mathematics within UndergraduateStatistics and Data Science Curricula

John P. HOLCOMB, Linda QUINN, and Thomas SHORT

George Cobb’s wonderful paper stimulates thought regardingthe undergraduate statistics curriculum. The historical insightsand the use of metaphor are illuminating. The list of referencesprovides an entire seminar course curriculum on seminal workin statistics and how it can be taught.

In our commentary, we focus on the complex role mathemat-ics plays in the undergraduate curriculum as it relates to theteaching of statistical science. In Section 3 of his paper, Cobbpresents his case for “How we got stuck: the evolving role ofmathematics.” Our experiences in departments of mathematicsat Cleveland State University (a comprehensive urban state uni-versity of approximately 12,000 undergraduates) and John Car-roll University (a private university of approximately 3,000 un-dergraduates) indicate that many students get “stuck” in theirquests to learn more statistics because they are limited by defi-cient knowledge of traditional mathematics.

At Cleveland State University, we have structured an un-dergraduate minor in statistics with access points for studentsfrom a variety of majors, including psychology and business.Our minor consists of a general introductory course, a secondcourse, and separate courses in regression, design, and con-sulting. When non-mathematics majors become excited aboutstatistics and wish to take additional coursework, they run intothe mathematical wall. The minor at John Carroll University issimilar, but differs in that students can count up to two quanti-tative methods courses within their home major. The statisticsminor for students outside of the mathematics major does notinclude enough mathematical content to support graduate workin most traditional statistics programs. By this we mean pro-grams that need students to have understanding of multivariablecalculus or linear algebra in required courses at the masters orPh.D. level.

For many students who do not take college algebra or cal-culus in their first year (most likely because they took a moreappropriate introductory statistics course as their general edu-cation mathematics course), Cobb’s summary of access to sta-tistical ideas becomes a much longer chain of courses:

College Algebra→ Trigonometry→ Calc I→ Calc II→Calc III→ Probability→Math Stat

We propose a thought experiment, inspired by Cobb’s three-part triage in Section 2.2 on how we work with data and clients.Whom should we allow to major in statistics (per the “Curricu-lum Guidelines for Undergraduate Programs in Statistical Sci-ence”) or go on to graduate school in statistics?

Online discussion of “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” by George Cobb,The American Statistician, 69. John P. Holcomb, Cleveland State University2121 Euclid Avenue, Cleveland, OH 44115 (Email: [emailprotected]).Linda Quinn, Cleveland State University - Mathematics 2121 Euclid Avenue RT1515 Cleveland, Cleveland, OH 44115. Thomas Short, John Carroll University.

1. Are you a third-year college student who has already takenup through multivariable calculus and linear algebra? Ifnot, go away.

2. Are you ready, are you willing, and do you have the timebefore graduation to take calculus, linear algebra, real anal-ysis, and Markov chain probability (and all the perquisitesfor those courses)? If not, go away.

3. Do you have the written and oral communication skills thatemployers repeatedly say that they really want in their em-ployees? If not, that is okay because it is really only math-ematical aptitude that matters for admittance into our pro-grams (even though we will do little to help you acquirethose needed communication skills).

To embrace Cobb’s use of metaphor, we wonder if the math-ematical jewels worn on the necklace around the statistician’sneck have turned that jewelry into a noose that is choking ac-cess to our field.

We argue that the budget crises facing both state and pri-vate colleges and universities should make faculty aware of en-rollment figures in their programs. A data-informed culture ofhigher education administrators is looking very closely at stu-dent enrollment data and the numbers of majors in programs.“Program prioritization” is being used across the country tohelp to determine where resources should flow for growth andwhere programs should be cut. Even state legislatures are de-manding to know how established and successful universitiesare preparing students for immediate employment upon gradu-ation. Statistics as a program has a tremendous opportunity toattract a great many students. Word is reaching both traditional-age and non-traditional students that there is an abundance ofhigh-paying jobs available for those who possess data skills.Mathematics, statistics, and data science departments cannot af-ford to turn potential majors away. We believe that there aretalented undergraduates who want to explore statistical science,but find their lack of mathematical training a hurdle that can-not be overcome in time to graduate within a reasonable win-dow. Thus, we agree with Cobb’s contention to “flatten prereq-uisites” at the undergraduate level, but worry that students inthese courses will still be too mathematically deficient to enrollin graduate programs in statistics.

On the other hand, undergraduate mathematics programs arestocked with large numbers of mathematically talented studentswho need to find employment upon graduation. Inviting thesestudents to take more statistics courses in order to broaden theirskill sets will provide them with an easier path to employmentand is one way to help to meet the high demand for data work-ers. We believe that we need to caution our mathematical col-leagues that encouraging their students to go on to graduate

c©2015 John P. Holcomb, Linda Quinn, and Thomas Short The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (13)

school in mathematics might be a disservice. The popular goalsof becoming a community college instructor with a master’s de-gree in pure mathematics or earning a tenure-track position ata college or university with a Ph. D. in pure mathematics arevery difficult to attain. Recently, John Carroll University hadapproximately 500 applications for a single tenure-track posi-tion in mathematics where the area of specialization was notspecified.

Statistics has the potential to help bridge this mismatch ofhighly talented students and employment needs in government,industry, and academia. We in statistics should advocate to ourmathematical colleagues that advising students to pursue a dou-ble major in mathematics and statistics (or some major/minor

combination in mathematics, statistics, and computer science)ultimately serves the students better than earning graduate de-grees in mathematics. This will not be easy, as so much of per-sonal identification is tied in up in an individual’s focus of study.

We leave it to others to debate whether the undergraduate cur-riculum needs to be a “tear down” as Cobb suggests, but wewould argue that at the very least, our statistical house needs anadditional wing. We need coursework and curricula that invitestudents who begin in other majors (including mathematics!)to acquire data analysis skills that in turn provide an avenue tohigh-paying and satisfying careers.

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (14)

The Gap Between Statistics Education and Statistical Practice

Robert E. KASS

As I write this response to George Cobb’s call to rebuild thestatistics curriculum, I am returning from a symposium, “Statis-tics in the 21st Century,” aimed at helping to define goals ofa new center for statistics at MIT (which has been an outlieramong premier U.S. universities in not having a statistics de-partment). To me, the most striking aspect of the symposiumwas the consistency among its speakers in their admiration forthe discipline of statistics, which focuses on the foundation ofscience and engineering: the use of data to provide informationabout the world. Maintaining this foundation as technology ad-vances is a noble endeavor and, in the past few years, partly dueto the advent of Data Science and Big Data, the importance ofstatistics has become much more widely appreciated.

The teaching of statistics has evolved more slowly than sta-tistical practice. In diagnosing the problem with undergraduatestatistics education, Cobb returns to Leo Breiman’s “two cul-tures” article and makes some important points. I completelyagree with him when, consistently with Breiman’s earlier sen-timent, Cobb warns against ceding to others “all methods ofanalysis that do not rely on a probability model.” Tukey’s pro-foundly important emphasis on the distinction between ex-ploratory and confirmatory (inferential) methods, including thecorruption of operating characteristics due to exploratory pre-processing, remains central to modern statistics. Furthermore,Cobb rightly suggests that computation should play a big rolethroughout the curriculum.

In Brown and Kass (2009, B&K hereafter), after criticizingour profession’s lag in adapting training programs to contem-porary statistical sensibility, we tried to move things forward byfocusing on the highest level goal: to help students think morelike expert statisticians. Our understanding of the way statisti-cians think was based on our experience in neuroscience. Wesaid, “In the course of perusing many, many articles over theyears . . . we have found ourselves critical of much publishedwork [in neuroscience]. Starting with vague intuitions, partic-ular algorithms are concocted and applied, from which strongscientific statements are made. Our reaction is too frequentlynegative: we are dubious of the value of the approach, believingalternatives to be much preferable; or we may concede that aparticular method might possibly be a good one, but the authorshave done nothing to indicate that it performs well. In specificsettings, we often come to the opinion that the science would ad-vance more quickly if the problems were formulated differently,formulated in a manner more familiar to trained statisticians.”We asked ourselves, What is it that differentiates expert statisti-

Online discussion of “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” by George Cobb,The American Statistician, 69. Robert E. Kass is Professor, Carnegie Mel-lon University, Department of Statistics, Machine Learning Department, andCenter for the Neural Basis of Cognition, 5000 Forbes Ave., Pittsburgh, PA15213. Support for this work was provided by NIMH grant R01 064537 (Email:[emailprotected]).

cians from other mathematically and computationally sophisti-cated data analysts? Our conclusion was that, roughly speaking,“statistical thinking uses probabilistic descriptions of variabil-ity in (1) inductive reasoning and (2) analysis of procedures fordata collection, prediction, and scientific inference.” It was notour intention to confine statistical education to those topics thatinvolve statistical models, and I again agree with Cobb (as weargued also in B&K) that there is too much emphasis on thesubtleties of mathematics-based statistical logic in many statis-tics courses. However, I would not back off the B&K formula-tion of what differentiates statistical approaches to data analy-sis, and I continue to advocate it as an overarching guide whenconsidering what curricula can accomplish. In fact, my contin-uing experience as an active member of the Machine LearningDepartment in Carnegie Mellon’s School of Computer Sciencehas only strengthened my conviction on this point, and deep-ened my feeling about Breiman’s article: Breiman coupled somevalid concerns with the bad advice that we should all think muchmore like 20th century practitioners of artificial intelligence.

An anecdote may be helpful. Some years ago, in the processleading up to Carnegie Mellon’s creation of its Machine Learn-ing Department, from the outset a joint enterprise of statisticsand computer science, we held a retreat to explore shared in-terests and develop a vision. At one point, a computer sciencecolleague said, “I’ve figured out the difference between statisti-cians and computer scientists: statisticians attack problems with10 parameters and want to get it right; computer scientists at-tack problems with 10 million parameters and want to get ananswer.” This was a telling remark. Yet, in the intervening time,the two perspectives have largely merged, as we are all tryingto do the best job we can with very large data sets, and com-plex models; in fact, the statistical perspective has been largelyvictorious in the sense of being fully integrated into every ma-jor machine learning conference and journal. At Carnegie Mel-lon, our Department of Statistics has incorporated computationextensively across our undergraduate offerings, as well as re-quiring students to engage with real, complex data sets, and wehave just started a new major in statistical machine learning. Iwill urge my colleagues to distribute details about their laudableefforts.

Cobb is concerned exclusively with the undergraduate cur-riculum. But the biggest challenge in statistics education arisesfrom the difficulty humans have in accepting ambiguity and act-ing reasonably in the presence of uncertainty. Together withcognitive psychologists, we should devise educational strategiesfor helping people grapple with this predicament, beginning atan early age. In addition, we should recognize the extraordinaryexpectations we place on those who teach elementary statistics,especially in high school. To teach the process of “thinking withdata,” one must not only comprehend the basics of statisticalreasoning (which is notoriously difficult) but also have someexperience with the way such reasoning is used in drawing con-

c©2015 Robert E. Kass The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (15)

clusions from data analysis. I fear we have not penetrated farinto schools of education, where teachers are trained, and I hopewe can find creative ways to do better in the future.

I presume this special issue will offer many constructive sug-gestions for advancing statistics education, which is a very goodway for committed teachers to share ideas. I am less clear aboutthe impact of the hand-wringing by both B&K and Cobb: wewrote, “The concerns we have articulated above are not minormatters to be addressed by incremental improvement. Rather,they represent deep deficiencies requiring immediate attention.”And Cobb frames his plea for reform with the “tear-down”metaphor. My guess is that, despite our undeniably compellingarguments, which undoubtedly convinced the vast readershipof these articles, change across the country as a whole willcontinue to evolve incrementally, and often more slowly thanat institutions such as Carnegie Mellon (where our Depart-ment of Statistics has a pretty unified view of our teaching

mission, a substantial campus presence, and a great deal of au-tonomy within our institution). I strongly endorse efforts to cre-ate modern, forward-looking online materials that can be usedby statistics teachers everywhere. Meanwhile, those of us inPh.D.-granting departments must remain vigilant as we train thestudents who will populate diverse environments, and will shapestatistics education in the future.

References

Breiman, L. (2001), “Statistical Modeling: The Two Cultures,” Statistical Sci-ence, 16, 3, 199–231.

Brown, E.N., and Kass, R.E. (2009), “What is Statistics?” (with discussion),The American Statistician, 63, 105–123.

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (16)

Training the Next Generation of Statistical Scientists

Eileen C. KING

George Cobb presents a provocative paper titled “Mere Ren-ovation is Too Little Too Late: We Need to Rethink our Under-graduate Curriculum From the Ground Up.” He discusses theneed for a total revamp to the undergraduate program. Essen-tials to this renovation of the curriculum include the following:“exploit context, embrace computation, seek depth, flatten pre-requisites, and teach through research.”

I have been a statistician involved in team science since theearly 1980’s and have seen a major evolution of our profession.As a manager in industry, I determined the correct mix of educa-tional training for our employees which included BS/BA, MS,and PhD prepared statisticians. At any point in time, approx-imately 30% of our employees were BS/BA prepared, manyof whom continued their education while still working. TheBS/BA statisticians would be very active and highly regardedmembers of multi-functional project teams and collaboratedwith data managers, physicians, senior statisticians and otherteam members on study design, data collection and cleaning,data analysis, and data presentation and interpretation.

In order to prepare our undergraduates to contribute to theirmaximum potential, a new approach to the educational pro-cess needs to be undertaken. In the sections that follow, Ipropose one method for preparing undergraduate students fora rewarding career as a statistical scientist that incorporatesthe imperatives laid out by Dr. Cobb and is in close align-ment with the recently issued “ASA Guidelines for Undergrad-uate Programs in Statistical Sciences” (http://www.amstat.org/education/curriculumguidelines.cfm. I also discuss the need forproviding opportunities for undergraduates to advance their ed-ucational training throughout their career by having access totop statistics programs without the need to relocate or leave theircurrent position.

Training the BS/BA Statistical Scientists

Recruitment of top talent needs to happen at the high schoollevel and within the AP calculus, statistics, computer science,and science classes. These students enter their post-high schooleducational process well prepared to become successful statis-tical scientists. Students expressing an interest in statistical sci-ence as an undergraduate major should be required to identify1–2 areas of collaboration by the end of their first semester, e.g.biology, psychology, medicine, business. To facilitate this re-quirement, the following two-step process is recommended: 1)each student would be required to attend a seminar series withspeakers from other divisions who present on their research sothe student can identify a collaborating division, and 2) each stu-

Online discussion of “Mere Renovation is Too Little Too Late: We Need toRethink Our Undergraduate Curriculum From the Ground Up,” by GeorgeCobb, The American Statistician, 69. Eileen C. King, Research Associate Pro-fessor, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH (Email:[emailprotected]).

dent would write a proposal and make a presentation on his/herinterest in working with the specified division. This requirementprovides the first of many opportunities for the development ofeffective writing and presentation skills.

At the beginning of the second semester, students would thenbe paired with an allied department for their collaboration ex-perience whereby they will be involved in working on projectswith researchers in those areas for their entire undergraduateprogram (Cobb: “Exploit context—use research as a vehiclefor teaching statistics”). Students could have the opportunity tochange areas after their first two years but it is important thatstudents learn the scientific area in which they consult so thatthey can be effective. A faculty member and/or advanced MS orPhD student within the statistics department must provide tech-nical oversight and training to the undergraduate student. Basedon this initial assignment, the curriculum for the student can bedeveloped; hence, “context dictates content” [Cobb]. For exam-ple, if the collaboration requires knowledge of approaches foranalyzing “big data” such as in the business world or informat-ics, then courses specific to this area can be taken (e.g. multi-variate statistics, exploratory data analysis, data visualization).

Of course, all students will need to take a set of core coursesincluding statistical theory. However, the method, prerequisitesand timing of these courses within the curriculum should takeinto account the individual student’s progression through the re-search experience [Cobb: flatten prerequisites]. This approachwill provide the context for learning [Cobb: “practice usuallyleads and theory follows.”] Technology today allows for intu-itive ways of learning important theoretical concepts throughsimulation and algorithmic methods as opposed to “formulasthat you can easily compute by hand” [Cobb: Embrace compu-tation]. For example, concepts of p-values and confidence inter-vals are still necessary for making inferences, especially sinceour scientific colleagues rely on them (although I often ques-tion if they understand them) but the approach for construct-ing and interpreting can be made more intuitive [Cobb: Seekdepth]. Sample size calculations through simulations should bea mainstay. Undergraduates must know statistical programminglanguages very well (e.g. SAS, R, etc) to be marketable. Struc-tured programming skills and the importance of producing well-documented and validated programs that lead to reproducibleresults is imperative. Undergraduate students must also masterthe “soft skills” such as good oral and written communicationskills, effective teamwork and collaboration, and the capabilityof making effective presentations. The proposed program willprovide ample opportunities for development of these skills.

Internships after the sophom*ore and junior years should beavailable and strongly encouraged for the students pursuing aBS/BA in statistics. The proposed program structure providesearly opportunities to develop the skills required for effectivecollaboration within multi-functional teams. As a result, theywill be prepared for these internships and will add value for the

c©2015 Eileen C. King The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (17)

sponsoring organizations.

Challenges to Implementation

I recognize there will be many challenges to this approach.First, in order to provide the opportunity for students to pairwith a department upon entry into the program, the collegeand/or university must have access to active research programswithin or external to their institution. Another challenge is theavailability of faculty within the statistics departments to pro-vide oversight to these undergraduate students. The facultymember must be proficient in collaborative research with otherdepartments in order to be an effective mentor for these stu-dents. The biggest challenge will be the willingness of other de-partments to provide research opportunities for undergraduatestatistics students. These departments must see the reward ofparticipation through effective and value added statistical col-laborations that lead to publication and increased funding.

Attracting top talent that can develop into highly effectivestatistical collaborators is a barrier. It is my perception that stu-dents typically enter their undergraduate programs with the de-sire to “change the world.” It is important that these students seethe impact they can make as a statistical scientist on producinghigh quality research that can, in fact, “change the world” forthe better. They must experience “team science” from the be-ginning of their educational program.

Supporting the Career of the BS/BA Trained StatisticalScientist

Part of the renovation should also include considerationof the entire professional career for BS/BA trained statistical

scientists including opportunities for obtaining advanced de-grees from highly regarded programs. It is common that theseindividuals will, after a few years, want to further their edu-cation without leaving a position that is fulfilling, provides agood income, and oftentimes provides money for advanced ed-ucation. There are many opportunities through distance learn-ing for students to obtain a MS degree. However, there are few,if any, options available for students who wish to continue theeducational process by obtaining a PhD degree from a top pro-gram without relocating to the specific university. Today’s tech-nology allows for synchronous and asynchronous learning andeffective collaborations with thesis advisors at all levels of theeducational process.

Conclusion

The curriculum for undergraduates in statistical science mustbe renovated. I have proposed one approach that augments Dr.Cobb’s discussion with the aim of attracting top talent into ourprograms and produces well-prepared and “in demand” BS/BAgraduates.

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (18)

Stirring the Curricular Pot Once Again

Sharon J. LANE-GETAZ

A Look Back, Then Forward

Rethinking the Introductory Course

I had high hopes of a major upheaval in statistics educa-tion after George Cobb’s 2005 USCOTS talk and paper (Cobb,2007). I naively believed that statistics educators would findnew insights for teaching statistical concepts as they integratedtechnology more routinely into their teaching. I believed thatintroductory students would embrace a randomization-basedcurriculum to deepen inferential understanding. I believed thatteaching and learning randomization-based inference wouldpromote statistical thinking and would be an impetus to trans-form our overall approach to introductory statistics education.It would just take some time.

What I see a decade later is that a dedicated minority ofstatistics educators embraced the development and teachingof randomization-based inference and simulation (e.g., Chanceand Rossman, 2006; Lock, Lock, Morgan, Lock, and Lock,2013; Tintle, Chance, Cobb, Rossman, Roy, Swanson, Vander-Stoep, 2015; West, 2009). However, the vast majority of statis-tics educators seem skeptical. Some teachers report having trieda randomization or simulation demonstration or two. Others in-serted a few class activities as well. Such minor changes arenot likely to produce a lasting, measureable change in students’inferential reasoning. And the research comparing learning out-comes from randomization-based courses to the normal-basedstatus-quo has enough limitations and confounding factors thatthe skeptic can remain unconvinced.

Rethinking the Entire Undergraduate Curriculum

Having been a proponent of randomization-based methods, Iwas at first taken aback by Cobb’s newest “shaking of his fin-ger.” I took a deep breath. I reread one of Cobb’s recent papers(Cobb, 2011) and reminded myself that he has been calling forcurricular change on a regular basis (Cobb, 2011, 2007, 2000,1992; Cobb and Moore, 1997). He has proposed that we teachinference using the randomization test (Cobb, 2007) for someand start with statistical modeling for the more mathematicallyminded student (a la Kaplan, 2012). He seems to periodicallystir the pot to challenge us to better teaching and learning ofstatistics. “We can advance the cause of statistics teaching andlearning by identifying and questioning unexamined assump-tions about what we do, why we do it, and when we do it”(Cobb, 2011, p. 31).

Cobb now asks statistics educators to take a more dramaticstep, to think outside of our comfortable curricular box.Think

Online discussion of “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” by George Cobb,The American Statistician, 69. Sharon J. Lane-Getaz is Associate Professor,Department of Mathematics, Statistics and Computer Science and the Dept. ofEducation, St. Olaf College (E-mail: [emailprotected]).

creatively about how we can better prepare our students to wran-gle the truth from data. Rethink our curriculum in order to makeroom for:

• algorithmic and computational techniques,

• data science methods,

• Bayesian inference, and an

• authentic research experience.

Cobb is challenging us to admit that not all probabilities areequiprobable; many statistical questions cannot be modeledwith a known reference distribution; and all probabilities areconditional. And to do something about it!

One Alternate Path

At St. Olaf we have experimented with teaching topics out-side the more traditional sequence (i.e., outside ofz-tests,t-tests, multiple regression, logistic regression, etc.). Our interme-diate course for social science research includes topics that stu-dents might encounter in graduate school (Lane-Getaz, 2012).The course is designed to meet the preparedness of students whohave only taken our statistical literacy service course. The syl-labus includes four weeks of ANOVA and ANCOVA methods(one-way, two-way and interaction), two weeks of measurementtopics (scale development, reliability, validity, bias and discrim-ination) and six weeks of dimension reduction analyses: princi-pal components analysis (PCA) and exploratory factor analysis(EFA). During the final three weeks students choose, developand present an activity-based lesson to the class. After the inau-gural course offering, one student wrote, her favorite aspect ofthe course was “learning PCA and EFA. I had never done any-thing like this before, and I was really excited to learn it becauseit is so applicable in psychology research! I think we learnedit in a way that made it comprehensible at the undergraduatelevel” (Lane-Getaz, 2012).

Fall of 2015 is the third scheduled offering of the course. Af-ter reading Cobb’s article, I ask “Was I bold enough?”

I am reminded of a bright student in Statistical Modeling,the foundational course for statistics concentrators. This studentproposed to do his final project using decision tree classifica-tion. I was hesitant. The course topics were multiple and logisticregression. Besides, how would I evaluate his work? Despite mymisgivings, his final presentation to the class was sound, easy tofollow and quite extensive—including animated colorful graph-ics. Regretfully, his final exam was less than stellar. He hadn’tattended to learning the intended course topics. To this day Ifeel that he deserved a better grade than he earned, mathemat-ically. The student had followed his curiosity, taught himself anew procedure, and introduced the class to classification. Thetopic was an accessible, useful alternative to logistic regressionfor his dataset.

c©2015 Sharon J. Lane-Getaz The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (19)

This story points out how we, as statistics educators, need tojoin our students in the dance of curiosity and experiment. Weneed to let go of our fear of change and the unknown. We needto remind ourselves that teaching and learning is a never-endingprocess for the students and for us. If we get stuck in a rut ofteaching the same things we learned, the same way we learnedthem, and assess the same way we always have, then we—andthe field of statistics—will be left behind.

Cobb’s Five Imperatives

With that in mind, I wish to applaud Cobb’s five summativeimperatives that might guide our re-thinking of undergraduatestatistics curricula:

1. Flatten the pre-requisites.Mathematics prerequisites serveas a barrier to entry to non-mathematical but brightthinkers. Many of the students in our introductory levelservice course are good critical thinkers but are not well-prepared mathematically. Of these students, a small num-ber do choose to take a second course in statistics, ourintermediate level statistics course for social science re-search (Lane-Getaz, 2012.) These students have an avenueto deepen their statistical thinking and gain confidence intheir statistical ability. The course is a step up for thosewho plan to attend graduate school in statistics or a relatedfield.

2. Embrace algorithmic/computational thinking.My ambi-tious statistical modeling student serves as an example tome to be fearless. A logical, step-by-step, computationalapproach to a problem is valid when the data do not con-form to our standard paradigm: (1) same data source, (2)row and column format, (3) probability model fits. Ourdata-driven society demands that we grapple with thesenew types of data and that we make this new content ac-cessible to a broader array of student interests and needs(Horton, 2015). [Flashing back to my story and my newcourse, could I find a place in my new course for decisiontree classification?]

3. Seek depth.In the tradeoff between depth and breadth, weteachers tend to cling to breadth. But students may bet-ter remember the deeper experiences. For example, weteach ANOVA designs and analysis using data that stu-dents have collected from a day of launching gummy bears(adapted from Cobb and Miao, 1998). The first day of classis dedicated to time-consuming data collection. It is wellworth it. Students are actively engaged, rolling up theirsleeves in teams, and discussing design issues to the de-gree they can, from day one. For homework they are askedto start organizing their data to compare launch distancesfor the various conditions. They are primed to learn two-way ANOVA, blocking, main effects and interaction. Theyremember.

4. Exploit context.As a mathematics undergraduate, I sawcontext as ancillary to the problem, even bothersome. Cobb(2015) reminds us “in applied data analysis context pro-vides meaning.” This statement is blatantly obvious to mysocial science students. In fact, they are more comfortable

dealing with issues related to the context of our case stud-ies than they are with the statistical issues. Students in ourintroductory level service course tend to take on big, con-troversial questions for their final class projects. They typ-ically analyze data from the General Social Survey andYouth Risk Behavior Survey, among others. Recent top-ics explored relationships between: Gender, Alcohol andDepression; Mental Health, Drugs and Physical Activity;Racial Discrimination in Employment, and Drugs, MentalHealth and Sexual Behavior. Context motivates.

5. Teach through research.Similarly, authentic research ex-periences motivate and teach students what statisticiansreally do. The expanded Center for Interdisciplinary Re-search (eCIR) has proved to be a great maturation groundfor our statistics concentrators (Legler, Roback, Zieglar-Graham, Scott, Lane-Getaz, and Richey, 2010). The eCIRlab promotes creative approaches to data analysis, in col-laboration with faculty from across the college. These re-search collaborations foster closer relationships among theeCIR fellows (our students), between the fellows and fac-ulty, and among the faculty as well. Most importantly, theeCIR fellows are inspired to pursue additional statisticalstudies.

Conclusion

“For every action there is an equal and opposite reaction.” IfNewton’s law applies, we can expect a big reaction to Cobb’scall for curricular change. We need to temper this reaction andheed the call to rethink our content and our teaching. We need topromote the statistical thinking required to analyze today’s data.The modern students sitting in our classrooms now have dra-matically different data-related experiences than in years past(Gould, 2010). With the heightened expectations of the mod-ern student, our traditional courses are sure to disappoint. Re-thinking the undergraduate curriculum is imperative. Cobb haslaid out some essential ingredients for our consideration as westir the curricular pot once again.

References

Chance, B. L., and Rossman, A. J. (2006),Introduction to Statistical Concepts,Applications and Methods, Belmont, CA: Thomson/Brooks/Cole.

Cobb, G. W. (1992), “Teaching Statistics,” inHeeding the Call for Change:Suggestions for Curricular Action,ed. L. A. Steen, MAA Notes (Vol. 22,Washington, DC: Mathematical Association of America, pp.3–43.

(2000), “Teaching Statistics: More Data, Less Lecturing,” inResourcesfor Undergraduate Instructors: Teaching Statistics,ed. T. L. Moore, MAANotes (Vol. 52), Washington, DC: Mathematical Association of America,pp.3–8.

(2007), “The Introductory Statistics Course: A Ptolemaic Curriculum?,Technology Innovations in Statistics Education, 1(1). Available online athttp:// repositories.cdlib.org/uclastat/cts/ tise/.

(2011), “Teaching Statistics: Some Important Tensions,”Chilean Jour-nal of Statistics, 2(1), 31–62. Available online athttp://chjs.deuv.cl/Vol2N1/ChJS-02-01-03.pdf.

(2015), “Mere Renovation is Too Little Too Late: We Need to Re- thinkOur Undergraduate Curriculum from the Ground Up,”The American Statis-tician, 69(4), doi:10.1080/00031305.2015.1093029.

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (20)

Cobb, G. W., and Miao, W. (1998), “Bears in Space: Activities to IntroduceBasic Ideas of Design,” inProceedings of the Fifth International Conferenceon Teaching Statistics, ed. L. Pereira-Mendoza, Voorburg, the Netherlands:ISI.

Cobb, G. W., and Moore, D. S. (1997), “Mathematics, Statistics, and Teaching,”American Mathematical Monthly, 104(9), 801–823.

Gould, R. (2010), “Statistics and the Modern Student,”International StatisticalReview, 8(2), 297–314.

Horton, N. J. (2015), “Challenges and Opportunities for Statistics and StatisticalEducation: Looking Back, Looking Forward,”The American Statistician,69(2), 138–145.

Kaplan, D. T. (2012),Statistical Modeling: A Fresh Approach(2nd ed.),Charleston, SC: CreateSpace Independent Publishing.

Lane-Getaz, S. J. (2010), “Linking the Randomization Test to Reasoning aboutp-values and Statistical Significance,” in theProceedings of the Inter-national Conference on Teaching Statistics (ICOTS-8), Ljubljana, Slove-nia, July 11–16, 2010. Available online athttp:// iase-web.org/documents/papers/ icots8/ ICOTS8C210 LANEGETAZ.pdf.

(2012), “Developing a Second Course in Statistics for Students in theSocial Sciences,” inProceedings, Statistical Education Section, Alexandria,VA: American Statistical Association, pp. 2757–2763.

Legler, J., Roback, P., Zieglar-Graham, K., Scott, J., Lane-Getaz, S., and Richey,M. (2010), “A Model for an Interdisciplinary Undergraduate Research Pro-gram,”The American Statistician, 64, 59–69.

Lock, R. H., Lock, P. F., Morgan, K. L., Lock, E. F., and Lock, D. F. (2013),Statistics: Unlocking the Power of Data, Hoboken, NJ: Wiley.

Tintle, N., Chance, B., Cobb, G., Rossman, A., Roy, S., Swanson, T., and Van-derStoep, J. (2015),Introduction to Statistical Investigations, Hoboken, NJ:Wiley.

West, R.W. (2009), “Social Data Analysis with StatCrunch: Benefits to Statisti-cal Education. Technology Innovations in Statistics Education, 3(1). Avail-able online athttp://escholarship.org/uc/ item/67j8j18s.

The American Statistician, Online Discussion 3

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (21)

Authentic Data Analysis Experience

Duncan TEMPLE LANG

I’d like to applaud and thank George for a very stimulatingand entertaining paper, and the challenge to reinvent the statis-tics undergraduate curriculum. I am very hopeful that it willlead to real discussions, experimentation and, importantly, sig-nificant changes.

Do we need such radical changes to our undergraduate cur-riculum? To answer this, I hope every instructor will seriouslyreflect on whether their graduates have the capabilities to per-form good data analyses? and whether we could be doing sub-stantially better in this regard? My focus here is on data anal-ysis, broader than statistics, as that is what the majority willdo with what they have learned, and is increasingly in high de-mand.

In my opinion, most of our students are not prepared for dataanalysis after their statistics major. My explanation is simple—they have done very little actual data analysis. Instead, they havelearned methods, solved homework problems corresponding tothe method taught that week, and perhaps done a project ora single capstone course. They think there is “one correct an-swer” and that the data analysis process starts by being told tofit a model or perform a hypothesis test, and ends by reportingthe parameter estimates or a p-value. For a model, they mayhave generated the expected diagnostic plots, but not necessar-ily have really interpreted them.

Many students enjoy the mathematics, and others lose sightof the “why” of the methods due to the mathematical and com-putational details. Many see only the deterministic aspects ofthe methods, and not the variability in the data and the approxi-mations and rough precision needed for the insights and qualita-tively solving real problems. They get drawn into the details of atest, only to forget to ask if the observations form the populationor a sample, or are dependent. The statistical methods are im-portant elements of data analysis, but there is so much more tothe data analysis process than these methods and we don’t spendmuch time teaching these other components. Exploring data issomething they may feel compelled to do because that is “rec-ommended”, but is an obligation before the “real statistics” aredone. Indeed, George says we teach EDA in a first course, buttypically, this is labeled “Graphical Summaries” or “Descrip-tive Statistics” in textbooks. Unfortunately, the very importantsteps of cleaning and exploring of both the data and the problemare not emphasized as being essential parts of the data analysisprocess.

While a laudable goal is statistical reasoning, students need todevelop, at least, a sense of intelligent data analysis, the abilityto frame a data analysis question and identify the goals, and theskills to express the necessary computations and create graphi-

Online discussion of “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” by George Cobb,The American Statistician, 69. Duncan Temple Lang, University of California,Davis (Email: [emailprotected])

cal displays. They develop these by repeating the process mul-tiple times, not just once in a capstone course. They must learnthe process by first watching how data analysts actually work,via guided case studies. The ideas and motivation of commonmethods can be introduced at this point without the details. Thisis, as many have written, quite a different learning experiencefrom presenting a long list of methods and their underpinningsthat we typically teach in courses without the actual data anal-ysis context and connecting the thinking to the question. TheNSF-funded Explorations in Statistical Research (ESR) work-shops (Nolan and Temple Lang 2015) that Deborah Nolan (UCBerkeley) and I organized exposed students to the data analysisprocess. While very short, they illustrate the need and potentialfor quarter/semester–long immersion with real data analysis.

Students enjoy exploring data when they understand what isbeing measured (e.g., cost of apartments in different cities, cartraffic patterns, airline delays, climate change, social networkpatterns.) Students can acquire reasonable computational skillsby exploring data. These are the skills they will need to pro-cess and analyze data. Given these computational skills, theycan then simulate data and explore the characteristics of statisti-cal methods. This can augment, or substitute for, the mathemat-ical understanding for different students.

On reading George’s paper, I was led to Friedman’s 1997article on Data Mining and Statistics (Friedman 1997), whichled me back to Tukey’s “The Future of Data Analysis” (Tukey1962), and of course to Brown and Kass (2009), and other his-torical calls for changes to the curriculum. We should reflect onthese papers and see how much has changed. Indeed, Georgewrites about adoption of Bayesian statistics: “Statisticians readthe arguments, followed the proofs, nodded in agreement, andcontinued in their pursuit of incoherence.” All of these calls forchange are important, and I believe statistics education is slowlychanging. However, it is too slow and there are many other fieldsthat are providing alternatives, and for teaching the actual prac-tice of data analysis, this may be a good thing.

I recently became the director of the Data Sciences Initia-tive at University of California, Davis. While there are manylogistical challenges, I feel liberated and less constrained. Wehave an opportunity to develop programs from the ground up,as George is encouraging. There are many constituents hun-gry for Data Sciences education and research skills that spanthe entire data pipeline, including data identification, acquisi-tion, cleaning, exploration, visualization, analysis and dissemi-nation of insights. This demand for data pipeline knowledge issomething we should embrace. We should work with variousdifferent fields (both consumers and producers of data sciencesmethods) to create students with the essential fundamentals andproblem solving capabilities needed in data science. Data sci-ence requires data analysis, computational reasoning, and actualexperience and practice.

Deciding if and how we should change the curriculum in-

c©2015 Duncan Temple Lang The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (22)

volves clearly articulating and prioritizing our goals. For me,the ability to be creative, independent, problem solve, work ina team, be able to map ideas into computations and results,and, most importantly, to make sense of data are important forthe majority of our students. Whether this involves more orless mathematics, computing, methods, . . . is up for debate and,importantly, experimentation and evaluation. Learning “just-in-time” or “on-demand” is an important skill for problem solv-ing, and can help the students escape the “multiple-choice/one-correct-answer” mindset into which they are led from high-school through college. A mode of teaching that leads studentsto “discover” traditional statistical methods, rather than just be-ing told them, will be much richer.

George’s suggestions of a “Teacher’s corner” that presentinnovations and experiences in teaching is a good idea. Facili-tating instructors to develop and share case studies and projectswould also be very useful. Having these recognized as scholarlycontributions could help. While change in the curriculum ishopefully proceeding, we can act more quickly by having stu-dents participate in data analysis challenges. These might be asshort as the weekend-long DataFestTMthat has emerged through

UCLA, the ASA, and others, or 6–7 week long ongoing prob-lem solving activities centered around real problems (e.g., extra-curricular team-based data analysis competitions).

Let’s embrace the new opportunities that data sciencepresents and include Statistics in the next generation of dataanalysis.

References

Brown, E. N., and Kass, R. E. (2009), “What is Statistics?” (with discussion),The American Statistican, 63, 105–123.

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

Friedman, J. H. (1997), “Data Mining and Statistics: What’s the Connection,”in Proceedings of the 29th Symposium on the Interface Between ComputerScience and Statistics.

Nolan, D., and Temple Lang, D. (2015), “Explorations in Statistics Research:An Approach to Expose Undergraduates to Authentic Data Analysis,” TheAmerican Statistician.

Tukey, J. W. (1962), “The Future of Data Analysis,” Annals of MathematicalStatistics, 33(1):1–67,

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (23)

Vision or Bad Dream?

William I. NOTZ

And afterward, I will pour out my Spirit on all people. Yoursons and daughters will prophesy, your old men will dream

dreams, your young men will see visions. (Joel 2:28)

I thank George Cobb for a thought-provoking and propheticpaper. The following are a few thoughts that occurred to me asI read the paper.

To begin, by the undergraduate curriculum I mean the en-tire body of courses and programs we offer for undergraduates,from introductory service courses to programs for majors. Bystatistics department I mean any department that employs fac-ulty with advanced degrees in statistics to teach courses andoffer programs in statistics. I will use “the science of data” (aphrase used by Moore (1992)) to include bioinformatics, datamining, data analytics, data science, and big data. Not everyoneagrees that these are part of statistics and I use the terms “statis-tics” and “the science of data” to emphasize this discrepancy.

1. Arguments Concerning Change

1.1 Arguments Against Rethinking our Curriculum fromthe Ground Up?

• Most departments already periodically rethink (and evenformally assess) their curriculum. Are we really in direstraits?

• Graduate programs and employers expect a basic level ofcompetency, but prefer to provide training in additionalskills. Perhaps all we need do is provide a basic levelof competency, something far less ambitious than whatGeorge Cobb proposes.

• We teach (but do not always practice) that extrapolationfrom the present into the future is dangerous. Will changeswe make now be outdated tomorrow?

If statisticians taught all courses in the science of data,George Cobb’s case might be less compelling. Unfortunately,the growing popularity of the science of data and the increasein courses taught by faculty not trained as statisticians, makesthis both the best of times and the worst of times for statisticsdepartments.

1.2 The Best of Times?

Reports about the science of data appear regularly in the me-dia. Success stories abound. The “internet of everything” will

Online discussion of “Mere Renovation is Too Little Too Late: We Need toRethink Our Undergraduate Curriculum From the Ground Up,” by GeorgeCobb, The American Statistician, 69. William I. Notz is Professor, Departmentof Statistics at The Ohio State University, 1958 Neil Avenue, Columbus, OH43210-1247 (Email: [emailprotected]).

generate massive amounts of data to be mined. The science ofdata is suddenly “sexy.” Students are flocking to our courses andmajors. STEM initiatives and employers seeking people withskills in the science of data, increase the chances of funding forstatistics departments. Responding to this growing interest inand demand for people who can extract information from datawill force us to reexamine our undergraduate curriculum.

1.3 The Worst of Times?

There is a great deal of confusion about exactly what is thescience of data. Researchers in many disciplines claim exper-tise, and hence the right to offer courses and programs in thescience of data. Has Breiman’s stochastic culture dominated theway we teach and practice statistics, so that Breiman’s algorith-mic culture is not regarded by outsiders as statistics? If so, it isnot surprising that others do not believe they are encroachingon our turf. To quote George Cobb, “do we really want to cedeto them all methods that do not rely on a probability model?”To address this, we need to seriously rethink our undergraduatecurriculum.

Others have noted that this is both the best and worst of timesin statistics. See, for example, Wasserstein (2015), who providesa more thorough discussion of why it is a great time to be astatistician, as well as challenges facing our profession.

2. Miscellaneous Thoughts

2.1 All Curriculum is Local

No department can be all things to all people. Graduateprograms vary from one department to another, reflecting thestrengths and interests of the faculty. Should the undergradu-ate curriculum exhibit similar flexibility? For example, in spiteof claims that Bayesian inference is an advanced topic, Bayesmethods are discussed in all the introductory service courses atDuke University. Dalene Stangl offers a short course in teachingBayesian methods for teachers in secondary education. Descrip-tions of courses taught by Mine Cetinkaya-Rundel can be foundat https:// stat.duke.edu/∼mc301/ teaching. Not all departmentswill want to emulate Duke, but I hope we avoid overprescribingwhat the undergraduate curriculum should look like.

2.2 Practice What We Preach

Many of my colleagues insist that all, or nearly all, coursesabout statistical methods should be taught by faculty in a statis-tics department. However, in an era of scarce resources, we can-not meet demands for new courses while accommodating grow-ing enrollments in existing courses.

We preach the interdisciplinary nature of statistics, but dowe practice it in our curriculum? Should we pursue collabo-

c©2015 William I. Notz The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (24)

rative teaching with faculty in other departments, or more rad-ically, encourage faculty in other departments to teach coursesin the science of data? Such a suggestion is controversial, and ifGeorge Cobb is to be identified with Martin Luther, I may be re-garded as an Anabaptist, despised by Catholics, Lutherans, andReformed. A more cooperative approach to the undergraduatecurriculum could involve the following.

• At The Ohio State University, general education coursesin data analysis must be approved by a committee.Any department can propose to teach a general edu-cation course in data analysis, but must meet guide-lines (developed by the Department of Statistics)regarding the content (see https://asccas.osu.edu/files/ASC CurrAssess Operations Manual.pdf ). At institutionswhere there is a process for approving new courses,faculty in statistics departments could suggest basicstandards that all courses in the science of data mustmeet to insure some level of quality (see the GAISEguidelines for introductory courses at http://www.amstat.org/education/gaise/GaiseCollege Full.pdf and theASA/MAA statement on qualifications for teachingstatistics at http://www.amstat.org/education/pdfs/TeachingIntroStats-Qualifications.pdf ). With the growingemphasis on assessing courses and curriculum, statisticsfaculty could develop the rubrics for assessing courses inthe science of data.

Faculty in statistics departments could provide work-shops for or informal gatherings with faculty in otherdepartments to discuss the teaching of courses in thescience of data, share ideas, syllabi, resources, or providetraining in the use of statistical software. One of mycolleagues in the Department of History told me that suchworkshops were held annually at Grinnell College whenhe was a faculty member there.

Faculty in statistics departments could offer to men-tor those wishing to teach a course in the science of data.This might involve helping draft a proposal for a course,guiding such a proposal through the approval process,suggesting resources, helping develop a syllabus, or evenproviding evaluations for the faculty member.

Faculty in statistics departments could offer to team-teach courses. This would bring together statistical andsubject matter expertise.

• Should we consider dual majors and minors for whichwe share responsibility? For example, at Miami Univer-sity one can major in statistics as well as in data analytics(the latter is a co-major shared by Information Systems andAnalytics, and Statistics).

• What about multiple programs? Rather than a single ma-jor, departments could offer multiple majors with differ-ent prerequisites, different core competencies, and differ-ent objectives.

2.3 Challenges

• Can our faculty agree on what is fundamental? Not toolong ago undergraduate majors in statistics were rare.Requirements for acceptance into a graduate program aremuch less stringent than the requirements for an under-graduate major. What is essential for an undergraduateplanning to pursue a PhD is different than what is essentialfor a student seeking a job upon graduation.

Even for introductory service courses, what is essen-tial is not clear. One recommendation has been tominimize the discussion of probability. However, to intro-duce Bayes thinking, students need to know somethingabout probability distributions, conditional probability,and Bayes theorem.

• Most introductory courses in statistics are taught by peo-ple without advanced degrees in statistics (for example,instructors in secondary schools and community colleges).We must train such teachers how to incorporate thechanges we propose in their teaching. Failure to do so willbe a failure to institutionalize change.

As an example of potential problems, many intro-ductory statistics textbooks begin with exploratory dataanalysis. One reason is to expose students from the outsetto the experience of exploring data in order to learn whatthe data are saying (see Moore (1992)). This motivateslater discussions about the pitfalls that can occur whendrawing conclusions from data and how these pitfallsmight be avoided. Unfortunately, many instructors teachthe sections on data analysis as descriptive statistics,perhaps because this is what they experienced in theirfirst course. They emphasize the process of calculatingnumerical summaries and making graphical displays,rather than using these as tools to explore what the dataare saying. How many instructors expose their students toexploratory tools such as brushing and slicing or linkeddisplays? Introducing algorithmic methods could suffer asimilar fate without training others to teach these methods.

What resources are needed for teachers to teach ef-fectively? Can one introduce students to computationallyintensive methods when one is unable or reluctant torequire students to use computers? Can one simply relyon graphing calculators? How do we examine students oncomputer intensive methods when we fear the use of com-puters on exams will encourage cheating? Shouldn’t ourmethods of assessment match the material we emphasizein class?

• I like George Cobb’s suggestions for disseminatingchanges. Publication of innovations (new courses, newteaching methods, new curricula) in journals, exposing in-novations to testing and refinement by the larger statisticalcommunity, and eventually institutionalizing change is anexcellent strategy. The challenge is finding journals willingto publish such papers. As a former editor of the Journal

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (25)

of Statistic Education, I suspect that the journal would bereceptive to a teachers’ corner section.

3. Conclusions

Do we need to rethink our curriculum from the ground up?I believe we do. This will be challenging, but there are bothfaculty and programs already thinking about and engaged inthis exercise who can serve as resources for the rest of us. Iagain thank George Cobb for his call to arms. Others will de-cide whether his is a vision for the future or just a bad dream.

REFERENCES

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

Moore, D. S. (1992), “What is Statistics?” in Perspectives on Contempo-rary Statistics, eds. D. Hoaglin and D. Moore, Mathematical Associationof America, Washington, pp. 1–18.

Wasserstein, R. (2015), “Communicating the Power and Impact of our Pro-fession: A Heads up for the Next Executive Directors of the ASA,” TheAmerican Statistician, 69, 96–99.

The American Statistician, Online Discussion 3

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (26)

A Response to “Mere Innovation is Too Late”:Data Cowboys and Statistical Indians

Jim RIDGWAY

“Mere Innovation is Too Late” is an important paper callingfor reflection and constructive discussions about the future ofstatistics education. George Cobb offers a metaphor from Cal-ifornia real estate, namely that serviceable properties are oftenrebuilt by their owners to bring them up to date. However, Ifear that he has mapped out the most positive scenario for thefuture of statistics. A “middle ground” metaphor is that of In-dian tribes being moved from their reservations to less desir-able and less fertile ground—here, statisticians being displacedby data scientists from their sacred territory; a darker metaphoris the fate of the Indian tribes in California in the Mission andpost-Mission eras—condemned to servitude, and random actsof genocide. Data cowboys are unlikely to shoot statisticians—but then they don’t need to, because they seem to take over coreterritory, with rather little effort. Data science is seen to be sexy;it uses data that everyone actually generates (twitter streams,purchasing data, mobile phone locations), and creates applica-tions that are really useful and part of everyday life in devel-oped countries—fingerprint access, speech recognition, weatherforecasts, shopping and vacation advice. So there is an existen-tial crisis for statistics—if you can ride the data revolution, whoneeds a statistician? Well, some statisticians think you do.

“Most real life statistical problems have one or more non-standard features. There are no routine statistical questions;only questionable statistical routines” (Cox, quoted in Chatfield,1991, p. 240). “All models are wrong, but some are useful” (Boxand Draper, 1987, p. 424).

Statistics has its origins in solving novel, practical problems.Both the Royal Statistical Society and the American StatisticalAssociation were established by heterogeneous collections ofindividuals united by a common goal to tackle exciting prob-lems by inventing methods and mathematics (see Pullinger,2014). A problem for statistics education is that the curriculumdevotes too much time to modeling well-understood problemswith traditional (1920s) methods, and too little time modelingunfamiliar ones—thereby ignoring the raison d’etre of the dis-cipline. Here, I consider introductory statistics courses.

Many introductory courses focus on one- and two-variableproblems, work with small samples, and use made-up data. Thisruns real risks, pedagogically, namely reinforcing the commonnotion that every sample is representative of the population fromwhich it is drawn, and that small samples are as representativeas large ones (i.e. Tversky and Kahneman’s (1974) “represen-tativeness” heuristic). Starting from one- and two-sample prob-lems makes the leap to understanding multivariate data, and no-

Online discussion of “Mere Renovation is Too Little Too Late: We Need toRethink Our Undergraduate Curriculum From the Ground Up,” by GeorgeCobb, The American Statistician, 69. Jim Ridgway, School of Educa-tion, Durham University, Leazes Road, Durham DH1 1TA, UK (Email:[emailprotected]).

tions of interaction, rather difficult. Similarly, the emphasis oncorrelation and linear relationships makes the idea of nonlinearrelationships hard. The range of applications of this approach isnarrow (assuming additivity and linearity even in school sciencewould be a big mistake). Some positive alternative approachescan be found in Ridgway (2015).

I agree with George Cobb that a focus on mathematical tech-nique hides statistical ideas, and with his assertion that accessi-ble ideas (e.g. Bayesian inference) are made obscure by dress-ing them up in heavy mathematics. Rakow et al. (2015) reportedpresenting funnel plots on outcomes from child heart surgery to172 participants (a quarter of whom had no education beyondcompulsory schooling) who (predominantly) were related to achild under the care of a specialist cardiac unit. The researchersasked questions which required an understanding of the funnelplot, and questions about which hospital or surgeon to choose.Around 90% of the responses were correct. They concluded that“. . . funnel plots can be readily understood. . . ” (p.327). Our ownwork supports the assertion that formal mathematics is not nec-essary for the understanding of important statistical ideas. Weused Rasch scaling to show that computer-based problems in-volving three variables can be easier (in psychometric terms)than one and two variable problems presented on paper (Ridg-way, McCusker and Nicholson, 2003). We have also shown thatstatistically naıve students aged 16 years can express sophisti-cated ideas such as interaction, nonlinear functions, and piece-wise functions when asked to describe patterns in data presentedin interactive displays (here, on alcohol use by young people,sexually transmitted diseases, and photosynthesis). Data visual-isation makes it possible for students to explore large, authenticdata sets and to reason about complex situations using thesedata. For example, we created the constituency explorer in col-laboration with the House of Commons Library. This presentsdata on 150+ variables (demographics, health, voting patterns intwo national elections etc.) relevant to every constituency in theUK in a visualization that runs on desktops and mobile devices.The primary target audience was politicians and their aides, butthe resource has been used in the teaching of political science,social history, and geography, as well in school statistics classes.Is this approach strong on short term gratification and weak onlong-term need? It is strong on short-term “wow”, but it can alsobring core statistical ideas into introductory courses (data prove-nance, metadata descriptions, nonlinearity, discontinuities, andeffect size, as well as means, medians and spread).

The range of phenomena that need to be modeled has ex-panded, and students know it. Linear additive models are notthe only game in town. For example, analysis of Wikipedia usehas been used to predict stock exchange movements (Moat etal. 2013). The key words associated with upward and down-ward movements are in the public domain. Should you use these

c©2015 Jim Ridgway The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (27)

keywords when making investment decisions? Students havecreated imaginary traffic jams on Satnavs (see Bilton, 2104).These examples illustrate situations where agents can “game”important systems for their own purposes. Modeling systems ishard; modeling systems undergoing change is harder; now stu-dents are familiar with systems undergoing change that are “selfaware”. Should we be teaching students 1920s mathematics,or introducing big statistical ideas relevant to their own lives?Can we cede all algorithmic thinking to data science? A modestset of targets for statistics education is: use data visualisationsof big open data sets to teach big statistical ideas; teach aboutthe statistics that underpin the lived experiences of technology-savvy students (notably pattern recognition in its many appli-cations); introduce modeling, early. And make friends with thedata cowboys.

References

Bilton, N. (2014), “Friends, and Influence, for Sale Online,”available online at http://bits.blogs.nytimes.com/2014/04/20/friends-and-influence-for-sale-online/?emc=edit ct 20140424&nl=technology&nlid=66226932.

Box, G., and Draper, N. (1987), Empirical Model-Building and Response Sur-faces, New York: Wiley.

Chatfield, C. (1991), “Avoiding Statistical Pitfalls,” Statistcal Science, 6(3),240–268.

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

The Constituency Explorer, http://www.constituencyexplorer.org.uk/ . Accessed7 August 2015.

Moat, H.S., Curme, C., Avakian, A., Kenett, D.Y., Stanley, H.E., and Preis,T. (2013), “Quantifying Wikipedia Usage Patterns Before Stock MarketMoves,” Nature Scientific Reports (3), 1801. Doi:10.1038/srep01801.

Pullinger, J. (2014), “Statistics Making an Impact,” Jounal of the Royal Statis-tical Society, Series A, 176 (4), 819–839.

Rakow, T., Wright, R.J., Spiegelhalter, D.J., and Bull, C. (2015), “The Prosand Cons of Funnel Plots as an Aid to Risk Communication and Pa-tient Decision Making,” British Journal of Psychology, 106, 327-348. DOI10.1111/bjop.12081.

Ridgway, J. (2015), “Implications of the Data Revolution for Statistics Educa-tion,” International Statistical Review, Doi:10.1111/insr.12110.

Ridgway, J., McCusker, S., and Nicholson, J. (2003), “Reasoning withEvidence—Development of a Scale,” IAEA Conference proceedings. Avail-able online at https://www.dur.ac.uk/resources/smart.centre/Publications/ReasoningwithEvidence-Developmentofascale.pdf .

Tversky, A., and Kahneman, D. (1972), “Judgement Under Un-certainty: Heuristics and Biases,” Science, 185, 1124–1131.DOI:10.1126/science.185.4157.1124

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (28)

Challenges, Changes and Choices in the Undergraduate StatisticsCurriculum

Jessica UTTS

George Cobb has (yet again) written a thought-provoking andentertaining article that should be required reading for anyoneinvolved in the training of statisticians. My reactions when read-ing the article ranged from agreement with the main ideas topessimism about the viability of implementing them. My pes-simistic side predicts that the recommendations will meet thefate Cobb describes for Bayesian methods before computingmade them tractable, namely “Statisticians read the arguments,followed the proofs, nodded in agreement, and continued intheir pursuit of incoherence.”

What changed the landscape for Bayesian methods was notonly that computers made them tractable, but that a few inno-vative leaders made it easy for others to implement and teachthem by writing textbooks and computer programs that could beused in the classroom and the consulting room. We need thoseinnovators if we are to implement the widespread changes rec-ommended in Cobb’s article.

My optimistic side kicked in when I realized how much andhow quickly things have changed during my academic career.It’s hard to remember that it wasn’t until at least 10 years afterI started teaching that faculty members were given individualcomputers, rather than relying on single mainframe computersthat served the whole campus! And in 1987 when I took a oneyear leave of absence to work at SRI International I becameone of the first in my academic circle to have an email address.Commercial email providers didn’t become popular until themid-1990s. We have come a very long way in a very short time.And the pace is quickening.

In the remainder of this commentary I address a variety ofunrelated issues brought to mind when reading Cobb’s article.The first is a reminiscence of the workshop and article that ledto my first published commentary (Utts, 1986), in Volume 1 ofStatistical Science, in response to an article titled “Computersin Statistical Research” (Eddy, 1986). Next is an exploration ofwhy undergraduate degrees in statistics should be offered at all.And third is a discussion of what happens after graduation, andindeed, to those who have graduated already.

The aforementioned Statistical Science paper was a commen-tary on an article (Eddy, 1986) about how academic statistics de-partments were beginning to acquire their own computers, andspeculating on how this would affect the future of research instatistics. The article was the culmination of a workshop on thetopic, and should be required reading for anyone who thinksthe statistics profession has not changed in the past 30 years!The relevance of my 1986 commentary to the Cobb article isthat I outlined a science fiction story then that I feared would be

Online discussion of “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” by George Cobb,The American Statistician, 69. Jessica Utts, University of California, Irvine(Email: [emailprotected]).

upon us 30 to 40 years hence, in other words, just about now.The essence of the story was that statisticians were no longerneeded because the “black box” was able to do everything auto-matically: spitting out p-values, confidence intervals and con-clusions without any need for the thoughtful input of a human.But eventually someone realized that results were being gener-ated that made no sense. When they tried to figure out what wasin the black box it was impossible to do so, until they locatedsome ancient statisticians who actually remembered the reason-ing that used to be part of the decision-making that accompaniedthe algorithms. Let’s make sure we don’t go there!

Any discussion of what should be covered in the undergrad-uate curriculum needs to take into account the fact that the rea-son for having an undergraduate degree in statistics is changingrapidly. What will become of our undergraduates? A small per-centage of them are likely to attend graduate school, but the restare likely to get jobs that involve working with data. What dothey need to know to be hired, what do we want them to knowonce they start working, and are those the same? We need in-formation on what kinds of jobs our bachelors’ level graduatesare getting, but anecdotal evidence indicates that the jobs theyare getting require more computing skills than high-level statis-tical thinking. I agree with Cobb’s view that we need to trainour students to combine algorithmic thinking with probabilisticthinking, even if it is not immediately obvious that they need thelatter for these data-crunching jobs. Otherwise, it is too likelythat my science fiction story will become reality.

The biggest challenge our graduates will face (eventually)is the same one that probably faces most professionals in thisera, and that is keeping current with changing technology andmethodology. As a profession, I think we need to vastly in-crease our continuing education offerings. I’ve restricted mycomments to the undergraduate level because that’s the focusof Cobb’s article, but I think we need more continuing educa-tion at all levels. I agree with Brown and Kass (2009) that noone can be expected to know all areas of statistics anymore—there are simply too many specialties, and effective statisticiansneed to learn a good deal about the disciplines in which theycollaborate in addition to keeping current with developments instatistics. As a profession we need to develop mechanisms foroffering continuing education in addition to the ones currentlyavailable (such as short courses and webinars).

One final note is that I think the emergence of undergraduateprograms in data science is a good step forward. It is easier tothink about implementing change when it’s viewed as part ofa new major than a revision of a current one. But even withinexisting statistics majors, I don’t think radical change is neededto implement the ideas put forth by Cobb. Changes to existingcourses could easily be made that would accomplish much ofwhat Cobb recommends. And it is only on this final point that

c©2015 Jessica Utts The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (29)

I disagree with Cobb. I think an appropriate amount of “mere”renovation would be sufficient to create the effect he wants. Butwe need innovators to make it happen.

References

Brown, E.N., and Kass, R. E. (2009), “What is Statistics?,” The American Statis-tician, 63, 2, 105–110.

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

Eddy, W. F. (1986), “Computers in Statistical Research,” Statistical Science, 1,4, 419–437.

Utts, J. (1986), “Comment,” Statistical Science, 1, 4, 437–439.

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (30)

Teaching Statistics Algorithmically or Stochastically Misses the Point:Why not Teach Holistically?

Richard DE VEAUX and Paul F. VELLEMAN

“It takes judgment, brains, and maturity to score in a balk-linegame, but I say that any fool can take and shove a ball in a

pocket. . . you’ve got Trouble. . . ”—“Professor” Harold Hill (The Music Man)

We agree with Professor Cobb on the need to reconstruct theundergraduate statistics curriculum. But if we are to “preachwhat we practice,” we must first examine what we practice.Viewed as a whole, the practice of statistics hasn’t reallychanged, although methods continually evolve. We should notallow debates about methods, whether algorithmic or stochas-tic, to distract us from teaching a holistic understanding of whatstatisticians actually do. The challenge of sound statistical prac-tice has been discussed for decades, but apparently with lessimpact on statistics education than we might hope for. Consider,for example, these two wise comments, each decades old.

Feinstein (quoted by Zahn 1985) said nearly 50 years ago:

A clinician is taught to identify and formulate pa-tients’ problems in a carefully structured manner; buthe is then left to develop diverse tactics of “judg-ment” for managing the outlined problems. A statisti-cian is taught a carefully organized set of mathemati-cal structures for managing an outlined problem; buthe is left to develop diverse judgmental methods foridentifying and formulating the problem. The clini-cian may emerge able to express the right questionsbut unable to find the answers; the statistician mayemerge with the right answers but unable to select thequestions.

William Hunter (1981) advocated a solution to this challenge:Statisticians should work as colleagues with scientists and oth-ers with whom we consult. In that role, we must comprehend theentire enterprise. Hunter pointed out that the first step in collab-oration is for the statistical consultant to work with the client toformulate the best question.

Therefore, it is of the utmost importance that in eachnew situation the consultant try to discover what thereal problem is. To avoid the mistake of solving thewrong problem . . . (p. 73)

Kimball (1957) called finding the right answer to the wrongproblem a Type III error. We must teach our students to avoidsuch errors.

Online discussion of “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” by George Cobb,The American Statistician, 69. Richard De Veaux, Department of Mathematicsand Statistics, Williams College. Paul F. Velleman, Department of StatisticalSciences, Cornell University (Email: [emailprotected]).

Undergraduate statistics education often focuses too much onmethods rather than taking this holistic view. It matters littlewhether the analysis tests the equality of two means with a clas-sical t-test or with a resampling approach if the conclusions ofthe test are invalid from the beginning. Rather than debating thechoice of t-test versus a decision tree, shouldn’t we first ensurethat the comparison is a scientifically valid one?

To follow Hunter’s advice we must ask questions such aswhether the data allow generalization to a larger population,whether their structure can be meaningfully described with themodels we wish to fit, and whether important subgroups or in-dividuals were excluded from the data. In the decades sinceHunter’s article, we have seen the development of graphical anddiagnostic tools that make it even easier for the statistician probedata to see whether a model is appropriate and to identify un-usual or influential groups and cases.

The answers to these questions (or, of greater concern, the vi-olations of the naıve assumptions our methods have been mak-ing) often emerge during a careful statistical analysis. Becauseof this it is essential that the statistician participate in the anal-ysis. It is statistical malpractice to turn the data over to an auto-mated algorithm, to, as Professor Hill would say, simply shovethe data into the pocket of a particular analysis.

Exceptions, anomalies, outliers, and subgroups are best rec-ognized and understood in the context of the question beingaddressed. That is why the statistician must be, as Feinsteinwould want, fully conversant with both the right question andthe statistical methods being applied. And that is why fully au-tomated methods cannot be trusted to produce statistical anal-yses on their own; computers don’t (yet?) understand the realworld sufficiently to take a holistic view of the analysis. But thetrained human mind and eye are remarkably effective tools forspotting unanticipated patterns and exceptions and understand-ing what they might mean in the context of the question beinginvestigated. So that training is essential.

Rather than focus on the methods used to solve the problem,we must teach the entire process by which the statistician probesfor the correct problem formulation, translates that problem intoa statistical question, finds an appropriate method to solve thatproblem, and then communicates the result back to the scientist.The methods themselves are important. Indeed, as the Ameri-can Dental Association Seal of Acceptance says (to paraphrase),they “have been shown to be of significant value when used in aconscientiously applied program of [data] hygiene and regularprofessional care.” But conscientious application is the requi-site element. Simply replacing a stochastic method by an algo-rithmic one will not help. We believe that the current statisticscurriculum focuses too much on the method rather than on theconscientious application of methods in the context of the ques-tion to be addressed, and that “data science” exacerbates this

c©2015 Richard De Veaux and Paul F. Velleman The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (31)

trend.It is certainly wise to provide our students with a full quiver

of methodological arrows. But (to mix our sports metaphors abit) we must not ignore the target. We must teach the entireprocess of developing a question that can be addressed with theavailable data and examining the data before, during, and afteran analysis in nearly every undergraduate course we offer. Bythe time statistics majors come to the capstone course, typicallyduring their final year, the approach should be second nature,not something they see for the first time.

This process requires “judgment, brains, and maturity.”Perhaps that is why after decades, our statistics courses stillhave not moved sufficiently in this direction. We should try toteach judgment, and our students certainly have fine brains, butmaturity comes only with experience. We must take the time to

give students practical experience in data analysis.Ignoring the more difficult parts of this process and concen-

trating only on the algorithmic part (as the worst practices ofdata science do) is an abdication of our responsibility as statis-ticians and educators.

References

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

Hunter, W.G., (1981), “The Practice of Statistics: The Real World is an IdeaWhose Time Has Come,” The American Statistician, 35, 72–76.

Kimball, A.W. (1957), “Errors of the Third Kind in Statistical Consulting,”Journal of the American Statistical Association, 57, 133–142.

McCulloch, C.E., Boroto, D.R., Meeter, D., Polland, R., and Zahn, D.A. (1985)‘An Expanded Approach to Educating Statistical Consultants,” The Ameri-can Statistician, 39, 159–167.

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (32)

Learning Communities and the Undergraduate Statistics Curriculum:A Response to “Mere Renovation Is Too Little Too Late”

Mark Daniel WARD

George Cobb urges the Statistics discipline to “rethink” theentire undergraduate curriculum. As I was reading his excellent,thought provoking article, I was asking myself, “Why do stu-dents continue learning Statistics on college campuses at all?”After all, Cobb points out many excellent books, which treatStatistics and Probability from broad points of view. Studentscan also fall in love with Statistics through data driven projects,many of which can be explored on the Internet, during intern-ships, or in research opportunities related to data analysis. Sowhat distinguishes the undergraduate academic experience inStatistics on a college campus from what a well-motivated stu-dent could learn on her/his own, from the excellent books, jour-nal articles, and online resources, including those that Cobbcites in his paper? Answer: A sense of community. Learningcommunities are sometimes called “communities of practice”(see Lave and Wenger, 1991, p. 49). They provide an excel-lent environment for students to practice what they are learning(e.g., to apply statistical methods) at a much earlier point intheir studies, much as Cobb urges. Although Cobb also pointsout that “all curriculum is local” (p. 34), nonetheless, I believethat some of the initiatives we recently began at Purdue couldbe implemented much more broadly, in Statistics departmentsnationwide. The Purdue Statistics Living-Learning Communityblends the academic, research, residential, and professional de-velopment experiences of 20 sophom*ore students per year. Webriefly discuss Purdue’s new initiative here, but we also refer in-terested readers to visit http:// llc.stat.purdue.edu and to contactme directly.

Students in a learning community have a sense of comfortand confidence that is often missing from the undergraduateSTEM experience. This comfort promotes retention. Cobb doesnot explicitly mention retention in his article, but if we are com-pletely revamping the undergraduate program, we are obligatedto keep retention in mind. If learning communities were imple-mented more broadly in Statistics undergraduate programs, ourdiscipline could potentially increase the numbers of women, mi-norities, and persons with disabilities, who are pursuing (andcompleting) undergraduate degrees in Statistics. This wouldyield a broader and more diverse pipeline of students into grad-uate programs in Statistics. Project INGenIOuS (Zorn et al.,2014) has a great vision for broadening the pipeline in the math-ematical and statistical sciences. Our discipline could also ben-efit from best practices learned in Computer Science about at-trition and students’ comfort levels; Margolis and Fisher (2002)give a helping starting point to this literature.

Online discussion of “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” by George Cobb,The American Statistician, 69. Mark Daniel Ward, Department of Statistics,Purdue University (Email: [emailprotected]). This material is based uponwork supported by the National Science Foundation under Grant No. 1246818.

Cobb states, “statistics suffers from the difficulty of its chal-lenge to integrate abstract deductive thinking with interpretationin context.” To address this, faculty can offer pairs of Statisticscourses in block-scheduled patterns, such as theoretical prob-ability paired with data analysis. When students take multiplecourses together as a cohort, they have more opportunities todiscuss complementary ideas, outside the classroom. Residen-tial life staff members can work with faculty on ways to supple-ment the academic learning experience. Professional develop-ment helps too, and it can consist of simply a series of weeklyseminars. Also, faculty who take the time to dine with studentsin the residence hall cafeterias are richly rewarded with gooddiscussions and increased insight into the student experience inStatistics. These are just a few extracurricular ideas; many oth-ers are possible.

Cobb celebrates diverse means of presenting the material tostudents. Such innovations, such as “flipped” classroom experi-ences in Statistics, can help us to better engage with students.At Purdue, I use video content and online modules in both myprobability and my data analysis courses. Instead of lectures, thestudents spend class periods working on problems in probabilityor projects in data analysis. So that our questions are appealingto students, Ellen Gundlach and I asked undergraduate studentsto design the majority of the examples that we included in ourrecent textbook, Ward and Gundlach (2015).

Cobb calls for students to get involved in research at an ear-lier stage in their studies. For large data analysis projects, stu-dents can benefit from having immersion into a yearlong projectwith a research mentor from another discipline. Students learnnot only about the data set to be studied, but more broadly,they learn about the terminology, customs, literature, and tra-ditions in the applied discipline. Such an interdisciplinary viewgives the students a renewed appreciation of the concepts thatthey learn in their Statistics classes. I believe that these researchprojects exemplify what Cobb is mentioning, when he discussesthe crucial role of “context as a source of understanding” (p. 21).

Immersive research experiences will require improved com-putational facilities dedicated for student use. Cobb alludes tothis need. Some departments will require a strong push—for in-ternal or external funding—to secure sufficient computationalresources.

Academic advisors should be invited to the discussion aboutthe kind of curricular overhaul that Cobb is advocating in Statis-tics departments. Academic advisors are often uniquely posi-tioned to guide students who are pursuing double (or triple)major programs of study, or minor programs that complementtheir main areas of interest. Students in Statistics can naturallybe encouraged to pursue double majors, since Statistics comple-ments many applied areas of study. Cobb discusses the need tominimize prerequisites. This goal is accomplished more easily

c©2015 Mark Daniel Ward The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (33)

when academic advisors are able to give direct input to the cur-riculum design. They understand a student’s view of the overallcurricular structure at the university or college.

The ASA DataFest is emerging as a way that the AmericanStatistical Association is working with faculty on several cam-puses, to give students a very exciting annual data analysis ex-perience. Among the many activities in Statistics at Purdue thisyear, the students cited the ASA DataFest as one of the most re-warding and enjoyable events. I encourage the ASA to continuerecruiting more departments to host ASA DataFest events forundergraduate students.

I applaud George Cobb for his vision about the entire under-graduate curriculum in Statistics. His paper is full of insightsand innovative ideas! I also thank Cobb for pointing out severalvery recent innovations in the undergraduate Statistics curricu-lum, including Kuiper and Sklar (2013) and Wagaman (2013).Finally, I heartily thank Nolan and Temple Lang for their work-shops on “Integrating Computing into the Statistics Curricula”;see Nolan and Temple Lang (2010) for an overview of theirefforts to implement changes. My participation in their work-shops inspired me to start Purdue’s Statistics Living-Learning

Community, with large research projects for sophom*ores as aunifying element. Cobb is urging all of us to do more, to trynew things, and to continue the dialogue about how innovationscan support our students.

List of references that are not found in George Cobb’s paper:

References

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

Lave, J., and Wenger, E. (1991), Situated Learning: Legitimate Peripheral Per-ception, Cambridge: Cambridge University Press.

Margolis, J., and Fisher, A. (2002), Unlocking the Clubhouse, Cambridge, MA:MIT Press.

Nolan, D., and Temple Lang, D. (2010), “Computing in the Statistics Curricula,”The American Statistician, 64, 97–107.

Ward, M.D., and Gundlach, E. (2015), Introduction to Probability, New York:W. H. Freeman.

Zorn, P., Bailer, J., Braddy, L., Carpenter, J., Jaco, W., and Turner, P. (2014), TheINGenIOuS Project: Mathematics, Statistics, and Preparing the 21st Cen-tury Workforce, Washington, DC: Mathematical Association of America.

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (34)

Teaching Safe-Stats, Not Statistical Abstinence

Hadley WICKHAM

I thoroughly agree with George Cobb that it is time to rethinkthe statistics curriculum. We are at grave risk of becoming irrel-evant, not because we are useless, but because we have focussedtoo much on doing the right thing, rather than doing somethingthat works. As a field, statistics tends to lean towards abstinencebased outreach: you should only do statistics if you’re in a com-mitted long-term relationship with a professional statistician. Ifyou experiment on your own (or with friends), you will hurtyourself and others. Abstinence based approaches don’t workbecause people will take the risk anyway, and most of the timethey will safely enjoy themselves. Abstinence based statistics isparticularly problematic as there are simply not enough profes-sional statisticians to go around.

Rather than stigmatizing amateur statistics, we should be do-ing out best to provide tools to make it safer. We can’t forcepeople to use our tools (as much as we’d like to!), so those toolsmust be more fun and more empowering than the alternatives.The undergraduate statistics curriculum is a vital place to de-velop and promulgate these tools. As statistics grows ever morepopular, we must be able to provide a curriculum that give stu-dents the skills they need to practice safe-stats for the rest oftheir lives.

To me, one of the keys to teaching safe-stats is to developgrammars of data analysis. A grammar is a framework that laysout the minimal set of independent components and a means ofcomposing them to solve a wide range of problems within a do-main. Much of my own work has been in this area: How can weprovide an accessible grammar of graphics (Wilkinson 2005)that makes it easy to create graphics tailored to the problem athand (Wickham 2009)? What are the verbs of data manipulationthat allow us to solve 95% of problems (Horton, Baumer, andWickham 2015)? What are the key components of tidy data, andhow can we tidy messy data (Wickham 2014)?

The best grammars are both flexible and constraining. Onceyou’ve seen the grammar used to solve two problems, youshould be able to recombine the pieces to solve a third, new,problem. But a grammar also needs to be constraining: unmiti-gated freedom is both overwhelming and potentially dangerous.Constraints can guide users towards better outcomes, but cannot guarantee them. A system that prevents the user from doingthe wrong thing must necessarily prevent many right things. (Apersonal example is the use of rather elegant ggplot2 graphicsin the paper discussed by Singal 2015.)

Online discussion of “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” by George Cobb,The American Statistician, 69. Hadley Wickham, RStudio 1719 Drew HoustonTexas 77004 (Email: [emailprotected]).

To have that needed flexibility, a grammar must be embeddedin a programming language. This offers an escape clause: eachgrammar need solve only the 90% of most common problems,leaving the long-tailed 10% to other parts of the language. Thisimplies that statistics students must be taught to program.

Teaching programming even in the first statistics course iseminently achievable, as Baumer et al. (2014) and others haveshown. The key is to focus on immediate pay-offs. You don’tneed to study the foundations of programming before you canstart to program data analyses. Students can start with recipes,code snippets that students can use and adapt, to show imme-diately useful (and interesting) tools. Over the length of thecourse, students can grow from simple duplication to creativerearrangement of entire components.

We must give statistics students the skills to dive into the dataocean. Yes, there are sharks and jellyfish and rip tides, but wecan not be paralyzed by all the potential dangers. Students willgo swimming with or without us, and all we can do is preparethem as best we are able.

References

Baumer, B., Cetinkaya-Rundel, M., Bray, A., Loi, L., and Horton, N. J. (2014),“R Markdown: Integrating a Reproducible Analysis Tool into Introduc-tory Statistics,” Technology Innovations in Statistics Education, https://escholarship.org/uc/ item/90b2f5xh.

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69(4), doi:10.1080/00031305.2015.1093029.

Horton, N. J., Baumer, B. S., and Wickham, H. (2015), “Setting the Stagefor Data Science: Integration of Data Management Skills in Introductoryand Second Courses in Statistics,” ArXiv E-Prints, February, http://chance.amstat.org/2015/04/setting-the-stage/ .

Singal, J. (2015), “The Case of the Amazing Gay-Marriage Data:How a Graduate Student Reluctantly Uncovered a Huge Scien-tific Fraud.” NyMag, May, http://nymag.com/scienceofus/2015/05/how-a-grad-student-uncovered-a-huge-fraud.html.

Wickham, H. (2009), ggplot2: Elegant Graphics for Data Analysis, Berlin:Springer.

(2014), “Tidy Data,” The Journal of Statistical Software, 59, http://www.jstatsoft.org/v59/ i10.

Wilkinson, L. (2005), The Grammar of Graphics (2nd ed.), Berlin: Springer.

c©2015 Hadley Wickham The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (35)

Further, Faster, Wider

Chris J. WILD

Cobb (2015) is right on the money when he says, “For ourprofession, the valuable territory is the science of data,” It isa territory we once had largely to ourselves but, as it expandsrapidly and becomes more densely populated, we are sharing itwith waves of immigrants and are in danger of becoming justone more indigenous people subsisting invisibly in the shadowsof their homeland. Our competition, Cobb says, “takes place inthe marketplace of ideas.” It also takes place in the marketplaceof credibility, in the marketplace of influence, in the market-place of great jobs for graduates, in the marketplace of accessto the best and brightest young minds, and in the marketplaceof educational curriculum share. When business and organisa-tions, as well as science, need to gain valuable insights fromdata, “Who you gonna call?”

As I have been reading Cobb (2015) I have also been readingthe ASA Curriculum Guidelines (2014) and Diggle (2015), Pe-ter Diggle’s Royal Statistical Society Presidential Address pa-per. In stark contrast to the expansive vision of undergraduatestatistics in Cobb (2015) and the ASA Curriculum Guidelines,Diggle says, “I would like to see less statistics in undergrad-uate mathematics degrees,” This matters because in the UK,“most academic statistics groups now sit within departments orschools of mathematics or mathematical sciences.” He did goon to talk about less statistics in undergraduate degrees being“counterbalanced by a radical expansion of postgraduate statis-tics teaching.” But this is our old model, the conversion modelwhereby one obtains students to fund the discipline and replaceits aging research-academics by attracting people into the foldwho thought they were heading someplace else—generally intomathematics. It has been a long time since statistics graduate-programmes in western countries have been able to attract suf-ficient of their own nationals to be sustainable. They have beensaved from withering only by an influx of students from poorercountries, mainly in Asia. Is this forever sustainable? Is thiswhat “success” looks like?

This clash of viewpoints is a manifestation of big-tent versussmall-tent statistics; see Rodriguez (2013); the “wider view” ofMarquardt (1987; see also Wild, 1994, 2015) versus his nar-rower views; the “greater statistics” of Chambers (1993) ver-sus his “lesser statistics,” and the “wider field” of Bartholomew(1995). Big-tent statistics is increasingly favoured by the ASAas both its strategic and its service-to-society stance. It also un-derpins whether you see your role as educating a small numberof great scientists in a fairly narrow if pure tradition, or edu-cation on a much grander scale to build a statistically capablesociety—as building a narrow tower or a broad-based pyramid,a select priesthood or a mass movement.

Online discussion of “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” by George Cobb,The American Statistician, 69. Chris J. Wild, University of Auckland (Email:[emailprotected]).

Does this really matter? It matters because statistics has anancestry of great thinkers whose wisdom needs to be preservedand passed on. It matters because statistics has important mes-sages for all of society, not just a few scientists. It matters be-cause, “Those who ignore statistics are condemned to reinventit” (attributed to Brad Efron by Friedman, 2001) and their igno-rance can do real damage in the meantime. It matters becausestatistics is in a bad position to promulgate those messages be-cause of the pitifully small market share it has in the educa-tional curriculums of almost all countries. It matters becausedata is currently seen as exciting and valuable, and the peoplewho know how to gain value from it are highly sought after. Itmatters because there has never been a better time for gettingattention and market share for teaching a modern, accessible,data-centric statistics than there is right now.

Statistics education has an opportunity to help a wide cross-section of students come to a much broader appreciation of whatdata is and what it can do for them and society. It has an oppor-tunity to help all students to make better sense of their worldusing data, to be not-easily-misled, and to prepare for a bur-geoning job market. It has an opportunity to harness the powerof visualization to greatly enhance the statistical understandingof a much wider spectrum of society. We can do so much morethan just passive “statistical literacy,” we can build significantstatistical capability.

Where are most of the innovations fostering big-tent statisticseducation coming from? Mainly from elite U.S. liberal arts col-leges like George Cobb’s. They are a distinctive U.S. nationaltreasure. With a few rare exceptions, research-universities arereactionary forces. They slow down any expansion of statis-tical vision because they are squeezed between the pincers ofranking systems, research assessment frameworks and researchfunding systems that favor traditional areas of strength and anarrow range of journals. To get promoted or funded, it is muchsafer to stick to your (predominantly mathematical) knitting. Atthis time of heightened opportunity, the needs of the wider soci-ety and the employment marketplace (more data, more accessi-bility to statistical capability) and the short-term health of manyresearch-university academic units (more mathematics) seem tobe pulling in opposing directions.

As with all of Cobb’s writings, this paper is a joy to read bothfor the vividness of his imagery and his provocative messages.We have just started to settle down and get comfortable after hislast onslaught when along comes George Cobb again, poking usonce more with a sharpened stick. So we will pick ourselves upand lurch forward, making a little more progress.

I agree entirely with Cobb’s thesis of a radical re-thinking ofthe curriculum from the ground up, particularly for intro andearly applied statistics courses. The standard intro course haspassed its use-by date. It reveals far too little of the explodingworld of data and does it far too slowly. In the early stages of sta-tistical education, I favor actually outdoing the fast-food com-

c©2015 Chris J. Wild The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (36)

petition by providing even faster food. The early stage relation-ship should have more in common with “courtship” than “eat-it-because-its-good-for-you” gruel. To summon up a phrase I usein talks, “Don’t make students crawl over broken glass—until adesire has been aroused for what’s on the other side.”

I am all for the ambitious experimentation Cobb praises andadvocates to push the limits of what we can do without (or withvery few) pre-requisites. Where students will take many of thesecourses it may, fittingly in this big-data age, enable educationaltimespans to be shortened by massive parallelization using di-vide and recombine (see Cleveland and Hafen, 2014)!

But the end aim should not be a smorgasbord of choose-oneintro courses each covering a small number of “advanced” top-ics at an intro level. The data world is exploding in scope andpotential and we need to efficiently convey as much as possibleof the scale and excitement of this in intro courses as well asproviding personal empowerment—to create a sense of possi-bility and potential “for me in my life,” not just something forsome high-powered PhDs somewhere. We should target “WhatI can do with data and what data can do for me” to build adesire to learn more. Beginning experiences of data analysisshould feel like driving a shiny sports car at breakneck speedalong the Riviera, sliding around hairpin bends overlookingthrilling vistas. We spend too much time in working in win-dowless workshops with our heads stuck under the hood. Thisis not fanciful reverie. I have done enough prototyping withmy iNZight (http://www.stat.auckland.ac.nz/∼wild/ iNZight/ )and VIT (http://www.stat.auckland.ac.nz/∼wild/VIT/ ) softwareprojects and “Data to Insight” MOOC (http://www.stat.auckland.ac.nz/∼wild/d2i/4StatEducators/ ) to believe that thisis within our grasp. If struggling to get the right stuff out of soft-ware chews up a significant proportion of intro students’ timethen its the wrong software.

And let’s stand some common practices on their heads. Let’sexcite students about what can go right before moderating thatwith “keep yourself safe” messages about what can go wrong.Let’s distinguish between the fundamental statistical messagesand enabling skills. Where time is short, let’s concentrate onfundamentals and strive for the fastest ways to convey them.I fully favor the expanded curriculum that empowers studentsto speak the enabling languages used by statisticians (comput-ing/algorithmic, graphics/visualisation, and mathematics). Butthey are not the fundamentals. They are great enabling skillsfor helping people on their way once they know where they aregoing. Visualisation, stands out from the others because it canprovide a fast track to understanding fundamentals.

I applaud the more “data-science” agenda for those making aserious commitment to statistics. Integrity (Section 2.2) also de-mands that undergraduate statistics programmes provide goodemployment skills for the majority who will not go on to furtherstudy. There are more of these in the “data science” aspects

of statistics than in most of what we have traditionally stressed.But we have to consider the striptease of what to reveal andwhen. Not everything we newly think is important has to berevealed straight away.

I worry about starting too early with wrangling messy data.Yes we need to teach statistics majors to deal with messy data.But extracting jewels from gloop is not something most peopledo because they love messing around in gloop. They want thejewels. But first they have to know (i) that jewels exist, and (ii)they might be in there. So lets first have them discover jewelsin places where they are easier to find. Also coming into voguefor intro courses is working in “reproducible-research” modes.Yes statisticians should work like this, and yes, there is a stagein a program where it should become the standard way of oper-ating. But all these things slow down what you can see and howfast you can see it. There should be a sniff test. Is this an entic-ing element of courtship? Or do I feel the skin-pricks of glassshards? So should we save it for after marriage? Or at least tillafter moving in?

Can statistics secure a central position in the new data world?Is there a will to find a way? Sadly I think statisticians in thebest liberal arts colleges and Bob Hoggs BIG (Business, Indus-try, and Government) offer more hope than most research uni-versities. The former want it. As to the latter, Im not so sure.

References

ASA Curriculum Guidelines (2014), “American Statistical Association Under-graduate Guidelines Workgroup,” 2014 Curriculum Guidelines for Under-graduate Programs in Statistical Science, Alexandria, VA: American Statis-tical Association, http://www.amstat.org/education/curriculumguidelines.cfm.

Bartholomew, D. (1995), “What is Statistics?” Journal of the Royal StatisticalSociety, Series A, 158, 1–20.

Chambers, J.M. (1993), “Greater or Lesser Statistics: A Choice for Future Re-search,” Statistics and Computing, 3, 182–184.

Cleveland, W. S., and Hafen, R. (2014), “Divide and Recombine (D&R): DataScience for Large Complex Data,” Statistical Analysis and Data Mining, 7,425–433.

Cobb, G. W. (2015), “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum from the Ground Up,” The AmericanStatistician, 69.

Diggle, P. J. (2015), “Statistics: A Data Science for the 21st Century,” Journalof the Royal Statistical Society, Series A, 178, 1–18.

Friedman, J. H. (2001), “The Role of Statistics in the Data Revolution,” Inter-national Statistical Review, 69, 5–10.

Marquardt, D. W. (1987), “The Importance of Statisticians,” Journal of theAmerican Statistical Association, 82, 1–7.

Rodriguez, R. N. (2013), “Building the Big Tent for Statistics,” Journal of theAmerican Statistical Association, 108, 1– 6.

Wild, C.J. (1994), “On Embracing the ‘Wider View’ of Statistics,” The Ameri-can Statistician, 48, 163–171.

(2015), “On Locating Statistics in the World of Finding Out,” http://www.arxiv.org/abs/1507.05982.

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (37)

Teardowns, Historical Renovation, and Paint-and-Patch: CurricularChanges and Faculty Development

Andrew ZIEFFLER and Nicola JUSTICE

When Nicola and I agreed to write a discussion for GeorgeCobb’s paper, “Mere Renovation is too Little too Late: We Needto Rethink Our Undergraduate Curriculum from the GroundUp,” we knew that one of the more difficult tasks would be torespond in under 1000 words. The first thoughts we had afterreading Cobb’s paper were. . . “we agree that the current con-sensus undergraduate curriculum George refers to is out-of-dateand needs to be up-dated.” The second thought, almost instanta-neously thereafter, was, “how will such a change happen, partic-ularly given all of the potentially affected stakeholders?” (fac-ulty, students, alumni, administration, client disciplines, etc.)And the third, much more cynical thought, after several min-utes of discussion was, “this may be impossible.”

As we cogitated over Cobb’s vision, we kept returning to theseemingly difficult question of how to get stakeholders to buy-in to the immense amount of work involved in curricular revi-sion. Curriculum of any kind is a statement about the valuesand cultural norms of a discipline. It is the educational processthrough which aspiring members of the profession gain knowl-edge, skills, values, habits, and attitudes. Curricular revision isat its best difficult, and can be quite controversial due to the con-flicting views that inevitably emerge, and in some cases mayeven be divisive (e.g., the reading and mathematics wars; seeSchoenfeld and Pearson, 2009). All that being said, George hasan uncanny ability to illuminate the large problems that needsolving in the discipline and motivate others to rise to the chal-lenge of making the “impossible” possible.

We envision that realistically, any kind of lasting change ofthe type Cobb is proposing will occur, initially, at the local level.If so, then perhaps it is fitting to ask: (1) does my institution’scurriculum need changing; and (2) if it does, what level of cur-ricular revision is palatable to local stakeholders?

In answering the first question, ASA’s Curriculum Guidelinesfor Undergraduate Programs in Statistical Science (Horton etal., 2014) and Cobb both offer compelling reasons that most in-stitutions’ statistics curricula need revision. But, we hope it isalso in the nature of any particular statistics program to collectthe evidence needed to evaluate their own situation; as Reagansaid, “trust. . . but verify.” There are several questions that canguide the collection of data, depending on what the local stake-holders value. For example (adapted from Preskill and Catsam-bas, 2006, pp. 101–103),

• What are the objectives for the curriculum, and is it achiev-ing those objectives?

Online discussion of “Mere Renovation is Too Little Too Late: We Need toRethink Our Undergraduate Curriculum From the Ground Up,” by GeorgeCobb, The American Statistician, 69. Andrew Zieffler, University of Min-nesota, 167 EdSciB, 56 East River Rd., Minneapolis MN 55455 (Email:[emailprotected]). Nicola Justice, University of Minnesota Minneapolis,MN.

• What are the needs of those closest to the program (e.g.,students, faculty, etc.), and is the current curriculum meet-ing those needs?

• What are all the effects of the current curriculum on stu-dents, including any side effects?

• What are the local and more global arguments for andagainst the current curriculum (cost–benefit)?

• Would an educated consumer (student) choose to study un-der the current curriculum?

If the data collected support curricular revision (and we sin-cerely believe it most often will), then it follows to consider thesecond question, regarding extent of the needed revision.

Using his real-estate metaphor, Cobb proposes a “tear-down”of the undergraduate curriculum—a complete gutting and re-building. Unfortunately, as Robin Lock reminded us in his2005 discussion at the Joint Statistical Meetings, most attemptsat curricular revision are not complete tear-downs, but rather,“paint-and-patch,” fixing a few things that didn’t work quiteso well; mostly just sprucing things up. Cobb recognizes this,pleading that, “we do more than just graft a single new ‘bigdata’ unit onto an existing course.” Of course, there is alsosomething between these two extremes, a “remodel” akin toleaving the over-arching structure in place, while updatingsome things, rebuilding others. There is potential for “histor-ical preservation”—trying to save structures and architecture,while at the same time updating and renovating, all while spend-ing more resources than it would have taken to tear-down andrebuild. Granted, such an approach may be easier to navigate,politically.

When choosing a model of curricular change, it may beappropriate to revisit some of the evaluation questions listedabove, but with more focused consideration on changes in thecurriculum. For example,

• What are the needs of those closest to the program, andwhat scale of curriculum changes could meet those needs?

• What are all the potential effects of changing the curricu-lum on students (and faculty), including any side effects?

Other guiding questions may be:

• What can our faculty and staff afford in terms of time?

• Is there financial help available from the institution?

• Are there parts of the curriculum that can be saved or re-tained?

• What kind of construction debris are we willing to acceptwhile we do the revision?

c©2015 Andrew Zieffler and Nicola Justice The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (38)

• How will the changes eventually impact curricula to pre-pare students (high school) and to follow up (graduatestudies)?

It is imperative in making and considering curricular changesthat we also consider how those changes could affect the de-velopment of current and future teachers of statistics. There aremany academics that are comfortable teaching the consensuscurriculum. How will the community help them to teach poten-tially new courses that include content with which they are dras-tically less familiar? And, are those faculty members willing toengage in this preparation? It also may be that the changes incontent may need to be accompanied by changes in pedagogy;faculty may need to transition from “let me, the expert, tell you”to an approach of “let’s learn this together,” an approach we ac-knowledge is far less comfortable for many instructors.

Finally, the rapid change we have observed in the disci-pline, especially in the last 10 years, make the curricularchanges Cobb proposes more urgent, and at the same time, moredifficult—it is easy to imagine that by the time a curricular re-vision is finished it could be almost immediately out of date.It may well be worthwhile to consider how to establish flex-ibility within the curricular structure to accommodate the in-evitable changes that will continue to accompany the evolvingdiscipline.

While the challenges Cobb lays out are numerous, the goalis admirable, and is worthy of deep thought and reflection. Us-ing the tools of our discipline—data and analysis—we shouldbe able to critically evaluate our current statistics curricula andmake quality, informed improvement to them. How this occurswill no doubt be at the heart of many future conversations andscholarly debate, but with this paper, George has certainly be-gun that discussion.

References

Cobb, G. (2015), “Mere Renovation is Too Little Too late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” The AmericanStatistician, 69.

Horton, N., Chance, B., Cohen, S., Grimshaw, S., Hardin, J., Hesterberg, T., Ho-erl, R., Malone, C., Nichols, R., and Nolan, D. (2014), “Curriculum Guide-lines for Undergraduate Programs in Statistical Science,” American Sta-tistical Association. Available online at http://www.amstat.org/education/curriculumguidelines.cfm.

Preskill, H., and Catsambas, T. T. (2006), Reframing Evaluation through Ap-preciative Inquiry, Thousand Oaks, CA: Sage.

Schoenfeld, A. H., and Pearson, P. D. (2009), “The Reading and Math Wars,” inHandbook of Education Policy Research, eds. G. Sykes, B. Schneider, andD. Plank, New York: Routledge, pp. 560–580.

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (39)

Rejoinder to Discussion of “Mere Renovation is Too Little Too Late:We Need to Rethink Our Undergraduate Curriculum from the

Ground Up”

George W. COBB

1. Introduction

What a failure! I worked so hard to be provocative only tohave Gelman and Loken call me “typical,” have DeVeaux andVelleman dismiss me as “beside the point,” and have Notz fan-tasize that I might be only “a bad dream.”1 Wild, obliquelythrough his title, sneers at my challenges: not far enough, notfast enough, and not broad enough. Clearly my race is run. Iwas right to retire five years ago.

Of course my sample of characterizations is grossly biased,both by selection and by quotation out of context. In reality, Ifeel honored that so many colleagues whose work I have longadmired have taken the time to read what I wrote, to thinkdeeply in response, and to write such a variety of original com-ments. What a picnic! I won’t go there, however; at least notin musteline clothing. But picnic, yes: I and all the respondentsagree in wanting to make our blanket-buffet of statistics bothmore attractive and more substantive. In particular, we can agreewith Fisher and Bailar that ours is a time of opportunity.

In what follows, rather than respond individually to the 19sets of authors one at a time noting points of agreement and dis-agreement, I attempt to synthesize and respond by categories oftopics. Accordingly, Section 2 summarizes and celebrates thevariety of innovative courses and programs described in the re-sponses. Then Sections 3 and 4 address two apparent misun-derstandings and two major disappointments. Next, Section 5offers a summary of the challenges and strategies for reformsuggested by the respondents, and Section 6 follows with a callto redirect research in statistics education to help meet thosechallenges. Section 7 outlines a triangular tension: our subject,our university departments, and our U.S. liberal arts colleges. Avaledictory Section 8 concludes on a note of constrained opti-mism.

2. Our Resplendent Picnic of New Programs

A disadvantage of the response/rejoinder format is that pointsof disagreement tend to get more space in the rejoinder thando points of agreement. There is far more to celebrate in themany innovations described in the responses than the length ofthis section might suggest. The academic levels of these inno-vations span the range from K–12 through graduate school (and

Online discussion of “Mere Renovation is Too Little Too Late: We Need to Re-think Our Undergraduate Curriculum From the Ground Up,” by George Cobb,The American Statistician, 69. George Cobb is Senior Research Professor, De-partment of Mathematics and Statistics, Mount Holyoke College, South Hadley,MA 01075.

1Here and after, when I cite invited comments, I omit the redundant 2015.

beyond: Utts urges “continuing education” and Gould urges “K-retirement”). The institutions impacted include grade schoolsand high schools, all levels at four-year colleges, and graduateprograms at universities. The scope ranges from single coursesto entire programs, many of them interdisciplinary. The ap-proaches are equally varied. Many could be seen as anticipa-tions of the Horton report (ASA, 2015) in their emphasis ondata science and/or their reliance on experience working withreal data to address an applied challenge of genuine import.

To take space to summarize each of the innovations I applaudwould duplicate what the respondents have already written, andso instead I urge any readers who have not read these responsesto do so. Here, briefly, are summaries of five innovations, onethat serves to illustrate the core recommendations of the Hortonreport, and four others that fall outside its convex hull. I listthem from the one I consider most at the center to the one Iconsider extreme.

• Chance/Peck/Rossman: A new kind of introductory course.To paraphrase, California Polytechnic Institute at SanLouis Obispo now offers a course for entering first quarterstudents that begins their majors’ discussions of the histor-ical roots of the discipline, of ethics, and of future direc-tions, while introducing big data, computing in R, commu-nication, and collaboration skills. Is it any wonder that forme the trio Chance/Peck/Rossman suggests “CPR” for ourbeginning course?

• Gould: “bringing a ‘data science’ curriculum to highschool.” Gould’s project “Mobilize” is funded by the Na-tional Science Foundation. For me what stands out here isGould’s emphasis on teaching statistics within the contextof the entire process of scientific investigation at the highschool level. This goal resonates with the Horton report’sattention to statistics as an integral part of the scientificenterprise, but also puts Gould’s project in the vanguardof those who would challenge the entrenched AdvancedPlacement curriculum, its lingering obeisance to probabil-ity, and its tradition of mathematically oriented teachers.

• King: A voice from industry. The teaching of statistics haslong been beneficiary of colleagues who work in busi-ness, industry and government, and who care also abouteducation. I am happy to salute King for contributing tothis important and valued tradition. I enthusiastically sup-port what I see as her three main imperatives: (1) Recruitearly, in high school, from students in Advanced Place-ment courses in calculus, statistics, and computer science;(2) encourage early, at the sophom*ore level in college, a

c©2015 George W. Cobb The American Statistician, Online Discussion 1

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (40)

commitment to an applied subject, and (3) expect and facethe challenges of implementation.

• Ward: Learning communities. Alone among the responses,Ward’s focus is on sociology more than cognition, and de-scribes a program already in place at Purdue University,one that anticipates my thesis in Section 6 that we needa new direction in our research on statistics education.Ward’s learning community at Purdue “could be imple-mented nationwide. The Purdue Statistics Living Learn-ing Community blends the academic, research, residential,and professional development experiences of 20 sopho-more students per year.” Clearly, there are issues of staffingand scale, but the Ward model is an inspiring example.

• Wickham: A grammar of statistics.2 Wickham argues thatour goal should be to develop a “grammar of statistics” asa framework that will allow us to broaden the “safe” use ofstatistics to a vast population of statistical “amateurs.”

Modern theory of finance identifies a curve—the “efficientfrontier”—that defines the tradeoff between the maximum ex-pected return for a given level of risk. As one might anticipate,the higher the acceptable risk, the greater the expected return. Iregard the efficient frontier as a metaphor that applies broadlyto all human investments in our future, and applies in particularto our efforts in education reform. Thus I see a tradeoff betweentrying for a small change with a large chance of enlisting a ma-jor following, and, at the other extreme, trying for a much largerchange with a correspondingly smaller chance of broad-basedimpact short term. In brief, attempts at reform are constrained:the bigger the step forward, the smaller the audience that willchoose to follow.3 In this context I regard Wickham’s proposalfor a “grammar of statistics” as the most ambitious of all theproposed innovations. It has the potential to revolutionize theway we think about data analysis and about how we teach it. Atthe same time, because it promises to be such a major step ina new direction, it may prove to be too far ahead of its time togain traction short term. (See footnote 3.)

In choosing these previous five innovations to single out, Ido not mean to downplay my admiration for any of the others.They may be closer to the mainstream of reform as set out inthe Horton report, but for that very reason they may be morelikely to attract followers in the short term, and thus more likelyto have broader impact short-term.

I now turn to a pair of points where I wasn’t clear enough,and another pair where I was disappointed.

2When it comes to statistics education, Christopher Wild advocates“courtship” and Hadley Wickham urges “safe sex.” Exploring such metaphorsis beyond the scope of this rejoinder, and so is left as an optional exercise forthe reader.

3As a salient example, consider Peter Nemenyi, a Hungarian born math-ematician who fled the Nazis and became a U.S. civil rights activist. Fewteachers of statistics know his name, but in the 1960s he created and taughta randomization-based introduction to inference at the historically black Hamp-ton Institute in Virginia, now Hampton University. Decades after Nemenyi,Gottfried Noether at University of Connecticut and Frederick Mosteller at Har-vard wrote textbooks for courses in the same spirit. Only in the past decade, 50years after Nemenyi, decades after Noether and Mosteller, has the idea of sucha course begun to gain traction.

3. Two Failures to Communicate: “I Thought I Was”

Several comments brought to mind an old story about thenotoriously taciturn Thomas Dewey, meeting the press as partof his 1948 run for President. “Smile, Governor,” a photogra-pher pleaded, to which the dour Dewey, taken aback, replied, “Ithought I was.” In a similar spirit, I was somewhat taken abackby two sets of comments I thought I had anticipated and ad-dressed in response to helpful comments by reviewers of myinitial draft. In short, “I thought I was.”

• The tear-down. It is not our curriculum that is the tear-down, but rather, less drastically short term, but more ambi-tiously long term, it is our thinking about curriculum that needsto start from the ground up. Unless I misread, not one of the re-spondents wants us to continue to debride the skins of our noseson the same old curricular grindstone. We all want change. Allthe same, our thinking about change is too somnolent.

The distinction between how we think and what we do needsto be recognized more explicitly. What we do is constrained byreality. How we think is not, and should not be. To borrow fromRobert Browning (1855), our reach should exceed our grasp.None of the respondents struck me as clear enough about thisdifference. Some accuse me of wanting to tear down our exist-ing curriculum. I don’t. What I do want is for us to seek outand question some lurking assumptions that shape the way weteach. As I see it, one of the biggest general issues in statisticseducation, one that may prove pivotal for our future, is the ten-sion between mathematics and data, between abstraction andcontext, between theory and story. We statisticians have comemainly from mathematics: Mathematics has been our compu-tational engine, our source of underlying theory, and in the un-dergraduate curricula since the 1950s, our path to respectability.By tradition, our allegiance is to mathematics. But mathematicsinsists that we understand, learn, and teach top down: Theorydominates, data merely illustrates. This too-often-unconscioushierarchy of priorities in our thinking about curriculum is thetear-down.

To recycle a simile, the tear-down is our mathematicallydriven tendency to treat topics and courses as structured likea pyramid: Knowledge comes in hard rectilinear blocks. Youhave to complete the first layer of blocks before you start thesecond one; you can’t talk about xyz until you’ve talked aboutrst and uvw, all the way back to abc. Transferring this logicfrom the mathematics curriculum to, for example, how we learnto read exposes its flaws for teaching how we learn from data.We don’t learn to read three letters at a time; rather, we learnthe way De Veaux and Velleman want us to teach, holistically.In learning to read, once we get past the stage of what for mygeneration was “See Spot run. Oh, oh, look,” we come to experi-ence learning as a kind of archeological dig in which we put thepieces together as we gain depth. Different students learn dif-ferent things in each of their courses, depending on their back-grounds, but for each student, the threads of new informationand new ways of thinking get woven into a pre-existing andever-evolving tapestry of understanding.

I’ve responded here to those for whom I failed to make clearmy sense of what it is that needs to be reimagined from the

2 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (41)

ground up. Other respondents put their focus on obstacles andchallenges. Although I agree with their concerns, I would besorry to have the genuine boulders in our uphill path distract usfrom trying to anticipate the thrill of the long view from abovetree line.

• Breiman’s dichotomy. Of course De Veaux and Vellemanare right to say that we should teach “holistically,” but their “be-side the point” is beside the point. To recognize a neglected po-larity as Breiman has done is not to argue for a forced choice.As I read Breiman, his main point was not that we must chooseone or the other, not that we should abandon probability-basedanalysis, but rather, that our current synthesis fails to recognizethe importance of the kind of thinking he (unfortunately) named“algorithmic.” Breiman was not arguing for either/or, but forbalance, integration, and attention to the relationship betweenthe available data and the goal of the analysis. So was I. So areDe Veaux and Velleman.

I continue, as urged by Chance, Peck, and Rossman, to ex-plore the fruitful tensions between Breiman and B&K: In think-ing about curriculum, how can we best understand the role ofprobability? The comments of Kass help push this explorationforward, and I agree with his censure of authors who “have donenothing to indicate that [their method] performs well.” Are therealternatives to probability-based models for this purpose? In thisconnection I also find it helpful to keep in mind, as I tried topoint out myself but Loken and Gelman state more clearly, that“algorithmic” is not the same as “anti-probability.” Finally, Iwonder whether and how we can justify a probability model forwhat Gould calls “algorithmic data.”

4. Two Disappointments: A Pier Into the Future?

In James Joyce’s Ulysses Stephen Dedalus defines a pier asa “disappointed bridge,” leaving implicit its failure to reachthe intended destination (Joyce, 1922). Sadly, two of my moreheartfelt attempts at a bridge to the future remain largely dis-appointed, short of the shore I had hoped to reach. They are:flattening prerequisites and teaching Bayes early.

• Flattening prerequisites. Few of the respondents com-mented directly and in substance on my argument that we canteach many core concepts and methods of our subject—appliedBayes, design and ANOVA, regression, and what is tradition-ally mislabeled “mathematical” statistics—without most of thetraditional prerequisites.

I salute Lane-Getaz for being an exception. In her response,she describes a second statistics course at St. Olaf that teachesstudents of the social sciences much of the content they mightotherwise learn in a graduate-level course in quantitative meth-ods.

Even more radically, Ridgway encourages us to recognizethat computers and visualization can engage students early anddirectly with multidimensional and nonlinear relationships. Weshould not assume that students can only understand multivari-ate data if they climb our familiar step-ladder, one dimension ata time. Step one, histograms. Step two, scatterplots. Step three,

two-way additive models. Step four, interaction. And on and up.Ridgway shows us a different path.

We should all embrace the concern of Holcomb and Morenothat for many students across the country, as at their two in-stitutions, mathematical expectations can be a major obstacle.Unless we flatten prerequisites, we bar the way to such students.But: If we do flatten prerequisites, where and when will studentswho aim for graduate school learn the mathematics they willneed for admission? Finding answers to this question is critical.Here are three possibilities: (1) Change some of the ways weteach mathematics to students of statistics. (2) Change some ofthe expectations of graduate programs. (3) Recognize that moreand more careers with data don’t require a traditional graduateprogram. In more detail: (1) For some students, and at some in-stitutions, it may work to teach applications first, and use thatbackground to support the teaching of mathematics. For exam-ple, many teachers of mathematical statistics urge students totake Stat 101 first. More radically, as described in my article, itis possible to teach the concepts of mathematical statistics withonly a prerequisite of a single semester of calculus. Probabilitycan come later. (2) Many traditional PhD programs want in-coming students to have learned mathematics through the levelof a rigorous course in real analysis. The need for these studentscontinues undiminished, but the need for other data-orientedgraduate students without that background continues to grow.(3) As Utts points out, “the reason for having an undergraduatedegree in statistics is changing rapidly,” with more opportuni-ties for jobs right after graduation, without the need for graduatecourses.

In the spirit of Holcomb and Moreno, although I applaud Al-bert and Glickman’s urging us to teach a course based on gener-alized linear models, I was disappointed to read that “this wouldrequire students to have knowledge of a variety of probabilitymodels.” Why not teach those models, as needed, in the contextof analyzing real data?

I agree with Wild that our aim should not be just to offera “smorgasbord” of courses each covering “a small number of‘advanced’ topics at an intro level.” That aim, however, was notthe point of my examples. In no way did I mean to suggest thatwe should revise our curriculum just by offering more flavorsof Stat 101. Quite the contrary. My point was and is that if onlywe choose to, we can for many strong students skip Stat 101altogether and offer instead courses that allow good students tolearn important areas and real applications at the level of a sec-ond course, teaching the necessary Stat 101 concepts along theway. In short, we can, with modest effort and thought, reviseexisting intermediate courses to teach the same content withouta statistics prerequisite. If we statisticians don’t do it ourselves,others will do it for us. (I may be guilty of a manufactured dis-agreement here, a la reality TV. Anyone who knows inZight(Wild, 2015) knows that Wild wants to flatten prerequisites asmuch as I do.)

• Teaching Bayes early. I was doubly disappointed here.First and foremost, I was disappointed that only 3 responses outof 19 highlighted the importance of teaching Bayesian thinking.Surely the fraction 3/19 far under-represents the role of appliedBayes in our current practice. I agree emphatically with Albert

The American Statistician, Online Discussion 3

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (42)

and Glickman that “the time is right for the development of anapplied Bayesian course,” but why not a Bayesian version ofStat 101? Thus I was also disappointed that the few responsesthat did mention Bayes were heavy in their emphasis on prereq-uisites. I disagree with Notz that “to introduce Bayesian think-ing students need to know something about probability distribu-tions, conditional probability, and Bayes theorem.” As I see it,this untested assumption (a call for education research!) bearssubstantial responsibility for the failure of past attempts to teacha Bayesian elementary course. As I have suggested, we canget by with far less formal probability than is usually assumed,and what little is truly essential can be taught along the way asneeded in a Stat 101 course with a Bayesian orientation.

Albert and Glickman cite the textbook by Link and Barker(1999) as the basis for a “nice applied Bayesian course” and so,full of hope, I ordered a copy. Sure enough, it is a lovely bookwith an impressive collection of deep and interesting applica-tions, but it implicitly requires three semesters of calculus and asemester of probability.

Utts is right that new textbooks can lead the way. Our profes-sion urgently needs a new textbook for teaching applied Bayesat the introductory level. I hope for a book along the lines I havesuggested, one that relies on Laplace’s version of the likelihoodprinciple to avoid the need for any of the usual formal mathe-matics. No marginal probabilities, no Bayes Theorem, and nocalculus. Laplace’s eighteenth century genius together with ourtwenty-first century computer simulations reduce the basic ideato a simple fraction P(θ |y) = #(θ |y)

#y . Adjustments for con-tinuous distributions and prior probabilities are straightforward.(More research, please.)

I regard both of my two disappointments as strong supportfor my assertion that our thinking about curriculum is indeed atear-down. We have the content already, but our thinking abouthow to make that content accessible to talented and motivatedundergraduates remains immobilized in a spider web of old as-sumptions. Tear down that web!

5. Implementation: Principles, Obstacles, Challenges, andStrategies

My goal in what I wrote originally was to float high andtake a long view, as a counterpoint to the Horton reports appro-priately more practical and tethered emphasis. Many of thosewho wrote in response to my article have chosen to drop leadin my basket and drag my balloon back to earth. They empha-size obstacles to change and challenges to making change hap-pen. Lane-Getaz reports on her own very real experience withresistance to change. Gelman and Loken are right that “devel-oping a forward-thinking approach is not so easy” and I sharewith Utts her “pessimism about implementing” calls for change.Here again, however, we need to render unto reality that whichis real, but only that much and no more. We should not let shad-ows of the short-term darken our vision of what we might ac-complish with time and effort. Thus I am cheered that Utts,our ASA President elect, is no pessimist: “ . . . we have comea very long way in a very short time. And the pace is quick-ening.” Franklin suggests, and I agree, that our “two biggest

challenges” are “building a culture that advocates this” and “theteacher preparation needed.”

Teacher preparation—this one of Franklin’s (University ofGeorgia) two challenges is clearly a major issue, one that manyother respondents echo. Fisher and Bailar (Miami University)raise the same concern in connection with issues of scale andteaching thousands of students at an institution that relies onadjunct faculty who are paid too little to learn to depart fromtheir familiar traditional course. Although Fisher and Bailar fo-cus their concern on the introductory course, I think their worryabout implementing change in fact applies to faculty at all lev-els. For example, at the high school level, Gould (UCLA), inconnection with his NSF-funded Mobilize project, which seeksto engage students with the role of statistics in the process ofscientific investigation, cites the challenge of teacher prepara-tion. At the graduate level, Kass (Carnegie Mellon University)notes that “we have not penetrated into schools of education.”Notz (Ohio State) also identifies teacher preparation as a ma-jor issue. Although other respondents chose to focus on otherissues, it is hard to imagine that any of them would disagree.

Changing the culture is closely tied to teacher preparation,in that teachers help shape the culture, and culture helps shapeteacher preparation. Along with Franklin, several others cham-pion the need to reshape the culture of statistics education.Fisher and Bailar point out the need to enlist client departments.Holcomb and Moreno urge us to publicize employment oppor-tunities. Temple-Lang has found that “data analysis competi-tions” are effective in getting students actively engaged.

More broadly, Kass writes that “the biggest challenge instatistics education arises from the difficulty humans have inaccepting ambiguity and acting reasonably in the presence ofuncertainty.” I agree wholeheartedly. At the same time, I can’tresist a chance to re-engage: I find it useful to make a sharpdistinction between uncertainty, which can be described us-ing a probability model, and ambiguity, which cannot. Randomsamples and randomized experiments lead without ambiguity tomodels for uncertainty. For data from other sources, the connec-tion to any possible model for uncertainty is ambiguous. Gouldarticulates the challenge to our profession: How can we “findmeaning in data that do not belong to the probability culture”?

Also on a general level, Zieffler and Justice ask: “how to getstakeholders to buy into the immense amount of work involvedin curricular revision.” At the risk of appropriating their ideasfor my own purposes, I suggest that they offer an answer totheir own question: by “using the tools of our profession—dataand analysis.”

6. Research in Statistics Education: Time for a NewDirection?

To start with my punch line, I suggest here that based on theavailable evidence, our profession would benefit from a shift inthe direction of research in statistics education, away from thecognitive psychology of understanding probability and its dis-contents, toward the social psychology of institutional changeand its resistances. Gelman and Loken note (using Zieffler and

4 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (43)

Justice’s “tools of our profession”) that my article “has morethan 100 references, only one of which addresses empirical re-search in educational effectiveness.” As a matter of principle Itry not to argue with data, and so I plead guilty as charged. Thisobservation and its implications resonate with a question posedmore explicitly by Chance, Peck, and Rossman, who ask, “Dowe test and evaluate before we tear down and build, or do wejust tear down, build, and hope for the best?” I regard the ques-tion as deliberately crafty in its phrasing, and my short answeris “Neither one,” but I take the question seriously, and so I de-vote the rest of this section and much of the next one to a longerresponse.

In my experience research in educational effectiveness is use-ful in helping us to understand which of our existing practicesare more effective, which ones are less effective, and why. But inmy experience, also, such research, though important for eval-uating what already exists, tends not to be a source of newcourses. As an example, I borrow from Utts: “What changed thelandscape for Bayesian methods” was “a few innovative lead-ers who made it easy for others” to implement and teach thesemethods “by writing textbooks and computer programs.” As Irecall, a certain Gelman was one of those innovative leaders andauthors who helped ease the way to Bayes for the rest of us.

To pursue the point, I think the history of our subject sup-ports Utts’s thesis that innovative textbooks are our engines ofchange. Here, with apologies for many omissions, is a severelypruned list, offered more as a provocation than as data. For moredetail, I refer readers to my original article. In this list, I chooseone major innovation for each of the last several decades, andone influential textbook author for each.

• 1950s: Making the teaching of statistics legitimate at theelementary level. Frederick Mosteller (1961).

• 1960s: Teaching us to teach with real data, before comput-ers. John Tukey (1977).

• 1970s: Interactive computing and data analysis. FrancisAnscombe (1981).

• 1980s: Real data in the first inference course. Freedman,Pisani, and Purves (1978); Moore and McCabe (1984).

• 1990s: Activity-based statistics. Richard Scheaffer, et al.(1996).

• 2000s: Randomization-based inference. Peter Nemenyiand others. (Note here that Nemenyi developed and taughthis randomization-based course in the 1960s. It has takenus 50 years to catch up. For more, see footnote 3.)

What stands out for me is that none of these pivotal innova-tions originated from research in statistics education. Such re-search does not ordinarily lead to directly to innovations; rather,it documents whether and in what ways existing approaches door do not help students learn. At the same time, it is importantto be clear. I do not mean to disparage the importance of this re-search, only to sharpen our sense of its role. Such research hasbeen instrumental in advancing our profession. Although I as-sert that it has not been a direct source of innovative textbooks,

I do credit that research as a source of new thinking that canlead to innovative textbooks. (A prime example is Chance andRossman 2015.)

To conclude this section, I offer a four-point summary.

• Salute: The importance (past and continuing) of researchin statistics education. This research has played an essen-tial role both in supporting and in shaping our teaching. Ithas helped all of us who teach to choose approaches thathelp students understand, and to avoid approaches that re-inforce misunderstanding. It has helped us to understandwhich approaches enlist student interest and enthusiasm.It has, I am convinced, pushed us in those directions thathave powered our extraordinary growth in student enroll-ments. In addition, this research has been essential to thesuccess of proposals to granting agencies for funding instatistics education, and helpful to those agencies in theirdecisions about which projects to fund. Looking to the fu-ture, it seems clear that our need for research continues.

• Premise: As fellow statisticians, our colleagues who spe-cialize in research on the teaching and learning of statisticsshould, following Zieffler and Justice, rely on data to directtheir talents and efforts. What are the important open ques-tions?

• Open questions: What do we need to know at this pointin our development? I think the responses to my origi-nal provocation offer a clear consensus. According to thetwo dozen statisticians in our admittedly biased sample,the main challenges to statistics education at this point arenot matters of cognition, but matters of implementation, asset out in Section 5. This matches the concern of fundingagencies with “dissemination.”

• A new direction? Based on the available data, I suggestthat where we most need research is in the area of im-plementation rather than cognition. I make this suggestionwith diffidence, knowing that so many colleagues who dothis research have come mainly from a background in cog-nitive psychology, in the tradition of Kahneman and Tver-sky. For many of them, the attraction of research is thechallenge of trying to understand the way students learnthe probability-based aspects of statistical thinking. Nev-ertheless, I think this vein of research offers a dwindlingsource of new nuggets, and that evidence supports my ar-gument for change. Sadly, perhaps, a background in cog-nitive psychology is no longer the best preparation for re-search that will help shape the future of statistics educa-tion.

As I see it, the argument for change is even more compellingwhen it comes to university graduate programs in statistics.

7. A Triangular Tension: Our Subject, Our GraduateDepartments, and Our U.S. Liberal Arts Colleges

Gelman and Loken observe that I seem “to be concerned withthe future of traditional statistics departments. . . ” and they are

The American Statistician, Online Discussion 5

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (44)

right: I failed to be clear about the difference between our sub-ject and our graduate departments. Their comment is especiallyclarifying in the context of observations from Wild. Together,Gelman, Loken, and Wild suggest a tension involving our sub-ject, our university departments, and our U.S. liberal arts col-leges. In this section (1) I argue, with Gelman and Loken, thatthe future of our subject is assured, (2) I agree with Wild thatmany university departments are bastions of conservatism, un-der siege and at risk, and (3) I expand on Wild’s hope that liberalarts colleges can help lead our way into the future.

Our subject. The success of our subject—learning fromdata—is no more in jeopardy than is the role of money in poli-tics. Data, like money, will always be in demand. You can’t havetoo much of it unless you don’t know what to do with it.

University departments. As recently as two decades ago, oursubject and our university departments were aliased. No longer.Our subject threatens to outgrow many of its graduate depart-ments. Wild quotes from Peter Diggle’s (2015) address as Pres-ident of the Royal Statistical Society: “I would like to see lessstatistics in undergraduate mathematics degrees.” In effect, oursubject may be important, but graduate schools don’t want youto study it. Diggle’s declaration calls to mind the Shakers, anow-extinct New England sect that expected to advance itsagenda despite a ban on sexual reproduction. Can university de-partments survive if they rely exclusively on students who ma-jor in mathematics, study little or no statistics, but neverthelesschoose to pursue a PhD in a subject they barely know? If you ap-ply to graduate school in molecular genetics, you are expectedto know something about molecules and genes. If you apply inastrophysics, the more you know about galaxies and quantumtheory the better. No so for statistics. Only in our subject, aloneamong the sciences, do we hear, “Please learn as little as possi-ble.” The future of our subject may be assured, but the future ofuniversity departments in our subject is a different matter. Wecan all take a lesson from mules, who are both stubborn andunable to propagate.

Liberal arts colleges. As Wild points out, these “elite” four-year colleges are unique to the U.S. They stand out in manyways:

• Faculty tend to come from PhD programs at top researchuniversities.

• Teaching load is comparatively light, four to six coursesper year, as opposed to as many as ten per year at U.S.two-year colleges.

• Greater emphasis on teaching excellence, in comparisonwith universities and two-year colleges.

• Curricular development is recognized as a form of schol-arship.

• Comparatively small programs offer flexibility and oppor-tunities for change. Our subject is changing rapidly. Large

programs, like ocean liners, can respond only slowly. Lib-eral arts college programs, like kayaks, can turn quickly totake advantage of the currents.

• Curricular emphasis remains tied to the ancient triviumand quadrivium, with an enduring commitment to processover content.

As my former dean, now president of New College (Florida)used to say, US liberal arts colleges are the places where cut-ting edge research from universities is brought into the under-graduate curriculum. (O’Shea, 2005) Faculty at these collegeshave the connections to colleagues at research universities, thetime, and the institutional support to create new courses andprograms.

But wait: There’s more! The commitment of liberal arts col-leges to putting critical thinking first—putting process aheadof content, skeptical reflection ahead of vocational training—creates a particular resonance with statistical thinking. Statisticsis about how we learn from what we can observe, and how wecommunicate what we have learned, the two fundamental goalsof the ancient liberal arts. This value of the liberal arts was rec-ognized decades back when the school of business at the Uni-versity of Chicago established a fellowship program based on astudy that found that a disproportionate number of CEOs of theFortune 500 companies came from liberal arts colleges. The fel-lowship program, which offered a summer internship and guar-anteed admission, was open only to students at a select 50 lib-eral arts colleges, and only to students whose major was not ineconomics or business.

8. Conclusion

I look forward, with constrained optimism, based on the fol-lowing four-point summary:

• Although the future of our university departments and theirentrenched graduate programs may be uncertain, the futureof our subject is assured.

• Education research can best serve our subject and its teach-ing by a shift in emphasis away from the old cognitive psy-chology of probability and its discontents toward the socialpsychology of institutional change and its resistances.

• Undergraduate programs at liberal arts colleges will likelybe nimble enough to respond to trends in statistical prac-tice and to relevant contributions from education research,regardless of what happens in university departments.

• Best of all, across the board and at all levels, our profes-sion’s shared commitment to reliance on data will keep usworking together.

Finally, I am grateful to the cited authors whose thinkingprompted me to write, especially to my fellow statisticians whodevoted their time to help create the Horton report (ASA 2015);to the editor and associate editors who helped with my think-ing and writing, and who arranged for the online discussion;to the many reviewers whose comments advanced my thinking

6 Online Discussion: Special Issue on Statistics and the Undergraduate Curriculum

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (45)

and clarified my understanding; and to the discussants who havejoined with me and with those I have relied on in contributingto what I am sure will continue to be an ongoing discussion. Iam confident that readers will agree: our profession can counton thoughtful efforts such as theirs. In the spirit of Fisher andBailar, although our time may be one of turmoil, in good handsturmoil creates opportunity.

Additional References

References listed in the original article are not repeated here.

Anscombe, F. J. (1981), Computing in Statistical Science Through APL, NewYork: Springer.

Browning, R. (1855), “Andrea del Sarto,” in Men and Women.

Chance, B.L. and Rossman, A. J. (2015), Investigating Statistical Concepts,Applications, and Methods (3rd ed.).

Joyce, J. (1922), Ulysses, The Egoist Press.

Kahneman, D. and Tversky, A. (1974), “Judgment Under Uncertainty: Heuris-tics and Biases,” Science, vol. 185, no. 4157, pp. 1124–1131.

Wild, C. (2015), InZight software, available online at https://www.stat.auckland.ac.nz/∼wild/ iNZight/getinzight.php.

The American Statistician, Online Discussion 7

Discussions of George Cobb’s Mere Renovation is Too Little ...ing a scripting language such as R or python. Particular data wrangling tools such as the use of regular expressions - [PDF Document] (2024)

References

Top Articles
Latest Posts
Article information

Author: Dan Stracke

Last Updated:

Views: 5711

Rating: 4.2 / 5 (43 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Dan Stracke

Birthday: 1992-08-25

Address: 2253 Brown Springs, East Alla, OH 38634-0309

Phone: +398735162064

Job: Investor Government Associate

Hobby: Shopping, LARPing, Scrapbooking, Surfing, Slacklining, Dance, Glassblowing

Introduction: My name is Dan Stracke, I am a homely, gleaming, glamorous, inquisitive, homely, gorgeous, light person who loves writing and wants to share my knowledge and understanding with you.