Undergraduate Learning Outcomes for Achieving Data Acumen

It is imperative to foster data acumen in our university student population in order to respond to an increased attention to statistics in society and in the workforce, as well as to contribute to improved career preparation for students. This article discusses 13 learning outcomes that represent achievement of undergraduate data acumen for university level students across di!erent disciplines.


Introduction
Recently the value of an undergraduate degree has been challenged (Estes 2011), and demands for greater accountability in higher education have emanated from prospective students, their parents, business leaders, and politicians. The economic climate and employment considerations are central to many of these concerns. The percentage of students who report that their decision to go to college has been strongly shaped by a desire to "to get a better job" has increased in recent years, and in 2012, 88% of rst-year students reported this factor as very important (Eagan et al. 2014). Colleges and universities have repeatedly been called upon to do a better job in preparing students for careers.
One area that needs to be strengthened in response to the career climate is student preparation in statistics and data science. The Chronicle of Higher Education recently listed the growth of data science programs as a key trend in higher education. However, they also noted that data science programs are being added without careful attention to what a data science curriculum should look like. Moreover, because data and statistics play an important role in all disciplines, undergraduate curricula in statistics and data science may be embedded within di erent disciplinary contexts. As such, there is a need for a set of comprehensive learning outcomes to help guide data learning across the disciplines. Such learning outcomes will help departments across institutions, administrators, as well as individual faculty to better understand how statistics and data courses across departments t together to provide a coherent curriculum. Data education entails ensuring that students not only have sound computing, data analysis, and statistical skills, but also have good communication skills and the ability to work as part of a team (Zorn et al. 2014;Holdren and Lander 2012; CONTACT Anna Bargagliotti abargag@gmail.com Department of Mathematics, Loyola Marymount University, 1 LMU Drive, Los Angeles, CA 90045. Davenport and Patil 2012). As noted by Horton and Hardin (2015), "the idea that an undergraduate statistics [major] develops general problem solving skills to use data to make sense of the world is powerful. " This is what o erings in colleges and universities in statistics should strive to achieve-nimble computing data problem solvers (Nolan and Temple Lang 2010;Nolan and Temple Lang 2015). Some data science education recommendation and guidelines already exist in the literature. For example, in 2018, the Two-Year College Data Science Summit published a report outlining recommendations for data science programs at community colleges (Gould et al. 2018). Recommended program outcomes are organized into four categories: computational, statistical, data management and curation, and mathematical. The program outcomes are further partitioned into foundations, thinking, and modeling outcomes (p. 16). Overall the guidelines provide community colleges housing data science programs a set of explicit learning outcomes to organize their programs around. Also in 2018, the National Academies Press put forth the Envisioning the Data Science Discipline: The Undergraduate Perspective Interim Report (National Academies of Sciences et al. 2018b). This report de ned the term "data acumen" as the ability to make good judgments and decisions with data. It also notes that data acumen is "not a nal state to be reached but rather a skill that data scientists develop and re ne over time. " (p. 12). To develop data acumen, mathematical foundations, computational thinking, statistical thinking, data management, data description and curation, data modeling, ethical problem solving, communication and reproducibility, and domainspeci c considerations are needed (p. 33).
The learning outcomes presented in this article di er from those in these reports in a few ways. A main goal here is not to provide statistics and data science guidelines for a speci c program dedicated to data science but instead to present outcomes for working toward data acumen across university courses and across disciplines. Responding to the call in the National Academies report that data science requires participation from all di erent disciplines understanding that the degree to which di erent disciplines develop the components of data acumen varies (National Academies of Sciences et al. 2018a), this article presents a cross-disciplinary study that was undertaken to develop baseline learning outcomes for statistical and data learning at a university. A second National Academies report, Data Science for Undergraduates, notes the di culty in furthering data science education across disciplines through upper division courses due to the varying topics in introductory courses in di erent disciplines. It also notes a need for crossdisciplinary coordination and collaboration from a wide spectrum of disciplines (p.39). While the Two-Year College report and the rst National Academies Report discuss learning outcomes for speci c programs in data science, this article outlines a series of common learning outcomes valid across disciplines for working toward data acumen on a university campus.
Following the recommendations of the American Statistical Association put forth in the Curriculum Guidelines for Undergraduate Programs in Statistics (ASA, Undergraduate Guidelines Workgroup 2014), this article discusses how statistics and data education bridge many disciplines and how the di erent disciplinary approaches can be integrated into one set of coherent learning outcomes for undergraduate education in statistics and data education. Overall, to ful ll the growing needs of the workforce, students graduating from college need to be prepared to tackle problems using technology, work with real data, and communicate their ideas.

Statistics and Data Science Education across Disciplines at Universities
Universities across the U.S. typically have many di erent statistics course o erings across campus. Because it is very common to have statistics courses housed in di erent disciplines (e.g., mathematics, computer science, psychology, economics), the ASA and Mathematical Association of America (MAA) o er guidelines for teaching introductory statistics targeted at nonstatistics departments (ASA/MAA Joint Committee on Undergraduate Statistics 2014). O entimes these courses overlap and yet their prerequisite structures do not allow a student to move from a statistics course o ered in one department to a more advanced course o ered by another department. Departments o en rightfully argue that the type of statistical techniques needed are discipline speci c and thus necessitate the o ering of a course within a speci c discipline. Although speci c techniques do vary from discipline to discipline, certain basic themes of working with data should be present in all courses. Three important, fundamental, and particularly timely themes are that students need to (1) employ technology, (2) explore real datasets, and (3) practice communicating statistical ideas and results.
Scholarly articles and recommendations of professional organizations concerning undergraduate preparation in various disciplines, in addition to the ASA sponsored documents already discussed, align with these themes. For example, skills in statistics and the ability to work with data and technology are increasingly recognized as core components of an education in sociology (Wilder 2010). A 2010 report published by the American Psychological Association, recommends that psychology students complete coursework in statistics and research methods as early as possible and that the knowledge and skills gained from these courses be reinforced throughout the curriculum. A national study of undergraduate business education conducted by The Carnegie Foundation for the Advancement of Teaching concluded with recommendations that programs provide a stronger linkage between business, arts, mathematics, and science curricula and that programs promote courses that incorporate complex and ambiguous real-world issues and three essential modes of thinking: Analytical Thinking, Multiple Framing, and Re ective Exploration of Meaning (Colby et al. 2011). Statistics courses that incorporate the statistical thinking process of formulating a question, collecting appropriate data, choosing an appropriate analysis technique, and interpreting results (Franklin et al. 2007) promote these modes of thinking. Teaching statistics as an interrogative process is also stressed in both the GAISE college report (Everson et al. 2016) and the ASA Undergraduate Guidelines Workgroup (2014).
Several important reports have stated the need for students to work with real data. The Committee on the Undergraduate Program in Mathematics Curriculum Guide 2015 (Mathematical Association of American 2015) states "Working mathematicians o en face quantitative problems to which analytic methods do not apply. Solutions o en require data analysis, complex mathematical models, simulation, and tools from computational science. " This report recommends that all mathematical sciences major programs include concepts and methods from data analysis and computing. The Guidelines for Assessment and Instruction in Statistics Education (GAISE) college guidelines also include working with real data as one of the necessary six components of structuring an introductory statistics course (ASA ?). In addition, the recommendations of the ASA on undergraduate programs in data science include Real Applications and Problem Solving as two of their Background and Guiding Principles. They state programs should "emphasize concepts and approaches for working with complex data and provide experiences in designing studies and analyzing real data (de ned as data that have been collected to solve an authentic and relevant problem)" (ASA, Undergraduate Guidelines Workgroup 2014).
As data science has been described as an intersection of statistics with computer science, when considering undergraduate preparation, one must consider how the use of so ware interplays with statistics. Regardless of the discipline, technological uency has become a must for success in the workforce. Therefore, university statistics and data science courses must incorporate heavy use of technology and computing.
The material commonly taught in introductory statistics courses o en merely focuses on techniques. However, such methods are o en "necessary but not su cient" for modern data science Ridgeway 2016). Instead, an undergraduate education should focus on the unifying themes of working with technology, working with real data, and com-municating results for all course o erings across campuses. Moreover, if a model existed for explicit learning outcome goals of an undergraduate education in statistics and data related courses, then the door may be open to creating a coherent curriculum for students seeking statistics and data education beyond what just their departments o er.

Undergraduate Data Pathways (UDaP) Study
The National Science Foundation (NSF)-funded project (NSF Grant No. 1712296), Undergraduate Data Pathways (UDaP), focused on understanding di erences and similarities of statistics and data related course o erings across di erent disciplines. The project carried out a rigorous study to develop a set of learning outcomes for statistics and data related courses at the undergraduate level that integrated the data-related goals put forth by several di erent disciplines. This paper reports on that study and presents a set of learning outcomes (LOs) that work toward data acumen for university level students across di erent disciplines. If a student meets all of the LOs, the student will have achieved an introductory level of data acumen appropriate to the undergraduate level. The LOs not only re ect cross-disciplinary goals but they also re ect societal needs of data analysis. The study took place at Loyola Marymount University (LMU), a mid-sized comprehensive university in Los Angeles, California. Faculty from eight departments across campus carried out the study.

Methods
Five steps were undertaken to better understand the di erences and commonalities of statistics and data education across disciplines and subsequently develop a unifying set of learning outcomes for undergraduate statistics and data education.
As a rst step, a faculty working group consisting of LMU faculty from mathematics, economics, biology, psychology, sociology, business, and education was formed. While LMU does not have a department dedicated to statistics or data science, the Department of Mathematics, Department of Biology, Department of Engineering, Department of Economics, Department of Political Science, Department of Psychology, Department of Sociology, the School of Business, and the School of Education o er courses related to statistics and data analysis.
The formation of a working group of invested change agents was no easy task. The Associate Dean for Undergraduate Studies urged faculty across departments that had investment in statistics and data analysis to join the group. In addition, members of the research team personally reached out to faculty in other departments to encourage them to join the working group. A total of 10 faculty members were selected for the working group. The working group was centered around understanding the processes and support needed to implement the themes of communication, technology, and real data in statistics courses across the disciplines. Four meetings per semester were conducted over the course of two academic years. The purpose of the working group discussions was to gather qualitative data on how di erent disciplines articulated the importance of statistics and data analysis and to determine what all of the disciplines had in common.
A second step in the process was to develop and administer a 36 question survey to the working group. The survey asked about so ware platforms, data sources, types of class assignments o ered (e.g., statistics investigations in the form of projects, problem sets), and the types of activities used in the classroom (e.g., students using computers in a lab setting, group work). The survey included questions from the Statistics Teaching Inventory (STI) developed by Zie er et al. (2012) focusing on teaching practice, assessment practice, technology use, teaching, and assessment beliefs. The goal of the survey was to gather in-depth information about the statistical habits of the working group faculty across disciplines. See Appendix A for the inclusion of the entire survey.
Using the discussions and internal survey results, an initial set of learning outcomes was developed. Each disciplinary representative researched and brought forth any guiding documents that were present from their disciplines related to data education. In addition, each disciplinary representative gathered syllabi and University bulletin course descriptions for all of the courses taught within their discipline. Using a blinded exercise, the working group sorted the course descriptions by similarities-all course names were removed from the descriptions and the working group worked in pair groups to organize the descriptions into groups according to the topics covered within the courses. The names and departments of the courses were then revealed. This exercise was a catalyst for the working group to summarize common themes that were present across courses at the institution. These themes were outlined and noted. Syllabi were then reviewed to pick out how many courses highlight the themes and whether other themes were present that were not touched upon in the course descriptions. The working group members were asked to review the syllabi and then discuss the recurring themes present within and across disciplines. Courses were sorted into basic courses, introductory courses, application courses, and beyond courses. The discussions were guided by two of the PIs of the project (Bargagliotti and Larson). Bargagliotti and Larson guided the group in readings of papers in the literature discussing data acumen and readings of guidelines and reports from the di erent disciplines. They also presented enrollment data for speci c courses at LMU (see Bargagliotti et al. 2020, for presentation of enrollment results) to help provide student context to the discussions. In addition, discussion questions focused on technology use and necessities were posed to the working group at each meeting session. This process led to the formulation by the group of explicit learning outcomes. The process was completed over the course of one year through monthly meetings and email and phone conversations in between the in-person meeting times.
To validate these learning outcomes, a third step in the development process included carrying out a larger-scale survey to the greater community, both academic and nonacademic, to garner thoughts on the necessary learning outcomes for data education at the university level. This survey gathered data on whether respondents agreed, were neutral, or disagreed that a learning outcome was important for achieving undergraduate data acumen at the undergraduate level. More speci cally, the survey asked: For each statement below, please mark whether you agree, disagree, or are neutral that the statement describes a data analysis skill that you believe a college graduate in today's society should have. The full survey is included in Appendix C.
A fourth step carried out by the research team was to review position statements, policy documents, and curriculum guidelines put forth by professional organizations (e.g., ASA, APA, AEA) regarding data education pro ciency to understand whether there was common ground between the disciplines.
Based on the information gathered in these four steps, the culminating step of the work was to develop a nal set of learning outcomes to represent appropriate data acumen at the undergraduate level across disciplines.

Working Group Survey Findings and Working Group Discussion
Nine members of the working group completed the internal survey. The survey was administered at the rst group meeting before any discussion took place. The purpose of the survey was to gauge satisfaction with the manner in which the teaching and learning of statistics and data related topics was approached at the University as well as to gather baseline data on the typical statistical and data analytical processes used across disciplines. These data would then serve for a starting point of conversation to develop a set of learning outcomes for data education that would bridge the disciplines. Of the nine respondents, only one reported they were happy with the course o erings and curriculum related to data, four responded they were somewhat happy, three were unable to judge, and one said they were not happy. Two main issues were identi ed as those keeping faculty from being satis ed or being able to change the course o erings and curriculum to their liking. The identi ed issues were the general feeling that the institution did not support current statistical needs -speci cally with providing access to technology or materials needed to teach statistics and data analysis properly as well as the institution not providing enough faculty lines to cover the growing needs.
Respondents were asked what they would like to change about the course o erings and curriculum related to statistics if they had all of the resources needed. The responses were: • Our students can take basic stats and they can take more advanced Biostats, though it isn't taught very frequently (once every 2 years). More gradations might be good, as well as more frequency. Also the class is co-taught with Bio and Math faculty, which is great. • I am most familiar with the statistics requirement in the psychology dept and less aware of the o erings elsewhere. I know that in the psych dept in the past we didn't have enough people to cover stats and o en relied on visiting professors or adjunct professors. This is changing now though. I will also say that it is di cult to get access to lab classrooms with computers for all of the stats sections that we o er. • [The] math department has some solid courses, but it would be nice to have at least one advanced data science class. • More computational statistics required [of students in order to graduate] • I believe in the social sciences all students should be required to take an introductory statistics course, an empirical research methods course, and a qualitative methods course. • Students do not have the ability to further their statistical knowledge past their own department o erings. • Overlapping topics; no interactions between various departments Several themes were present in the responses and these themes continued to emerge throughout subsequent discussions. These themes included: a general siloed approach to data education curriculum across departments, frustration over a lack of advanced courses, and a lack of understanding of what is happening in other departments.
Although our group of nine faculty was identi ed as the primary professors teaching data-related courses in their disciplines, there was repeated evidence that indicated that we had di culty thinking of statistics past our own departments and across the university as a whole. For example, in multiple instances throughout discussions, the conversations revolved around single departments and single courses. There were many statements such as "in my department, in my course, we…" While this reaction was to be expected, there was extensive e ort made to keep the cross-disciplinary goal in mind as the development of learning outcomes progressed. This cross-disciplinary focus was thus identi ed as a main take-away for the working group as we strove to make adjustments over the course of the next year.
Subsequent working group meetings consisted of discussions and group exercises that took a closer look at the University course descriptions of all courses o ered at the University related to statistics and data analysis. A total of 29 courses o ered at LMU were identi ed by the working group that spanned 11 di erent departments (see Table 1 for the list of courses, see Appendix B for brief descriptions of each course). The courses being o ered cover a total of 11 di erent departments and therefore have a wide reach across the University. As shown in the Table 3, The College of Liberal Arts o ers 12 courses related to data; the College of Science and Engineering o ers 12, the College of Business Administration o ers 4, and the School of Education o ers 1 course. Of the 29 courses o ered, 9 are lower division (shown in light gray) courses and 20 are upper division courses (shown in dark gray). Of these upper division courses, ve were special reading courses o ered in small settings.
Using the courses, the working group participated in an exercise where the course descriptions of all of the 29 courses were placed on 3×5 cards but without course names and titles. Each group member paired up with another member, with no pair coming from the same department. Each pair had to sort the cards by similarity of course content as well as di culty. All pairs agreed that there appeared to be several basic and introductory statistics courses being taught across the University that had similar content. In addition, two other course types were identied -a research methods type course (where statistical methods were applied) and an advanced level course. The introductory courses could be further distinguished by those courses that covered regression and/or ANOVA versus those that did not.  The classi cation exercise led to an attempt by the working group to create a set of interchangeable courses -interchangeable intro level statistics courses, interchangeable research  Students should be able to understand, carry out and interpret basic inferential procedures for one or two samples. 5. Students should be able to understand, carry out and interpret statistical procedures for predicting future data (predictive inference) 6. Students should carry out and communicate results from an extensive data-driven project that is related to a real-life problem (extensive means that the project takes more than two weeks to complete and is worth at least 25% of the nal grade) 7. Students should be able to communicate their analyses and the interpretations of their results in a manner that is appropriate to their discipline in the context of the data 8. Students should be able to select appropriate methods for data analysis and explain limitations of their analyses and interpretations 9. Student should be able to formulate questions about multivariate data, collect multivariate data/consider multivariate data, analyzing multivariate data, and interpret results 10. Student should be able to use current statistical software, or statistical packages appropriate to the discipline and context beyond basic Excel or a calculator 11. Student should be able to write a program (using a programing language) to analyze data 12. Students should study at least one type of advanced data-analytic methods such as (not limited to) generalized linear models, Bayesian analysis, advanced probability theory and stochastic processes, non-linear models, machine learning, advanced study-design, big data analysis, econometrics, or statistical computing methods courses, and interchangeable advanced courses. For example, if two courses within a group were deemed interchangeable, then those courses would ful ll the same requirements. The idea of creating some type of interchangeability course map was grounded in the belief that students might then be provided more ways to reach advanced content. It was also at this third meeting that the working group determined that a set of learning outcomes for a complete data pathway explicitly needed to be de ned. All meetings that followed focused solely on the purpose of determining these learning outcomes.
To guide the creation of the learning outcomes, further analyses were done on the survey discussed above in order to identify the types of statistical techniques that were frequently used in each discipline. The results-were categorized into ve groups: Based on these results, the working group de ned a set of 12 learning outcomes with the idea that certain learning outcomes might be developed within a category of interchangeable courses. Table 3 presents the initial 12 learning outcomes put forth by the working group.
The learning outcomes highlighted in yellow describe the Descriptive bullet, the blue describes the Visualization, the purple describes Inferential, and the green describes Predictive. Several outcomes, highlighted in orange, focused on Application. The remaining outcomes characterized data processes. Using these 12 learning outcomes as a guide, an external community survey was administered.

Community Survey
To validate the 12 learning outcomes, a community survey was administered online. The goal of this survey was to assess whether peers at other universities and in industry would also view this set of outcomes to adequately represent the skills that a university graduate should have today. The online survey was sent out by members of the working group to peers and to listserves for several disciplines (speci cally sent out by the American Statistical Association and CAUSEWeb). It was also posted on several listserve forums (e.g., isostat). A total of 367 people opened the survey and 287 people completed the survey within the allotted time frame of one week. Table 4 shows the distributions of backgrounds of people who completed the survey. The survey was largely dominated by College and University Faculty with 82% of the total respondents. Industry scientists, researchers or consultants made up the next largest category at approximately 6% of the respondents. A total of 14 disciplinary backgrounds were represented in the respondents as noted in Table 5. Statisticians were the largest group of respondents, with mathematicians and psychologists being the second and third largest.
Because the American Statistical Association (ASA) helped in the distribution of the survey, it was expected that statistics would have a large response rate. The working group team all sent out to their personal contacts, however, due to feasibility, the only large organization to actively post and distribute the survey was the ASA. Despite the imbalance in discipline representation of the survey respondents, the responses were still varied. Table 6 shows the percentage of survey respondents that agreed, were neutral, or disagreed with the statement that the learning outcome was an important skill that a university student must acquire.
Respondents grouped the LOs into roughly three categories. Of the 12 learning outcomes, four seemed especially important, as 90% of respondents agreed that they are important skills that a university student should acquire. These included univariate statistics, descriptive statistics, graphs and visualizations, and communicating in context. Five other learning outcomes had a large majority of respondents stated that they agreed or were neutral. This category included inferential statistics, predictive statistics, discussion of limitations, multivariate statistics, and use of so ware. Only three learning outcomes had large disagreements with the statements. This third category included having a large project, writing a program to analyze data from scratch, and studying advanced statistical methods.
Based on these data, the working group agreed that a student meeting all 12 learning outcomes would be deemed to have undergraduate data acumen. Due to the disagreements on three learning outcomes, subsequent levels of data acumen were then de ned (see Bargagliotti et al. 2020 for a description of the categorizations of levels of acumen).

Policy Documents
To further validate the learning outcomes, nine curriculum guidelines from various professional organizations were reviewed by the working group. These guidelines speci cally discussed students' necessary data acumen skills for a given discipline. The working group identi ed di erent disciplines that had position statements or curriculum guidelines that mentioned statistics or data education explicitly. Those disciplines represented in the policy documents were: mathematics, statistics, psychology, economics, sociology, science, engineering and medicine. The policy documents reviewed were:  Table 7 illustrates that seven of the 12 learning outcomes were discussed in all of the policy documents. The remaining learning outcomes were supported by most of the documents. Interestingly, the policy documents all mentioned a learning outcome that was not included in the hypothesized 12 outcomes. That is: Students should become critical consumers of statisticallybased results reported in popular media, recognizing whether reported results reasonably follow from the study and analysis conducted.
Due to its inclusion in all of the policy documents from the various disciplines, the working group felt that it should be added to the 12 developed learning outcomes. This LO aligned with growing societal needs of being able to merely ingest the news and participate in the information age. Because it was included in all eight of the policy documents, the project team opted to include it as an explicit LO. A total of 13 LOs were then proposed. Table 8 presents the nal 13 Undergraduate Data Pathways (UDaP) learning outcomes that were established as important for students to meet at the university level currently today. Several edits to the original learning outcomes were undertaken. They were:

Final Learning Outcomes
• LO6 emphasizes that the project must count for a large portion of the nal grade but does not specify an arbitrary percentage • LO11 articulates that the use of so ware be used to manipulate, extract information, and carry out statistical analyses from data • LO12 is rewritten to better re ect the data tasks a student would undertake using a so ware program These adjustments were made based on the open comments received in the community survey, feedback from reviewers of this article during the revision process, feedback from audiences  Learning Outcomes 1 S t u d e n t sf o r m u l a t ea n d / o ra d d r e s sq u e s t i o n sa b o u tu n i v a r i a t ed a t a ,c o l l e ct / c o n s i d e ru n i v a r i a t ed a t a ,a n a l y z eu n i v a r i a t ed a t a ,a n di n t e r p r e tr esults 2 S t u d e n t su n d e r s t a n d ,c a l c u l a t e ,a n di n t e r p r e td e s c r i p t i v em e a s u r e sf o rq u a n t i t a t i v ea n d / o rc a t e g o r i c a lv a r i a b l e st od e s c r i b ec h a r a ct e r i s t i c sof the data 3 S t u d e n t sc r e a t ea n di n t e r p r e tb a s i cd a t av i s u a l i z a t i o n sf o rq u a n t i t a t i v ea n dc a t e g o r i c a lv a r i a b l e s 4 S t u d e n t su n d e r s t a n d ,c a r ryo u t ,a n di n t e r p r e tb a s i ci n f e r e n t i a ls t a t i s t i c a lp r o c e d u r e sf o ro n eo rtw os a m p l e s 5 S t u d e n t su n d e r s t a n d ,c a r ryo u t ,a n di n t e r p r e tr e s u l t sf r o me s t i m a t i n gs t a t i s t i c a lm o d e l sf o rb i v a r i a t ed a t a( e . g . ,l i n e a rr e g r e s s i o n ,i n t e r p o l a tion, extrapolation, predictive inference) 6 Students carry out and communicate results from extensive data-driven project(s) related to a real-life problem (Extensive means that a single project takes more than two weeks to complete or a series of projects take more than two weeks to complete and are worth a large percentage of the nal grade. Students formulate and/or address questions about multivariate data, collect/consider multivariate data, analyze multivariate data, and interpret results 11 Students use current statistical software or statistical packages that are appropriate to the discipline and context beyond basic Excel or a calculator to manipulate data, extract information from data, and carry out statistical analyses with data 12 Students write a program (using a programing language) to manage and curate data by nding, manipulating, analyzing data or extracting information from the data 13 Students study at least one type of advanced data-analytic methods such as (but not limited to): generalized linear models, Bayesian analysis, advanced probability theory and stochastic processes, non-linear models, machine learning, advanced study-design, big data analysis, econometrics, or statistical computing when the paper was presented in three di erent settings, discussions among the PIs on how to incorporate the comments, and approval from the working group members in writing the nal LOs. Students meeting these 13 learning outcomes are deemed to have undergraduate data acumen. The UDaP learning outcomes span both content and process. The important themes of using real data, communication with data, and technology are well-represented within the learning outcomes as well. These outcomes are meant to be broad and cross-disciplinary so they can serve as benchmarks across all disciplines o ering statistics and data education courses on a university campus. These learning outcomes stemmed from two-years of discussions within the working group as well as the review of the policy documents and the community survey.

Discussion and Future Research
While there has been a large increase in data science and statistics majors and minors across the US over the past several years (Pierson 2018), explicit learning outcomes to govern such programs are relatively new (see Gould et al., 2018;National Academies of Sciences et al. 2018a and2018b). Furthermore, while there is consensus that data education reaches across disciplines the wide reach and wide importance of data across disciplines makes it di cult to put forth coordinated e orts for student learning. In the cross-disciplinary context across departments and disciplines, no set of coordinated learning outcomes exist as a bridge to data education. There is an important need to acknowledge that data education is not taught solely in statistics, computer science departments, or within a single data science program but instead working with data is present in most disciplines and is o en intertwined with disciplinary content. Therefore, although guidelines exist that specify recommendations on how to teach statistics courses (GAISE, ?) and guide data science speci c programs (Gould et al., 2018;National Academies of Sciences et al. 2018a, 2018b, these guidelines are not designed to bridge the interdisciplinary context.
Several challenges emerge as data education is conceptualized across disciplines. Perhaps a rst step in advancing this conceptualization is an agreement of some basic content and process outcomes that students should acquire. The implementation of such outcomes necessitates departmental agreements and a concerted e ort to create opportunities for students to advance their data acumen despite potential departmental limited o erings.
To develop goals for data education at the undergraduate level, the UDaP project explicitly considered the cross disciplinary nature of coursework related to data as well as the overall learning goals for students driven by current workforce and societal needs. As society pushes toward being more datadriven, it is important to understand and characterize what education should be doing as a response. Moreover, crossdisciplinary demands are more and more emerging in society with data being embedded in policy and discussions across all subjects. As such, how we conceptualize data acumen at the undergraduate level must be exible enough to bridge many contexts and students with diverse academic backgrounds. This is di erent than the way the literature has conceptualized data science as being a three circle Venn diagram with computer science, statistics, and context; instead undergraduate data acumen aims to be exible and broad to span disciplines. In other words, a sociology major must have the opportunity to gain data acumen just as much as a computer science major.
The UDaP learning outcomes presented in this paper can be used by colleges and universities that plan to assess their capacity across disciplines to produce undergraduates with data acumen by matching existing course o erings with the learning outcomes presented here. This could provide insight about the accessibility, quantity, and di culty of existing pathways to achieving data acumen and guide the resource-e cient development of new pathways using cross-disciplinary badges, concentrations, minors, or majors. The UDaP learning outcomes can form a basis for ongoing assessment of data-related concentrations, minors, or majors. Moreover, they could form a basis for assessment of the role of co-curricular learning through internships, campus jobs, etc., toward students earning badges around data acumen. Universities with speci c statistics departments or data science programs can lead such e orts by ensuring that their o erings can meet the learning outcomes without many prerequisite costs to students. E orts for general data education courses (much like writing requirements) required by all students at a university could ful ll such a need.
Through a rigorous process, UDaP developed a set of 13 learning outcomes for undergraduate data acumen at the university level. The learning outcomes focused around three important themes of working with real data, communicating data driven results, and working with technology. Of the developed learning outcomes, ve focus on process and communication while eight focus speci cally on content. This breakdown re ects the changing needs of statistics education today.
This paper o ers the important initial step in nding common ground across disciplines. The creation of a working group of "change agents" from di erent disciplines on a university campus that are invested in furthering data acumen in students has been an invaluable asset to the project. Next steps for research could include the design of assessments and curriculum that could bridge disciplinary contexts as well as the development of curriculum and projects that foster collaboration among students and embody the learning outcomes (e.g., https://ww2.amstat.org/education/datafest/, https://www. causeweb.org/usproc/).
The authors hope that this study will persuade readers to consider doing something similar on their own campuses. The manuscript provides an example of how to have crossdisciplinary discussions which can be invaluable to creating opportunities for students to achieve data acumen.

A.1. Internal Working Group Faculty Survey
Welcome to Project Undergraduate Data Pathways (UDaP). As an initial step in our research, we would like to gather some feedback about your opinions and thoughts about data analysis and statistics at LMU and beyond. We greatly appreciate you taking the time to answer the questions below. This survey should take approximately 10 minutes to complete.  104, 204, and 360 and I would answer question 20 in the following way:

A.2. Background Information
Indicate the type of data that you believe helps students learn statistics best.
a. All constructed data b. Mostly constructed data c. Equal amounts of constructed and real data (360) d. Mostly real data (204) e. All real data (104) 22. Indicate the type of data that you use helps students learn statistics best. a. All constructed data b. Mostly constructed data c. Equal amounts of constructed and real data d. Mostly real data e. All real data 23. Indicate the method of computing numerical solutions to problems that you believe helps students learn statistics best. An examination of the processes by which public policy is formulated, implemented, and evaluated. Emphasis will be placed on policy planning and evaluation competencies.

PSYC 241/243/2001
Statistical Methods for Psychology (4 semester hours) Statistical concepts and methods related to psychological testing and research, including measures of central tendency, variability, hypothesis testing, analysis of variance, correlation, regression, non-parametric tests, and use of statistical software programs.
Prerequisite: Grade of C (2.0) or higher in PSYC 1000 (General Psychology).

PSYC 261/2002
Research Methods (4 semester hours) Introduces the basic principles of common psychology research methods and designs. Provides students with fundamental background for planning, conducting, and critiquing research in psychology. Emphasizes scienti c writing, including APA style, and data interpretation using descriptive and inferential statistics. This course is intended for the student who wishes to become more pro cient at developing and managing database applications. It is designed to provide an introduction to the conceptual foundations underlying database management systems, with an emphasis on its applications in business and organizations. The course begins with an introduction to the fundamental principles of database design-from data modeling to the actual implementation of a business application. Particular emphasis will be placed on the careful planning and analysis of business needs, which will lead to the appropriate development of an Entity-Relationship Model. Using these principles, each student will design and implement a database application using Access. This part of the course will employ lectures describing database theory, as well as hands-on tutorials demonstrating database concepts using Access. The second part of the course will further investigate the relational model, which is the basis for the most popular DBMS products on the marketplace today (i.e., Oracle, SQL Server, MS Access, Sybase). Topics to be studied include relational algebra, Structured Query Language (SQL), and maintaining data integrity in a relational design. In addition, important managerial concerns will be covered including database administration and the management of multi-user databases. Prerequisites: ACCT 3140 (Accounting Information Systems) or AIMS 2710 (Management Information Systems); BADM 1030 (Business Perspectives -Information Technology in Organizations) with a grade of C (2.0) or better

AIMS 4760
Analytics & Business Intelligence (3 semester hours) The course introduces students to the scienti c process of understanding, displaying, and transforming data into insight in order to help managerial decision makers do their job e ectively and make better, more informed, decisions. The nature of data/information used in the decision making process and the role of information technology in that process is discussed. The course focuses on data preparation and transformation, descriptive and predictive analytics, data mining, and data visualization and dashboards. An overview of prescriptive analytics is presented as well as the role of business analytics in the context of business intelligence. Hands-on learning is an important feature of the course. For each topic, a case analysis will require the use of Excel and/or other specialized data mining and analytics software to reinforce the underlying theoretical concepts. Students will gain knowledge in planning and conducting research as well as further advance their written communication skills. Students will critically evaluate published research. Students will use and apply various observation techniques such as narrative records, running records, time sampling, and event sampling to the understanding of child behavior and developmental processes. Students will demonstrate data analysis skills. Students will gain knowledge in the assessment of both typical and atypical development. Students will explore issues of professional ethics related to working with parents and teachers when special needs in children are identi ed and require intervention. Field experience will be required.

BIOL 367
Biological Databases (3 semester hours) Interdisciplinary course at the interface between biology and computer science focusing on how biological information is encoded in the genome of a cell and represented as data in a database. Biological concepts include DNA structure and function, the central dogma of molecular biology, and regulation of gene expression. Computer science concepts and skills include command line interaction, the structure and functions of a database, and the management of data ranging from individual les to a full relational database management system. Emphasis on science and engineering best practices, such as maintaining journals and notebooks, managing les and code, and critically evaluating scienti c and technical information. Course culminates with team projects to create new gene databases. This course is designed to teach students how to analyze and interpret quantitative data. It will demonstrate practical applications in addition to basic theory. The emphasis will be how and when to use (or not use) each method. We will apply these methods to actual data from biological, ecological, and public health applications. This course will also include the use of computer programs (SPSS, R) to apply tests to datasets. By the end of the course the student should have a good understanding of basic parametric and nonparametric statistical methods, their assumptions and applications, and how and when to apply them to di erent types of data This course provides an introduction to statistics emphasizing data analysis and applications to life sciences. Topics include: descriptive statistics, elementary probability, various discrete and continuous distributions, con dence intervals and hypothesis tests for means and proportions, correlation and linear regression, as well as analysis of variance. This course will also include the use of computer programs to analyze datasets. This course is designed to teach students how to analyze and interpret quantitative data. It will demonstrate practical applications in addition to basic theory. The emphasis will be how and when to use (or not use) each method. We will apply these methods to actual data from biological, ecological, and public health applications. This course will also include the use of computer programs (SPSS, R) to apply tests to datasets. By the end of the course the student should have a good understanding of basic parametric and nonparametric statistical methods, their assumptions and applications, and how and when to apply them to di erent types of data. Prerequisite: One year calculus and one year biology or consent of instructor.

MATH 560
Adv Topics in Probability & Stats (3 semester hours) Material to be covered will be determined by the instructor. Consult with the instructor for the speci c topics in probability and statistics that will be covered in any given semester. Basic concepts of probability and statistics that are fundamental to Design of Experiments (DOE). The key topics will include sampling, hypothesis testing [t-statistic, f-statistic, analysis of variance (ANOVA), p-value], experimental design matrices, full-factorial and fractional factorial designs, normal probability plots, factor level interactions, regression modeling. Case studies and a design project will be used to illustrate the methodology. Prerequisite: Undergraduate Calculus I and II.
Some courses listed in the table have multiple course numbers due to cross-listings or changing courses numbers during the study period.
Appendix C Community Survey. The following questions were asked in the external community survey.
For each statement below, please mark whether you agree, disagree, or are neutral that the statement described a data analysis skill that you believe a college graduate in today's society should have 6. Students should carry out and communicate results from an extensive data-driven project that is related to a real-life problem (extensive means that the project takes more than two weeks to complete and is worth at least 25% of the nal grade) 7. Students should be able to communicate their analyses and the interpretations of their results in a manner that is appropriate to their discipline in the context of the data 8. Students should be able to select appropriate methods for data analysis and explain limitations of their analyses and interpretations 9. Student should be able to formulate questions about multivariate data, collect multivariate data/consider multivariate data, analyzing multivariate data, and interpret results 10. Student should be able to use current statistical software, or statistical packages appropriate to the discipline and context beyond basic Excel or a calculator 11. Student should be able to write a program (using a programing language) to analyses data 12. Students should study at least one type of advanced data-analytic methods such as (not limited to) generalized linear models, Bayesian analysis, advanced probability theory and stochastic processes, non-linear models, machine learning, advanced study-design, big data analysis, econometrics, or statistical computing What is your primary professional role? If other, please specify.