This concentration will equip students to learn about the world through data analytics.

The core courses related to this concentration are: CS 4740: Natural Language Processing, CS 3780: Machine Learning for Intelligent Systems, CS 4786: Machine Learning for Data Science , STSCI 3740: Data Mining and Machine Learning , and ORIE 4740: Statistical Data Mining I

Some faculty members whose research is related to this concentration include: Solon BarocasCristobal CheyrePaul GinspargThorsten JoachimsJon Kleinberg, René KizilcecLillian LeeDavid Mimno; Helen Nissenbaum, Jeff Rzeszotarski, Matthew Wilkens, Aditya Vashistha, and David Williamson

Career Paths

  • Professional positions requiring data analytics and statistical analysis combined with communication in commercial, academic and public service settings. 
  • Ex: Data Journalist, who finds questions in the real world, collects and analyzes data to address that question, and then explains the implications of that analysis in accessible terms.
  • Related Job Titles: data scientist, statistician, data engineer, machine learning engineer


Please reference the Cornell Class Roster for details on the courses below.

  1. For your primary concentration choose one course from A,B,C, and D. Fall 2021, 2022, and 2023 semesters only: You may take one course from A, B, and C; and an additional course from any of the four categories in place of category D. All other semesters, take one course from each category (A, B, C, and D). Specifically, if you are trying to fulfill category D in Fall 2024, courses from categories A, B, and C will not count.
  2. If you count Data Science as your secondary concentration, then choose one course from B, C, and D.

A. Data Analysis (choose one)

  • INFO 3300: Visual Data Analytics for the Web (previously Data-Driven Web Applications)
  • INFO 3900: Causal Inference
  • INFO 3950: Data Analytics for Information Science
  • CS 3780 (previously 4780): Machine Learning for Intelligent Systems
  • CS 4786: Machine Learning for Data Science
  • ORIE 4740: Statistical Data Mining I
  • ORIE 3741 (previously 4741): Learning with Big Messy Data
  • STSCI 3740 (previously 4740): Data Mining and Machine Learning

B. Domain Expertise (choose one) 

  • INFO 2770: Excursions in Computational Sustainability
  • INFO 3350: Text Mining for History and Literature
  • INFO 3370: Studying Social Inequality Using Data Science
  • INFO 4100: Learning Analytics
  • INFO 4120: Ubiquitous Computing
  • INFO 4300: Language and Information
  • INFO 4350: Conversations and Information
  • INFO 4940: Special Topics - How LLMs work, Their Potential, and Limitations
  • INFO 4940: Special Topics - Advanced NLP for Humanities Research
  • CS 4740: Natural Language Processing

C. Big Data Ethics, Policy and Society (choose one) 

  • INFO 3200: Technology, Behavior, and Society (previously New Media and Society)
  • INFO 3561: Computing Cultures
  • INFO 4145: Privacy and Security in the Data Economy
  • INFO 4200: Information Policy: Applied Research and Analysis
  • INFO 4240: Designing Technology for Social Impact
  • INFO 4250: Surveillance and Privacy
  • INFO 4270: Ethics and Policy in Data Science
  • INFO 4390: Practical Principles for Designing Fair Algorithms
  • INFO 4561: Stars, Scores, and Rankings: Evaluation and Society
  • INFO 4940: Special Topics - U.S. Copyright Law
  • ​INFO 4940: Special Topics - Technology and Social Change Practicum
  • INFO 4940: Special Topics - Building Inclusive Computing Organizations
  • ​INFO 4940: Special Topics - Computing on Earth: Extraction, Consumption, and the Material Ethics of Computing
  • INFO 4940: Special Topics - Law, Policy, and Politics of Cybersecurity
  • INFO 4940: Special Topics - Law, Policy, and Politics of Artificial Intelligence (AI)
  • COMM 4242: The Design & Governance of Field Experiments
  • ENGL 3778: Free Speech, Censorship, and the Age of Global Media
  • PUBPOL 3460: Culture, Law, and Politics of Information Policy
  • STS 3440: The Data Science & Society Lab

D. Data Communication (choose one) 

  • INFO 3312: Data Communication
  • INFO 4310: Interactive Information Visualization
  • COMM 3150: Organizational Communication: Theory and Practice
  • COMM 3189: Taking America's Pulse: Creating and Conducting a National Opinion Poll
  • COMM 4200: Public Opinion and Social Processes
  • COMM 4860: Risk Communication
  • COMM 4940: Data and Technology for Organizing
  • GOVT 2169: Survey Data in the Information Age
  • SOC 3580: Big Data on the Social World