Undergraduate Major in
Information and Data Sciences
Undergraduate Option Rep
Prof.
Konstantin Zuev
kostiacaltech.edu
Undergraduate Option Manager
Carmen NemerSirois
carmenscms.caltech.edu
The information and data sciences are concerned with the acquisition, storage, communication, processing, and analysis of data. These intellectual activities have a long history, and Caltech has traditionally occupied a position of strength with faculty spread out across applied mathematics, electrical engineering, computer science, mathematics, physics, astronomy, economics, and many others disciplines. In the last decade, there has been a rapid increase in the rate at which data are acquired with the objective of extracting actionable knowledge  in the form of scientific models and predictions, business decisions, and public policies. From a technological perspective, this rapid increase in the availability of data creates numerous challenges in acquisition, storage, and subsequent analysis. More fundamentally, humans cannot deal with such a volume of data directly, and it is increasingly essential that we automate the pipeline of information processing and analysis. All areas of human endeavor are affected: science, medicine, engineering, manufacturing, logistics, the media, entertainment. The range of scenarios that concern a scientist in this domain are very broad  from situations in which the available data are nearly infinite (big data), to those in which the data are sparse and precious; from situations in which computation is, for all practical purposes, an infinite resource to those in which it is critical to respond rapidly and computation must thus be treated as a precious resource; from situations in which the data are all available at once to those in which they are presented as a stream.
As such, the information and data sciences now draw not just upon traditional areas spanning computer science, applied mathematics, and electrical engineering  signal processing, information and communication theory, control and decision theory, probability and statistics, algorithms  but also a range of new contemporary topics such as machine learning, network science, distributed systems, and neuroscience. The result is an area that is new, fundamentally different that related areas like computer science and statistics, and that is crucial to modern applications in the physical sciences, social sciences, and engineering.
The Information and Data Science (IDS) option is unabashedly mathematical, focusing on the foundations of the information and data sciences, across its roots in probability, statistics, linear algebra, and signal processing. These fields all contribute crucial components of data science today. Further, it takes advantage of the interdisciplinary nature of Caltech by including a required set of application courses where students will learn about how data touches science and engineering broadly. The flexibility provided by this sequence allows students to see data science in action in biology, economics, chemistry, and beyond.
In addition to a major, the IDS option offers a minor that focuses on the mathematical foundations of the information and data sciences, but recognizes the fact that many students in other majors across campus have a need to supplement their options with practical training in data science.
Option Requirements
 Computer Science Fundamentals. CS 1; CS 2; and CS 38.
 Mathematical Fundamentals. Ma 2; Ma 3; Ma 108a; and Ma/CS 6ab or Ma 121ab. The analytical tracks of Ma1bc are required.
 Scientific Fundamentals. 18 units selected from the following courses Bi 8, Bi 9, Ch 21abc, Ch 24, Ch 25, Ch 41abc, Ph 2abc, or Ph 12abc. Advanced 100+ courses in Bi, Ch, or Ph with strong scientific component can be used to satisfy this requirement with approval from the option administrator, but cannot simultaneously be used to satisfy the “Applications of Data Science” requirement or the “Advanced Electives” requirement.
 Communication Fundamentals. SEC10; one of SEC1113.

Information and Data Science Core Requirements.
 Linear Algebra: ACM/IDS 104; ACM/EE 106a.
 Probability: ACM/EE/IDS 116.
 Statistics: IDS/ACM/CS 157.
 Machine Learning: CMS/CS/CNS/EE/IDS 155 or CS/CNS/EE 156a.
 Signal Processing: EE 111.
 Information Theory: EE/CS/IDS 160
 Applications Electives. At least 18 units from the following list: Ay 119, BE/Bi 103, Bi/BE/CS 183, BEM/Ec 150, CNS/Bi/EE/CS/NB 186, CS/EE/ME 134, EE/CNS/CS 148, Ec/SS 124, ESE 136, Fs/Ay 3, FS/Ph 4, Ge/Ay 117, Ge 165, HPS/Pl/CS 110, SS 228. Other courses that include applications of data science may be substituted with approval from the option coordinator. Courses used to fulfill this requirement may not also be used to fill the any requirement above.
 Advanced Electives. At least 54 units from the following list: IDS courses numbered 100 or above, CS/CNS/EE 156ab, ACM 106b, ACM 95/100ab. Courses used to fulfill this requirement may not also be used to fill the any requirement above.
Courses used to fulfill requirements in the “Applications of Data Science” and Advanced Electives” requirements cannot be used to fulfill the institute humanities and social sciences requirements.
Units used to fulfill the Institute Core requirements do not count toward any of the option requirements. Pass/fail grading cannot be elected for courses taken to satisfy option requirements. Passing grades must be earned in total of 486 units, including all courses used to satisfy the above requirements.
Double Majoring
Students interested in simultaneously pursuing a degree in a second option must fulfill all the requirements of the Information and Data Sciences option. Courses may be used to simultaneously fulfill requirements in both options. However, it is required that students have at least 54 units of “Advanced Electives” and 18 units of “Applications of Data Science” that are not simultaneously used for fulfilling a requirement of the second option, i.e., the requirements of the Advanced Electives and the Applications of Data Science sections must be fulfilled using courses that are not simultaneously used for fulfilling a requirement of the second option. Any proposal to replace these courses must be discussed with the option administrator. To enroll in the program, the student should meet and discuss his/her plans with the option representative. In general, approval is contingent on good academic performance by the student and demonstrated ability for handling the heavier course load.
Advising
Starting in the sophomore year IDS students will be assigned a faculty advisor whom they should meet with regularly, typically once per quarter. Students in the program are advised by faculty interested in the information and data sciences from across the institute. This includes all the CMS faculty, as well as the following faculty that pursue data sciencerelated research and participate in IDS advising: Justin Bois, Fernando Brandao, Shuki Bruck, George Djorgovski, Laura Doval, Frederick Eberhardt, Federico Echenique, Babak Hassibi, Jonathan Katz, Victoria Kostina, Heather Knutson, Tom Miller, Pietro Perona, Antonio Rangel, Mark Simons, Omer Tamuz, Andrew Thompson, Matt Thomson, Victor Tsai, David Van Valen, Zhongwen Zhan. Students seeking an IDS advisor should contact the undergraduate option secretary at academics@cms.caltech.edu.
Study Abroad Requirements
Students interested in studying abroad must fulfill the 'Information and Data Science Core Requirements' via enrollment in courses at Caltech, i.e., these courses cannot be substituted by courses taken abroad. Substitutions of equivalent courses are allowed for other requirements at the approval of the Option Administrator.
Typical Course Schedule
Units per term  
Second Year  1st  2nd  3rd  

CS 1 
Intro. to Computer Programming

9     
CS 2 
Intro. to Programming Methods

  9   
CS 38 
Algorithms

    9 
Ma 2  Differential Equations  9     
Ma 3  Intro. to Probability and Statistics    9   
Ma/CS 6 ab  Intro. to Discrete Methods  9  9   
ACM/IDS 104  Applied Linear Algebra  9     
HSS Electives  9  9  9  
Scientific Fundamentals    9  9  
Other Electives      9  
45  45  36  
Third Year  1st  2nd  3rd  
SEC 10  Technical Seminar Presentations    3   
CMS/CS/CNS/EE/IDS 155  Machine Learning & Data Mining    12   
One of SEC 1113  Written Technical Communication in Engrng and Appl Sci      3 
Ma 108 a  Classical Analysis  9     
EE 111  SignalProcessing Systems and Transforms  9     
IDS/ACM/CS 15  Statistical Inference      9 
ACM/EE/IDS 116  Intro. to Probability Models  9     
HSS Electives  9  9  9  
Advanced Electives  9  9  9  
Applications Electives    9    
Other Electives      9  
45  42  39  
Fourth Year  1st  2nd  3rd  
ACM/EE 106 a  Intro. Methods of Computational Math.  12     
EE/CS/IDS 160  Fundamentals of Information Transmission and Storage    9   
Advanced Electives  9  9  9  
Applications Electives  9  9    
HSS Electives  9  9  9  
Other Electives  9  9  18  
48  45  36 