CS 4641\7641 A|B: Machine Learning (Fall 2022)

Course Information

  • 4641A: Tuesdays and Thursdays,     9:30am - 10:45am EST | Location: College of Business 100
  • 4641B: Mondays and Wednesdays,  3:30pm - 4:45pm EST   | Location: Clough Commons 144
  • 7641A: Tuesdays and Thursdays,      2:00pm - 3:15pm EST  |  Location: College of Business 100
  • Edstem: https://edstem.org/us/courses/25005
Instructor:
Mahdi Roozbahani
(mahdir@gatech.edu)
Head TA:
Ruijia Wang
(rwang@gatech.edu)
Head TA:
Rusty Utomo
(rutomo6@gatech.edu)
Head UTA:
Kevin Li
(kli361@gatech.edu)
Head UTA:
Gururaj Deshpande
(gurudesh@gatech.edu)
TA:
Sidharta Vadaparty (Sid)
(svadaparty3@gatech.edu)
TA:
Rohit Das
(rohdas@gatech.edu)
TA:
Mengyang Liu
(mengyang.liu@gatech.edu)
TA:
Sidhesh Desai
(sidhesh@gatech.edu)
TA:
Meghana Deepak
(mdeepak3@gatech.edu)
TA:
Bryan Zhao
(bzhao90@gatech.edu)
TA:
Dimitri Adhikary
(dadhikary3@gatech.edu)
TA:
Daanish M Mohammed
(dmohammed7@gatech.edu)
TA:
Patrick Crawford
(pcrawford9@gatech.edu)
TA:
Lintong Han
(lhan63@gatech.edu)
TA:
Arvin Poddar
(apoddar32@gatech.edu)
TA:
Yao Duo
(douy@gatech.edu)
TA:
Adarsh Honawad
(avishwanath30@gatech.edu)
TA:
Kasturi Gottivedu Shriniwas
(kshriniwas3@gatech.edu)
TA:
Ruohao Guo
(rguo48@gatech.edu)
TA:
Taylor Del Matto
(tmatto3@gatech.edu)
TA:
Venetis-Paraskevas Pallikaras
(vpallikaras3@gatech.edu)
TA:
Ethan Mendes
(emendes3@gatech.edu)
TA:
Jason Wu
(jwu341@gatech.edu)
TA:
Anirudh Prabakaran
(aprabakaran3@gatech.edu)
TA:
Faizan Hasan
(fhasan8@gatech.edu)
TA:
Ganesh Murugappan
(gmurugappan3@gatech.edu)
TA:
Krish Nathan
(knathan8@gatech.edu)
TA:
Charles Snider
(csnider32@gatech.edu)
TA:
Kabir Doshi
(kdoshi36@gatech.edu)
TA:
Arjun Agarwal
(aagarwal434@gatech.edu)
TA:
Mehul Kalia
(mkalia6@gatech.edu)
TA:
Ziyuan Cao
(zcao300@gatech.edu)
TA:
Srishti Jain
(sjain443@gatech.edu)

Course overview

  1. Basic math for data science and machine learning

    • Linear algebra
    • Probability and statistics
    • Information theory
    • Optimization
  2. Unsupervised machine learning for data exploration

    • Clustering analysis
    • Dimensionality reduction
    • Kernel density estimation
  3. Supervised learning for predictive data analysis

    • Tree-based models
    • Support vector machines
    • Linear classification and regression
    • Neural networks

Prerequisites for this course include (1) basic knowledge of probability, statistics, and linear algebra; (2) Basic programming experience in Python.

In addition to the technical content, this class includes the following learning objectives:

  • Structuring a task into a machine learning work flow
  • Collaborating effectively on team projects in a remote environment
  • Conducting peer evaluation in a constructive format
  • Communicating technical content in a concise and effective manner

CSE students should note that CS 7641 is not allowed as a substitute for the CSE core course CSE 6740, and that they cannot get credit for both CSE 6740 and CS 7641.

Schedule

Wk Dates Topics Homework (HW) Quizzes Project Readings
1 Aug 22-26 *Course Overview ( L1)
*Data analysis toolbox - P1( L2)
*Broadcasting toolbox (L2)
   Q0 - L1&2

  GT Honor Code
 
Heilmeier catechism;
Visual Information Theory by Chris Olah;
GitHub Pages;
YAML Configuration;
NumPy Tutorial;
Matplotlib Tutorial;
seaborn: statistical data visualization;
Overleaf for GT students;
2 Sept 29-2 *Linear Algebra (*Notes, L3)
*Prob and Stats (*Notes, L4)


    Q1 - L3&4
  Correlation vs Covariance
Linear Algebra Review by Zico Kolter
More about Cross Entropy and KLD
Probability Theory Review by Andrew Moore
3   5-9 Labor Day ( Sept 5)
*Info Theory (*Notes, L5)
 A1 out
Sept 5
  Q2 - L5
  The Differences Between Data, Information and Knowledge
Cross Entropy as loss function
4   12-16 *Optimization (L6)
*Toolbox - P2 ( L7 )
   Q3 - L6&7
Project team composition due 
Sept 16 AOE
KKT for inequality constrained optimization;
Why Cross Entropy over MSE for Classification;
Gradient Descent short video
Matplotlib Tutorial
NumPy Tutorial
5 19-23 *Clustering & K-Means (*Notes, L8)
*GMM - Part 1 (*Notes, L9)
 A1 due
Sept 23 AOE
A2 out
Sept 23
 Q4 - L8 
  Curse of dimensionality (Euclidean space example);
Jupyter Notbook (Kmeans and DBSCAN);
6 26-30 *GMM - Part 2 (*Notes, L10)
*Hierarchical Clustering (*Notes, L11)
     Q5 - L9&10&11 
  Understanding the concept of Hierarchical clustering Technique
Dendrogram Visualization
GitHub Student Application
7 Oct  3-7 *DB SCAN (*Notes, L12)
*Clustering Eval (*Notes, L13)
   Q6 - L12&13 
 

Project proposal due
Oct 7 AOE
Peer Evaluation
OCT 7 AOE

Jupyter Notbook (Kmeans and DBSCAN)
8   10-14 *Density Estimation (*Notes, L14)
*Dimension Reduction (*Notes, L15)

 

 Q7 - L14&15 
  KDE interactive visualization
KDE sampling
KDE SKLearn and sampling
Jupyter Notebook Kernel Density Example
Image reconstruction using PCA
Feature extraction using PCA
PCA for images
PCA as linear combination of features
PCA and Linear Discriminant Analysis
9 17-21 Fall Break (Oct 17 and 18)
*Project Practical Advice (*Notes, L16)
A2 Due
Oct 21 AOE A3 out
Oct 21 
     
10 24-28 *Linear Regression (*Notes, L17)
*LR contd (*Notes, L18)
   Q8 - L17&18
  Simple Linear Regression in Matrix Format;
Adding Noise to Regression Predictors;
11 Nov  31-4 *Regularization (*Notes, L19)
*NB &  Logistic Reg (*Notes, L20)
   Q9 - L19 
   
11   7-11 *NB & Logistic Reg (*Notes, L21)
*Neural Networks (*Notes, L22)
A3 Due
Nov 11 AOE A4 out
Nov 11  
 Q10 - L20&21 
 Project midpoint report
Nov 11 AOE
Peer Evaluation
Nov 11 AOE
NN Playground ;
Interactive NN initialization ;
The role of a hidden layer;
Back propagation numerical example;
More detailed introduction;
12   14-18 *NN contd(*Notes, L23)
*CNN (*Notes, L24)

 Q11 - L22&23&24 
  CNN Live Demo;
A guide to an efficient way to build CNN and optimize its hyper-parameters;
Back Propagation in CNN;
Transfer learning in CNN;
Project Scoring Guidance;
13 21-25 *DT and *RF (Notes *DT & *RF, L25)
Thanksgiving break (Nov 23 to 25)
       
15 Dec 28-2 *SVM - Part1  (Notes, L26)
*SVM - Part2 (Notes, L27)
 A4 Due
Dec 2
 Q12 - L25
KKT and SVM
16   5-9 *SVM-Kernel (Notes, L28)
*Ethics in ML (L29)
Final Class days (Dec 5 and 6) 
   Q13 - L26&27&28  
Dec 6
AOE
Final Project Due and Peer Evaluation
Dec 6 AOE
 
 

Course policies

  • Attendance: Our class will be offered on campus for both Undergrad (4641) and Grad (7641). Lectures might be recorded IF class has the recording system. Any class that I am able to record [which sometimes does not work even if we have the recording system in place], I will make it available to all students (both undergrad and grad) by the end of the day. The attendance is required for both undergrad and grad. Having students in the class helps me and my students A LOT to work with each other for a better environment to facilitate learning. Trust me it will be fun and you will give me a lot of energy to teach better. The fact that you need to listen to the lectures without fast-forwarding me can help you to learn the materials much better and you will have the chance to ask questions if you are confused anywhere in the lectures. Also, the class attendance will be counted toward your class participation at the end of semester.
  • Class deliverables: All class deliverables will be handled via Gradescope except quizzes which will be on Canvas. The time span offered to complete the course objectives is plentiful and deadlines will not be extended under any circumstances. To ensure the class is fair for all students, you will receive zero credit for work submitted after the deadline. Regrade requests should be submitted directly on Gradescope within a defined period after grade publication (we will inform you on that; we only provide a 3 day for the regrade request). Should you find yourself in an impasse with the TA responsible for your grading, feel free to contact the head TA or course instructor on Edstem.
  • Ed Discussion:
    • Ed Discussion will be the main and only place for the course discussions and announcements. If you have questions, please ask it on Ed Discussion first because 1) other students may have the same question; 2) you will get help much faster.
    • For public homework specific questions, PLEASE use the appropriate TA created mega threads instead of creating a new individual thread.
    • If it’s something you do not like to discuss publicly on Ed Discussion, you can send private messages on Ed Discussion.
    • If course staff needs to communicate with specific students (i.e. members of a project team), the Ed Chat feature of Ed Discussion will be used. Students can benefit from this feature to communicate with other students. e.g., to discuss forming a project.

      IMPORTANT: Everyone must ensure that the notification setting is on for both Ed Discussion and its Ed Chat feature to stay up to date with the class requirements and prevent losing points because of missing updates and announcements on Ed Discussion.

    • Ed Discussion GOOD questions
      • I don't understand this part of the lecture, can you explain it to me?
      • This certain part of the hw is not clear to me, would it be possible to explain that more?
      • I have a question about the project ...
      • I found an issue on the website, hw or the lectures, can you clarify ...
      • Any feedback, suggestions, ... would be greatly appreciated.
      • Historically, most of the questions were good.
    • Ed Discussion BAD questions
      • Can you debug my code? [our team will not do that. You need to be specific about your question]
      • Can you find where the problem is in my code?
  • Exceptional circumstances: Any request for exceptions to these policies should be made in advance when at all possible. Requests should be due to incapacitating illness, personal emergencies, or similarly serious events. Your request MUST be accompanied by a supporting letter issued by the Dean of Students before contacting us.

Diversity and inclusion

Just as machine learning algorithms cannot accomplish complex tasks if trained on datasets of limited variability, our course cannot be successful without appreciating the diversity of our students. In this class we aim to create an environment where all voices are valued, respecting the diversity of gender, sexuality, age, socioeconomic status, ability, ethnicity, race, and culture. We always welcome suggestions that can help us achieve this goal. Additionally, if any of our class scheduled activities conflicts with religious events, please inform the instruction team so that we can make appropriate arrangements for you.

Students with disabilities: your access to this course is extremely important to us. The institute has policies regarding disability accommodation, which are administered through the Office of Disability Services: http://disabilityservices.gatech.edu. Please request your accommodation letter as early in the semester as possible, so that we have adequate time to arrange your approved academic accommodations.

Office hours and questions

Office hours will start on the second week of clases in ahybrid mode for both online and in-person. Please follow the instruction on this Excel Sheet [Undergrad] [Grad] to signup for a 10-minutes slot with one of the TAs. If you require more than ten minutes, please advise the TAs. They’ll return to your Zoom meeting once they have completed their appointments with other students. You just need to add your name, question of interest and your Zoom meeting link. Please do not change the other part of the Excel Sheet. The TA meetings are designed to be one-on-one. Please do not join another student’s Zoom meeting. The sole exception to this policy being discussions about the project, in which your fellow team members can also join. In-person office hours are only available by appointment and will likely be held outdoors, in line with the aforementioned Georgia Tech's and CDC guidelines with respect to preventing the spread of the coronavirus.

Office Hour Rules and Guidelines

There are two types of slots in the Office Hour Spreadsheet: reserved slots and waitlist slots

  • Reserved Slots: Students are allowed to hold ONE pending reserved time slot at any time.
    • Let's say it's Tuesday night. Student A signs up for a Wednesday OH slot from 10:00--10:10 AM. Now, student A may NOT put their name on any other reserved time slot. Once the 10:00--10:10 AM OH session has finished, then student A may sign up for another available reserved OH time slot.
    • There is no limit to how many OH sessions a student can attend, but we require you to hold only one active/pending reserved time slot at a time.
  • Waitlist slots: You are allowed to sign up for multiple waitlist slots per day, but you can only sign up for one slot per TA session.
    • At the start of the TAs OH, if there are regular OH slots available and students on the waitlist, the TA may bump up the waitlisted student to one of the open reserved OH slots.
    • If there are no reserved slots available, a TA may assign an estimated time slot or take the student based on their availability. It is possible that a TA cannot get to any or all of the students on a waitlist.
We have these rules in place so that students can get additional OH help if needed and also to allow availability to a larger subset of students once OH gets busier closer to HW deadlines.

Grading

  • Assignments (50%)

    • There will be four assignments. Each one is designed to improve and test your understanding of the materials. Assignments will have both programming and written analysis components.
    • You will need to submit all your assignments using Gradescope. Instructions on how to submit your code and written portions will follow with every assignment. Handwritten solutions WILL NOT BE ACCEPTED and you will not receive credit for a handwritten submission.
    • You are required to use Markdown, Latex (watch the tutorial created by our own team [Undergrad Access] [Grad Access] and OverLeaf Latex Example in the Video), or a word processing software to generate your solutions to the written questions. Because handwritten solutions WILL NOT BE ACCEPTED.
    • All assignments follow the “no-late” policy. Assignments received after the due date and time will receive zero credit.
    • [*IMPORTANT] All students are expected to follow the Georgia Tech Academic Honor Code. Because of the large size of our class, if we observe any (even small) similarity\plagiarisms detected by GradeScope or our TAs, WE WILL DIRECTLY REPORT ALL CASES TO OSI, which may unfortunately lead to a very harsh outcome.
    • You can easily export your Jupyter Notebook to a Python file and import that to your desired python IDE to debug your code for assignments.
    • You are NOT allowed to share or discuss ANY assignment codes, information or answers with other students. Edstem is the best place to have discussion regarding assignments and course topics. Discussions can be on a whiteboard level with other students such as high level conceptual questions (i.e. what is independency in Naive Bayes model)
    • We have 4 big assignments in total. The reason we do not call them project, because our class has a project as well. Consider each assignment as one individual big project. Assignments take time to finish them. YOU NEED TO START WORKING ON ASSIGNMENTS AS SOON AS THEY ARE OUT. Visit this course's Canvas and GradeScope for the assignment documents. See the schedule table above for deliverable due dates. (Topics are subject to change)
      • [12.5%] HW1: Linear Algebra, Probability and Statistics, Maximum Likelihood Estimation, Optimization, Information Theory
      • [12.5%] HW2: KMeans, Expectation Maximization, Gaussian Mixture Model, Clustering Evaluation
      • [12.5%] HW3: Singular Value Decomposition, Principal Component Analysis, Linear Regression, Regularization, Naive Bayes
      • [12.5%] HW4: Decision Trees, Random Forest, Support Vector Machine, Neural Networks, CNN

  • Project (30%)

    • Proposal (5%)

      • A project proposal should be written on your GitHub page. It is also a good starter to come up with the first draft of your project.
      • You need to provide us the link to your GitHub page. Make sure your GitHub repository is private.
      • It should be less than 500 words single spaced. References are not the part of the word count.
      • A project proposal should include:
        • Introduction/Background: A quick introduction of your topic and mostly literature review of what has been done in this area. You can briefly explain your dataset and its features here too.
        • Problem definition: Why there is a problem here or what is the motivation of the project?
        • Methods: What algorithms or methods are you going to use to solve the problems. (Note: Methods may change when you start implementing them which is fine). Students are encouraged to use existing packages and libraries (i.e. scikit-learn) instead of coding the algorithms from scratch.
        • Potential results and Discussion (The results may change while you are working on the project and it is fine; that's why it is called research). A good way to talk about potental results is to discuss about what type of quantitative metrics your team plan to use for the project (i.e. ML Metrics).
        • At least three references (preferably peer reviewed). You need to properly cite the references on your proposal. This part does NOT count towards word limit.
        • Add proposed timeline from start to finish and list each project members' responsibilities. Fall and Spring semester sample Gantt Chart. Note, this part does NOT count towards word limit, and the dates are NOT reflective of true deadlines. Please refer to the schedule table above.
        • A contribution table with all group members' names that explicitly provides the contribution of each member in preparing the project task.This part does NOT count towards word limit.
      • A checkpoint to make sure you are working on a proper machine learning related project. You are required to have your dataset ready when you submit your proposal. You can change dataset later. However, you are required to provide some reasonings why you need to change the dataset (i.e. dataset is not large enough because it does not provide us a good accuracy comparing to other dataset; we provided accuracy comparison between these two datasets). The reasonings can be added as a section to your future project reports such as midterm report.
      • Your group needs to submit a presentation of your proposal. Please provide us a public link which includes a 3 minutes recorded video. I found that OBS Studio and GT subscribed Kaltura are good tools to record your screen. Please make your visuals are clearly visible in your video presentation.
      • 3 MINUTE is a hard stop. We will NOT accept submissions which are 3 minutes and one second or above. Conveying the message easily while being concise is not easy and it is a great soft skill for any stage of your life, especially your work life.
    • Midterm report (10%)

      • A checkpoint to make sure that you have had major progress in your project. You will add information to your project Proposal and turn it into your midterm report.
      • You need to provide us the link to your GitHub page. Make sure your GitHub repository is private.
      • The midterm report does not have a word count limitation.
      • A project midterm report is quite similar to your proposal with the exception of having actual results instead of potential ones:
        • Introduction/Background
        • Problem definition
        • Data Collection
        • Methods
        • Results and Discussion
          • All groups should have their dataset cleaned at this point
          • We expect to see data pre-processing in your project such as feature selection (Forward or backward feature selection, dimensionality reduction methods such as PCA, Lasso, LDA, .. ), taking care of missing features in your dataset, ...
          • We expect to see at least one supervised or unsupervised method implemented and the results need to be studied in details. For example evaluating your predictive model performance using different metrics (take a look at ML Metrics)
        • An updated contribution table from with all group members' names that explicitly provides the contribution of each member in preparing the project task.
      • You do not submit any video recording for the midterm report.
    • Final report (15%)

      • You need to provide us the link to your GitHub page. Make sure your GitHub repository is private.
      • A final report should include:
        • Introduction/Background
        • Problem definition
        • Data Collection
        • Methods
        • Results and Discussion (We expect to see multiple predictive models and your team need to compare them together and evaluate the results. If your team is working on a Deep learning project, you could finely tune hyperparameters and explain how it could improve the results or you could employ different architectures or methods)
        • Conclusions
        • An updated contribution table from with all group members' names that explicitly provides the contribution of each member in preparing the project task.
      • Your group needs to submit a presentation of your final report. Please provide us a public link which includes a 7 to 9 minutes recorded video. I found that OBS Studio and GT subscribed Kaltura are good tools to record your screen. Please make sure your visuals are clearly visible in your video presentation.
      • Ideally, we would like to see a 7 minute video, but we understand that some groups may find this difficult. Therefore we are allowing 9 minutes as the maximum hard stop time limit for the final video. We will NOT accept submissions which are 9 minutes and one second or above.

    • Sample Projects


    • General project guidance

      • Your project will be graded based on the following criteria:
      • Was the motivation clear?
        • What is the problem?
        • Why is it important and why we should care?
        Were the dataset and approach used effectively?
        • How did you get your dataset?
        • What are its characteristics (e.g. number of features, # of records, temporal or not, etc.)
        • Why do you think your approach can effectively solve your problem?
        • What is new in your approach?
        Were the experiments, results, and conclusion satisfactory?
        • How did you evaluate your approach?
        • What are the results?
        • How do you compare your method to other methods?
        How was the presentation in general?
        • Finished on time?
        • Effective visualizations? (Are they relevant? Do they help you better understand the project's approaches and ideas?)
        • Use of text (Succinct or verbose?)
      • Undergrad students can ONLY team up with Undergrad Students (either section A or B), and Grad students can ONLY team up with Grad students. If you are in a Grad students team, you are required to have both unsupervised and supervised learning in your project. I highly recommend Undergrad students to use both unsupervised and supervised learning in your project. However, if you were to pick one, please go with supervised learning.
      • In order for you to obtain hands-on experience applying the topics covered in this course, you are expected to complete a term project utilizing real-world data. The project will encompass both unsupervised and supervised learning.
      • Each project needs to be completed in a team of five people (you will be forming your team on your own. In case you cannot find a team, we will randomly assign you a team). Team members need to clearly claim their contributions in the project report. Once your teams have been formed and you have selected a topic, you will be assigned a mentor, who will provide you with general guidance on your project. It is important to note that your team will lead the project effort: obtaining the data, researching data-driven approaches to accomplish your project goal and coordinate your own activities. The role of the mentor is solely to advise you, should you find yourself stuck and unable to make progress. We also accept a team of four, if you really cannot find the fifth team member.
      • You will create a GitHub page page for your project, which you will use to publish your main deliverables. There will be three deliverables published to your GitHub: a proposal, a midterm checkpoint, and a final report.
      • Seminars: To help you conduct your project successfully, We have project seminars where one or two TAs will present their ML projects, prior students' projects, research or industrial projects. Doing so, you will gain a good sense of what it is being done in both Academia and Industry. Besides that, students can ask general questions about their class project and how to improve that in each seminar. Seminars will be streamed online and recorded and they will be published on the course website. Similar to the class lectures, Please ensure that you join to these seminars and get yourself familiar with the practical and real-world application of ML. We will have Edstem post for each seminar, its exact time, and joining information.
      • Google colaboratory allows free access to run your Jupyter Notebook. I strongly suggest you use it for your project, especially for teams that are going to employ Deep Learning. Don't forget to take advantage of Google Cloud Platform and AWS Educate as well.

    • Project Peer Evaluation

      • More information will be added on how Project Peer Evaluation will affect each team member's final project grade. Stay tuned.

  • Quizzes (15%)

    • There will be 13 quizzes throughout the semester.
    • We will consider your top 10 quizzes' scores. Each quiz will have 1.50% of your final score.
    • [*IMPORTANT] All quizzes are mandatory to be taken even if they do not count toward your final grade. If you miss a quiz, we will deduct your score from your Class Participation score. Let's say you miss taking one quiz; we will reduce 1.5% from your class participation score. If you miss 4 quizzes, you will lose all your class participation score, which is 5%, and we will NOT go beyond that. (if you miss 5 quizzes, we only deduct 5% from your class participation score, not 6.5%). Your class participation score can be zero at its lowest, and it won't go to a negative number.
    • The topic of each quiz will coincide roughly with the content covered in class on that week.
    • Quizzes will have a duration of seven-minutes for Undergrad students and six-minutes for Grad students. Each quiz will have five multiple choice questions . All quizzes will be released on Thursdays weekly at 6:00 pm EST and the deadlines will be on Fridays AOE. As of now Quiz 13 is the only exception, but please check the class webiste every week for any updates. To check deadlines for Quizzes, ensure to check the class schedule table. Any possible changes on quizzes dates will be reflected on our course schdule page. Please make sure to check our class website before taking the quiz. Quizzes have 48 hours "grace period" without any penalty.
      • If a student decides to make a submission during the grace period, they are responsible for all issues associated with that submission and
      • Course staff support is not guaranteed during the grace period; We provide help only when available.
      • You do not need to ask before using the grace period.
    • Quizzes measure your understanding of the topics and they will be mostly conceptual questions.
    • Quizzes' answers will be released as soon as all our students took them including our ODS students. Please do not ask any questions about a quiz that you just take on Edstem before we release the answers.
    • Quizzes questions are selected randomly from our question bank, which means that students will not receive the same questions for their quiz.

  • Class participation (5%)

    • Edstem has statistics which give us many measurements regarding how much a student has been involved on Edstem's activities such as viewing posts, answering questions, asking questions and so on. We use this to account for your Class Participation score. We also will add class attendance to this score. At the end of the semester, we will define a minimum and maximum number of involvement considering all the students and your grade will be defined based on that.
    • We will RELEASE the class participation score on the last day of the class when we have all the score for projects, quizzes and assignments. If you ask us what is my participation score before the last day of the class; we will say we do not know. So please be patient.

  • Bonus points (up to 8%)

    • About bonus points: Bonus points will be counted to always be beneficial for your final grade. More information on bonus points for assignments will be provided as the semester progresses. If it becomes necessary to curve grades, bonus points will be applied after curving, not before.
    • Undergrad and grad: You can obtain up to 5% bonus points by answering the challenging questions we may have in some of the HWs.
    • Undergrad: You will notice that we have bonus points for all the hws, where grad students are required to answer those questions, but it will be optional for undergrad students. You will receive up to 3%, if you answer those questions. Note that these are different than the challenging questions. Challenging questions are bonus for both grad and undergrad.
    • How does it work? For example, hw 1 may have 30 bonus points, hw 2 may have 20 bonus points and so on. If you receive all the bonus points for all your hws, we will add 5% to your final grade. If you are an undergrad and you answer all the challenging and Grad students questions, you will receive 8%.
    • Note and Example: There's a cap to how much extra credit you can get, so it is (bonus points earned)/(total bonus points available throughout the entire semester). Let's say by the end of the semester there was a total of 100 bonus for all points (100 is just a number we are randomly choosing here) between hw1, hw2, hw3, hw4, and you earned 20 bonus for all points for the whole semester, then at the end of the semester your grade will be bumped up by 5% * 20/100 = 1% from the bonus for all points. The calculation is similar for the 3% bonus for undergrad points.

  • Grade Calculator

    • Grade calculation can be slightly complicated considering we have different types of bonus questions. Our last semester students created this Grade Calculator Excel Sheet. Please give it a try to calculate your grade along the way.

COVID-19 Policy

This semester is challenging due to the ongoing Covid-19 pandemic and a growing awareness of inequities.  Please review the most up-to-date information relates to specific services and guidelines for courses during this semester at TECH Moving Forward website and in the Academic Restart Frequently Asked Questions.  

Resources

No textbook will be required for this course, however you are strongly encouraged to complete the readings indicated for each class. You may also find the following books very helpful:

Other resources, such as machine learning toolboxes and datasets, will be provided throughout the course.

Dataset Ideas (may need API, or scraping) - Thanks to Polo and everyone who contributed with suggestions to these datasets