CSCA 5622: Introduction to Machine Learning:ÌýSupervised Learning
ÌýÌýWork you complete in the non-credit experience will transfer to the for-credit experience when you upgrade and pay tuition. See How It Works for details.
Cross-listed with DTSA 5509
ÌýÌýImportant Update: Machine Learning Specialization ChangesÌýÌý
A previous version of Machine Learning: Theory and Hands-On Practice with Python Specialization (taught by Professor Geena Kim) has been retired and is now replaced with a new and improved version (taught by Professor Daniel Acuna) that reflects the latest advancements in the field. The new version is available Spring 1, 2026 Session.
- Course Type: Breadth (MS-CS) Pathway|Breadth (MS-AI)
- Specialization: Machine Learning: Theory & Hands-On Practice with Python
- Instructor: Dr. Daniel Acuna, Instructor
- Prior knowledge needed:
- Programming languages: Basic to intermediate experience with Python, Jupyter Notebook
- Math: Basic level Probability and Statistics, Linear Algebra
- Technical requirements: Windows or Mac, Linux, Jupyter Notebook
Learning Outcomes
- Use modern machine learning tools and python libraries.
- Explain how to deal with linearly-inseparable data.
- Compare logistic regression’s strengths and weaknesses.
- Explain what decision tree is & how it splits nodes.
Course Grading Policy
| Assignment | Percentage of Grade | AI Usage Policy |
|---|---|---|
| 5 QuizzesÌý | 40% (8% each) | Conditional |
| Programming Assignments (5) | 40% (8% each) | Conditional |
| Final Exam | 20% | No AI Use |
Course Content
Duration: 6 hours, 45 minutes
Welcome to Introduction to Machine Learning: Supervised Learning. In this first module, you will begin your journey into supervised learning by exploring how machines learn from labeled data to make predictions. You will learn to distinguish between supervised and unsupervised learning, and understand the key differences between regression and classification tasks. You will also gain insight into the broader machine learning workflow, including the roles of predictors, response variables, and the importance of training versus testing data. By the end of this module, you will have a solid foundation in the goals and mechanics of supervised learning.
Duration: 3 hours, 31 minutes
In this module, you will expand your understanding of linear models by incorporating multiple predictors, including categorical variables and interaction terms. You will learn how to interpret partial regression coefficients and assess the fit of your models using metrics like R² and RMSE. As you build more complex models, you will also explore the risks of overfitting and the importance of model validation. By the end of this module, you will be equipped to build and evaluate multiple linear regression models with confidence.
Duration: 5 hours
In this module, you will transition from predicting continuous outcomes to modeling categorical ones. You will learn how logistic regression models binary outcomes, like whether a customer will default on a loan, using probabilities and odds, and how to interpret the results. You will also explore k-Nearest Neighbors, a flexible, non-parametric method that classifies observations based on their proximity to others in the dataset. To evaluate your models, you will use tools like confusion matrices, accuracy, and precision/recall, gaining insight into how well your classifiers perform. This module lays the groundwork for tackling real-world classification problems with confidence and clarity.
Duration: 4 hours, 30 minutes
In this module, you will learn how to evaluate your models more reliably and improve their generalization to new data. You will explore resampling methods like k-fold cross-validation and the bootstrap, which help estimate test performance without needing a separate test set. You will also be introduced to the regularization techniques Ridge and Lasso that prevent overfitting by constraining model complexity. Using cross-validation, you will learn how to select the optimal regularization strength, balancing predictive accuracy with model simplicity. These tools are essential for building models that perform well not just in theory, but in practice.
Duration: 3 hours, 39 minutes
This module introduces you to one of the most intuitive and interpretable machine learning models: decision trees. You will explore how trees split the feature space into regions, how to read their structure, and why they are prone to overfitting if left unchecked. Trees are just the beginning; this module also introduces ensemble techniques that elevate predictive accuracy by combining many models. You will get a first look at methods like bagging, random forests, and boosting, and see how they compare to the models you have already studied. By the end, you will understand when and why tree-based models can outperform simpler approaches, especially in capturing complex, non-linear relationships.
Duration: 30 hours
Final Exam Format: Exam
This module contains materials for the final exam. If you've upgraded to the for-credit version of this course, please make sure you review the additional for-credit materials in the Introductory module and anywhere else they may be found.
Notes
- Cross-listed Courses:ÌýCoursesÌýthat are offered under two or more programs. Considered equivalent when evaluating progress toward degree requirements. You may not earn credit for more than one version of a cross-listed course.
- Page Updates:ÌýThis page is periodically updated. Course information on the Coursera platform supersedes the information on this page. Click theÌýView on CourseraÌýbuttonÌýabove for the most up-to-date information.