All Ratings and Reviews for Yizhou Sun
Beware of the demographics of students enrolled. There are a lot of graduate students in this class, who’ve often had 3.8+ GPAs in undergrad. Even the undergrad students tend to geniuses - smarter than your typical UCLA CS students. Many of these students also had prior exposure to data mining and ML - I like to think that the word “introduction” in the course name is misleading, because this class isn’t an introduction for most students who enroll.
First of all, let’s talk about the grade distribution. Looks nice, right? Well, I’ll tell you that it’s misleading, and I took the bait. Why so many A’s? Because of the above kinds of students who enroll in this class, who skew the grade distributions of the exams and project. If you’re not one of those students, then this class is actually challenging (due to the people you have to compete against).
The content in the class involves a lot of math. If you are thinking of getting into data mining or ML, be prepared for that. I honestly found that I didn’t like this area of CS, which made this class less enjoyable for me. If you’re passionate about it, then go for it, it can certainly make this class feel easier.
The lectures are boring - it’s basically the professor reading through slides full of convoluted math equations, that she’ll breeze through as if everyone understands them. I guess the prof caters to the geniuses in this class.
The assignments and exams themselves were reasonable, if not slightly on the challenging side. What ruined it for me was the people in this class I had to compete with.
In this class, you get six (6) HW assignments. In each assignment, you fill in boilerplate code (typically 10-ish lines), run the program, and write a 2-3 page report answering questions about the topic and about your program results. The top five assignment scores are counted toward your grade. One assignment also has extra credit (in my quarter, it was HW3, the neural network and k-nearest neighbors assignment). The HW can have some inconsistencies - for example, one assignment may require python 2.7 while the next one will require 3.7.
On the midterm, you can bring a one-page cheat sheet, and on the final, you can bring two. You are also allowed to bring simple calculators for both exams. Most of the questions were reasonable and involved knowing the details (for example, the strengths and weaknesses) of different algorithms and how to apply them. But again, the geniuses in this class will skew the exam grade distributions to be really high.
There is a group term project (along with a report) in which, given a data mining problem, you write an algorithm to solve it, and compete with others on Kaggle. Like I said, many students in this class have done data mining and ML before so you better group up with those people or else you’ll get screwed over.
That being said, this class is challenging due to the people you are competing against for your grade. If you hear someone tell you, “this class is easy”, don’t listen to them. This class is not easy; they just think it’s easy because they’re a CS genius, they’ve done data mining before, or are at least passionate about the subject.
But if you enjoy data mining and ML, then by all means take this class.
This class is absolutely not an easy ride like CS161/146/188. The workload for this class is about 2 * CS146 or 3 * CS188.
Actually, this class should make CS146 a prereq because the first 5 weeks basically cover the entire quarter of 146 and at a deeper level. I think those that haven't taken CS146 with no ML background will surely have a hard time.
Plus, challenging exam, challenging project that even reaching the baseline isn't easy, and time-consuming HW. I would say the workload is closer to classes like CS144 or CS118, not as crazy as CS111 or CS131 though.
So, adjust your expectation. This class is much more challenging than the other ML/AI themed classes in the CS department.
I had 0 experience with ML in general going into this class, but that Professor Sun's generous (project heavy) grading scheme and great extra credit opportunities allowed me to survive this class. Professor Sun is super nice and is more than happy to explain concepts again for better understanding, but I found it kinda hard to pay attention in class over zoom. The slides are super formula dense in places, and the useful information can be difficult to parse at times.
The homework is definitely challenging, even though its basically fill in the blanks code. The course project was by far my favorite part of this class, and I found that experimenting with diff models was how I learnt the material best. The midterm was harder (and worth more) than the final, but imo they were fair tests. Overall would recommend this class.
I really enjoyed taking this class with Prof. Sun – this was one of my favorite classes in the program. While the material can be confusing especially for students with no prior knowledge, Sun made it a point to address all questions that were asked during lecture, and discussion sections were really helpful in seeing examples worked out. Many of the more complex derivations are more for students that are interested in the topics (as touched on in older reviews), but I think that the exams were pretty fair to those that understood the high level concepts. There is substantial overlap with M146, but I think the teaching staff received that feedback and is working to change that in future iterations. I would definitely recommend taking this course!
FYI regarding the previous reviews: There was a very vicious review about the course and Professor Sun, and I believe it has been deleted.
Now, my two cents:
1. The difficulty of the course:
Difficulty is a subjective measure. I would say this course is relatively easy. As an undergrad who has not taken CS146, I did pretty well in both midterms and final exams, so I believe no prior knowledge is required to perform well in the exams. Of course, there will be people who have taken CS146 or/and have prior experience in machine learning, and some may think it is unfair and adds positive skew for the grade distribution. Face the truth : life is never fair. Don't blame this on the course or Professor Sun. All I want to assert that you can do well (at least in homework, exams, and quizzes =>75%) without any prior knowledge about machine learning.
The group project varies quarter from quarter. I had a pretty hard time trying to contribute to the group project and did feel bad when the trained model did not yield good results. With that being said, it does not mean I did not have fun. Get a ML Notebook with an instance of 16v CPU + 108GB and 1 Nvidia Tesla P100 GPU with CUDA drivers, and trying different frameworks is pretty fun by my standard. At the time of this review, the grade for group project has not been released, so I am afraid there is no way I can tell how much one will get if your model did well/okay/a little bit bad.
2. Clarity of lecture
The lectures are pretty clear, although you have to be familiar with the math and notations to be able to understand the material. If you have trouble about matrix multiplication or understanding things you learned in Stats 100A, it is better if you get them straight before taking this course.
I don't like the homework. The major difficulty I have faced is to fill in other's code and dealing with pandas and numpy library. If you are unfamiliar with those libraries and Python (in general) like me, I strongly recommend you to study on those libraries because they can help you a lot.
If you interested in the academic track, talk to Professor Sun.
Writing this because of the other review, to provide a different perspective. The concepts in this class aren't too difficult, especially if you've taken 146, or any other machine learning class before. Professor Sun does a good job of explaining the technical details of each algorithm covered, and is open to anyone that needs clarification during lecture. I'm not sure what this person is saying, but I don't think there is a single grad student in our class right now, at least that I know of. The course project is dumb, imo. You're allowed to use any framework you'd like, and doing real data mining is sort of pointless, as the best scores are mostly just 11 lines of Python. It was really fun researching all of the methods we could use though, and working with a team. The homeworks are jank. Great concepts, but ill defined, and built terribly, but again I don't think this is Professor Sun's fault, and I think this is the first quarter they've had homeworks. Overall this is a pretty average CS elective, I wouldn't say its a need to take, but if you haven't taken 146 I'd recommend it. Definitely easier than 111.
Professor Sun taught this class as if the whole course was a review session for multiple large and complicated ML concepts. She went extremely fast because she had to cover all of M146's material in the first half of quarter + new materials. We had HW due almost every week that had pretty vague instructions and requires a lot of time to figure out how the code/calculation works. But, if you're willing to put in the time and effort to do the HW yourself, I think you'll understand the material pretty well. Each HW took me about a day of my weekend. She expects students to know how to use dataframes and matrix operations in numpy very well, as your code will likely take hours to run if it is not optimized.
The course project was a group project to predict number of Covid cases and deaths which had literally almost nothing in common with any topics taught in this course. So, to do well on the project, you either find teammates who are already pretty experienced with implementing different ML models or you spend hours trying to figure it out yourself. The number of hours put in the project does not really correlate to the performance of your model. You could spend hours processing your data and using some complicated model in some research paper online and get significantly worse results than simply using a LINEAR REGRESSION on the data provided without using any of the other features. Hopefully the projects will be easier in future years.
The exams weren't extremely difficult if you actually understand the materials. BUT, to understand the materials, it takes WAY more than just attending her lectures (unless you're some kind of ML genius) as her lectures were very fast and often makes a simple concept (ex: Apriori algorithm) extremely complicated. You have to put a lot of time and effort outside of class to deciphering her slides or just reading other people's online explanations. Good thing was that exams were open book, so you didn't have to memorize a lot of definitions and formulas. But the professor did use respondus.
Nevertheless, looking back with the amount of effort I put in to understand the material, I did learn a lot of useful knowledge in this class + significantly improved my python skills. The materials were interesting, specifically in the second half where we learned frequent pattern mining and text data mining. The professor, despite her often confusing lectures, did always try her best to answer any questions student may have.
tl;dr: professor not great at explaining. this class is a lot of effort and self-studying, but you do learn a lot if you're willing to put in that effort.
I personally felt like this class was extremely theoretical and the professor wasn't great at explaining most concepts. She would start each concept with a lot of theory and complicated math and derivations and then after everyone is extremely confused, she'll go through an example which will seem so simple that it's impossible to relate to the theory that she just went through. She's not too great at explaining most concepts which is why I was very confused by most of the theoretical portions. Honestly, I've done M146 and I felt like I got even more confused about certain concepts after doing this class. I would not recommend taking this class with this professor.
Did this review contain...
Thank you for the report!
We'll look into this shortly.