Introduction to Data Mining23>
Most Helpful Review
Beware of the demographics of students enrolled. There are a lot of graduate students in this class, who’ve often had 3.8+ GPAs in undergrad. Even the undergrad students tend to geniuses - smarter than your typical UCLA CS students. Many of these students also had prior exposure to data mining and ML - I like to think that the word “introduction” in the course name is misleading, because this class isn’t an introduction for most students who enroll.
First of all, let’s talk about the grade distribution. Looks nice, right? Well, I’ll tell you that it’s misleading, and I took the bait. Why so many A’s? Because of the above kinds of students who enroll in this class, who skew the grade distributions of the exams and project. If you’re not one of those students, then this class is actually challenging (due to the people you have to compete against).
The content in the class involves a lot of math. If you are thinking of getting into data mining or ML, be prepared for that. I honestly found that I didn’t like this area of CS, which made this class less enjoyable for me. If you’re passionate about it, then go for it, it can certainly make this class feel easier.
The lectures are boring - it’s basically the professor reading through slides full of convoluted math equations, that she’ll breeze through as if everyone understands them. I guess the prof caters to the geniuses in this class.
The assignments and exams themselves were reasonable, if not slightly on the challenging side. What ruined it for me was the people in this class I had to compete with.
In this class, you get six (6) HW assignments. In each assignment, you fill in boilerplate code (typically 10-ish lines), run the program, and write a 2-3 page report answering questions about the topic and about your program results. The top five assignment scores are counted toward your grade. One assignment also has extra credit (in my quarter, it was HW3, the neural network and k-nearest neighbors assignment). The HW can have some inconsistencies - for example, one assignment may require python 2.7 while the next one will require 3.7.
On the midterm, you can bring a one-page cheat sheet, and on the final, you can bring two. You are also allowed to bring simple calculators for both exams. Most of the questions were reasonable and involved knowing the details (for example, the strengths and weaknesses) of different algorithms and how to apply them. But again, the geniuses in this class will skew the exam grade distributions to be really high.
There is a group term project (along with a report) in which, given a data mining problem, you write an algorithm to solve it, and compete with others on Kaggle. Like I said, many students in this class have done data mining and ML before so you better group up with those people or else you’ll get screwed over.
That being said, this class is challenging due to the people you are competing against for your grade. If you hear someone tell you, “this class is easy”, don’t listen to them. This class is not easy; they just think it’s easy because they’re a CS genius, they’ve done data mining before, or are at least passionate about the subject.
But if you enjoy data mining and ML, then by all means take this class.
Most Helpful Review
"Data Mining" sounds like a great class—until you realize that it means sitting through 1.5 hours of research papers converted into incomprehensible PowerPoint slides. Professor Wang is the epitome of a professor who does not know how to teach, because she clearly never received training in it. Examples are not worked out, complex diagrams, definitions, and tables are breezed by before they can even be fully examined, and every slide has a million items on it. Her style is to drop an unintelligible stream of academic jargon and acronyms before saying, "Okay? Great, let's move on." People have no idea where to even begin asking questions.
The tests are all multiple-choice, which sounds easy for the pattern-matching powers of most modern college students. Except, they usually require working out long algorithms. Now trying repeating 10-minute computations for 20 questions in a row with only 100 minutes for the midterm. The average and median came out to 13/20. You do the math, because she clearly didn't.
This is a common predicament in the math lowerdivs: a non-helpful professor and bad presentation of material. In that case, you could turn to the textbook (or better yet, KhanAcademy). In a CS upperdiv like this, you can only hopelessly Google for slightly more comprehensible slides from other universities. The professor also doesn't follow the textbook (which is similarly jargon and math-heavy). Your only saving grace is the TAs, and not even all of them were helpful enough.
To top it all off, you have a group project to struggle through for the last half of the quarter. Overall, CS 145 is a stinking example of all that is wrong with the theory-heavy CS curriculum at UCLA. Avoid it at all costs and go do something useful with your time. The world has many real problems to solve.