Python and Other Technologies for Data Science

Description: Lecture, three hours; discussion, one hour. Enforced requisite: course 20. Covers use of Python and other technologies for data analysis and data science. Focus on programming with Python and selection of its libraries: NumPy, pandas, matplotlib, and scikit-learn, for purpose of data processing, data cleaning, data analysis, and machine learning. Other technologies covered include Jupyter notebook and Git. Intended for Data Theory majors as introduction to Python language and libraries most frequently used in data science. Letter grading.

Winter 2022 - ** The quarter I took this class was partially online due to Covid reasons, your experience may be different** -- Overall, Miles Chen is a pretty good, clear lecturer. This class as a whole is also just about a useful topic in general too, and what you learn here should help jumpstart your knowledge toward data analyst and data science positions, so overall I think it's a good class to take. Note, however, that because there are pre-requisite Stats courses before this one, there's much less guidance compared to, say, Stats 20 (at least this was my experience). As someone who wasn't a very experienced programmer compared to some of my peers, I think I struggled a bit more than I'd like to admit during my time in this class. -- The grading scheme was as follows: 15% Lecture Viewing Quizzes 20% Datacamp Homework 36% Homework 25% Final 4% Campuswire Participation -- Viewing quizzes, Datacamp, and Campuswire were basically free points, please do NOT forget about and waste them! Lecture-viewing quizzes weren't class content related, they were random letters, so if you're a data science expert you can't just skirt the quizzes. Datacamp was based on completion and was easy because there are tutorials on there, and Campuswire you just needed to get 150 points, plus an opportunity for extra credit if you went further (I think to 200?). Also, the premium subscription to Datacamp that you get for this class is valid for 6 months, so you can basically keep using it the quarter after you take this class which is excellent. -- I mentioned before that you have to be more proactive and independent in this class compared to your other Stats lower divs and I think the homework reflects this. The first assignment was pretty straightforward and then it got more challenging from there. Not everything was straightforward and you sometimes had to think beyond the lecture examples, which admittedly are pretty clear but don't always directly relate to the homework. That said, reading the documentation online for the different libraries/data science technologies you're using is pretty helpful as is talking things out with peers. I didn't go to office hours too often but both Professor Chen and my TA Lucy seemed quite approachable and helpful so I'm sure that's a good way to go. Also, there were a lot of issues happening on campus the quarter this class happened that caused people mental distress, so Professor Chen was kind enough to give an extension on homework to everyone at that time. -- The final was absurdly challenging and after talking with peers afterward, I don't think I was alone in not having completed the whole exam. The topics that appeared were also not what I expected; for example, there was a much bigger focus than I expected there to be on object-oriented programming. Personally, I think the final was the worst part of this class. That said, Professor Chen was pretty understanding and adjusted the grading upward on the final. -- Overall I'd take this class again with Professor Chen, and would recommend you to take it as well.
