CS221: Artificial Intelligence

Intro

CS221 (along with all of the sheer amount of classes and extracurriculars) is what made this term hard.

The Material

I learned a lot about working with a group and developing a large project. In terms of AI content, the course material:

We started out with a basic search for a solution. Most AI problems could be phrased as a search problem. The only issue is that basic searches are extremely time consuming which makes them impractical to use in all but the most basic problems. In searches, we learned about a generic search algorithm (maintain a queue; add connected nodes to the queue when you 'expand' a node, which removes that node from the queue; when you expand a node, you check to see if it's the a goal node, in which case you can return with that node immediately. The only difference with different searches is what type of queue it is: first in first out --> breadth first search; last in first out --> depth first search; and cost-based searches or heuristic based searches like A* will use a priority queue), A* search, a search that uses a divide-and-conquer heuristic, and greedy hill climbing and optimization search.

We spent a little bit of time on applying search to solving constraint satisfaction problems and phrasing slightly more complicated problems in CSP terms (ie, is there a way to color a given map so that no two adjacent countries have the same color if you only have 4 colors to work with?). Specifically, we learned about consistency checking, forward propagation, and arc consistency / constraint propagation, in addition to the heuristics of choosing the most constrained variable or the least constraining value.

The next part of the class was about 'supervised machine learning.' With supervised learning algorithms, you know the correct answers, you let the machine make a guess, and you tell the machine how wrong it was. Then, it changes the way that it makes its guesses based on if it was wrong or not. This works well if you know exactly how the machine should have guessed in a number of cases and you want it to be able to use those cases to predict future cases. Specifically, we learned about linear and logistic regression using a gradient descent algorithm, and basic, bagged, and boosted decision trees. 

We then moved on to reinforcement learning. This is good when you don't know the correct answer to a given problem, but you know what you want the answer to be. Ie, in a chess game, you don't know what move is most strategic on move 34, but you know that you want to win, and you might be able to tell the machine how far away it is from winning on any given move (ie, how many pieces it has left versus how many pieces the opponent has). We spent a lot of time on learning the Markov Decision Process.

That brought us about halfway through the term. In the last half, we spent more time on specific problems in AI and on more advanced models. Since we already know the basics of each model, this meant that the emphasis was more on learning a new algorithm to solve a task more efficiently rather than having to learn about what the task even is. Some of the areas that we talked about were computer vision and optical recognition, medical diagnosis, and natural language processing. Some of the new models we learned about were Bayesian networks, Hidden Markov Models, and Particle Filters.

Interspersed throughout the course were cool tidbits of things that AI has done. For instance, Stanford taught a helicopter to fly upside down, taught a car to do tricks like turn around really quickly, drive sideways, drive on ice, and drive long distances without any human input, and taught a machine to classify objects and respond to voice commands. AI can do some pretty cool stuff. And InSTEDD was founded on using AI to do natural language processing and infer disease outbreaks (now they do lots of other cool things too).

Assignments (They Were Hard)

Starting out, all of the material was very basic. It was all a review of things that I learned about search in CS106B or about probability in CS109. As a result, the first problem set hit hard. It may have been an all-nighter between coding and proofs (there were only four assignments, plus the final assignment, in the term, but each assignment had both a coding portion and a more mathy portion) -- I don't actually remember. The reason that it was so difficult is because, rather than testing the basics like most classes do, they assume that you understand the basics and extract the most difficult parts of the reading and lecture and come up with a conceptually difficult problem. They pulled it off masterfully, though. They did a tremendous amount of the work for us on the coding portions, which meant that if we understood everything we had to do (ie, if we understood the material), it would be a simple task to add in what we needed to do. It was the same way with the problem set portion. The problems were really hard to understand, but once we understood them, the entire problem set was only a few pages. Also, because the class was so geared towards applied AI techniques, it was extremely evident that each coding assignment and problem set was only a few steps removed from an actual AI project that is either an unsolved problem or a problem that was only solved a few years ago, so there was never a question of why we were doing the work that we were doing. Everything was meaningful. 

In comparison to my other classes: I was also very satisfied with the assignments in CS107. CS107 was a programming class, which meant that it was applied rather than theoretical ("can do you this?" versus "do you know this?"), and in those classes, you don't really learn the material until you have programmed it. Because CS221 was more theoretical, they couldn't quite make it so that you would learn about a type of machine learning or search space by doing, but the way they laid out their assignments was a very concise way of testing conceptual understanding. The assignments in CS103, a very theoretical class, I was less satisfied with. I guess since it's the introductory theoretical CS class, they have to teach the mechanics of doing proofs and such, but because I had already picked up most of that before, I felt like I had to do a lot of work on the assignments even after I understood all of the material. 

Later on in the term, the assignments weren't quite so torturous. They were still very hard, but I guess we adapted to them or learned about working harder. I think that a large part of it was that we got better at group work. In CS221, all of the assignments were group-based. You could program with 2 other people, and you could discuss how you would solve the problems on the problem set with anyone as long as you wrote it up on your own. With the first few assignments, my group was spending a lot of time talking about the problem set portion; on the later assignments, we did the problem sets on our own and just conferred with each other when we had a problem (ie, that the class assumed knowledge of multivariable calculus, which I didn't know, so Brennan taught me multivariable calculus in 10 minutes for the problem set that utilized it) or to check our answers, which ended up going quite a bit faster. We also got better about distributing work and thought as a group. Or perhaps the assignments were just as torturous towards the end, but because we had completed the earlier ones, we knew that it could be done, so it wasn't a desperate race to finish. Rather, it was a steady progression of work towards a goal. That goal may not come until 4 or 7am, but it will come. I also realized the extent to which group work on difficult projects lends itself to all nighters. With individual work, you can work an hour during lunch, an hour between classes, an hour at 3am, and keep on like that until finished. Because finding a time when three busy students can meet to work on a project is hard, often not beginning until 10pm, it is much harder to break up the assignment into multiple work sessions.

Also, there were a ton of 'late days' (in the CS department, classes give out 'late days,' which are basically pre-approved extensions. Most classes will have between 2 and 4 late days. CS221 had 7.) Thus, if we needed more time on an assignment, it didn't feel like wasting a precious resource to turn it in late. In other words, our all nighters were the result of having other projects that we needed to do the next day and the day after that, not the result of any cruelty on the part of CS221.

Arrogance

Because the CS221 assignments were all conceptually difficult, I went into my other classes with a good deal of arrogance. Yes, people say that CS107 is a hard class, but it is by no means CS221, and I'm doing fine in CS221, so CS107 is easy. And that arrogance served me well. It is the same way of thinking that I got from doing debate. Yes, people say that an IB class or that taking an AP test without taking the class, or that taking a full class schedule and a full extracurricular schedule is hard, but it is by no means as hard as debate, so all of that is easy. Once I have climbed a high mountain, all lower mountains become mere hills. CS221 was hard, but it helped me with everything else that I did by giving me a more arrogant attitude.

It was arrogance, not rational confidence. Confidence is entering territory that has been explored. Arrogance is treated unexplored territory as if it had been explored. My term was hard, and there wasn't much wiggle room. If I had gotten a bad sickness, it would have been hard to make the time to recover. Rushing, headstrong, into the hard terms seems to have worked out so far, though.

Arrogance helps in some situations. It gives me all of the benefits of self-confidence even when I shouldn't be so confident. That means that I take risks and that I realize that, with stupid rules, the people who make them are often willing to relax them. I would not be as successful as I have been with only rational self-confidence.

However, arrogance is arrogance, a vice rather than a virtue. Being arrogant with my course load turned out well, but humility would probably have been better in my classes. In CS221, I was humble. It was the first class that I had taken that was really hard. Thus, I was fine when I missed a few points. In CS107, I was arrogant, so when I missed some points, I sent out grade-grubbing emails even though I was doing fine in the course and I could see why they would have given me the grade that I got. 

The problem is that I haven't found a suitable substitute for arrogance. When I think of problems as hard, they are much harder to solve. I have had a series of technical interviews. The first few were the section leading interviews in the past. The last few have been the section leading interviews this term and internship interviews this term. In the past, I was asked questions that I knew, but I was nervous, and I choked. This term, I felt good about myself, and the arrogance overshadowed the nervousness, and I succeeded. 

I guess you need to make a choice when living in a world with imperfect information.

Midterm

CS221 also had a midterm. The midterm was on week 8/10. I guess they made the midterm so late because they had a final project rather than a final test, so they wanted to test as much as they could. The midterm was fairly similar to the homework problems -- conceptually hard, but not too bad if you knew the material very well. I didn't do too well on the practice midterm, but I studied a lot and made a good notes sheet, and I did well on the midterm itself. 

The midterm was out of 140. The median was in the 80s. I was in the 100s. Because of the humility that I discussed in the previous section, I can accept that I was imperfect. However, this is telling of the mental difference between arrogance and humility: with arrogance, my initial reaction, in the gray area, is to blame others; with humility, my initial reaction, in the gray area, is to blame myself. I still strive to be better, so I am frustrated with myself that there was a problem that I did very poorly on. Arrogance can act as a shield, preventing me from beating myself up.

I do try to be humble when comparing myself with others. The only virtues that I publicly ascribe to myself are that I care about helping people and that I work hard because those virtues are accessible to everyone around me and are much less the result of the luck of birth than things like 'intelligence' or 'skill' (see Camus' discussion of heroes in "The Plague" for more on this point). Thus, I find it very awkward when someone else ascribes the virtue of intelligence to me when they work just as hard as I do. I guess it forces me to think about how lucky I am and how there's nothing that makes me deserve the success that I get. So I was embarrassed when a friend saw my midterm and said that I was a better coder than him.

Final Project

The class was based around a group project. We worked on computer vision -- going through a video and classifying the objects in it as one of several types. I worked with the same two friends that were in my group for the assignments.

It was hard. Apparently, sight is complicated. One of the challenges we faced with this that we didn't face as much on the assignments was that everything took a long time. That is, in order to test how good our design was, we would have to let the machine train for a long time and then take 10 minutes to let it classify a video, and then we would find out how well we did. Some of my CS107 skills came in handy for this (at the end, 8 hours only got my neural network 1/3 of the way done training. Thankfully, I was running it inside of GDB, so I stopped it and saved it in the middle of training.).

Mostly, what I learned from the project was an appreciation of the difficulties of coordinating with a group to work on a big, experimental project. A lot of the projects that I had done in the past had a definite solution. With computer vision, it's an unsolved problem. Thus, it was just about trying new things and seeing what helped and what didn't. In other words, the projects that I had done in the past were just about programming; in this class, I was actually working on advancing the state of the art of computer science. It wasn't about the code; it was about the writeup at the end where we evaluated our trials and errors.