How to Think Like a Computer Scientist

Peter Norvig noticed that a lot of books purported to teach someone how to program in hours or days.  He responded with a post titled "Teach Yourself Programming in Ten Years."  Since Norvig's logic has me at half a programmer, I'll refrain from telling you how a computer scientist thinks and instead give you some challenges to try out over the next decade.

  • Practice Personally.  The question of how to think "like a computer scientist" implies that there's one right way to think like a computer scientist.  Nope!  There are stereotypes, though, so let me call out some of them out, make them explicit, and reject them: being a young, rich, white men with his eyesight and a long beard who lives in Silicon Valley and spend his days in his basement drinking Red Bull will not make you a better programmer.  I'm an okay computer scientist because I think a lot about philosophy and social change (thus my focus on technology for nonprofits and education).  Some people are good computer scientists because they think about music, art, or public policy.  You won't get better by trying to think like someone else.  You'll become a good computer scientist by thinking like yourself and figuring out cool things that no one else has thought about before.  Yes, some people have more obstacles than others, but there are also communities and resources to help you surmount those obstacles.  
  • Study Soft Skills.  Computer scientists are people first and coders second.  If you implement systems that will change the world, it is your duty as a human to think about the ethical implications.  If you work with a team (hint: most programming involves team work), you need to be good at talking with people and resolving differences.  If you want people to take your ideas seriously, you should probably practice public speaking and writing.
  • Teach these Topics.  Teaching something is a good way to learn it.  Even if you aren't a TA, you have plenty of options!  You could make videos teaching a subject that you just learned (like Sal Khan).  You could give a tech talk.  You could mentor a high school student.  You could talk with your friends.  Options abound!
  • Don't Defer Design.  There's a stereotype that "real" programmers do either systems or algorithms work and that making a clean user interface is something that can be deferred to other people.  No!  You will make better products if you keep the user in mind.  That includes frontend graphic design and knowing how to make clean HTML/CSS/JS, but it also means showing real users prototypes and mockups to get their feedback.  A really fast algorithm isn't much good if it doesn't do what the user wants.
  • Deconstruct Data Structures.  When I led a section for Stanford's intro data structures class, I wouldn't just ask "how does an {array, linked list, map, set, tree, [priority] queue, stack, ...} work?" I would follow up with "and how else could you implement it?"  How would the {big-oh, best case, average case, performance on real hardware} {memory usage, runtime} change for a particular operation if you implemented a particular data structure with a different primitive?  A class like Stanford's CS106B (available for free online) or a book like Cracking the Coding Interview will get you started.
  • Apply Algorithms There are a few usual suspects in algorithms, and you need to know all of the major types of algorithms.  Stanford's algorithms class is online (http://online.stanford.edu/course/algorithms-design-and-analysis-part-1 and http://online.stanford.edu/course/algorithms-design-and-analysis-part-2) and pretty good.  It will introduce you to things like divide and conquer, greedy, graphs, dynamic programming, linear programming, and NP complete problems, and it will also show you the canonical examples for each of them.  After that, you might want to challenge yourself to learn about some new algorithm, implement a new data structure, or discover a new dark corner every week.  
  • Control the Command Line.  If you're programming in Windows, you're giving yourself a disadvantage.  Even if you have a Linux box that you SSH into to program, if you do your day to day computing in Windows, then you are volunteering to speak the command line as a second language rather than speaking it natively.  If you need to keep Windows for video games, then dual boot, but your default should be Linux.  Not only should you use Linux, but you should make the command line your home.  Invest some time into your shell and text editor config files.  Learn grep.  I taught a class on Practical Unix that's available online (CS1U - Practical Unix), and the content will get a bit more polished in time.  Also, if you want some dotfiles, you can use mine (https://code.google.com/p/samking-config-files/source/checkout).  I prefer Free Software like Linux to proprietary software like Macs (I also think the Ubuntu user interface is more intuitive in many ways), but a Mac isn't horrible.  Prepare yourself for an uncanny valley if you switch between Mac and Linux, though.
  • Delve Into Dark Corners.  A big part of being a better programmer is confidence.  Learning about Some dark corners of C might not make you much better of a programer, but it will probably make you feel  like a better programmer.  Plus, what's more entertaining at a cocktail party than hearing about how many ways you can break someone's program with macros in C?  (Answer: most things).
  • Practice Paradigms.  If you aren't familiar with object oriented programming (50+ hours of OOP programming at a minimum), then fix that.  After that, you should play with some functional programming to get better at recursion and map / reduce / filter.  And, uh, thinking functionally, I guess.  Declarative programming might feel like it isn't programming at all; it will encourage you to think very differently, which is good.  Doing a bit of HDL (hardware description language) stuff in a context like Nand to Tetris (http://www.nand2tetris.org/) might do the trick.
  • Learn Languages.  You should know a fast and reasonably memory efficient OOP language (C++ or Java), a backend web language (Python, PHP, or Ruby), and frontend web languages (JavaScript, HTML, and CSS).  Personally, my favorite is Python (I have spent months programming in Ruby and hated just about every minute).  Norvig's article recomends a few others.
  • Master Math.  Linear algebra is useful for dealing with big matrices (like PageRank) and geometric stuff (like everything in computer graphics).  Probability and statistics are useful for machine learning and for making real systems efficient.  I used some calculus in a proof in an artificial intelligence exam once.  The stuff on Khan Academy for these topics is reasonably good, though it isn't specific to computer science.  The mathematical foundations of computing (formal logic, regular expressions, turing machines, the halting problem, P/NP, reductions...) are also important.  You can see some archives of Stanford's class on this at http://www.keithschwarz.com/cs103/, but I don't think there's a full online course (with lectures and such) yet.
  • Scale Systems.  Building something cool and building something cool that works at scale are almost two completely different challenges.  To make something scale, you need to be familiar with things like distributed systems.  You might need to get down to bits and bytes and optimizing spatial and temporal locality of memory to reduce cache misses.  You probably will want to be familiar with machine hardware, C, and profiling code.
  • Scale Software.  Software engineering isn't just about making code that runs fast.  It also means thinking about well designed architectures that will make the programming scale -- a code base that is hundreds of thousands of lines long is small in industry, and even that can be a nightmare to deal with if the code isn't well architected.  Version control (and a philosophy behind it like Git Flow: http://nvie.com/git-model/), unit testing everything, and using clean abstractions and encapsulation are all required.
  • Wow with the Web.  Making your own website from the ground up is a good way to learn about a bunch of different things and it gives you a way to show off to your friends.  
    1. Get a domain and a server.  I use http://Gandi.net because they're socially conscious, have good service, and cost the same as everyone else, but plenty of people use Amazon Web Services or Google App Engine.  Full disclosure: Gandi provides free hosting for my nonprofit (http://www.gandi.net/supports) and I work at Google.
    2. Install Apache (or nginx or some other server) and serve some static content.
    3. Get Python, PHP, or Ruby working.
    4. Get MySQL (or some other database engine) working.
    5. Get an MVC framework working (Django, Codeigniter, or Rails)
    6. Get a CMS up (Drupal, Wordpress, or Joomla).
  • Synchronize Selectively.  Computer science as a discipline evolves when we create abstractions that make hard problems easier.  Eg, object oriented programming makes encapsulation easier.  Right now, there isn't a clean solution that makes concurrent programs both efficient and easy to reason about.  Threading and synchronization are hard, so you need to practice and get good at them.  The canonical program to learn threading is implementing a bank.  A fun followup is to implement the lower level synchronization primitives yourself in an operating system context like PintOS (MIT's intro operating system for education.  It's also used in Stanford's operating systems class).
  • Attack Adversaries (as a white hat).  There is a lot of insecure code out there because a lot of programmers haven't ever thought about writing secure code.  Stanford's crypto course is free online and very good (http://online.stanford.edu/course/cryptography).  The security class isn't online yet, and I'm not sure exactly when it will get there, but the syllabus is online (https://courseware.stanford.edu/pg/courses/lectures/349991) and some of the homework is, too (http://crypto.stanford.edu/cs155/).  You could also try finding some sample code that is vulnerable to buffer overflows, integer overflows, double frees, etc, and exploiting those vulnerabilities.  And there are probably some resources on XSS, CSRF, SQL injection, and clickjacking.
  • Discover a Domain. The topics described above are applicable across domains, so figure out how they apply to at least one particular domain!  My undergraduate concentration was biocomputation, and there are a ton of big problems in every one of these areas there.  How can we efficiently implement computation across hundreds or thousands of human genomes (probably using an approximation of an NP complete algorithm distributed over hundreds of machines), each of which is 3 billion base pairs long, in a way that respects privacy (eg, resistant to many attack vectors) and is intuitive for a physician?  That one project touches on everything I've talked about so far, and it's just one problem!  Every domain (eg, graphics, NLP, AI...) will have plenty of big issues to sink your teeth into.

You'll get better with time.  There isn't one trick for changing your way of thinking.  There's just elbow grease and musing about how you fit into the big picture.  

As Macklemore says, "The greats weren't great because at birth they could paint; the greats were great because they paint a lot."

Note: this post was inspired by a Quora question, and it's cross posted there.

Quarter:

Experience Type:

Experience Date: 
Wednesday, July 31, 2013