Wednesday, October 25, 2006

Introductory CS Sequences

At most universities the first year CS courses tend to cover programming. These courses differ from first-year sequences in other departments in several important ways.

In most disciplines, the topics of the introductory sequence have not significantly changed since I went to college and even since my father went to school. In CS our introductory courses seem to change every few years. I don't think anyone currently teaches PL/C (or any other PL/1-variant) that I learned in CS 101 at Cornell. Computer Science didn't even exist as a field when my father when to college in the 50's.

In most disciplines any professor in the department deeply knows the material taught in the introductory sequence. Any professor could teach the intro sequence, or if they can't it's not because they don't know the material. This is certainly not true in computer science.

A professor at a state university noted that their CS majors had internships after their first year and commented "CS is the only discipline where we have to make the students employable after the first year."

In non-CS scientific disciplines, different universities generally teach the same material to their first-year students. Different physics departments teach with different books at different levels and maybe material in different orders but there is general agreement of what basic concepts of physics that first year students should know.

Go to any computer science conference and you'll hear discussion about what programming language gets taught at their schools. Nearly every department has people disagreeing about which programming language to teach to the first years. I tend to stay out of these fights for it is a lose-lose proposition. Win the argument and you'll end up teaching the class.


  1. You seem to skirt around what I find very disappointing about the whole structure. First students are taught how to hack out crufty code, then they get taught the high level concepts. IMO, it would be much better if the students somehow learned how to think first, then actually did major programming. Now, I don't know exactly how one would structure such a thing however.

  2. What I see coming from CS majors is a lot of incredibly bureaucratic code that I can't enjoy reading. In many cases, the programmers go overboard with what they consider good or even mandatory style. This may be an improvement over old-fashioned "spaghetti code", but not much of one. Some of it can be called "politician code".

    If I taught intro programming, I would like to see each student write a 30-line program in 30 lines of clear, pleasant code. I do not want to see 800 lines of de rigeur code. I don't care so much about the language. Personally I like Python a lot, but there are other perfectly livable choices.

    That said, I think that it's very good that CS departments first teach students how to program. Without that, the entire major could turn fake. Programming could be replaced by courses in running software and philosophizing about the Internet -- and I have heard that at some colleges, they have been. Although this may not be a problem at Chicago, I would have the same concern about intro programming as about calculus. Namely, that the syllabus might evolve so that a memorizing student can get by with only pretending to learn how to program.

    The interesting half-exception to my hard line is CS students who are seriously on the theory track. You can be a very good CS theorist, really a kind of mathematician, even if you don't enjoy programming. In my view, it's like writing a great romance novel without ever having had sex. Indeed it has been done, but it's not how I like to live.

  3. One department wants to teach C# to the first years. I think it is a bad idea because it is propriatry language. What do other people think?

  4. To anonymous number 3: You actually can program in C# using Free Software:

    On the other hand, C# might be too narrow (yet) with regard to target audience. Truthfully I am not sure whether in first year one should go for what is used in practice (i.e. C) or for what is a good learning experience (don't laugh, PASCAL, perhaps also Python although I am not well acquainted with that). Being a theoretician I might go with the latter option :-)

  5. I think it is a bad idea because it is propriatry language. What do other people think?

    So was Java until quite recently and no one complained. This suggests that you just may be trying to rationalize your (perfectly valid) anti-Microsoft sentiments in the wrong way.

  6. I actually take issue with the fact that introductory CS classes are programming classes, because I think that this has led people to equate computer science with programming. This is especially true given that computer science is generally not taught in high schools. In first (high school) courses in other sciences, students are introduced to the most important big ideas in the field in their first year. In an introductory biology class, students are exposed to the theory of evolution, the basics of genetics and how DNA carries and replicates the genetic code, etc. These ideas are exciting, and they give a good sense of what the important and motivating ideas in the field are. In contrast, intro CS courses tend to teach java programming, and prepare students for web-design internships. I think the first computer science course that a student takes should hardly involve programming at all, except for perhaps as a teaching aid. Introduce students to recursion, randomization, dynamic programming, maybe a bit of undecidability and complexity -- the key insights of computer science that make it exciting. Then, once students understand -why- they should study computer science, you can teach them how to program so that they can use what they have learned and become employable.

  7. One more thought --

    Part of the problem is that the term "computer science" covers so many very different things. What an introductory course should do is provide some common ground from which all other subfields can be thought of as sprouting from.

    I think a good intro course might sell computer science as physics. The thesis: Information is a fundamental quantity in the universe, just like mass and energy, and computation is the force that acts upon it. I think everything really follows from this.

  8. I've taken a few programming courses in high school and college, and so far none of them teach it right. I had a huge advantage over my classmates because I knew how to use a debugger. That should be one of the first things taught in a programming class. Why isn't it?

    And why are so few schools offering the Software Engineering degree? Most CS students are going on to become software engineers, not computer scientists.

  9. Introductory biology and chemistry classes also have labs! When I did my undergraduate work, they tried to teach data structures and algorithms in intro courses. These courses taught things like asymptotic notation, Dijkstra's algorithm, B-trees, etc.

    We did lab work in Java for the first year and C++ for the second, but these courses were not _about_ Java or C++. If you were confused, you could ask questions in recitation. And people would stop the professor in class when he did something nifty in emacs or Eclipse or the debugger or whatever. By the end of the courses, you knew C++ and Java.

    One lab I remember studied the complexity of insertions, deletions, and queries to sorted and recency lists vs. balanced and unbalanced binary trees. (I think.) We figured out what the complexity should be and then tested our hypotheses empirically using various input sizes. (using the Java interface to the system clock)

    So if the question is theory or application, I think the answer is yes.

  10. By the way, Java runs on many more platforms than .NET does, and the open implementation of it is much more complete. (Or it was last time I checked.) This could be an issue when doing labs--it's easy to ssh in and do labs from home if you're using Java on a Unix/GNU system.

    But they're both similar and you can go back and forth easily.

  11. I personally don't see the problem in equating CS with programming, since CS is programming. Dijkstra described himself as a programmer.

  12. To ppmitra: You are incorrectly generalizing. Because one member of set A is also in set B doesn't mean A = B (here A is "computer scientists" and B is "programmers").

  13. I was making a point about the character of CS. I don't understand why we need to disassociate ourselves from what is the foundation of our field.

  14. " I don't understand why we need to disassociate ourselves from what is the foundation of our field."

    We don't. Indeed, programming does NOT constitute in any terms the foundation of the science of computing. Hence, programming courses should be taught only as optional 2nd/3rd year course.

    This is of course if we aim to be scientists.
    I can't see how programming has anything to do with science.
    Physics or biology labs for instance have a crucial role in their scientific fields, as they are empiric fields. CS is not an empiric scientific field.

    Therefore, CS has two opposed directions: either becoming a part of engineering (hence, NOT a science); or forming itself as a scientific branch related to Mathematics (hence, NOT having programming as its foundation).
    The best option (scientifically speaking) is probably to split the field into two diverse fields.

  15. The best option (scientifically speaking) is probably to split the field into two diverse fields.

    This is the worst option, scientifically speaking. Divorcing the science from its applications is a sure path to irrelevance, both scientific and in practice.

  16. In my opinion teaching CS without teaching programming is like teaching algebra without teaching arithmetic.

  17. A few random thoughts:

    1. DIfferences in programming languages don't make intro courses significantly distinct. You can teach essentially the same material in C as in Pascal (or C++ vs. Java, etc).

    2. Programming is a basic skill needed in many advanced CS courses that require "lab" work, but unlike similar basic skills in mathematics (arithmetic, trigonometry, ...), we still cannot rely on all high school graduates to know how to program.

    3. It's also hard to discuss computational models with people that have no intuition or experience with what physical computing devices can and can't do. In that respect, teaching first an assembly language would have made some sense.

    4. I've never heard of an intro to math, intro to physics, into to chemistry, intro to biology, intro to EE course. Why do we call our introductory programming course intro to CS?

    5. Most CS departments I know also have some introductory math for CS course, covering some aspects of discrete math. The variations are far greater than those of the intro to CS course.

  18. I looked up what the Computing Science Accreditation Board says, and it's remarkably vague.

    "... a core of at least 16 semester hours of algorithms, data structures, software design, concepts of programming languages, and computer organization and architecture." and "Students must be proficient in at least one high-level programming language and exposed to a variety of languages."

    So there we have a suggestion for the core courses:
    1. programming
    2. algorithms and data structures
    3. software design
    4. programming languages
    5. computer architecture.

    Programming languages and computer architecture seem to be more advanced courses to me, so that leaves 1, 2, and 3,
    for the introductory sequence. One could imagine separating 2 into two pieces, or combining 1 and 3 (or 1 and 2). But aside from MIT, where are the introductory courses chosen from something other than 1, 2, and 3?

    Peter Shor

  19. I take serious issue with Anon14. As an undergrad, I had extensive discussions with friends about the "computer scientist" versus "programmer" monikers (where I was on the "there's a big difference" side), but I still think that this is not such an easy issue to dismiss.

    First, many aspects of CS are empirical fields. An easy example is machine learning: the no free lunch theorems tell us that no learning algorithm will be universally good. This means that one almost always needs empirical validation to see if the algorithms we have are good on the range of problems that actually exist in the real world. I'm sure there are other examples. Second it's somewhat unclear if math and computer science (the "science" part that has nothing to do with programming) are even sciences, at least according to the stance of studying natural phenomena in controlled ways (I think Feynman made this argument a while back).

  20. Second it's somewhat unclear if math and computer science (the "science" part that has nothing to do with programming) are even sciences, at least according to the stance of studying natural phenomena in controlled ways

    This shows great ignorance of (i) what a science is and (ii) the fact that math is not an arbitrary construct, but rather well founded in nature. In fact certain novel theorems are "proven" by physical arguments well before mathematicians can develop a proof that does not depend on, say, the principle of conservation of energy (Witten has quite a few of these).

  21. More comments:

    1. "programming", as far as I understand it, includes formalizing the problem, and designing algorithms and data structures (or understanding which existing algorithms and data structures to use). I don't think anyone can say that this is not an essential part of computer science.

    2. If an intro course focuses on these essentiall issues (modeling a problem, algorithms and data structures), rather than details of a particular language or execution platform, then it does not really matter which programming language is used. (Although some languages and environments may make it easier to aviod getting bogged down in details.)

    Not coincidentally, these essential issues are the ones that every professor knows (even if not all of them are masters of say designing GUI programs in C# in the windows operating system).

    3. Especially since today's students are often masters of "programming" their cell-phones/ipods/web-pages etc.. the course should focus on abstract thinking rather than details that they can pick up themselves by reading a user manual.

    Boaz Barak