2012.0513 permalink

Selection bias in new technologies

Each new thing that comes out seems to improve the state of technology. OOP "conquered complexity", FP "conquered buggy state", etc. Yet once enough people adopt a new paradigm, we start seeing the same old problems pop up again; today's OOP is just as unmaintainable as yesterday's GOTO-laden spaghetti code.

I think there's a selection bias that makes new technologies look artificially effective. Who adopts the latest unproven technologies? Probably enthusiasts, for the most part. Programmers who are inherently curious about things. And it should come as no surprise that, on the average, a curious programmer is probably more effective than a non-curious one.

The real test is whether the new technology holds up after it becomes mainstream.

2012.0508 permalink

The programming language quiz

I've just written an impeccable measuring tool for useless knowledge. The programming language quiz.

2012.0416 permalink

Goto isn't evil

Constructs which are easy to misuse are branded as evil, despite the fact that using them is generally a conscious decision. goto is the canonical example of this. Mutability is another, in FP circles. Neither of these is in itself evil, but they become harmful when misused by incompetent programmers. Similarly, OOP and immutability aren't univeral solutions. Either of these can be misused by similarly incompetent programmers to form codebases that are unmaintainable or impractical.

By analogy, cars aren't evil despite the fact that careless misuse of them kills so many people. Nor are explosives, firearms, chainsaws, or any number of other things that end up hurting people. They may be treacherous and dangerous, but such is the world we live in sometimes. (Programmers are no exception; look at Javascript.)

Incompetence is the thing that's evil. Capable programmers make informed, responsible choices about how they implement things.

2012.0411 permalink

The box problem

It can be argued that Java and C++ have roughly comparable performance for CPU-bound tasks. Yet Java is justifiably considered to be a much slower language than C++. What gives?

Most programs aren't CPU-bound. They're IO-bound against memory. An L2 cache miss is often estimated at 100 clock cycles. This is comparable to two IDIV instructions on a fast 64-bit processor (page 28). This delay is so significant that processors can use HyperThreading to create virtual cores just so that they have something to do while waiting for data to load.

In Java, objects are passed by-reference and primitive values are passed by-value. However, there's an important exception to this rule: A generic class with a type parameter always uses a reference type. That is, List<Double> uses reference doubles, also called "boxed" doubles. Java avoids this representational overhead in certain cases by providing primitive arrays, which store the values directly.

Boxed values are a problem for four reasons:

  1. Accesses to boxed values require an additional pointer dereference, which may miss the cache.
  2. Allocating boxed values takes up a lot of additional space and therefore requires the GC to run more often.
  3. Boxed values are additional entities that must be garbage-collected, so having many of them makes it slower to find the live set every time the GC runs. (Also, the GC will incur its own cache misses traversing them.)
  4. In Java, changing a boxed value often involves allocating a new one and garbage-collecting the old one; changing a primitive value can be done in a single instruction.

Compilers generally optimize imperative instruction sequences but not data representation. Boxed values are generally an artifact of language or VM limitations, primarily due to the weak type erasure semantics that most likely became mainstream with the introduction of object-oriented programming (they had been present in dynamically-typed languages for some time, however). Until compilers get better at optimizing the way data structures are implemented in memory, there will be a strong case for using low-level languages when performance matters.

2012.0403 permalink

Zero-risk consulting

I'm going to try something new. I have no idea how it will turn out. It occurs to me that there's probably a lot of risk associated with hiring people in various capacities: Full-time employees are a very high risk, descending to contractors from managed places like eLance. But with any of these mechanisms, there's enough overhead that you generally don't have a market for micro-work; that is, stuff that might benefit from a specialist but that only takes a few minutes.

So my idea as it stands is to become a zero-risk consultant. Let me know what you think.

2012.0401 permalink

Why microbenchmarks are misleading

The Great Programming Language Shootout contains a fairly prominent disclaimer that microbenchmarks are not an accurate measure of real-world performance. The first time I read this, I remember thinking: "Why not? What about these applications isn't 'real-world'?"

I think the problem isn't that microbenchmarks are somehow fake; they obviously solve very real problems. The real issue lies in the various processes used to write different kinds of software. For example, suppose you're writing a gzip encoder. There's probably a known-optimal solution with a low instruction count and good cache locality, and which is known to maximize performance across a wide range of architectures. And it's probably worth tuning the algorithm like this (maybe even per architecture) because gzip faces such a horizontal market.

On the other hand, consider something like Eclipse. It is written in one of the fastest languages in the world, yet it is one of the slowest pieces of software I have ever used. Nothing about what Eclipse is doing should take nearly as long as it does, but my guess is that the codebase is large and complex enough that effective optimization isn't realistic. Once code gets to be that large, complexity precludes good optimization. The language's runtime performance is made irrelevant by high-level problems like ineffective multithreading, heavy indirection, and swapping to disk.

In other words, microbenchmarks are useful only when the language runtime is the limiting factor. Most real-world applications are limited by concerns like maintainability, good style, sensible abstractions, short timelines, and developers who are not compiler experts. For these projects, the applicability of the language's paradigm to the problem will contribute more to good performance than the vast majority of low-level runtime optimizations.

2012.0327 permalink

Pure functions don't exist

State is evil. Such is what we are led to believe by functional programming devotees who justifiably value purity in code. And it continues: referential transparency is crucial, embrace pure functions, use immutable data structures, etc.

I don't have much of a problem with the philosophical values that motivate this. However, I think there is a particular brand of idiocy that has sprung up from the dogma. It is delusional to claim that a function has no side-effects, or that pure-functional programming languages as they are implemented today even allow you to write such functions. Here's an example in Haskell:

-- Program 1: A pure function.
-- Load this up in ghci and evaluate 'identity 5'.
identity = id
-- Program 2: A similarly pure function.
big_identity 0 = id
big_identity n = id . big_identity (n - 1)
identity = big_identity 100000000000000

The first program will quickly return 5. The second program will use up 100% of your CPU for a few seconds, then start swapping to disk, then die after it exhausts all of the available memory. (I had to reboot because I couldn't interrupt the process due to swapping.) On the face of it, this is patently ridiculous; any human will tell you that it doesn't matter how many times you compose the identity function onto itself, it remains the identity function. It's also not exactly obvious what this code is even doing; the identity function is defined not to do anything at all.

The real fallacy, I think, arises from the fact that time, memory usage, etc. are all side effects, and they can cause just as many bugs as excessive state-sharing in imperative code. At the end of the day, the best way to eliminate undesirable side effects is probably to have a deep knowledge of what your compiler is doing. Abstraction, purity, and correctness don't absolve the programmer from the responsibility of using a real CPU, real memory, and having real performance constraints.

Put differently, languages don't eliminate undesirable side effects. You can easily write pure-functional C if you want to. Programmers eliminate undesirable side effects by making high-level decisions about things that compilers are light-years away from understanding. Purity is a property of design, not implementation.

2012.0323 permalink

Solving the wrong problem?

Source code is represented as text. On the face of it, this is a strange format to use; it requires complex parsing algorithms, makes it easy to write nonsensical statements, and introduces a large layer of indirection when debugging. Yet I am unaware of any remotely palatable non-text programming environment.

This, of course, bugs me. I would love to see text be replaced by something closer to a program's true representation. Yet more than fifty years after the development of the first programming languages, we are still using text exclusively. Why?

My guess is that the proposed solutions don't get at the real problem with using text to represent code. For example, structural solutions that I've seen (and I'm no expert, so I may be overgeneralizing) have worse ergonomics than text, generally take up far more space to represent the same amount of information, and don't provide any particularly compelling advantage other than constraining the programmer to known-valid constructs.

What about a structural representation that did things differently? How about one which let the programmer move the code through invalid states, used less space than pure text, and had superior ergonomics? Given the power of editors like vim and emacs, this may not be possible. And arguably it has already been achieved by IDEs like Eclipse.

Put differently, maybe people like to communicate with computers like we communicate with other people: in writing. Just as we would never diagram our sentences as an alternative to speaking or writing them, maybe it's unrealistic to expect programmers to explicitly state the structure of their code.

2012.0319 permalink

The value of heuristics

Some problems don't have clean, elegant solutions. One example of this is object traversal in Javascript. I was working on a serialization library that crawled through a series of Javascript objects, serializing each one and encoding the object graph as a series of references that could later be reconstructed. In order to do this, you need to be able to reliably mark each object and be able to reverse-map objects to their IDs. And the catch is that Javascript only lets you use strings as object keys.

The hard but correct way to do this is to keep an array of [ID, object] pairs that you could then search each time you encountered an object. But this is time-consuming; O(n^2) overall. Better is to find some way to store the ID directly on each object. This can be done if you choose an extremely improbable name, like extremely_improbable_name_for_serialization_library. But this approach breaks down once you want to serialize the source tree of your serializer (if you ever did want to do such a thing), since the name is present verbatim.

I ended up generating a pseudorandom name using at least 128 bits of entropy. This isn't too hard; Javascript identifier characters (A-Z, a-z, 0-9, $, _) encode 6 bits each, so 22 characters contains 132 bits of information. I chose 128 because for a long time this was considered cryptographically secure; if you could guess someone's 128-bit key then you could see everything they were saying to each other. Pseudorandom data isn't cryptographically secure, but it is difficult enough to predict that it seemed like a reasonable choice.

The beauty of this solution is that it isn't correct, but it always works. If you can put an upper bound on the degree of pathology of your problem domain, then you don't have to implement something that is universally functional; you just need to handle the most pathological case that comes your way.

2012.0315 permalink

Too little structure

Caterwaul's parser and syntax trees are much looser than the Javascript language specification. I did this mostly out of laziness; it seemed simpler to write an ad-hoc parser and stick to operator-precedence parsing as much as possible. It turns out that my laziness enabled a whole range of very cool stuff to be done. For example, caterwaul can parse expressions like this:

S[for (var L[i] = R[10]; R[i < 10]) S[{_body}]]

This means that you can annotate arbitrary subtrees, including ones in statement mode (!), with markers and then pattern-match against those markers and their contents. Caterwaul has used this for some time; the seq library uses it to descend through subtrees and parse out the sequence operators. And because the caterwaul parser handles all of the cases uniformly, markers can be seamlessly integrated into qs and qse forms.

The interesting thing about this is that I wouldn't have even thought of the marker approach had I been using a highly-structured parser and AST representation, and I would have worked harder to write the parser and AST in the first place. So I've got a new approach: Write APIs with too little structure and fix it later.

2012.0312 permalink

Type theory considered distracting

Assembly language is untyped. By the time you're writing assembly programs, all types have been erased into instruction patterns that happen to reflect the semantics of what you're doing. If you're in a higher-level language, those semantics could even vary depending on runtime variants; the fixity of in-memory representation is, from what I can tell, primarily a historical nod to low-level type erased languages that depend on it being a compile-time invariant. (And compacting garbage collectors loosen this restriction somewhat by making pointers opaque, so we're slowly losing aspects of this codependency even in situations where it seems like it should be difficult.)

C inherits its type system from the constraints of assembly language: Types were not particularly used to guarantee correctness or to prove complex compile-time invariants; rather, they dictated the memory layout in a consistent way. And C has a desirable property because of this. All of its types can be erased at compile-time; there is no runtime type information. This isn't always the fastest way to do things because of the overhead associated with x86 calling conventions and indirect jumps, but the type system arguably leaves off where it makes sense -- before the program is run.

Fast forward to the development of the JVM Hotspot compiler, which includes an impressive array of optimizations including inline caching that deal specifically with compile-time unknowns whose runtime impact is generally mitigated by heuristic observation. There isn't anything wrong with heuristic optimization; generational GC, for instance, makes a tremendous amount of sense. But it's worth thinking a bit about the kinds of optimizations that didn't happen even when they would be relatively trivial to implement.

For example, loop invariant-analyzing compilers do not, to the best of my knowledge, hoist invariants across function boundaries, even when the function is provably monomorphic. This matters in languages like Javascript; for example:

var process = function (data, x) {
  return (data instanceof Array ? f1 : f2)(data, x);
};
for (var i = 0; i < xs.length; ++i)
  process(constant_array, xs[i]);

Here, process and constant_array are loop invariants, but even sophisticated compilers like V8 are unlikely to include an optimization step that inlines the loop body into the single expression f1(constant_array, xs[i]) even after executing it many times. However, if the dispatch were made implicit by using prototype methods, the inline cache would kick in and provide a significant benefit.

2012.0310 permalink

Becoming a dad

Adam Tipping was born two days ago at about 8 in the morning. He's our first, so Joyce and I went from a low-maintenance child-free existence to something totally different. I haven't thought much about technology since Adam was born, so for the next little while I'll post cute baby pics instead.

2012.0307 permalink

Developing without unit tests

About a year ago I removed caterwaul's unit tests. Now it has only a large functional test: can it bootstrap-compile itself? I think the change has been really good for the project, all things considered.

Tests are a time investment. You hope that the effort you spend maintaining tests is less than the amount of effort you would spend debugging/fixing regressions that would have occurred without them. Whether or not this is true depends on a few factors, a significant one of which is how quickly the project requirements change. Caterwaul's API doesn't change very often, but its internals change more often.

Unit tests are also protection against other developers. I deliberately wrote caterwaul to be difficult for other developers to modify so I wouldn't need to worry about this (and because writing software this way is just more fun IMO). I'm convinced that team development has some significant game theory problems, and unit tests mitigate these problems to some extent.

Finally, caterwaul has had some interesting philosophical changes since I removed unit tests. It has gotten simpler because I needed to keep it manageable. If I tried to write some ill-defined, complex library, something would break somewhere and it would take a lot of time to fix. The result is that the code is roughly uniform in its distribution of fragility * modification frequency.

2012.0302 permalink

A Caterwaul bug

Not all bugs are created equal. They range from the stupidly obvious mistakes of late-night coding to the intriguingly subtle nonlocal consequences of reasonable design decisions. In my experience, most bugs tend towards the obvious side of this scale. But sometimes I run into a legendary bit of pathology that is so convoluted or mysterious that it qualifies as a work of art. Yesterday I was fortunate enough to encounter one of these in Caterwaul.

I was a little worried about it because I had seen it happen at least once before and had no idea why. The symptom was that, for some input programs that made heavy use of syntax quotation, waul would die with output similar to the following:

node.js:201
      throw e; // process.nextTick error, or 'error' event on first tick
            ^
TypeError: Object , has no method 'reach'
    at eval at <anonymous> (waul-1.2:140:119)
    at [object Object].each (eval at <anonymous> (waul-1.2:140:119))
    at [object Object].reach (eval at <anonymous> (waul-1.2:140:119))
    at eval at <anonymous> (waul-1.2:140:119)
    at [object Object].each (eval at <anonymous> (waul-1.2:140:119))
    at [object Object].reach (eval at <anonymous> (waul-1.2:140:119))
    at eval at <anonymous> (waul-1.2:140:119)
    at [object Object].each (eval at <anonymous> (waul-1.2:140:119))
    at [object Object].reach (eval at <anonymous> (waul-1.2:140:119))
    at eval at <anonymous> (waul-1.2:140:119)

One of the problems with writing your own programming language is that you also have to write your own debugger. Right now Caterwaul has no debugger, so figuring out why this was happening was a real challenge. Fortunately, I had a hunch: I suspected that I was building a syntax tree with a string child instead of a syntax tree child. Caterwaul lets you use either as a convenience, but the push() method of syntax trees doesn't do that conversion. So I refactored push() to use the same logic as the constructor and figured that would take care of it.

It didn't.

Instead, I saw this in the generated output:

... .call(this, ..., , , , , , , , );

Node was understandably unhappy. Also strange was that my .qse literal expression refs had been replaced by variables called e1, e2, etc (normally they'd be called qse1, qse2, ...). At first I thought I had some macro that was re-expanding something and generating anonymous expression refs, so I poked around. To my surprise, no macro like this existed in the Caterwaul standard library. This meant that the bug was in the Caterwaul core itself.

I copied the failing program and started trimming it down until the problem went away. I found out that constructs like 'g[_x]'.qse /-f/ y would cause an e variable to be created. As soon as I got rid of the /-f/ y, the qse would behave normally. I then tried it out on the REPL and found the same strange behavior:

$ waul -c js_all
waul> '"foo".qse /-f/ bar'.qse.toString()
'f(qse_h_azCSgoP86KF16yaWYRruza)'             <- notice: no 'bar'!
waul> '"foo".qse / z /-f/ bar'.qse.toString()
'f(qse_j_azCSgoP86KF16yaWYRruza,z,bar)'       <- now it's there

At this point it was clear that it was some strange case of the infix function syntax. After some more poking around and random guesses, I found that calling flatten() on expression refs didn't work correctly. It turns out that tree coercion had a subtle bug in this function (it's written on one line in the Caterwaul source):

as: function (d) {
  return this.data === d
    ? this
    : new this.constructor(d).push(this);
}

This appeared to be doing the right thing: Either return the node or wrap it in a new d node and return that. It would be appropriately generic, too: new this.constructor would be closed over the node type.

And that's when everything made sense. Expression refs aren't instantiated like regular nodes. Most syntax trees take the "data", or their name, as the first argument. So if you said new caterwaul.syntax('foo'), you'd have a syntax tree representing the identifier foo. But the expression ref and closure ref constructors are front-loaded to take the value before the name. The name is entirely optional; normally Caterwaul will just use a gensym.

The result of this was that when flatten was called on an expression ref, the expression ref was parenting itself with another expression ref, not a regular syntax node. Expression refs don't have children, so this caused the original expression ref to be hidden and replaced by a nullary comma node. Expression refs serialize to their data, so the nullary comma would be serialized as just ',', hence the syntactically incorrect output.

Long story short, it turns out that as is an anomaly among tree transform methods in that it shouldn't respect type closure. The fix was to write it this way:

as: function (d) {
  return this.data === d
    ? this
    : new caterwaul_global.syntax(d).push(this);
}

If only all bugs could be this cool.

2012.0229 permalink

Performance as a side effect

Someone on Twitter posed a good question recently: "What is state?" (that is, statefulness vs statelessness in code). I thought about that for a while. It's a difficult thing to pin down because you could argue that any stored data is really state. My answer was that state is an implicit dependency on time.

If it's true, this answer has some strange ramifications. For example, performance is another variable that involves time. And I would argue that something with unpredictable performance is, in production scenarios, as dangerous as something that randomly emits side-effects. On the aggregate, it wouldn't surprise me if Java and C++ were actually the two languages with the highest level of abstraction for the generally low total time dependency.

2012.0226 permalink

Indirect jumps in GNU assembler

I'm not great at assembly-level programming, but I've been getting into it recently to implement a new programming language that I'll probably never finish. While working on it, I decided to use some continuation-passing style patterns to minimize the amount of stack manipulation that would need to happen. So I started out with some code like this:

movq continuation, %r12
jmp f
...
f:
...
jmp *%r12
...
continuation:

If you're familiar with AT&T assembler syntax, you probably won't be surprised to hear that this segfaulted and gdb showed %r12 as having a value way outside of the program's code space. After a lot of reading online, it turns out that you need to do it this way:

movq $continuation, %r12

Linguistically, this makes sense: you're jmping to the code itself (which is similar to a dereference), but you're talking about the code's address when you put the value into a register.

2012.0222 permalink

The wrong tool

As a programmer who likes functional paradigms, I have a hard time accepting the fact that Java is so popular. But it is, and so much so that it's noteworthy. Great software is built in Java, C++, C, and other very non-functional languages with tons of mutable state and edge-triggering. (For example, my OS, window manager, editor, terminal emulator, and browser are all written in one of these languages.)

I think there are some interesting dynamics behind Java's popularity and empirical success (and Haskell's empirical non-success, despite being signficantly more academically meritorious). What is it that Java has that Haskell doesn't? My best guess is that Haskell gives you the ability to create tools easily; this is the default way of developing. So in order to make progress, you are continuously defining abstractions. Put differently, you have the freedom, and the burden, of choosing your tools.

Java is simple. It gives you one mediocre abstraction that leads to slow, complex, bug-ridden software. Its model of objects is, in my opinion, the wrong tool for any job. However, and this is the genius of it, you don't have to think about your toolset. You just write stuff using the wrong tool, and slowly but surely you make progress without solving unnecessary problems.

Another way to look at it is that this is a reflection of the software process as a whole: Most failures occur in the requirements phase, not design or implementation. By making programmers think about their own requirements during the development phase, languages like Haskell create another point of failure. What interests me particularly is that this point of failure is evidently so large that the average corporate programmer (even at Google!) is better off using the wrong tools than they are taking a shot at inventing their own. Maybe the true measure of a programmer's usefulness is not their ability to write software, but their ability to know what needs to be written.

2012.0222 permalink

Perfection is irrecoverable

Haskell emulates something close to perfection on an imperfect platform. Rather than embracing the platform's idiosyncrasies and imperfections, it attempts to create a world where they don't exist. C, on the other hand, doesn't hide the fact that the platform is imperfect. It simply refines the world into a less imperfect one if you're writing normal assembly code.

In general, I don't think it's possible to cover up significant amounts of imperfection. It's possible to use the Church encoding for numbers, for instance, but we don't do it because the cost of introducing this abstraction is far too expensive. I would argue, too, that it's unnecessary. Purity is beautiful, but we're better off asymptotically approaching it than we are living in a world where it's the only option.

2012.0217 permalink

Syntax as a constraint

Someone at some point said something like "constraints breed creativity." It's an interesting thing to say considering that constraints also limit your options. But sometimes the burden of choice is much more significant than the flexibility it provides.

I'm running into this with Mulholland and its very complex operator precedence model. The language doesn't present enough of an opinion to guide library design, and as a result I'm not sure what to do with it. Caterwaul didn't have this problem because working inside Javascript's syntax was very limiting; there was rarely more than one way to do something. And it's one of the most useful (to me) things I've written. The necessary compromises haven't held it back much at all.

Put differently, syntax should be friendly both to library designers and to library users. Users, obviously, should be able to write what they mean and have that translated into something useful. But more subtly, designers should have a set of constructive constraints so they aren't making decisions in a vacuum.

2012.0215 permalink

Software reliability

When I was working for startups, it seemed like something was always burning down. The database would spontaneously become read-only, the middleware would sponataneously fail, or something similarly catastrophic, and everyone would be in red-alert mode to try to fix the problem. It isn't hard to see why, either: we used brand-new technologies that weren't mature.

Big companies don't seem to do it this way, and it isn't hard to see why. They can't afford the risk. (Using immature technologies is a huge risk, by the way; when we used Tokyo Tyrant, for example, it would have started silently losing our data after we put more than 40GB into it.) Startups are lured into making risky choices by the promise of more rapid software development.

At the same time, though, I wonder how many startups fail because they jumped the gun and tried to be more technologically savvy (in the use-what's-cool way) than their competitors, only to end up with a technology Jenga tower that collapsed at some crucial moment (or worse, a development team whose performance was crippled by fighting fires). The ideal risk isn't blind, it's calculated.

2012.0209 permalink

Not invented here

I have a terrible case of NIH. This means I reimplement stuff that other people have written because I think I can do a better job (or based on other shaky reasoning). It does end somewhere; I haven't yet written an operating system, browser, terminal emulator, machine-code compiler, or even a replacement for jQuery. But I have written a couple of programming languages, data structure implementations, serialization formats, and other stuff that I could have gotten for free with open-source software.

Strangely enough, I'm not sure I want to outgrow this bad habit just yet. I've learned a lot by reinventing the wheel. It's also been productive in some cases; when I find a problem in my stack I can fix it quickly and move on.

2012.0207 permalink

Occam's Razor

In real life, true facts are often both counterintuitive and simplifying. Finding out that the earth is round is a significant mental leap considering that it looks flat from every angle. But it simplifies our perception of deeper questions like gravity, and it explains how airplanes can keep going west and end up where they started.

The same is true of great platforms and frameworks. Rather than obscuring things or creating noise, they use a less-than-obvious presentation of something that ends up making your world a simpler place.

2012.0205 permalink

Great abstractions

jQuery didn't take long to become the dominant Javascript library, and it's easy to see why. For someone who already knows DOM programming, it's trivial to learn and provides great ways to get stuff done with less work. jQuery also managed to achieve a kind of minimalism; you can't remove much without changing something fundamental. And for many purposes, it has replaced the DOM.

Another platform like this is the C programming language. It allows you to write assembly-level code far faster than you could in assembly, and it conveniently obscures the details of architecture/platform-specific instructions and ABIs. The vast majority of C programmers feel no need to look underneath the hood of GCC, the linker, the calling conventions, the standard C library, or any number of other complex pieces of C's infrastructure.

Each of these abstractions has an interesting property: people who use them rarely find the need to work around them. Put differently, these abstractions don't leak. Most abstractions, of course, aren't like this. For instance, X11 implementations use direct-memory rendering for local connections to provide hardware acceleration. All of the rendering could be done over its network protocol, but it would be dog-slow and nobody would use it.

Good abstractions vanish

CPU-bound programs written in Perl or Ruby usually take much longer to run than the same programs written in C. As such, there's a case to be made against, for example, writing ray-tracers or programming language interpreters in Perl. C, on the other hand, is not appreciably slower than assembly language for the vast majority of tasks. For most purposes, you'd never know the result wasn't written in assembly to begin with, except that C is so cool that nobody would do that. (Actually, this isn't true. C imposes a lot of structure on the resulting assembly that makes it easy to detect. But none of this structure impedes the program's functionality/performance very much.)

Like C, jQuery also doesn't add much overhead. Last time I measured it, the cost of creating a Javascript object is 1% of the cost of creating an HTML element; although jQuery isn't erased per se, it is such a thin layer, and it's so aggressively optimized, that jQuery itself is rarely the cause of performance problems.

Here's where things get slightly counterintuitive: both jQuery and C will generally result in a net performance increase despite each creating some overhead over the perfect hand-coded solution. Not only do the abstractions vanish, but they result in a faster end product. And we use these libraries because low performance is one of the most harmful side-effects that an application can have.

Put differently, asking the question, is using this abstraction making my application slow is just as problematic as asking, is my app failing because this abstraction has a bug.

Good abstractions are culturally accessible

jQuery uses CSS syntax to select things, and while it did invent (as far as I know) its own accessor/mutator convention, this was so unsurprising and easy to use that things like Java's getter/setter pattern looked clunky and outdated by comparison. C allowed programmers to keep doing the kinds of cool stuff you could do in assembly (pointer typecasting, computed jumps), but made it easier and less error-prone to write programs with consistent structure.

Also, and just as importantly, jQuery didn't present some grand unified replacement for the DOM. Instead, it leveraged one of the DOM's core properties (the structured node hierarchy) and made it accessible in a reliable, cross-browser way. C didn't try to manage memory, create a new paradigm, or ignore the fact that you're ultimately writing machine language. Instead, it embraced these things and made them work consistently across every major platform (at least as far as the end-programmer is concerned).

The last, and perhaps most important, component of accessibility is ergonomics. C is much terser than assembly language for common use cases, just as jQuery is often much terser than direct DOM programming.

C and jQuery aren't the only libraries that have great ergonomics, of course. The UNIX filesystem and shell are also so useful that there isn't a decisively better alternative. Ditto for text editors as IDE components, despite how non-textual most programs are. Whatever design flaws these systems have, their ease of use is so compelling that we don't want to switch away.

Great abstractions withstand adaptation

Adaptation comes in two ways. One is when the abstraction author changes something, often breaking code that uses the abstraction. This happened with Ruby; 1.9.x isn't backwards-compatible with 1.8.x, yielding workarounds like the RVM. Perl and Javascript are examples of the opposite; they have preserved horrible design flaws throughout their evolution so that old programs would keep working without modification.

The other, and more interesting, form of adaptation is when people start misusing something. A great example of this is the C++ template system, which was probably in no way designed with the idea that people would be using it for general-purpose metaprogramming. The fact that this misuse is so reliable that it has become commonplace is a significant compliment to C++ templates. They have held up to unforeseen use where a lesser system would have broken down. The Web is perhaps the greatest example of misuse; what started as a simple linked document management system has ended up nearly replacing desktop applications.