<M <Y
Y> M>

Deception, bias and programming language communities

After listening Robert Trivers' talk on deception, I began wondering about self-deception, bias and programming languages. Bias in the sense of favoring arguments that support your view and vice-versa.

The bias is worst for things that you have already made up your mind. Say you are choosing a college. Before choosing one, you probably weigh the alternatives quite critically and might even experience some anxiety over the choice. But once you've made up your mind, ie. chose to go to a certain college, you just ignore everything negative about it and endorse all positive tidbits you hear regarding it.

This all happens mostly unconsciously. Even if you know about it, it's extremely hard to take it into account. (Let alone if you don't know about it.) You'd have to always be consciously aware of this bias and deliberately take it into the calculation.

Another example would be a so called we-bias. You belong to a group and then there's everyone else. If a member of your group does something positive, say, gives a donation to a charity, you tend to generalize: "Bob is such a generous guy." Whereas if it's not a member of your group, you tend to be very specific: "Alice gave a small donation." And vice versa for negatives. Member of your group: "Bob stumped into Ned." Not a member of your group: "Alice is so hostile."

Again, this happens mostly unconsciously. You probably have to be a researcher, or at least an outside observer, to tap into this phenomenon. You're very unlikely to notice your own bias. (Unless you specifically look for it; a rare occasion I would assume.) Self-deception is a very strong power. As Trivers noted, it's likely that a selective force was in favor of genuine self-deception, because one is undoubtedly a better liar if one isn't aware of one's lying. And, boy, is lying good for your breeding prospects or what!

All this deception and group bias must be going on in programming language communities, too. And I'd argue that there isn't a simple cure for it. You might think you're immune to such deception, but, then again, you might be just cheating yourself. You see, you are prone to deception and bias if you have taken sides, so to speak. In other words: you are biased, if you have a favorite programming language.

No matter how hard you try to be objective, you probably still aren't. No matter how diverse you think your views are —if you still have a favorite programming language (or two)— you probably are biased.

I think the deception and especially bias is apparent in the way programming language communities are hostile to other, inferior languages. Once you've made your personal decision (which you might of course change later) on a preferred programming language, you tend become dismissive about the relative weaknesses of your language and you tend to emphasize to strengths of your language. What's more, you probably become extremely critical about other languages.

People can look for good ideas and strengths in other languages, but they do think that "their language" is one some sense the "best language", at least for them. They feel as if there was a genuinely all-around better language out there, they'd switch to it, immediately. But I would assume that this is not the case.

There are exceptions to this rule, of course, but if you're now thinking "yeah, that's me", I wouldn't be so sure about it. I am pretty sure I am biased and self-deceptive, especially on this subject!

And frankly, I don't know if this is such a bad thing, after all. Sure, self-deception and bias distort your objectivity and decisions, but it is so hard to always fight against self-deception. And once you've made your decision, you're probably going to stick with it for a while and knowing your language very well is good for your productivity. Furthermore, you can always explore other languages; it's just that you won't probably make objective and balanced observations of other languages. Get used to it.



I've always found Google's related (ie. finding similar pages) pretty unsatisfying, but it can be kind of flattering, too:

(Not that I would think it's true, but anyhow.)



Reddit and The Wisdom of Crowds

I've been using reddit as a source for finding "what's new online", but it's not always that satisfying. It turns out that I didn't really understand what reddit is all about. I thought that reddit can't quite utilize what James Surowiecki calls the wisdom of crowds, but I just might have it all wrong. (Or, at least, there's a very good possibility for reddit to evolve into a very good collector of the wisdom of crowds.)

The wisdom of crowds that Surowiecki is talking about consists of the rather counter-intuitive idea that a crowd, given the right circumstances, can be smarter than any of its individuals, however brilliant they might be. I won't reiterate the evidence for the wisdom of crowds Surowiecki has gathered. You have to read his book if you're not convinced of me just saying that.

The right circumstances for crowds to accumulate wisdom boil down to four principles. The first is that the crowd has to be decentralized. As Surowiecki puts it: "people are able to specialize and draw on local knowledge." Second, the choices that the individuals make must be independent of each other: "people's opinions are not determined by the opinions of those around them." The opinions must also be diverse, thus: "each person should have some private information." And finally there must be some mechanism for aggregating the information to a collective decision.

Applying these principles to the way reddit works, I think decentralization, diversity, and aggregation are somewhat given. Reddit's quite decentralized, users are sufficiently diverse (though lack of total diversity might be a bit of problem too), and, well, reddit is an aggregator.

The question becomes, then: are the choices of reddit users independent of each other?

You see, when it comes to the wisdom of crowds, the independence of choices is very important. If choices are not independent, what you get is an "information cascade." It means that the individual persons imitate the choices of others.

A simple example of information cascade: consider two restaurants (call them A and B) near each other and a situation where the both are empty. Let's say they are somewhat similar in quality and not too different otherwise. Then, a person comes to eat and chooces the restaurant A, for whatever reason. And let's say the next person also goes to the restaurant A.

Now, when the next person comes and tries to decide which of restaurants to go to, his decision is likely to be influenced by the fact the B restaurant is empty and the fact there are few people in the restaurant A. Once this process has gone for a while, there's a good chance that no one goes to the restaurant B, because it's empty. People figure out that there must be some reason for people to have chosen the restaurant A over restaurant B. But it might be that the reasons might not have anything to do with preferences of a person trying to decide which restaurant to go to!

Okay, then. Back to reddit.

I have always thought that the way reddit gives "points" based on people's choices is a gross violation of Surowiecki's independence principle. The front page, the "hot" page, shows only links that have at least moderate amount of points, and some people only follow the front page. Thus, their choices are not completely independent. Furthermore, the articles that get a high "score" are more likely to be checked out by people who are not going through all links (ie. most if not all users). Thus, the choice, again, is not completely independent. There's also the psychological (mostly unconscious) effect of following what others do, "herding" you might call it.

Note that I am not saying that reddit absolutely mustn't violate that principle. The violation matters only from the perspective of wisdom of crowds. I think that's important, but you might not.

What I think can cure this is to think of reddit as a personal filter. (Paul Graham himself has been suggesting the idea to the reddit developers. Max Khesin recently realized this too.) The ups and downs ("hot" and "cold") you give to the articles are not giving a general score to the given article, but a hint to the reddit system that you would like to see more similar stuff [1]. So it is somewhat irrelevant to you as a user which articles get the most points, you're only there to use it for your own needs.

But things could be even better. (Even if I used it like that, most of the other users might be suffering from information cascades, thus affecting what I see, too.)

I'd suggest —like Paul did way before me— for reddit to get rid of the visible scoring from the front page and only show how relevant the articles are to you (with some kind of 0-100 point scale or something). If reddit was explicitly and directly a personal filter for every user, I think it would have all the four necessary conditions for fully utilizing the wisdom of crowds. Every user would independently use reddit just for one's own needs, and as a side effect it would be better aggregator collectively.


[1] Whether the way reddit finds similar stuff, through cross-linked scoring, is really effective is another question.



Random book from Amazon

I revived my age old application that gives you (quasi-)random books from Amazon.

Just needed to update the pyamazon library and rewrote the application in CherryPy.

I find the application a bit addicting. For a little while, at least.

(And one day I'll get around ordering a real domain for myself, too.)



Argh. Google Reader doesn't quite have the technical quality we've come to except from Google's services. In short, it kinda sucks.

I think I'll just have to find a better RSS/news reader. (I don't want to go back to Bloglines.)

Sigh.

[1 comment]



Pychyl's iProcrastinate Podcast

Another solid and informative audio show would be Timothy Pychyl's (or, Procrastination Research Group's) iProcrastinate Podcast. The topic revolves around, not too surprisingly, on procrastination. Pychyl himself researches procrastination in academy and therefore his audio show should prove to be accurate, but hopefully still helpful.