Monday, December 17, 2007

Using the internet as a broker between data processing projects and volunteer resources

Consider this situation. An institution has some worthwhile project they want to undertake, but it has huge data-processing requirements, and they just don't have the resources to fulfill them. Maybe they'd like to simulate protein folding, for medical research.

In an increasing number of cases like this, the internet is being used as a broker to enable such projects, by providing a pool of volunteer resources, either of people's computers or their own time to perform manual data-processing. There's huge pools of resources out there - a whole world of computers in homes and potential volunteers.

So far, this strategy has been quite successful, as reported in this article in the Economist.

Here's a summary of the details

Automated data-processing

Using the spare processing cycles on people’s home computers (and other devices, like Playstation 3’s).

Folding@home - simulating protein folding and mis-folding -- a cause of diseases such as Alzheimer's.
In September, had combined computing capacity one petaflop--a quadrillion mathematical operations per second--something supercomputer designers have dreamed of for several years.

SETI@home - analysing data for signs of the existence of extra-terrestrial civilisations
The BOINC platform has been developed to support such processing.

Manual data-processing ("distributed thinking")
Galaxy Zoo - volunteers help astronomers to classify the shapes of galaxies from powerful telescope images. "Thanks to the exquisite pattern-recognition capabilities of the human brain, amateurs with just a little training can distinguish between different types of galaxy far more efficiently than computers can."
Had more than 100,000 volunteers classify over 1 million galaxies in a few months.

Stardust@home - volunteers spot the tell-tale tracks left by microscopic interstellar dust grains in tiles of porous aerogel from a probe sent into space.
Enlisted some 24,000 volunteers, who in less than a year performed more than 40 million searches (about 1500 searches / person).

Herbaria@home - volunteers document plant specimens from images drawn from the dusty 19th-century archives of British collections. Already, some 12 000 specimens have been documented.

Africa@home - volunteers extract useful cartographic information -- the positions of roads, villages, fields and so on -- from satellite images of regions in Africa where maps either do not exist or are hopelessly out of date. This will help regional planning authorities, aid workers and scientists
documenting the effects of climate change.

Distributed Proofreaders - volunteers help to proofread OCR'd scans of pages from old books (not metioned in the Economist article, but in a referring slashdot article).
The BOSSA platorm (Berkeley Open System for Skill Aggregation) has been developed to support such processing.

What if the non-expert volunteers do a poor job at the dataprocessing? This isn't actually a problem, as redundancy is used to ensure quality. For example, a particular image used by Galaxy Zoo is classified by thirty different people, and it turns out that this is enough to get a highly accurate answer.

Some thoughts

Some ideas not touched on in the Economist article.

Will there always be a need for such volunteer resources? Or might Moores Law make computing power so cheap and abundant that even the most processing hungry project easily satisfy its own needs?

Second, I wonder if there might be a time in the future when the idea of such projects is well known in the public mind, and where people would think of them like they would think now of volunteering for a community group or giving money to a charity?

Lastly, the manual data processing projects might be a good way for school children, or members of the general public, to gain a gentle introduction to the world of science, and learn a bit more about how science is done and how the scientific community works? I'm not saying that volunteering on such projects is an education in these things, but it might still provide a little bit of a feel and familiarity.

Thursday, December 13, 2007

Dharmesh Shah: Why Some Software Is Not Simpler, Just Suckier

Last year, Dharmesh Shah wrote some posts on the topic of simplicity in software, trying to distinguish between good and bad types of simplicity (as I mentioned here).

The other day, he wrote a pretty good followup. It's main point is that

The goal for software developers should not be to make things simpler just by reducing features. The goal of software should be to make it simpler for the user to do what they are trying to do.
and the post elaborates on what this means in practice.

Tuesday, December 04, 2007

A climate-change leadership opportunity


"I believe that this nation should commit itself to achieving the goal, before this decade is out, of landing a man on the moon and returning him safely to the earth."
John F. Kennedy, thirty-fifth President of the United States.

That sort of leadership might be what the world needs for tackling climate change.

The leadership to do what needs to be done, and not make excuses. To set the sort of example that'd make the rest of the world follow.

It'd need a brave leader, but they'd be also taking the opportunity for them and their country to be heroes in the eyes of the world.

The more that the rest of the world drags its feet, and the bleaker the future comes to look, the greater the rewards for taking such leadership will become.

BTW, the text of JFK's speech can be found here.

Monday, December 03, 2007

TextMate text editor, and a screencasting idea

The TextMate text editor looks quite good. I haven't used it -- my laptop runs Windows, and it's only on Mac -- but going by these screencasts, it's got some nifty features and overall seems quite impressive. Looks like it might provide a nice way to edit structured data like XML, while still retaining the free-form feel of a text editor -- see this screencast.

If you watch that screencast, you'll notice that it's often difficult to tell exactly what the person is doing to perform the operations they show, because they're using some sort of keystroke combination to invoke the operations. That made me think that software for recording screencasts could have a feature to add a 'virtual keyboard display' to the video, showing a little display of a keyboard in the video, showing which keys are being pressed as things happen.

That is, user can record their screen cast as per usual, and while they are doing so, the recording software keeps track of what keys they pressed and when. Then, it adds a little animated keyboard picture somewhere in the screencast recording, that shows when different keys were pressed.