Although you won’t believe me for a few paragraphs (*cough* tangent *cough*), this article is about something called the Amazon Mechanical Turk service. But now that I have promised to walk off the path, I wouldn’t want to disappoint – so let’s begin with some history.
About 280 years ago a fellow named Wolfgang von Kempelen was born in the Kingdom of Hungary. He became an accomplished linguist, architect, artist, poet and mechanical engineer. He designed and patented various inventions (even including a “speaking machine”) but his most famous invention by far was the “Mechanical Turk”.
Von Kempelen completed construction of the device in 1770 when he was 36 years old. It was a spring driven automaton, shaped like a desk with an attached mechanical torso. The machine’s function was to play chess and it defeated contemporary masters including Napoleon Bonaparte and Benjamin Franklin. It was also a brilliant hoax: “The Turk” was elaborately designed to conceal a human who controlled the game. This fact remained unexposed for more than 50 years as the machine and its increasingly wealthy owner toured Europe and North America (and even Richmond, VA where in 1835 a young Edgar Allan Poe wrote a long essay exposing the fraud). At the end of the day the machine was only moving the chess pieces - the human inside was doing the “thinking”.
So when Amazon launched its crowd sourcing service in late 2005 it chose the name “Mechanical Turk” (a.k.a. MTurk) because like von Kempelen it uses humans to do what machines cannot. When we talk about human–computer interaction (HCI) we usually think of the computer as the “tool”. But in the world of Amazon’s Mechanical Turk, the human is the tool being used by the computer application. Critical responses to this paradigm have ranged from “virtual sweatshop” to “awesome” to “awesome – a virtual sweatshop!” Salon did an early story when the marketplace was launched that provides a solid overview. My personal experience is that most workers are using free time to make extra money. There is a freely available academic paper on MTurk worker demographics.
The service has two types of customers: “requestors” and “workers”. Requestors are people or programs that want to pay for work; workers are people who are willing to work for pay. Requestors create HITs (Human Intelligence Tasks) by using either the web interface or developer tools (API and command line tools) and specifying a title, description, required completion time, reward, and results type. Requestors can optionally require that workers “qualify” by completing boilerplate or customized certification tasks before becoming eligible to work on HITs – this is useful for highly skilled tasks and to filter out weak performers. For example a qualification task could require workers to correctly identify differences in pictures or demonstrate proficiency in written English.
Once created, HITs are published to the Amazon marketplace, where “workers” can browse and accept them, perform the work and submit their results. The requestor then reviews the results, approving or rejecting each completed HIT. Requestors only pay for approved submissions and Amazon takes a 10% additional commission on completed transactions.
To illustrate how this works I used Amazon’s Mechanical Turk service for editorial review of my own article. There is certainly something amusing about using the MTurk service to write a blog about itself. However I’m not a programmer so instead of using the API or CLTs I used the website to create a HIT manually.
Step 1: Enter the properties - I entered some basic information about my HIT template, including a name, title, description, monetary compensation, and worker requirements (such as a 95% approval rate). I’ll be publishing three assignments and paying $2.36 for each one that I approve.
Step 2: Preview and publish the HIT - I preview my HIT and hit the 'Finish' button and the 'Publish HITs' button on the next screen.
Step 3: Monitor - The marketplace takes me automatically to the 'Manage Batches' screen, where I can monitor dynamically updated results including elapsed time since published, average time per assignment, assignments completed, estimated completion time and the effective hourly rate that I am paying. All three assignments were complete within an hour.
Step 4: Review - The average time to complete an assignment was 11 minutes. The effective hourly rate for this submission was $12.83. If I was using the marketplace for a high volume project and approving most submissions this rate would be too high and I would be able to save money by lowering the reward. Programmatic requester interfaces to the marketplace adjust rewards dynamically. I approved all three assignments and the total cost to me including commissions was $7.79.
Step 5: Consume the results - I used all of the editorial comments but the best one was, “I guess if I had to sum up my one way to improve the article it would be to expand upon the idea of the ideology of the human being as the tool, and the positive and negative implications. I think you could put the information after your first paragraph to lay the foundation for the blog.” This is great advice. I added more history and some polarized opinions on the service. I also paused to examine the philosophical implications that a software-based tool just used a human being to suggest that I should consider the implications of a software tool using humans to do work.
At the time I wrote this there were about 23,000 groups of HITs containing 247,033 active HITs. Rewards ranged from ranged from $.01 to $100. Typical work types include market research, audio transcriptions, data cleaning, editing, business process documentation, comparable product identification, translation, and content classification. The mturk.com website highlighted 15 commercial applications of the MTurk marketplace, including CastingWords.com (transcription), Easyusability.com (testing), and Pickfu.com (market research). MTurk is still in Beta but is fully functional and accepts new registrants.