This assignment counts as a deliverable toward your final project. The goal of this assignment is to generate several ideas that you could refine into your final project. You’ll create video pitches for three ideas, and get feedback from your classmates about how exciting and how feasible they think your ideas are.
Your final project must be done in a group. The size of the group can be 3-5 people. Find some awesome people. Do you remember what James Surowiecki says about groups? They’re best if they are composed of a diverse set of people. So maybe you should try to pick out people who aren’t already your friends. Use the Piazza message board to advertise that you’re looking for teammates.
Your group should meet up and brainstorm ideas for the project. You can come up with ideas on your own, or use some of my ideas as a starting point. Your ideas are probably better than mine.
Here are some different kinds of projects that you could do:
As a group, pick the 3 ideas that you like the most and start fleshing them out.
Here are some considerations that you should take into account when selecting your shortlist of ideas. The final version of your project should:
Your initial ideas don’t have to do all of these things yet, but they should be ideas that you can extend in that way.
You will be asked to answer to provide the following information:
You should submit your answers to the question on the project pitch questionnaire. Please submit separate forms for each of your 3 ideas.
You should produce a 90 second to 2 minute video for each of your ideas. The video should include slides that describe your project, along with a narrative that explains the project in an exciting, cogent way. No need for Hollywood level production values in this video project. A clear description of your proposed project is preferred instead of being overly creative.
You will each be assigned a set of projects to review, and you’ll give feedback to your classmates on their ideas. We will provide a set of criteria for you to use when evaluating the project ideas. We will give you the questionnaires and videos for several projects for you to evaluate. Peer grading will constitute 5% of your overall grade. There will be several assignments that you will peer-review: this assignment, the company profiles, and the other parts of the final project.
This assignment is due before class on Friday, March 4, 2016. You must work in groups on this project. You must declare who is in your group when you turn your assignment. Everyone in your group will receive the same grade on the assignment.
This assignment is worth 7 points of your final project grade.
Here are a few final project ideas. You are welcome to adapt one of these ideas into your final project, or to come up with your own idea. My expectation is that your final project will represent a substantial amount of work, and that it will be something that you’re proud of and that you would like to show off to potential employers or to graduate schools.
Come up with a human computation algorithm that helps people find a better match in online dating. Some people have tried to use machine learning or crowdsourcing to optimize their dating experience on OKCupid. Can you come up with a better way of matching people up via crowdsourcing? Maybe you can have the crowd act as Cyrano de Bergerac, feeding users better lines than they could think of themselves. Maybe you could have people in a social network nominate people who they think would be good matches.
The International Children’s Digital Library is a collection of children’s books from around the the world. Volunteer translators have translated a subset of their books into different languages. We could try to translate many more of the books using crowdsourcing. There could be different tasks for monolingual speakers and bilingual speakers. Monolinguals could transcribe the text of the books (which is usually embedded in images). Bilinguals could translate it. Monolinguals could edit the translations.
Create a human computation algorithm to convert prose into poetry. Your algorithm should model two aspects of poetry rhyme and meter. NLP researchers have been working on text-to-text generation algorithms that can rewrite sentences in many different ways. This software can generate a huge number of alternatives, some of which may fit the constraints of a poem. However, the software is currently poor at determining which of the generated sentences are grammatical versus ungrammatical, and which correctly retain the its original meaning. Your job will be to incorporate humans into the process to make those decisions. Meet with your professor to learn more about the NLP software (it takes quite a lot of effort to learn), and then design a set of MTurk HITs to filter generated sentences down to ones that are poetic, grammatical and mean the same thing as the original prose.
The Mechanical Turk interface for workers sucks. Although it lets workers search by value of HIT and maximum duration, it lacks one critical piece of inform: expected hourly rate. Write a web plug in to help workers track their hourly rate. Design a database that records these stats for multiple workers and reports the average hourly rate for different requesters and/or tasks. Possible extensions could track the approval/rejection rate of each requester, the average time from doing a HIT to getting paid for it, and a log of all of the reasons for rejection.
Design an adjudication system for work rejected by Mechanical Turk Requesters. The system should allow Workers to appeal rejections, and should have a mechanism for deciding whether the rejection was fair (in which case it would stand), or unfair (in which case it should be overturned, and the Worker should be paid). Possible ideas: design a second pass HIT that has other Turkers review the work, and decide whether it is acceptable or not. As part of this project you should specify what constraints are on the original HIT design to allow easy second pass reviewing and highlighting / explanation of why an assignment was rejected. You should also quantify the expected increase in costs to Requesters, based on variables like: rejection rate, original reward amount, reviewing cost estimate.
Design an implement a method for Mechanical Turk Requesters to share qualification tests and the results of who passed the tests. Write a short paper describing the value of such a system, and comparing it to MTurk’s master’s qualification. Design a few qualifications of your own that you think would be broadly useful, possibly by reviewing the tasks currently posted on MTurk and generalizing the skill sets that are needed.
Choose some aspect of the cognitive science that can be tested through experiments on human subjects. One of my favorite examples of this is Lera Boroditsky’s work testing various aspects of the Sapir–Whorf hypothesis that language influences they way that we think. Read Lera’s article on how she used MTurk to test whether metaphors change the way people reason. Choose your own topic (or reimplement several classic experiments). Write a paper discussing your results, discussing whether MTurk provides a representative sample of subjects, and describing how to go about applying for Institutional Review Board approval for cognitive science experiments on MTurk.
Write a suite of HITs on Mechanical Turk to test behavioral economic theories by implementing a set of games like the “ultimatum game”. In this game two people are paired up. (They can communicate with each other, but otherwise they’re anonymous to each other.) They’re given $10 to divide between them, according to this rule: One person (the proposer) decides, on his own, what the split should be (fifty-fifty, seventy-thirty, or whatever). He then makes a take-it-or-leave-it offer to the other person (the responder). The responder can either accept the offer, in which case both players pocket their respective shares of the cash, or reject it, in which case both players walk away empty-handed. See more details in “The Wisdom of Crowds”. Note that this requires pairing two people simultaneously or simulating their interactions.
Apple uses speech recognition systems for Siri. You can develop this technology for new languages. You need an open source speech recognition system and a bunch of training data. What sort of data? Audio files paired with their transcriptions. Where do you get data? Crowdsourcing! You can come up with ways of collecting data. You could gather data either through transcription of existing audio files, or `elicitation’ where people read texts out loud and save recordings of it. You’ll need to figure out how to do good quality control, to what extent the quality matters when you’re training a speech recognition system for a new language.
There are a lot of food trucks in Philly. Some of them are so awesome that they move to different locations on different days. They announce their whereabouts on Twitter or Facebook. Do they really expect us to keep track of where they all are? Why not have the crowd create a map of the current whereabouts of all the food trucks. How about having the crowd keep track of their menus and prices while you’re at it? A good crowdsourcing platform to use for this project is FieldAgent. FieldAgent gave me $2000 in credit, which I can share with students.
Did you know that you can catch the flu from social media? Well, you can’t. But you can use it as a tool to track the spread of certain diseases. You could try re-creating one of the publications by this cool researcher. What sorts of health problems do you think social media can give us information about?
Wikipedia provides hour-by-hour page view statistics for every one of its pages. Write a human computation algorithm that uses these statistics as input to detect trending topics in the news. Use humans to (1) review the trending pages to say whether they describe something newsworthy, (2) cluster them into pages about the same event, and (3) write short summaries of the event that triggered them to become popular. Design good mechanisms for quality control for clustering, and for describing something as newsworthy. Read this paper about a baseline computational algorithm.
Prediction markets use collective intelligence to try to predict the outcome of future events. Prediction markets answer questions that have definite, verifiable answers on a particular date (like “Will the government shutdown still be in effect on October 31, 2013?”). They let people buy and sell shares in the outcomes, and track the value of each outcome’s shares over time. You should implement a prediction market that sets that value of the shares. You should hire workers on Mechanical Turk to make the predictions. The major design challenge will be to formulate the system so that it incentivizes Turkers to make well-considered predictions instead of random predictions. For instance, you may consider designing a HIT that pays nothing initially, but that gives people up to $10 if all of their predictions are accurate.
The words we use to describe politicians and public figures in general depends a lot on their background. Pick one characteristic to keep track of (age, gender, party, country or state of origin, relationship status, time in office, anything), figure out which words correlate most strongly with politicians who possess that characteristic, and use the crowd to assign an intensity and sentiment to some of these words – maybe even design a HIT that swaps out the names and pronouns of one politician for another and ask the Turker to assess the clarity and cohesion of the article to see how background affects descriptions in the media.
The Guardian recently started publishing an online database of police-involved killings called The Counted. In turn, the FBI announced that it would also be publishing information about the deadly use of physical force nationwide. This information is tracked in a lot of places, including gun violence blogs and even in the projects of students who took NETS213 last year. Using the crowd to identify duplicates and supplement details in one place could yield interesting information about which areas are best at reporting violence, which news sources are least accurate, or any other problem you’d like to study. Automatic reconciliation of conflicting data and classification of the type of data would likely require some strong HIT design.
Penn’s School of Engineering and Applied Science has graduated over 1,400 PhD students over the past 25 years, but it has lost track of them. Kostas Daniilidis, the SEAS Associate Dean for Doctoral Education, would like to know gather information about how their careers turned out after graduating from Penn. For each of the students, we have their name, the year that they received their PhD, their department, and (sometimes) their advisor’s name.
Dean Daniilidis would like us to lead a crowdsourced effort to find out the following information about each SEAS PhD graudate:
For this project, I recommend using the UWork platform, rather than MTurk or CrowdFlower.
Professor Eskenazi and her team are constructing a spoken dialog system that gives out information on a variety of topics (places to eat, fine arts, hotels, transportation and places to shop). In order to enable the system to respond correctly, they need to know the types of things that people would say to it. Since people speak differently than they write or type, they need to collect their answers in the form of recorded speech. This collection of speech utterances will then be automatically labelled and used to train a natural language understanding system (NLU) that the spoken dialog system uses to extract meaning from an unknown user utterance.
They would like to show the worker a sentence (system prompt) on the screen and ask them to record five possible responses for each one. We would like to have 100 workers see each system prompt.
To sum up, a worker sees a randomly-chosen system prompt (from the set below) on the screen. The worker records herself saying one response, listens to the recording and verifies that it is correct (and re-records it if it is not), then goes on to record the next one and verify it, and so on until all five responses have been successfully recorded and verified. Then the next randomly-chosen system prompt is presented
Here are system utterances for each domain (places to eat, fine arts, hotels, transportation and places to shop). The system says ‘'’I can give you information about places to eat. What would you like to know?’’’
Places to eat:
Places to Shop
Professor Eskenazi and her team are creating a spoken dialog system that finds answers to user questions in resources such as news articles, The Wall Street Journal, The New York Times, USA Today or Wikipedia.
For their task, a paragraph of an article from one of these resources will be given to a worker. The worker is to read it and then think of two questions that could be answered by parts of this paragraph and record herself saying each of them. After recording the two questions, they would like to worker to mark the sentences in the paragraph that contain the answer for each question. They have a set of 100 paragraphs for this HIT and we would like for five workers to see each paragraph.
The collected data will be used not only in their question answering system development, but also for the development of the automatic generation of the most informative questions for a given document. This combination of derived data will help our users get the most appropriate answers for their questions.
An example paragraph from Wikipedia: Early life of Bill Gates
Gates was born in Seattle, Washington on October 28, 1955.
He is the son of William H. Gates, Sr. and Mary Maxwell Gates. Gates’ ancestral origin includes English, German, and Irish, Scots-Irish.
His father was a prominent lawyer, and his mother served on the board of directors for First Interstate BancSystem and the United Way. Gates’s maternal grandfather was JW Maxwell, a national bank president. Gates has one elder sister, Kristi (Kristianne), and one younger sister, Libby. He was the fourth of his name in his family, but was known as William Gates III or “Trey” because his father had the “II” suffix. Early on in his life, Gates’s parents had a law career in mind for him. When Gates was young, his family regularly attended a Protestant Congregational church. The family encouraged competition; one visitor reported that “it didn’t matter whether it was hearts or pickleball or swimming to the dock … there was always a reward for winning and there was always a penalty for losing”
Questions from Workers
Professor Eskenazi has provided 100 paragraphs that she would like workers to create questions for.
Recently Professor Resnik has been researching the problem of how language connects to underlying mental state, with an interest in potential applications like low-cost, wide-coverage screening for mental health conditions like depression. He would like help collecting labeled training data for posts to a mental health forum, where the goal is to design a machine learning classifier that could flag posts if a person might be a danger to self or others and require intervention. There is a related shared task on the automatic triage of posts from a mental health forum as part of an upcoming workshop on the Computational Linguistics and Clinical Psychology.
Professor Resnik would like your help to crowdsource labeling of postings from various sources, possibly including:
This project would involve working with Professor Resnik to develop annotation guidelines, performing the crowdsourcing and quality control for him, and training a machine learning classifier to identify language that would prompt intervention.
Jeffery Bigham runs a class at CMU. You can check out his list of suggested final project topics.
You will be asked to answer the following questions about each of your three term project ideas. (Be sure to save a copy of your answers for your records).