Discussion in 'Requesters' started by pilazza, Apr 29, 2018.

    Hi everyone,

    I'm a newbie requester looking to get some feedback on a series of HITs for identifying and labeling forum/blog/Q&A sites. Early prototypes of the HITs (on the worker sandbox) are up at:

    HIT #1: Identify forum sites:
    This is a simple single question with instructions. I've got keyboard shortcuts for the options for easy completion. Would it be helpful to have the site itself in an iframe so it displays inline? Or is it better as a link that opens in a new tab/window?

    HIT #2: Find forum root:
    This is a text box entry, again after looking at and manipulating a website. Same question about iframe vs. link as before. Also, is it clear enough what I mean by forum "root"? Would some examples help?

    HIT #3-5: Label index/thread/post pages:
    These involve a custom Javascript overlay that walks you through labeling several (~20) fields on a webpage. I'm most concerned about the granularity of these HITs - I know that it's best to break HITs up into individual tasks that can be completed quickly, but in these cases, the later stages are dependent upon what's selected in early stages, and I can use software to reduce the amount of work the Turker has to do or improve the error-checking on their input. (For example, if no categories are selected, it should hide the request to label category fields, and the set of legal click targets for the field is modified by the items selected in earlier stages.) Are these too daunting & long? Should I break them up so that each "stage" is a separate HIT, moving some of the software logic that's currently in the Javascript to the server? So that each individual labeling task is a separate HIT? Or is it more convenient to be able to click through a number of fields at once all on the same document?

    I'm also interested in feedback on wording, instruction clarity, UI, and pricing, if you have any opinions on those. I'll likely change the color scheme to match the default templates and add some example screenshots in the future.

    Thanks for any input!
    @pilazza it's a bit late but some of our other members may have a response for you, I tagged this forums admin so he's sees this tomorrow and gets back to you then, he's much more knowledgeable with the sandbox and helping requesters in general.
    First of all, welcome to TurkerHub! Secondly, props for making use of the Worker sandbox by testing your HITs and asking for feedback before going live. :emoji_hugging:

    It's pretty late for me so I won't be able to offer particularly detailed or verbose feedback at this time, but I can give you some thoughts as a worker.

    HIT #1 clear and straightforward. Keybinds are always appreciated. I would say that most folks prefer inline embedded pages when possible. If I were to see a HIT like this in the wild, I would just turn on my quick n dirty automatic link opener that fires after the HIT's iframe loads.

    HIT #2, ha, you put MTurkCrowd as one of the sites. :p Even more straightforward. These can't really be embedded inline so a link like this is just fine. I'd say folks that do URL/content harvesting consistently often have ways to tackle opening links in a way that they're familiar with.

    HITs 3-5 are where things get dicey. You're going to get tons of people that see the instructions and nope right out. :emoji_sweat_smile: The difficult part about bigger HITs like this is that it can be harder for us to figure out if there's a good hourly. Another part is that there might be HITs than take 90 seconds, where some take 3 minutes.
    Depending on how long of a period of time you have between these groups of HITs, you might want to qualify a smaller pool of Workers that either perform well on your previous HITs. Another option is to put out a small qualification batch and assign qualifications to the top X% of Workers.

    Overall it sounds like you want to put in the work and tweak things early on to make things easier for us, which is just fantastic. :emoji_bow:

    @ChrisTurk You know what must be done. :emoji_ok_hand:
    I'm not able to do a thorough look over right now, but I did notice on the 3-5 tasks, if you accidentally click on an element, there's no way to deselect it. And further clicks just move up the hierarchy of the elements until the original parent is selected. There should probably be a way to deselect without having to refresh the page and do it over again.
    I think the link is fine on this one. Directions are simple and clear.

    This looks good to me as well, and using a link is probably better. I would include an example or two in the instructions. I know what you want, but there may be people that misunderstand them. Throw a little screenshot link in there or something.

    mTurk wouldn't load these when I tried to look, so maybe you're pulled them to work on them or something. I usually prefer short, simple tasks, but one with more steps is fine as long as the pay ends up being a decent hourly.

    I was going to say what @Melting Glacier said about considering a qualification for workers.

    Thanks for putting this amount of work into these and getting in touch with forum about it. Remember us when you put out those qual HITs! ;)
    Thanks everyone for your responses! HITs #3-5 went down for a bit because I'd only submitted a couple of instances to the sandbox and apparently people accepted them, leaving no more work to be done. They should be back now.

    The time between HIT groups is on the order of minutes - I've got some software that listens for the completion of HITs via AWS, then does some crawling and processing based on the result and kicks off the next HIT. Ideally this'd operate continuously - where user input kicks off HIT #1, which kicks off HIT #2, which kicks off HITs #3-5, and then additional users result in more work - but right now I've just got an initial cohort of interested users that might generate a batch of ~10000 #1s, ~10% of which end up generating #2-5.

    Qualifying Turkers is an interesting idea that I hadn't considered. Does Amazon charge a higher fee for using qualifications? I admit I haven't read all the documentation for this feature - I was scared off by premium qualification prices (which is a slightly different feature, no?), though I guess if this is a larger HIT that would pay a few dollars anyway, those are proportionally lower.

    How much time do Turkers typically expect to devote to a single HIT? I'm also considering just breaking HITs #3-5 up on subtask boundaries, so that "Select all categories" is its own HIT, as is "select all forums", as is "label forum fields", and so on. That should cut the time required for it down from the 10-15min it took me when I worked through it to something more like a minute or so. It also lets me get rid of the big instructions should take no more than a paragraph or so to explain an individual subtask, and they'd all fit on one screenful without any "Next" buttons.

    Thanks, this is a good point. I'll go add that functionality shortly.

    Yeah, I'll add a couple example screenshots when I've finished developing - just haven't had time yet.
    Regular qualifications are different from premium. The premium qualifications are more meant for finding certain demographics. I don't think it costs anything to make and use your own.

    It really depends on your HIT and what that particular turker is willing to do it for. We usually measure by what kind of hourly rate you can get from working on the task. A higher hourly attracts more people and better work. A lower hourly does the opposite.

    There's also consideration on whether you're a reasonable requester or not. The possibility of getting rejects from a strict or unfair requester is a big deterrent as they can affect our ability to do other HITs. Other turkers will usually give you a TO/TV rating for your HITs that we use to learn about the requester. I feel like this thread already says a lot about you and your communication, so I doubt you'd have many problems in this respect.

    I'm more of a fan of splitting them up, myself. I think it'll allow people to zone in on a specific task and go through them with more focus. A lot of people underestimate the cost of stopping focus and having to switch and start on something else, so I think splitting them up would help you in this aspect as well as the instructions length.
    Hi again everyone,

    I've redone the series of HITs fairly significantly, based on Turker feedback and some additional requirements on our backend. It now looks something like this:

    Unfortunately I'm finding that Turkers are almost always getting confused about what they're supposed to do. Either they submit the HIT without selecting anything, or they don't submit anything and put it back. Ideally the Turker would select items like what are shown in the examples:

    But instead they're selecting nothing. I ran an experiment that asked a bunch of Turkers about the terminology they'd use for the highlighted items, and found that their answers were all over the map. Perhaps the problem is that there's no common vocabulary for labeling the parts of a forum site, and so when I ask them to select a certain header/topic/thread/comment/etc. they don't know what I mean? I also looked into Qualifications, but the format of these HITs isn't suitable for a Qualification Test and adding qualifications based on past results only works if some people manage to complete the past results successfully, while it seems like nearly everyone is getting confused by what I'm asking.

    Anyone have any ideas for what might be going wrong, how to clarify the instructions so that they make more sense, or other strategies I could use that would improve accuracy?
    Do you have a target that these pay in an hourly range? Honestly just looking at the example task.. it'd have to be painfully obviously well paid to get me to sit down and try to sift through that.

    I'd maybe start by breaking the tasks down by website scraped. Jumping around UIs/layouts makes it impossible to get into a groove. But that is just barely a start.. this task is.. in depth? Complicated? There is just way, way too much going on IMO.

    "with a number of items already been labeled by a previous Turker"

    Totally unrelated but I personally don't like tasks that do this. It clutters the UI even more and if the other turker labeled something incorrectly it doesn't give me a lot of confidence in submitting the task.
    hmm weird..I tried this HIT earlier and couldn't submit. I kept getting "Loading next HIT" on my screen after I clicked Submit HIT. So, I thought maybe I labelled something wrong. In the end, I returned the HIT.
    Thanks for both of your replies!

    My apologies for that - that was a software bug between me & Amazon that I just fixed earlier this week.

    And actually, I was just trying these HITs out again myself in the sandbox and noticed that another software bug was preventing me from getting results from them, even though they were submitting okay. That could be the source of my problems - I'll go fix it and incorporate the feedback I've received so far and see if that improves things.

    I was initially shooting for roughly $12-24/hour. It takes me between 30-60 seconds to finish one of these (granted, I know exactly what I'm looking for and have labeled about 70 sites = ~1000 HITs in the sandbox), and so I set the price at about $0.20/HIT.

    I was looking at the worker list on the site, though, and it looked like the vast majority of HITs were $0.05 and under, and in many cases required a lot more work than mine (there was one ridiculous one that paid $0.01 and asked you to label 50 images for one HIT). So I set the initial price at $0.05/HIT. People pick it up and submit at that price, so I didn't think pricing was the problem, though I'm open to changes in that.

    Is there a way to tell MTurk "prefer the Turker who previously submitted this other assignment, but if they're not active, allow other Turkers to complete this HIT"? Giving out a qualification per HIT doesn't seem like it'd work because it would totally exclude other Turkers, and may not be feasible anyway technically because that's thousands of qualifications.

    I'd initially had it so that all of the tasks on a single page were a single HIT - those were HITs #3-5 in my original message. There were a number of problems with that approach though:
    1. When broken up like that, it took me about 20 minutes to complete a single HIT, and about 40% of the HITs I did had errors in them that made the final result unusable (and that's as the requester and primary software developer, where I know exactly what I'm looking for). I didn't want to create situations for Turkers where they got halfway through and decided they didn't want to do the HIT after all (wasting 10 minutes and not getting paid for any of it), or were called away by dinner or a crying baby and had to abandon the HIT, or made a mistake on one of the subtasks and so end up with the whole HIT rejected.
    2. The instructions for this approach got so long that I had to organize them hierarchically and they scrolled all the way off the screen. I thought that would be even more confusing.
    3. It required a lot of context-switching on the part of the Turker - labeling each of 20 fields is mentally a different activity from selecting items from a repeated list. I could keep the page constant, but then the activity they're doing switched with each subtask, which subjectively seemed worse.
    4. By breaking them up, I can do a bunch of things with software that save labor for the Turker and reduce the chances of errors. If I've got all items in a repeated list selected, for example, then when it comes to labeling specific fields I can have the Turker just label one of the repetitions and use the information I got from the first task to apply it to the rest of the page, saving hundreds of clicks. I can also determine which elements are valid based on the structure of the page, and prevent the Turker from selecting ones that wouldn't work anyway.
    It's unfortunately a complex task, and I've tried to break it down in a way that the Turker can get into a groove and do a bunch of related work at once without needing to go back to the main HIT list. Many of the tasks have dependencies on previous tasks, though, like I don't even know which page to crawl until each thread has been labeled. The multi-Turker approach seems the only way to handle that, short of building in some AJAX workflow engine into the pages themselves that largely duplicates what MTurk is already offering.
    I’d prefer more than 5 cents pay, honestly. I spent around ~50 seconds to label everything and making sure everything is correct - time included looking at examples and reading instructions. It also depends how many headers the forums have. More headers more time spend to label. Mine one was like 6 headers only. Also, it could be better if you could make a tutorial video which to label to avoid confusion.
    It is really hard to follow the actual answer to this question - you put them on production at 5c/HIT?

    FWIW You can see those other HITs because they languish there, undone, (at least in part) because they pay like crap. At best, some of the companies can get away w/ opening them up to non-American workers who have lower standards and the quality doesn't suffer because of the nature of the task. If you can manage that, by all means, we understand its business when that happens.

    But 5c for 30-60s worth of work would be highly objectionable ($3-$6/hr is what you are intending to pay people?). Your original instinct on pricing was way more in line with what I'd bother with.

    You're saying people are picking it up & submitting it ("not a problem"), but also saying the quality is not up to par and that
    It'd suggest to me people are accepting a few, doing enough to figure out the hourly is crap, and then moving on to better work (as they should).

    I have no clue though, if your data quality is fine at 5c/HIT, more power to you.
    Ditto. I think an actual video of instructions would do wonders to help those confused.
    @pilazza so, I tried this HIT of yours Label fields on web forum ($0.20). Did you increase the pay? I could only label 1 field and couldn't proceed to label other similar fields? I looked through your examples and your examples only labeled 1 title and 1 description only. I was totally confused and submitted the HIT anyway.

    ps: you put Turkerhub as example :p
    Thanks everyone for your feedback. Yes, I've been testing a bunch of changes this weekend:
    1. I put in qualifications of > 1000 HITs completed and 99% accuracy.
    2. I bumped the HIT price back up to $0.20.
    3. I changed the wall-of-text explanation into an animated GIF screencast
    I only finished screencasts for the first couple of HITs in the sequence, so @queenpenguin, there wouldn't have been one on the "Label fields" task you completed. Nevertheless, you actually completed it correctly - that particular forum site only has a title & link for each topic, and labeling those correctly is enough for the software to infer how to get to the next page. Do you have any ideas how to make it less confusing? I put the example with just the title and description there so Turkers would be aware that there are some cases where all you have is a title & link and that's okay, but I also want to make sure that people will select description/#-replies/lastPostDate if those fields are present (and not just select the minimum required), since the software works a lot better when those fields are present.

    Things are working a bit better now. I'm going to go make screencasts for the other tasks now, and I've got some coding changes needed for this.
    • Like Like x 1
    I like the pay now. one of the gif did not load properly for me so I used my best judgment what to do with the HIT. This is the file with an error in it (filename: select_subforums_intro.gif). The gifs actually very helpful. I think you need to emphasize the instructions more by adding highlighting the words. I just found out that sometimes I had to label the subforums. Sometimes I had to just label the title and description. I went to my queue, I opened it and I expected to label title and description. Just my personal preference here, I'd like to drag the popup by myself to the center or anywhere on my screen so that I could check I didn't miss label or label wrongly. You can also add a comment box for workers to leave you feedback on how to improve the hit more.
