Thursday, October 10, 2013

Dispositional attitude and user testing



Here is another great reason to read journals outside your main focus area.  I just read an article from the Journal of Personality and Social Psychology that has great implications for human factors testing.  First let me tell you a little about the paper and then I will bring it home to HF.

There are some things that just about everyone likes (maybe chocolate, love, or the smell of freshly cut grass).  There are also some things that just about everyone dislikes (poverty, cruelty, missing the last piece of a 5000 piece jigsaw puzzle).  Then there are things that some people like and others dislike (smoking, roller coasters, sausage pizza).  But here is a different kind of question: are there some people, if you average across the board of everything, who come out as generally positive about things?  And other people, if you average across the board of everything, who come out as generally negative about things?  And is this a reliable, valid, useful metric?

That is what a paper by Justin Hepler at the University of Illinois and Dolores Albarracin at UPenn wanted to know.  And psychometrically, this is a really solid paper.  They took great pains to develop a metric and test its convergent validity, discriminant validity, predictive validity, and reliability.  Here are what I found to be the most useful takeaways for human factors.

First, yes – there is a general tendency towards dispositional attitude.  They started with 100-items and had people rate them on a 7-point Likert scale from extremely unfavorable to extremely favorable.  They included a wide variety of items, from “abortion on demand” to “mullets.”  Then they were able to winnow it down to a 16-item list (without the political hot buttons) that reliably predicts someone’s general dispositional attitude.  The final list includes some items that are generally considered strongly negative, mildly negative, mildly positive, and strongly positive (check out the paper for the full list).  Using this instrument, they were able to predict people’s dispositional tendency towards new items not on the list.  Not for a single item, because obviously even a positively disposed personal might still hate paying taxes or sausage pizza.  And a negatively disposed person might still like chocolate or World Cup soccer.  But on average it can predict whether you are a fuddy duddy or a Pollyanna. 

Then they tested the convergent and discriminant validity.  It turns out that this dispositional attitude is correlated with other scales like openness, extraversion, and self-esteem.  It is negatively correlated with neuroticism, behavioral inhibition, and prevention-focus.  It is uncorrelated with conscientiousness and imagination.  So we can start to see why this is an important finding in personality psychology.

But here is how it can be used in HF.  We often test systems by asking users whether they like a design or a system – often on a similar 7-point Likert scale.  If they rate it above neutral (say a 5 out of 7) then we assume they like it and if they rate it below neutral (say a 3 out of 7) then we assume they dislike it.  But this kind of measure can be made even more accurate (sensitive, precise, and valid) if we spend two minutes getting a customized baseline for each person using the dispositional attitude scale.  If a person who has a dispositional baseline of 2 rates your design as a 3, that is actually positive.  And if a person who has a dispositional baseline of 6 rates your design as a 5, that is actually negative.  And perhaps more importantly, a person with a dispositional baseline of 6 who rates your design as a 2, that indicates a serious dislike that you might not realize otherwise.  So instead of recording the raw Likert scale ratings of your system, you should record the difference between their rating of your system and the dispositional baseline.  It only adds 2 minutes to your study but it can make your results much more useful.

1 comment:

Anonymous said...

While I agree that this study makes an interesting point, let me raise another one that might impact results of a study:

Wouldn't it also make sense to evaluate how much experience, how much educational background and/or familiarity a person has with what you are testing? Thus, a 6 from the expert might have more impact than a 6 from an average user. Or a 2 from the expert might overshadow five 6's from average users.

Meaning, if I was conducting as study to try to improve the usability of my website and asked 1 person, 10 people, or even 100 people to give me feedback, do you think I would have more accurate feedback from a human factors expert or from the average guy who surfs the web?

Both sides have merits. The expert might give an answer based on science and proven track records from company sites that are very effective in achieving their intended goals with their audiences. And maybe the expert would be the person to talk to when you only have 1 person to talk to, but 100 actual users might give me the a better sense of how my site presents itself to the masses (which might also impact how the expert will view things in the future).