Here is another great reason to read journals outside your
main focus area. I just read an article
from the Journal of Personality and Social Psychology that has great
implications for human factors testing.
First let me tell you a little about the paper and then I will bring it
home to HF.
There are some things that just about everyone likes (maybe
chocolate, love, or the smell of freshly cut grass). There are also some things that just about
everyone dislikes (poverty, cruelty, missing the last piece of a 5000 piece
jigsaw puzzle). Then there are things
that some people like and others dislike (smoking, roller coasters, sausage
pizza). But here is a different kind of
question: are there some people, if you average across the board of everything,
who come out as generally positive about things? And other people, if you average across the
board of everything, who come out as generally negative about things? And is this a reliable, valid, useful metric?
That is what a paper by Justin Hepler at the University of
Illinois and Dolores Albarracin at UPenn wanted to know. And psychometrically, this is a really solid
paper. They took great pains to develop
a metric and test its convergent validity, discriminant validity, predictive
validity, and reliability. Here are what
I found to be the most useful takeaways for human factors.
First, yes – there is a general tendency towards
dispositional attitude. They started
with 100-items and had people rate them on a 7-point Likert scale from
extremely unfavorable to extremely favorable.
They included a wide variety of items, from “abortion on demand” to “mullets.” Then they were able to winnow it down to a
16-item list (without the political hot buttons) that reliably predicts someone’s
general dispositional attitude. The
final list includes some items that are generally considered strongly negative,
mildly negative, mildly positive, and strongly positive (check out the paper for
the full list). Using this instrument,
they were able to predict people’s dispositional tendency towards new items not
on the list. Not for a single item,
because obviously even a positively disposed personal might still hate paying
taxes or sausage pizza. And a negatively
disposed person might still like chocolate or World Cup soccer. But on average it can predict whether you are
a fuddy duddy or a Pollyanna.
Then they tested the convergent and discriminant
validity. It turns out that this
dispositional attitude is correlated with other scales like openness,
extraversion, and self-esteem. It is
negatively correlated with neuroticism, behavioral inhibition, and
prevention-focus. It is uncorrelated
with conscientiousness and imagination.
So we can start to see why this is an important finding in personality
psychology.
But here is how it can be used in HF. We often test systems by asking users whether
they like a design or a system – often on a similar 7-point Likert scale. If they rate it above neutral (say a 5 out of
7) then we assume they like it and if they rate it below neutral (say a 3 out
of 7) then we assume they dislike it.
But this kind of measure can be made even more accurate (sensitive,
precise, and valid) if we spend two minutes getting a customized baseline for
each person using the dispositional attitude scale. If a person who has a dispositional baseline
of 2 rates your design as a 3, that is actually positive. And if a person who has a dispositional
baseline of 6 rates your design as a 5, that is actually negative. And perhaps more importantly, a person with a
dispositional baseline of 6 who rates your design as a 2, that indicates a
serious dislike that you might not realize otherwise. So instead of recording the raw Likert scale
ratings of your system, you should record the difference between their rating
of your system and the dispositional baseline.
It only adds 2 minutes to your study but it can make your results much
more useful.
1 comment:
While I agree that this study makes an interesting point, let me raise another one that might impact results of a study:
Wouldn't it also make sense to evaluate how much experience, how much educational background and/or familiarity a person has with what you are testing? Thus, a 6 from the expert might have more impact than a 6 from an average user. Or a 2 from the expert might overshadow five 6's from average users.
Meaning, if I was conducting as study to try to improve the usability of my website and asked 1 person, 10 people, or even 100 people to give me feedback, do you think I would have more accurate feedback from a human factors expert or from the average guy who surfs the web?
Both sides have merits. The expert might give an answer based on science and proven track records from company sites that are very effective in achieving their intended goals with their audiences. And maybe the expert would be the person to talk to when you only have 1 person to talk to, but 100 actual users might give me the a better sense of how my site presents itself to the masses (which might also impact how the expert will view things in the future).
Post a Comment