I remember very clearly the day I was sitting at the meeting of a data initiative at University of Virginia among colleagues who had expertise in data analysis and one faculty member who was known to be teaching a data mining class raised his hand and asked “What is qualitative data?”. Those of us from the education school were kind of taken aback by this question. Was it a joke, was something else coming? It turned out to be a genuine question to learn more, but we all kind of wondered how someone in research would not know what qualitative data is.
If you are reading this blog you most likely know but let me try to clarify again. Qualitative data is anything that can’t be quantified into a numerical structure.
When I talk about numerical structure I mean the four main types: nominal, ordinal, interval, ratio. Each is a category with increasing possibilities for running computations.
Nominal is when there are distinct groups but they don’t have any numerical relationship. There is no order, you can’t add, subtract etc. But they are separate. For instance your hair color is nominal. We can identify them as distinct (well mostly) but brown hair is not 2 blonde hairs divided by a black hair.
Ordinal data has distinct categories that have an order. A lot of Likert scale data is in that category. For instance “how bored are you from 1-5 will give you categories and 5 is more than 1. Although it looks like there is a gap of exactly 1 between each we can’t quite say that is true with certainty because it would be impossible for humans to rate their boredom so accurately.
Interval data has an order and the difference between them is equal so you can start to count them and do addition and subtraction. Temperature is an interval scale. You can’t do multiplication with them because 60 degrees is not twice as much as 30 degrees Celcius. That’s because 0 in Celsius is not in fact zero in terms of temperature (or an absolute zero), the same is true for Fahrenheit. (The true measure of temperate with a 0 is Kelvin)
Ratio data also has absolute zero so you can to multiplication or division on it. Money is ratio. $20 is in fact exactly twice as much as $10.
This distinction is important for qualitative data because while interval and ratio numbers are actual numbers, ordinal and nominal data are groups that get converted into numbers. Because of this some kind of statistical analysis can be run with them or including them.
Qualitative data however is open ended data, that’s mostly converted into language. Most common examples are things people say or write. Images can be qualitative data if someone writes down something about the image or says something.
If you talk to someone about how bored they are and they say, “well not too much” then you are collecting qualitative data. If you tell someone they should rate their boredom on a scale of 1-5 then you are collecting quantitative data.
In this example which one you use depends on your goal. If you are going to compare this person to 1000 others you need a standardized way that you ask this question and get responses to. Standardizing is harder with open ended words and it’s definitely much more time intensive read through 1000 responses to your question getting excel to calculate the average of 1000 responses (or whatever other statistic you need).
If you are trying to understand the deeper reason behind someone’s boredom, or the way they express it, or whether they express it at all, how they behave in the environment etc then you need to ask this question in a qualitative way. If all you need is the answer you can get it through any medium or even an open ended survey. If you are looking at tone, body language or other cues as part of your study then you would likely see them in person. Whether you invite them to your office or visit them in a specific environment will also depend on what the goal of your visit it.
Qualitative data is deep. There is an often quoted idea that quantitative data tells you “what” and qualitative data tells you “why”. Qualitative data can also tell you what depending on your budget but qualitative data excels at collecting a very large amount of standardized information.
Now you may be wondering can qualitative data not be standardized? To a certain extent yes and sometimes it has to be if you involve multiple researchers. The primary method by which qualitative data can be turned into statistics of some sort (which you don’t always have to do by the way), is through a process called “coding”, meaning you elicit themes or answers to conceptual questions from the text. Converting open ended response to a theme involves a lot of issues like bias, language barriers etc. So the researchers need to be aware of them. If you involve multiple people who code you would have to do something called inter-rater reliability. This is essentially comparing notes between people to see if they thought similarly about themes.
The best practices for qualitative and quantitative data can fill a book, and in fact it has filled many books. One important thing to go over here though is sometimes it gets a bad reputation as if quantitative data is somehow more pure and devoid of biases and because it covers more people more representative. You need to be aware that quantitative data can be VERY biased and messed up. For instance you may give people likert scale that with weird options that people can’t identify with or you misinterpret analytics by measuring wrong things. I can for instance add an event listener to users moving a mouse and generate thousands of events per hour, making it seem like users are just going at it in your app.
So take the time to learn more about qualitative data methods. You can’t be a UX researcher and not know one or the other of qualitative vs quantitative.