This article will critically analyse two types of attitude scale as data collection methods. Commonly, measurement of both patients’ and professionals’ assessments of healthcare is based on attitude scales (Cormack, 2000; Bowling, 2002; Parahoo, 2006). There are several methods that have been developed to measure attitudes. However, this article will focus on the Thurstone and Likert methods. In order to give an example of the application of both scales, the relevance of these methods to measure the attitudes of paramedics supporting the suddenly bereaved will be discussed. The author acknowledges that several sources referred to are dated. Nevertheless, they are important primary sources, written when the scales were initially developed.
What are attitudes?
It is first important to define what an attitude is. Edelmann (2000) and Bowling (2002) both define an attitude as a disposition to evaluate a phenomenon in a particular way. Peoples’ attitudes in the context of psychological research do not wax and wane, they are consistent beliefs and feelings about things (Aranson et al. 1994). Bowling (2002) continues to explain that attitudes are usually evaluated in the context of cognitive, evaluative and behavioural components. These different components may or may not be consistent with each other (Edelmann, 2000).
The significance of this might be for example: a paramedic might hold particular beliefs about death and dying (cognitive), he might feel that for instance some patients should not be resuscitated (evaluative), but this would not necessarily influence his clinical practice (behavioural).
Attitude scales
Attitude scales are used extensively in the collection of self-report data in public health and social science research and evaluation (Bowling, 2002; Parahoo, 2006). While there are several variations in these types of scales, typically they involve a statement about the particular attitude being measured—sometimes referred to as the stem and a response arrangement where the respondent is asked to indicate on an ordinal range the extent of agreement or disagreement (Edelmann, 2000; Bowling, 2002).
There is an assumption on the part of the researcher when using either of these methods that the attitudes of the participant can be represented by a numerical score and that each descriptor will mean the same thing to each respondent.
Thurstone scale
When Thurstone developed his ‘method of equal appearing intervals’ in 1928, it was the first defined method of measuring attitudes (Cormack, 2000; Bowling, 2002). The method begins with a chosen attitude object and a wide range of belief statements, both positive and negative, are collected (Bowling, 2002). Belief statements are usually collected from literature, discussions with experts or interviews with people for whom the topic is relevant. In order to construct a scale from these statements, a large panel of ‘judges’ are involved. This can lead to the process being lengthy while responses are being constructed.
The judging panel undergo what is essentially a sorting task. Each person is asked to sort cards with individual statements written on them, into eleven piles, ranking from most positive to negative. Judges are not asked to give their own opinions but are required to estimate the degree of favourableness or unfavourableness expressed by each statement (Edwards and Kenny, 1946; Barclay and Weaver, 1962; Edelman, 2000). The middle pile forms a neutral opinion.
The results are then calculated and each statement is given a score; considering the number of judges in agreement to where each statement was placed in the continuum. Statements that have a poor inter-judge agreement are discarded. The attitude questionnaire can then be produced using statements of each mean value. The scale will constitute 20–40 statements which will be numerically equidistant from the previous or next statement. The statements are placed on the scale in random order (Edwards and Kenny, 1946).
Participants are asked to determine whether they agree, disagree or are neutral towards each attitude statement. The mean value of statements that participants agree or disagree with, given that each statement has a numerical value, can then be calculated (Edwards and Kenny, 1946; Edelmann, 2000; Bowling, 2002). This score reveals whether the participant has a positive or negative attitude toward the topic in question.
Critics assert that while Thurstone’s method was an important development in the field of attitude measurement assumptions were made that placed the validity of the scale in question. The ranking of statements by a panel of judges is not guaranteed to be independent of their beliefs. There is no assurance that the attitudes of the panel do not have a bearing on the resulting score awarded to each statement (Edwards and Kenny, 1946).
Barclay and Weaver (1962) asserted also that Thurstone’s method was time consuming and arduous taking a third more time to construct than a Likert scale. Edelman (2000) concedes that for this reason the Thurstone method is out of favour in modern research practice. However, given the significance of this method in the development of attitude measurement scales it is given place in several modern research texts.
Statements that have a poor inter-judge agreement are discarded. The attitude questionnaire can then be produced using statements of each mean value. Table 1 illustrates an example of how a Thurstone questionnaire might be presented.
This is a hypothetical example of a Thurstone questionnaire. Note how the statements of attitude toward end-of-life care change. A validated scale would contain statements ranked and scored by the panel of experts.Indicate whether you agree (A) disagree (D) or are neutral (leave blank) with the following statements:___I feel adequately prepared to effectively provide care in end-of-life situations |
___I am able to appropriately provide support to the bereaved |
___ do not feel that I need further education about end of life care issues |
___I feel able to make autonomous decision whether to withhold resuscitation |
___More education about end-of-life care issues would be beneficial |
___The legal implications involved in end-of-life care are concerning |
___The ethical issues surrounding end-of-life care mean I am often unsure whether I do the right thing |
___I am not adequately prepared to provide effective end-of-life care |
___More education is needed about end-of-life care issues |
___Quality end of-life care is not an important concern for paramedics |
Likert scale
Attempting to shorten this seemingly laborious procedure, Likert (1932) presented a technique which, according to him did not need a judgment group to produce item scale values (Edwards and Kenny, 1946; Barclay and Weaver, 1962; Mueller, 1986). However, it is worth noting that according to Ferguson (1941), Likert originally used a scale that had already been constructed using the Thurstone sifting method in order to develop his scale. This brings into question the initial claim that he did away with the need for a judging panel entirely.
The construction of a Likert scale is similar to that of a Thurstone scale in that attitude statements about a particular phenomenon are collected from relevant sources, usually literature (Edelmann, 2000; Bowling, 2002). These statements need to be carefully phrased without ambiguity, however individual phrases are not given a score and there is no assumption that the difference between each response can be measured equally. The Likert scale can therefore report the order of respondent’s attitudes but does not measure the difference between agreeing and strongly agreeing with a particular statement.
Scoring is usually done on a five point scale with a higher score revealing a more positive attitude. Table 2 shows an example of a Likert questionnaire.
From your paramedic training indicate how prepared you are in each of the following knowledge and skill areas? | |||||
Knowing when to honour written do not attempt resuscitation orders? | |||||
Not at all prepared | poorly prepared | somewhat prepared | well prepared | ||
4 | 3 | 2 | 1 | ||
Knowing when not to commence resuscitation? | |||||
Not at all prepared | poorly prepared | somewhat prepared | well prepared | ||
4 | 3 | 2 | 1 | ||
Understanding potential grief reactions? | |||||
Not at all prepared | poorly prepared | somewhat prepared | well prepared | ||
4 | 3 | 2 | 1 | ||
Appropriate delivery of death notification? | |||||
Not at all prepared | poorly prepared | somewhat prepared | well prepared | ||
4 | 3 | 2 | 1 | ||
How important do you think the following knowledge and skill areas are? | |||||
Knowing when to honour written do not attempt resuscitation orders? | |||||
Not at all important | of little importance | somewhat important | very important | ||
4 | 3 | 2 | 1 | ||
Knowing when not to commence resuscitation? | |||||
Not at all important | of little importance | somewhat important | very important | ||
4 | 3 | 2 | 1 | ||
Understanding potential grief reactions? | |||||
Not at all important | of little importance | somewhat important | very important | ||
43 | 3 | 2 | 1 | ||
Appropriate delivery of death notification? | |||||
Not at all important | of little importance | somewhat important | very important | ||
4 | 3 | 2 | 1 |
Adapted from: Stone et al (2009)
Half the statements are worded in order for a strongly agree response to be favourable to the issue in question, the other half worded so that a strongly agree response indicates an unfavourable response. The scoring is reversed for these statements (Edwards and Kenney, 1946; Roberts et al. 1999). This method goes some way to check the reliability of the scale, as those respondents who answer strongly positively should answer strongly negatively to the opposing statements.
The overall score for each respondent is reported in Likert scales as opposed to responses to individual statements. This means that respondents may produce the same score from different sets of answers. Therefore, the same score does not necessarily represent the same attitude (Edelmann, 2000). Individual statements might be reported on in a study, these statements are known as Likert items.
However, it was not the intention when Likert first constructed his scale for items to be reported on individually, data obtained are usually summarised using the total scores obtained (Dawis, 1987). Nevertheless, it is common practice for researchers to report on individual questions in order to clarify their data analysis as this will go some way to differentiate when the same scores are obtained from different responses.
Reliability and validity
Attitude scales, like all other data collection tools, need to be checked for reliability and validity. Internal consistency and reliability would be supported if the various individual items correlate with each other, indicating that they belong together in assessing this attitude. In order for an attitude scale to be reliable, all statements and instructions must be unambiguous and understood in the same way by all participants. Few studies have attempted to directly compare the reliability of Thurstone and Likert scales; however those that have directly compared the two methods concluded that Likert scales offered greater reliability (Edwards and Kenney, 1944; Barclay and Weaver, 1962). These studies also concluded that the use of a judging group was not necessary for the construction of a reliable attitude scale.
It is impossible to categorically state which method would be most valid—this would depend on whether the individual tool answered the research question. Validity could be assessed by determining whether the scale can differentiate between groups thought to differ on the attitude in question or by correlations with other reports that are theoretically related to the attitude object (Bowling, 2002).
In regard to his scale, Thurstone (1928) questioned whether judges could rate opinion statements without any bias; indeed a different set of judges might not arrive at the same ratings for statements. However, Upshaw (1965) determined that Thurstone’s methods provided scales that are valid.
Data types
The data commonly measured by Thurstone and Likert scales are of ordinal type. This means that data are ranked according to a certain characteristic but the difference between them cannot be measured accurately (Campbell et al. 2007). Thurstone’s equal appearing scale initially appears to deliver data interval in nature, by assigning an equal distant score between each statement. However, the resultant scores are based on the ranking of statements by a panel of judges and it is difficult to see how attitudes can be given an accurate numerical value. According to Thomas (1982) few, if any, psychological attitude scales are even-interval scales.
Ordinal type data must be analysed using non-parametric statistics. These methods are less authoritative than those developed for use with interval or ratio data. The resulting analysis will provide descriptive data that will summarize and indicate significant points from the results. Parametric analysis should not be performed on ranked order data and inferences to a greater population cannot be made.
However, Bowling (2002) asserts that often some researchers make assumptions about ordinal data (that intervals are equal) and apply parametric statistical analysis, albeit incorrectly. This enables the researchers to make more use of data. However, the resulting conclusions must be questioned if this is done.
Use of the scales
In the context of studying paramedics attitudes towards supporting the suddenly bereaved the decision as to which method is best to use rests with the researcher. Consideration of the aims of the study will be of paramount importance.
There is a paucity of research that considers the attitudes of paramedics to death or bereavement and there appears to be no pre-existing data collection tools specifically for this purpose.
The question remains therefore which of the two methods would be most appropriate to use in developing such a tool.
The Thurstone method might prove to be impractical to use as a large pool of judges would be needed and the specialist nature of the paramedic role means that the judges would need to be practicing paramedics. Using this group of staff to develop the tool would reduce the available population with which to conduct the study. The time consuming nature and lesser reliability of the Thurstone method accounts for the relatively superior popularity of the Likert procedure for attitude measurement in health sciences (Petty and Cacioppo, 1981).
Paramedics would also be familiar with the Likert type of attitude scale given its popularity in health related studies (Bowling, 2002). However, it is worth considering whether this popularity has bred ‘Likert scale contempt’, in that there are so many attitude scale questionnaires, respondents might not read, assimilate and answer truthfully, or they may use an automatic response. This would mean that data obtained would not reflect true attitudes.
It is important to consider that measuring attitudes can be extremely difficult and respondents may not always be truthful with their responses (Parahoo, 1997). Paramedics may wish to hide their true beliefs and attitudes. This would significantly affect the data collected.
The context within which the scale is administered might be important. In the case of assessing paramedics attitudes toward death, it might be important for the questionnaire not to be given to them by their managers during shift time as there may be concerns that the data obtained might be used against the paramedic. One of the limitations of both methods is the presence of the neutral or undecided response option. Respondents might use this option in order to conceal any extreme beliefs that they feel might be unpopular.
However the number of alternatives is often manipulated, it is not unusual to see more response categories and some researchers remove the neutral category all together (Edelmann 2000). These are issues that will need to be addressed by the researcher in their design of scale and beyond the scope of this piece to discuss fully.
Conclusion
Both of the attitude scales discussed are valid, reliable data collection methods that would be appropriate to investigate the attitudes of paramedics, but neither without weakness. Considering the literature, it appears that Likert is a superior tool between the two methods, however there are other techniques that warrant consideration before deciding which to use for the purpose of the research in question.