Qatar has a population of more than 2.6 million (MDPS, 2016). Nearly 90% are emigrant workers, mainly from Asia and North Africa (Goodman, 2015). Hamad Medical Corporation Ambulance Service (HMCAS) employs critical care paramedics (CCPs) and ambulance paramedics (Wilson et al, 2018), who have received a variety of training in pain assessment and in various pharmacological and non-pharmacological treatments for pain in their home countries.
CCPs are recruited primarily from Western countries. In the prehospital environment in Qatar, CCPs work with ambulance paramedics from various linguistic and cultural backgrounds such as from Tunisia, India, the Philippines, Jordan, Morocco, Egypt and Britain. Qatar's multinational population adds to the diversity of emergency medical care practice (Gangaram et al, 2017).
Globally, on average, four out of five (80%) of all patients seeking emergency medical service (EMS) help experience pain (Iqbal et al, 2015). Recent studies show that pain is poorly assessed in the prehospital setting (Lynde and Zorab, 2015). Delays in prehospital pain assessment and treatment are further prolonged in the emergency centre because of the initial triage processes (Hodkinson, 2016).
Researchers conducted two significant studies in California in the United States to determine the effects of an educational intervention (EI) on prehospital pain management (p<0.001) (French et al, 2006; 2013). In both studies, paramedics took part in a 3-hour EI, with surveys completed before the EI and 1 month after. In 2001, French et al (2006) reviewed 297 surveys and 439 EMS patient care reports (PCRs) with pain complaints. They found that, following the EI, paramedics' knowledge of basic pain management principles increased by 17.6 percentage points (57.3% to 74.9%). Their use of non-pharmacological pain therapies improved by 32.2 percentage points, and documentation of pain severity by 51.0 percentage points and of pain characteristics by 24.0 percentage points. Overall, assessment of pain following the EI improved by 13.0 percentage points. Even before the EI in 2007 (French et al, 2013), the researchers found improvements from 2001 in the basic knowledge of pain management by 18.2 percentage points, perceptions of pain by 9.2 percentage points and management of pain by 13.8 percentage points. The researchers concluded that continuing education in pain management is key to a more effective prehospital approach.
The appropriate assessment and treatment of pain in the prehospital setting in Qatar has been identified as a key performance indicator (KPI) by HMCAS. Currently, patients' pain is assessed by HMCAS paramedics using the Wong-Baker FACES Pain Rating Scale (Figure 1). This pain rating scale translates facial pain expression into a numerical pain scale (NPS) rating, which is then recorded on the electronic patient case record (ePCR) using a 0–10 numerical value. The tool was primarily designed for paediatric patients who are unable to verbalise their pain intensity score but can indicate their pain intensity using pictures.
HMCAS policy mandates that all patients presenting with a pain intensity score of 4 or more on the scale receive analgesia based on the clinical practice guidelines and the paramedic's scope of practice (HMCAS, 2021). However, recent HMCAS findings have indicated that the assessment of patients presenting with acute pain is suboptimal.
The use of the Wong-Baker FACES Pain Rating Scale at HMCAS has not been researched.
A plethora of evidence suggests that, once pain has been assessed and documented accurately, patients are more likely to receive appropriate analgesia. An Australian emergency centre study was conducted to assess pain score documentation and the treatment of pain (Furyk and Sumner, 2008). The researchers conducted a retrospective evaluation of 145 charts from patients with confirmed appendicitis. Pain scores were documented for 13 children and 79 adults. Eleven children and all 79 adults received intravenous morphine. The study suggested that, if pain is assessed and documented accurately, the likelihood of patients being given analgesia is increased.
A further retrospective, cross-sectional study was conducted on emergency medical service (EMS) patient care records (PCRs) after the introduction of a prehospital pain assessment protocol (McLean et al, 2004). Data extracted included verbal rating scale (VRS) scores, NPS scores and emergency call-related information. In total, 1227 PCRs were studied, of which 907 (75%) concerned non-trauma EMS transports. Two per cent (n=27) of the study population were unconscious. Pain was assessed using the EMS protocol in 1002 of 1200 (84%) patients. Of the 518 patients reporting pain, 104 (20%) completed a VRS but not an NPS. A total of 31% of patients reported moderate or severe pain. Prehospital pain assessment using a VRS and NPS was thus feasible.
In addition, studies show that ethnicity affects the pain assessment (Todd et al, 2000; Tamayo-Sarver et al, 2003). Given the local context with cultural and language differences, varying expectations of Qatar's population regarding treatment provided by EMS professionals and the latter's scope of practice, and suspected variations in the assessment of pain between HMCAS paramedics, this pilot research study was deemed to be of interest.
In addition, no inter-rater reliability studies that evaluated the use of the Wong-Baker FACES Pain Rating Scale on adult patients had been found. Searches of databases including Science Direct, Medline, EMBASE and CINAHL revealed that the use of the Wong-Baker FACES Pain Rating Scale has not been assessed in the prehospital setting on adult patients. Although the Wong-Baker FACES Pain Rating Scale was designed for use by paediatric patients to self-report pain intensity, at HMCAS it is also used for adult patients.
Methods
Study design
A prospective, quantitative pilot study was conducted. Primary data on the paramedics' assessment of pain was gathered using survey questionnaires following simulation-based interactions with five standardised adult patients.
Study setting, population and sample
This pilot study was conducted through HMCAS in Qatar, and approved by its medical research centre (16155/16).
To direct and coordinate emergency resources, HMCAS uses a hub-and-spoke model. The model was designed to ensure the public have rapid access to emergency care. The country has six hubs with 29 spokes (Wilson et al, 2018).
A sample size of 3.0% (35/1159) of ambulance paramedics and CCPs was deemed the minimum for this inter-rater reliability pilot study. Participant recruitment was randomised based on staff presence at the various locations during the study data collection period.
Study protocol
Five members of staff from the HMCAS training department were prepared to act as standardised adult patients presenting with different reference levels of pain. These ‘patients’ were taken to all HMCAS hubs and spokes over a period of 2 weeks.
All paramedics at these locations were invited to voluntarily participate in the study. No advance invitations were circulated to prospective participants to prevent participants refreshing on the use of the Wong-Baker FACES Pain Rating Scale beforehand. On the day of data collection, information letters regarding the study were circulated to all prospective participants. Only those who consented to take part were recruited into the study.
The data collection tool included demographic questions and a section in five parts regarding pain scoring for the different cases. The order of patient presentation was done using a randomisation table. The paramedics then had to assess the patients' pain using the Wong-Baker FACES Pain Rating Scale and record the score on the data collection tool for each case.
Participants were required to explain the procedure to patients, obtain their consent, explain the Wong-Baker FACES Pain Rating Scale, and have the patient identify their pain intensity score.
Anonymised completed questionnaires were placed in a sealed box. The data-collection process did not affect the paramedics' availability to respond to emergency calls.
The patient scenarios were validated by a focus group comprising instructors from the HMCAS training department, consultant paramedics from HMCAS and academics from the Durban University of Technology's Department of Emergency Medical Care and Rescue.
The same actors were used for the simulated adult patient scenarios throughout the data collection process. The five cases included:
Data analysis
Microsoft Excel was used as the primary analytical software. An add-in analysis tool from Real Statistics was used to supplement statistical computations. A 95% confidence interval and statistical significance of α=0.05 where ρ<α was chosen to reflect the statistical power of the study.
Inter-rater reliability is the degree of agreement between raters; in this study, it was how close the pain scores (ratings) given by each participant paramedic for each patient were to each other.
The data obtained were ranked and therefore ordinal, requiring mostly non-parametric statistical measures. Furthermore, only a single rating was awarded by each rater on each patient at only one time. A confusion matrix was applied to determine sensitivity, specificity, over-rating, under-rating and degree of variance as they were important measures to give direction and perspective to the inter-rater reliability results.
As there were 35 participants, each scoring five cases, there were 175 pain scores in total to be analysed and compared.
Results
There were 30 (85.7%) male and 5 (14.3%) female participants. Thirty-two (91.4%) of the 35 participants were ambulance paramedics and three (8.6%) were CCPs. Thirteen (37.1%) of the participants had received their initial degree clinical training in Tunisia, seven (20.0%) in Jordan, five (14.3%) in India, four (11.4%) in the Philippines, two (5.7%) each in Morocco and South Africa, and one (2.9%) each in the United States and Yemen. The mean number of years of practising in Qatar was 5 (in a range of 1–14 years).
The researchers observed that the participants were scoring the standardised patients' pain intensity based on their facial expression of pain rather than the patients selecting the desired face on the Wong-Baker FACES Pain Rating Scale.
Overall, Fleiss' kappa values indicate only a poor to slight agreement of the allocated pain scores among participants (Table 1). Not only was there poor agreement overall but also the five patient cases individually showed equally poor agreement. Only the case where they pain score was 10/10 received moderate/good agreement between raters (74.3%) (Table 2).
Pain score | Kappa | CI (95%) |
---|---|---|
Total | 0.146 | (0.132–0.160) |
0 | No Value | |
1 | –0.005 | (–0.041–0.030) |
2 | 0.019 | (–0.016–0.055) |
3 | 0.202 | (0.166–0.238) |
4 | 0.062 | (0.026–0.098) |
5 | 0.027 | (–0.008–0.063) |
6 | 0.017 | (–0.018–0.053) |
7 | 0.076 | (0.040–0.112) |
8 | –0.004 | (–0.040–0.031) |
9 | 0.037 | (0.001–0.073) |
10 | 0.501 | (0.465–0.537) |
All cases (n=175) | Case A (7/10) | Case B (10/10) | Case C (4/10) | Case D (2/10) | Case E (6/10) | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
n | % | n | % | n | % | n | % | n | % | n | % | |
Pain Score | ||||||||||||
0 | ||||||||||||
1 | 1 | 0.6 | 1 | 2.9 | ||||||||
2 | 7 | 4.0 | 2 | 5.7 | 4 | 11.4 | 1 | 2.9 | ||||
3 | 28 | 16.0 | 2 | 5.7 | 13 | 37.1 | 13 | 37.1 | ||||
4 | 28 | 16.0 | 2 | 5.7 | 8 | 22.9 | 9 | 25.7 | 9 | 25.7 | ||
5 | 26 | 14.9 | 4 | 11.4 | 1 | 2.9 | 9 | 25.7 | 4 | 11.4 | 8 | 22.9 |
6 | 15 | 8.6 | 6 | 17.1 | 1 | 2.9 | 2 | 5.7 | 1 | 2.9 | 5 | 14.3 |
7 | 19 | 10.9 | 9 | 25.7 | 1 | 2.9 | 1 | 2.9 | 1 | 2.9 | 7 | 20.0 |
8 | 11 | 6.3 | 4 | 11.4 | 3 | 8.6 | 2 | 5.7 | 2 | 5.7 | ||
9 | 7 | 4.0 | 4 | 11.4 | 3 | 8.6 | ||||||
10 | 33 | 18.9 | 4 | 11.4 | 26 | 74.3 | 3 | 8.6 |
Each simulated case had a predetermined reference pain score (i.e. rank) so the correlation of ratings distributed among these cases provided some reference point. In all cases, for both Spearman's and Kendall's correlation coefficients, values remained below 0.50, indicating a poor correlation of pain score distributions throughout all the cases (Table 3). The distribution of pain score allocations were equally varied throughout the participants' allocation, signifying that they were equally poor at agreeing or allocating the correct pain score throughout the group.
Correlation | A | B | C | D | E |
---|---|---|---|---|---|
Spearman's | |||||
A | - | 0.05 | 0.29 | 0.43 | 0.47 |
B | 0.05 | - | 0.02 | –0.14 | 0.20 |
C | 0.29 | 0.02 | - | 0.04 | 0.38 |
D | 0.43 | –0.14 | 0.04 | - | 0.29 |
E | 0.47 | 0.20 | 0.38 | 0.29 | - |
Kendall's | |||||
A | - | 0.04 | 0.21 | 0.32 | 0.35 |
B | 0.04 | - | 0.01 | -0.12 | 0.17 |
C | 0.21 | 0.01 | - | 0.04 | 0.31 |
D | 0.32 | –0.12 | 0.04 | - | 0.23 |
E | 0.35 | 0.17 | 0.31 | 0.23 | - |
Interclass | Raters | 0.67 (0.41–0.95) | |||
Interclass | Cases | 0.08 (0.01–0.21) |
The null hypothesis for this study is that there is no significant difference between the raters or between the cases. If ρ<α (α=0.05) and F-distribution >F-critical, then the null hypothesis can be rejected and vice versa. The ANOVA result indicates raters (ρ=3.381E-05)<(α=0.05) and F-distribution (2.667)>F-critical (1.516), cases (ρ=5.88E-39)<(α=0.05) and F-distribution (97.479)>F-critical (2.438).
Therefore, the null hypothesis can be rejected as there is a significant difference in pain score allocations between the raters and between the cases. The latter proves the hypothesis testing accurate as there were different predetermined reference standards set for each case. There was no reliability between the participants when it came to the allocation of pain scores based on the Wong-Baker FACES Pain Rating Scale.
The confusion matrix indicates similar results to what was found through the kappa and correlation statistics (Table 4). It further describes the distribution of scores as seen in Table 2. Overall, sensitivity is poor to very poor throughout, except for case B (the 10/10 pain reference score) where 74.3% is regarded as good sensitivity (Fleiss and Cohen, 1973; Landis and Koch, 1977). Sensitivity as an exclusionary measure in this instance shows how poorly the Wong-Baker FACES Pain Rating Scale was applied by these participants as they could not accurately determine the correct pain score.
Case | Sensitivity (%) | Specificity (%) | Under-score (%) | Over-score (%) |
---|---|---|---|---|
All | 29.7 | 92.9 | 32.6 | 37.7 |
A | 25.7 | 92.9 | 40.0 | 34.3 |
B | 74.3 | 95.0 | 25.7 | |
C | 22.9 | 85.7 | 42.9 | 34.3 |
D | 11.4 | 97.9 | 2.9 | 85.7 |
E | 14.3 | 92.9 | 51.4 | 34.3 |
Specificity, on the other hand, was very good throughout as an inclusionary measure indicating poor delineation of pain through varied case presentations. The under- and over-score values coupled with the sensitivity and specificity values clearly indicate that the participants were not able to allocate the correct pain scores and, notably, their scores were spread widely throughout the Wong-Baker FACES Pain Rating Scale (Table 2).
The degree offset distributions were clustered around the reference standard (also evident in Table 2). Most of the allocations were 1–2 pain score levels away from the predetermined reference standard, but not in a normal distribution. However, some of the higher reference pain levels received lower scores, and vice versa (Table 5).
Case | Under-score | Correct | Over-score | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
>-2 | -2 | -1 | 0 | +1 | +2 | >+2 | ||||||||
N | % | N | % | N | % | N | % | N | % | N | % | N | % | |
A | 4 | 11.4 | 4 | 11.4 | 6 | 17.1 | 9 | 25.7 | 4 | 11.4 | 4 | 11.4 | 4 | 11.4 |
B | 3 | 8.6 | 3 | 8.6 | 3 | 8.6 | 26 | 74.3 | ||||||
C | 2 | 5.7 | 13 | 37.1 | 8 | 22.9 | 9 | 25.7 | 2 | 5.7 | 1 | 2.9 | ||
D | 1 | 2.9 | 4 | 11.4 | 13 | 37.1 | 9 | 25.7 | 8 | 22.9 | ||||
E | 1 | 2.9 | 9 | 25.7 | 8 | 22.9 | 5 | 14.3 | 7 | 20.0 | 2 | 5.7 | 3 | 8.6 |
Discussion
This pilot study aimed to determine the inter-rater reliability of the Wong-Baker FACES Pain Rating Scale when applied to adult patients in the prehospital setting. Participants in this study scored patients' pain intensity using the tool based on the their facial expressions of pain rather than the patients themselves identifying their pain score using the Wong-Baker FACES Pain Rating Scale.
Overall, the inter-rater reliability as determined through Fleiss' kappa indicated only a poor-to-slight agreement of the allocated pain scores, as described against the reference standards.
There was a wide grouping of the pain score levels around the reference standard. Most of the allocations were 1–2 pain score levels away from the reference standard, although not in a normal distribution, with some of the higher reference pain levels receiving lower scores and vice versa. Ideally, if the standardised patients scored their pain intensity, their scores would have been the same throughout. Overall, sensitivity was poor-to-very poor throughout.
Being able to assess the intensity of pain is essential to its effective management in the prehospital setting (Garra et al, 2010). Contextually, paramedics must overcome critical barriers such as environmental factors, communication differences, cultural assumptions, bias and ineffective use of the Wong-Baker FACES Pain Rating Scale to successfully assess pain. The plethora of evidence demonstrates that paramedics can use various pain assessment tools and scales and, based on their scopes of practice, manage patients' pain appropriately.
Ambulance paramedics and CCPs in Qatar undergo extensive training on the use of the Wong-Baker FACES Pain Rating Scale during their induction programme. These clinicians are then assessed theoretically and practically on its use before they are certified to use it in operational duty. Furthermore, memory aides in the form of laminated cards are distributed to all paramedics during their induction training to assist them with effective pain assessment. The Wong-Baker FACES Pain Rating Scale is also contained in the HMCAS clinical practice guidelines, which are hosted on the ePCR system for quick reference.
Although the tool was standard, the inter-rater reliability results demonstrate that this sample had poor agreement when allocating pain scores using the Wong-Baker FACES Pain Rating Scale, except for one standardised adult patient where the agreement was moderate to good. This scale, when used with adults, and not in children as intended, resulted in a poor assessment of pain intensity in five simulated patients (French et al, 2013).
The pilot study findings could be attributable to the specific barriers within the prehospital setting in Qatar. Communicating with patients is challenging in certain instances. Although the HMCAS makes every effort to ensure paramedic teams are multilingual, patients from certain linguistic groups are disadvantaged. The lack of identifying and understanding non-verbal cues may also reduce their ability to assess pain and use the tool as intended. There are also prevalent assumptions associated with certain ethnic groups and nationalities with regards to tolerance to pain; irrespective of these, clinicians should assess patients' pain according to their reported level of discomfort rather than base it on subjective assumptions.
The paramedic's knowledge of the use of the Wong-Baker FACES Pain Rating Scale for assessing pain is also in doubt. Further EIs on the correct use and interpretation of the Wong-Baker FACES Pain Rating Scale is required at HMCAS to meet the International Patient Safety Goals as set out by the Joint Commission International (Bener and Al Mazroei, 2010). Essentially, achieving effective pain relief and patient comfort is critical to efficient emergency medical care. Inaccurate pain scores and an inability to assess pain may translate into poor or incorrect treatment (Schyve, 2007).
In addition, the present study shows that the Wong-Baker FACES Pain Rating Scale may be an inaccurate tool to determine the intensity of pain levels in adults. Furthremore, if used incorrectly, it will not detail an accurate pain intensity score. More commonly used tools for assessing pain intensity in adults include NPS and VRS (Hjermstad et al, 2011).
Limitations
This was a pilot study, so involved only a limited number of clinicians and standardised patient cases. For statistical analysis, this means there are problems when applying Fleiss' kappa and correlation coefficients; the picture can be skewed if data are limited and vary. Six of the 10 possibilities did not have a predetermined reference simulation case; thus, the ranking order was inconsistent. The results have bearing only to this sample of cases at a single point in time.
Further research to include a variety of reference cases inclusive of all the possible pain score allocations, and possibly multiples thereof, would be recommended to get a clearer picture of the phenomenon.
Test-retest could also be considered to factor in the effect of possible training on the use of the Wong-Baker FACES Pain Rating Scale. Although the pain rating scale is specifically included in the HMCAS clinical practice guidelines, its instructions for use during patient care need to be reviewed to ensure they are explicit that it is the patient who is meant to show the facial expression corresponding to their level of pain.
Conclusion
For this study population, participants were not accurate in determining the correct pain score. Not only were they inaccurate, they were also unable to agree on the pain score to be allocated (regardless of the predetermined reference standard) and thus were not precise.
Pain score allocations were widely spread throughout the scale showing poor consistency and inter-rater agreement on the score to be allocated. This could be attributed to the inaccurate use of the tool. It is this study finding that the participating paramedics were unreliable in the application of the Wong-Baker FACES Pain Rating Scale to determine the pain levels of these five simulated cases.
Since pain scores have a direct impact on treatment, it is notably concerning that incorrect pain score allocations can lead to under- or over-management of pain and administration of analgesic therapy.
Further training for HMCAS staff on the Wong-Baker FACES Pain Rating Scale is recommended. A further study should then be conducted with a larger sample size to determine their ability to accurately use the Wong-Baker FACES Pain Rating Scale in determining the adult patient's pain intensity.
HMCAS should also consider changing to an adult pain rating scale in the prehospital care setting with a culturally diverse patient population.