Background

The Work Limitations Questionnaire-25 (WLQ-25) and the Work Instability Scale for Rheumatoid Arthritis (RA-WIS) have been used to measure at-work disability related to musculoskeletal disorders. However, a recent systematic review has shown that important psychometric properties still needed to be evaluated.

Objective

The purpose of this study was to establish the validity and responsiveness of the WLQ-25 and RA-WIS in people with chronic work-related upper-extremity disorders.

Design

Two-hundred six participants with chronic upper-extremity disorders who attended a specialty clinic operated by the Workplace Safety & Insurance Board of Ontario were evaluated at their initial visit and 6 months later.

Methods

Questionnaires completed at each evaluation were: the WLQ-25, the RA-WIS, the QuickDASH, the pain subscale of the Shoulder Pain and Disability Questionnaire, and the Chronic Pain Grade Questionnaire. At the 6-month evaluation, participants completed a global rating of change question. Known-group and construct convergent validity were assessed using analysis of variance and Pearson correlations, and standardized response means (SRMs) were used to assess responsiveness. Clinically important differences (CIDs) also were determined.

Results

The WLQ-25 and RA-WIS had low to moderate correlations with pain and disability scales (.28<r<.62) and discriminated among different functional categories (P<.001). For improved participants, the WLQ-25 (SRM=0.65 for summed score, SRM=0.63 for index score) and the RA-WIS (SRM=0.66) demonstrated moderate responsiveness. The CID for improvement was estimated to be 13/100 points for the WLQ-25 summed score, 5/28.6 points for the WLQ-25 index score, and 4/23 points for the RA-WIS.

Limitations

The external criterion of change was specific to change in upper-extremity condition and not to change in work ability or productivity.

Conclusions

The WLQ-25 and RA-WIS provide different information from that provided by pain and disability measures. They discriminate among functional outcome subgroups and detect improvement over time in people with chronic work-related upper-extremity disorders.

Musculoskeletal disorders often lead to work disabilities in the form of both time away from work (absenteeism) and difficulties experienced on the job (presenteeism).1,2 Presenteeism reflects the phenomenon of loss of work productivity in terms of the quantity or quality of work done due to illness or injury in people who are present at their job.35 Although work absenteeism is readily definable and commonly measured through administrative records,1,2 presenteeism is a more abstract concept, as it tries to quantify less than full output in work situations that are not standardized.

Presenteeism usually is assessed by self-report questionnaires.6 The Work Limitations Questionnaire-25 (WLQ-25) and the Work Instability Scale for Rheumatoid Arthritis (RA-WIS) are presenteeism scales that have been used to evaluate workers with musculoskeletal disorders.710 The WLQ-25 measures the impact of chronic health conditions on job performance and work productivity,11 whereas the RA-WIS is designed to assess the risk of work disability based on the degree of mismatch between functional abilities and workplace demands.12 In 2007, a systematic review was conducted to evaluate the psychometric properties of presenteeism questionnaires for musculoskeletal disorders.2 Six scales were identified, and the conclusions were that none had sufficient supporting psychometric evidence.2 The WLQ-25 and RA-WIS received the lowest psychometric ratings because important properties, such as responsiveness and validity, had not yet been established. Since that systematic review, recent studies have shown that the WLQ-25 and RA-WIS correlate moderately (.70>r>.50) with other disability measures and differentiate among various functional levels.6,9,1315 The WLQ-25 and RA-WIS also have been shown to be lowly to moderately responsive (standardized response mean [SRM]=0.28–0.64) in workers with rheumatoid arthritis who reported improvement in work ability and productivity.16

Most pain and disability scales used in musculoskeletal practice focus on activities of daily life. For this reason, they may not capture important participation restrictions experienced at work. Thus, presenteeism scales may provide a useful perspective because they capture more specific and subtle difficulties that workers encounter at their job that might be overlooked by generic scales. Potential applications for presenteeism scales include determining readiness to return to work, progress during treatment, risk of absenteeism, or chronic work disablement.2 Because these different measurement purposes rely on specific measurement properties, psychometric evidence is required before clinicians can be confident in using presenteeism scales for these different purposes. Therefore, the objective of this study was to enhance psychometric evidence of the WLQ-25 and RA-WIS by determining validity and by estimating responsiveness in people with chronic work-related upper-extremity disorders.

Method

Participants and Study Design

Six hundred fourteen consecutive participants who attended upper-extremity specialty clinics were enrolled in a prospective cohort study. These clinics are part of specialty programs offered by the Workplace Safety and Insurance Board (WSIB) of Ontario that are available to injured workers who may be experiencing difficulty in their return-to-work process. The WSIB is the sole provider of workers' compensation and benefits related to work injuries or diseases for workers in Ontario. An injured worker is considered for referral to these programs if a work-related upper-extremity injury has been recognized and a timely and satisfactory recovery from the injury has not occurred. Other inclusion criteria for the study were to be able to complete questionnaires in English and to provide written informed consent.

Because the program is a tertiary-level program of the WSIB, referrals were made by the caseworker. Recommendations to the caseworker by clinicians (physical therapists or occupational therapists outside of the tertiary-level program) involved in the case, suspicions of a need for surgery by the physician, or unsuccessful attempts to return to work (not back to work 12 weeks after injury) typically would precipitate such a referral. At the clinic, all participants were assessed by an upper-extremity surgeon. Thereafter, according to the surgeon's evaluation, they were referred for surgery, physical therapy, occupational therapy, or psychotherapy.

The participants were evaluated using self-report questionnaires at their initial visit to the specialty clinic and 6 months later. The questionnaires (Tab. 1) were: (1) 2 presenteeism scales, the WLQ-2511 and RA-WIS12; (2) 1 upper-extremity disability scale, the QuickDASH17; (3) 1 pain scale, the pain subscale of the Shoulder Pain and Disability Index (SPADI-P)18; and (4) 1 chronic pain scale, the Chronic Pain Grade Questionnaire (CPG).19 Because presenteeism scales assess at-work difficulty, only participants working at the time of both assessments were included in the analysis.

Table 1

Description of the Scalesa

ScaleNo. of ItemsRange of ScoresType of ScaleTime FrameItem ContentResponse OptionsInterpretation of Scores
RA-WIS230–23DichotomousAt the momentMismatch between abilities and job demandsYes/noHigher score=higher risk of work disability
WLQ-25250–28.6 (index score) or 0–100 (summed score)5-point Likert scale, plus a “does not apply to my job” optionPrevious 2 weeks“All of the time” to “none of the time”Higher score=greater productivity loss at work
 Time-management demands50–100bDifficulty handling time and scheduling demands
 Physical demands60–100Ability to perform job tasks involving strength, movement, and flexibility.
 Mental-interpersonal-demands90–100bDifficulty handling cognitive job tasks and social interactions
 Output demands50–100bDiminished work quantity and quality
QuickDASH110–1005-point Likert scalePrevious weekAbility to do activities or severity of symptoms“No difficulty” or “not at all” or “not limited at all” to “unable” or “extremely”Higher score=greater disability
SPADI-P50–10011-point Likert scalePrevious weekSeverity of pain“No pain” to “worst pain imaginable”Higher scores=greater pain
CPG7Grade 0–IV11-point Likert scalePrevious 6 monthsHigher grades=greater chronic disability and limitations
 Characteristic pain intensity30–100Intensity of pain“No pain” to “pain as bad as could be”
 Disability score30–100Interference of pain with activities“No interference” to “unable to carry on any activities”
ScaleNo. of ItemsRange of ScoresType of ScaleTime FrameItem ContentResponse OptionsInterpretation of Scores
RA-WIS230–23DichotomousAt the momentMismatch between abilities and job demandsYes/noHigher score=higher risk of work disability
WLQ-25250–28.6 (index score) or 0–100 (summed score)5-point Likert scale, plus a “does not apply to my job” optionPrevious 2 weeks“All of the time” to “none of the time”Higher score=greater productivity loss at work
 Time-management demands50–100bDifficulty handling time and scheduling demands
 Physical demands60–100Ability to perform job tasks involving strength, movement, and flexibility.
 Mental-interpersonal-demands90–100bDifficulty handling cognitive job tasks and social interactions
 Output demands50–100bDiminished work quantity and quality
QuickDASH110–1005-point Likert scalePrevious weekAbility to do activities or severity of symptoms“No difficulty” or “not at all” or “not limited at all” to “unable” or “extremely”Higher score=greater disability
SPADI-P50–10011-point Likert scalePrevious weekSeverity of pain“No pain” to “worst pain imaginable”Higher scores=greater pain
CPG7Grade 0–IV11-point Likert scalePrevious 6 monthsHigher grades=greater chronic disability and limitations
 Characteristic pain intensity30–100Intensity of pain“No pain” to “pain as bad as could be”
 Disability score30–100Interference of pain with activities“No interference” to “unable to carry on any activities”
a

RA-WIS=Work Instability Scale for Rheumatoid Arthritis, WLQ-25=Work Limitations Questionnaire-25, SPADI-P=pain subscale of the Shoulder Pain and Disability Questionnaire, CPG=Chronic Pain Grade Questionnaire.

b

Scores are reversed.

Table 1

Description of the Scalesa

ScaleNo. of ItemsRange of ScoresType of ScaleTime FrameItem ContentResponse OptionsInterpretation of Scores
RA-WIS230–23DichotomousAt the momentMismatch between abilities and job demandsYes/noHigher score=higher risk of work disability
WLQ-25250–28.6 (index score) or 0–100 (summed score)5-point Likert scale, plus a “does not apply to my job” optionPrevious 2 weeks“All of the time” to “none of the time”Higher score=greater productivity loss at work
 Time-management demands50–100bDifficulty handling time and scheduling demands
 Physical demands60–100Ability to perform job tasks involving strength, movement, and flexibility.
 Mental-interpersonal-demands90–100bDifficulty handling cognitive job tasks and social interactions
 Output demands50–100bDiminished work quantity and quality
QuickDASH110–1005-point Likert scalePrevious weekAbility to do activities or severity of symptoms“No difficulty” or “not at all” or “not limited at all” to “unable” or “extremely”Higher score=greater disability
SPADI-P50–10011-point Likert scalePrevious weekSeverity of pain“No pain” to “worst pain imaginable”Higher scores=greater pain
CPG7Grade 0–IV11-point Likert scalePrevious 6 monthsHigher grades=greater chronic disability and limitations
 Characteristic pain intensity30–100Intensity of pain“No pain” to “pain as bad as could be”
 Disability score30–100Interference of pain with activities“No interference” to “unable to carry on any activities”
ScaleNo. of ItemsRange of ScoresType of ScaleTime FrameItem ContentResponse OptionsInterpretation of Scores
RA-WIS230–23DichotomousAt the momentMismatch between abilities and job demandsYes/noHigher score=higher risk of work disability
WLQ-25250–28.6 (index score) or 0–100 (summed score)5-point Likert scale, plus a “does not apply to my job” optionPrevious 2 weeks“All of the time” to “none of the time”Higher score=greater productivity loss at work
 Time-management demands50–100bDifficulty handling time and scheduling demands
 Physical demands60–100Ability to perform job tasks involving strength, movement, and flexibility.
 Mental-interpersonal-demands90–100bDifficulty handling cognitive job tasks and social interactions
 Output demands50–100bDiminished work quantity and quality
QuickDASH110–1005-point Likert scalePrevious weekAbility to do activities or severity of symptoms“No difficulty” or “not at all” or “not limited at all” to “unable” or “extremely”Higher score=greater disability
SPADI-P50–10011-point Likert scalePrevious weekSeverity of pain“No pain” to “worst pain imaginable”Higher scores=greater pain
CPG7Grade 0–IV11-point Likert scalePrevious 6 monthsHigher grades=greater chronic disability and limitations
 Characteristic pain intensity30–100Intensity of pain“No pain” to “pain as bad as could be”
 Disability score30–100Interference of pain with activities“No interference” to “unable to carry on any activities”
a

RA-WIS=Work Instability Scale for Rheumatoid Arthritis, WLQ-25=Work Limitations Questionnaire-25, SPADI-P=pain subscale of the Shoulder Pain and Disability Questionnaire, CPG=Chronic Pain Grade Questionnaire.

b

Scores are reversed.

At the 6-month evaluation, participants completed a global rating of change (GRC) question using an 11-point rating scale (0=“much worse,” 5=“no change,” 10=“a lot better”): “Think about your injury now compared with when you completed the first questionnaire package. How would you rate the change in your problem overall?”20 Three other questions also were answered at the 6-month evaluation: (1) “How would you rate the ongoing effects of your injury?” (5-point rating scale); (2) “How would you rate your ability to do your paid work over the last week?” (7-point rating scale); and (3) “Are the effects of having an injury now at a level where you can ignore or cope with them and do whatever it is you need to do in your daily life?” (yes/no).

Outcome Measures

WLQ-25

The WLQ-25 asks respondents to rate their level of difficulty in performing or ability to perform specific job demands over the previous 2 weeks, given their current physical health and emotional problems. Those demands are grouped into 4 types8,11: time management (n=5), physical (n=6), mental-interpersonal (n=9), and output (n=5). Responses are endorsed as a percentage of time, reflecting the amount of time that the respondents felt limited in their ability to perform each category of job demands. Up to 50% missing scores are permitted for the calculation of the scores, as recommended by scale developers. In this study, summed score, which is an overall score equal to the average of all of the items rescaled to a 0 to 100 scale, and index score, calculated using an algorithm to convert subscale scores into an estimate of productivity loss, were used.21 Higher scores indicate greater at-work productivity loss.

RA-WIS

Work instability is defined as a state arising from a mismatch between an individual's functional abilities and the demands of his or her job.12 The RA-WIS consists of 23 questions with dichotomous (yes/no) response options. The scale is scored by summing the scores of all 23 items (higher scores indicate higher risk of work disability). Originally, the RA-WIS was developed for individuals with rheumatoid arthritis,12 but it has been used to evaluate workers with other musculoskeletal disorders.9,16,22,23

QuickDASH

The QuickDASH evaluates physical disability and symptoms of the upper extremity in individuals with upper-extremity disorders.17 The score ranges from 0 (no disability) to 100 (most severe disability). The psychometric properties of the QuickDASH have been established.17,2426

SPADI-P

The Shoulder Pain and Disability Index measures pain and disability associated with shoulder pathology.18 In this study, only the pain subscale was completed. The total pain score ranges from 0 (“no pain”) to 100 (“worst pain imaginable”). The SPADI-P has been shown to be valid and reliable.27

CPG

The CPG measures the severity of chronic pain.19 It classifies respondents into hierarchical pain grades: grade 0 (pain-free) to grade IV (high disability-severely limiting). It includes subscale scores for characteristic pain intensity and disability score. The CPG has been shown to be reliable and valid.19,2830

Statistical Analyses

Psychometric analyses related to validity and responsiveness were conducted with SPSS software, version 17.* The alpha level was set at .05. A priori hypotheses were established. They are presented after each psychometric property. Independent t tests and chi-square tests were used to compare the participants working at baseline who did or did not participate at the follow-up visit.

Validity

Floor and ceiling effects are the extent to which scores cluster near the less (floor) or more (ceiling) desirable health state extreme on the scale.31 Clustering at these extremes may indicate a problem with its application to specific populations. Floor and ceiling effects were considered when more than 15% of the participants achieved the highest or lowest possible scores.32

Construct convergent validity was assessed by evaluating whether presenteeism scales correlated with each other and with pain and disability scales according to expected relationships.31 Pearson correlations among the WLQ-25, RA-WIS, CPG, QuickDASH, and SPADI-P were categorized as follows: high=≥.70, moderate=.50 to .70, and low=.26 to .50.33A priori hypotheses were that moderate correlations (.70>r>.50) would be observed between the presenteeism scales, as they evaluate different aspects of at-work disabilities, and that low to moderate correlations (.30<r<.70) would be observed between the presenteeism scales and pain and disability scales, as larger differences in their constructs were anticipated.

Known-group validity is the capacity of a test to discriminate between a group of individuals known to have a particular trait and a group of individuals who do not have the trait.31 Different comparisons were performed. Using a one-way analysis of variance (ANOVA) (Bonferroni post hoc test), participants reporting improvement according to the GRC question (7–10 on the GRC) were compared with those who were stable (4–6 on the GRC) or worse (0–3 on the GRC). Thereafter, using t tests, participants who had no to mild ongoing effects of their injury were compared with those who had moderate to very severe ongoing effects; participants who rated their ability to work as not difficult (0–3 on the scale) were compared with those who rated their ability to work as difficult (4–7 on the scale); and participants who could cope with their injury were compared with those who could not. Our a priori hypothesis was that presenteeism scales would discriminate among participants classified into different known groups (P<.05).

Responsiveness

Longitudinal construct validity refers to the degree to which change over time correlates with other indicators of change. This analysis was performed using Pearson correlations between the change scores (score at 6-month evaluation minus score at baseline) of the presenteeism scales in comparison with each other and the change scores of the pain and disability scales. A priori hypotheses were that at least moderate correlations (r>.50) would be observed between the change scores of the presenteeism scales and that low correlations (.30<r<.50) would be observed between the change scores of presenteeism scales and change scores of the pain and disability scales.

Standardized response means are used to evaluate the ability of a measure to assess change over time. Before establishing SRMs, participants were divided into subgroups because statistical methods underlying the SRM assume that all participants change in the same direction.34 Therefore, 3 subgroups were defined according to the response to the GRC question at the 6-month follow-up: (1) those who were better (7–10 on the GRC), (2) those who were stable (4–6 on the GRC), and (3) those who were worse (0–3 on the GRC). Global rating of change questions have been used to characterize responsiveness and CID for other outcome measures.20,3537 Thereafter, SRMs (mean change score divided by the standard deviation of the change score) were determined for the 3 subgroups. Ninety-five percent confidence intervals (95% CIs) were calculated for the SRMs.38 An SRM was considered large if ≥0.8, moderate if between 0.5 and 0.8, and small if between 0.2 and 0.5.39,A priori hypotheses were that large SRMs for the presenteeism scales would be observed for participants who had improved and that moderate negative SRMs would be observed for participants who had worsened, as previous studies showed lower indices for subgroups of individuals who had worsened in self-report scales.16,40

Clinically important difference is the smallest change that represents a clinically significant change for the individual patient. There are several methods to estimate CID.41 In this study, the GRC question (“Think about your injury now compared with when you completed the first questionnaire package. How would you rate the change in your problem overall?”) was used as the external criterion for establishing change. The model of Riddle and colleagues42 was followed, using receiver operating characteristics curves to determine the amount of change in the presenteeism scales that best differentiated those individuals who were moderately to greatly improved (8–10 on the GRC) from those who were stable or slightly improved (5–7 on the GRC) on the GRC question. Receiver operating characteristic curves also were plotted to determine the amount of change that best differentiated those individuals who were moderately to greatly deteriorated (0–2 on the GRC) from those who were stable or slightly deteriorated (3–5 on the GRC). Receiver operating characteristic curves were constructed for both the WLQ-25 and RA-WIS by plotting sensitivity versus 1 − specificity for all possible cutoff values of the self-reported scales. The area under the curve (AUC) was evaluated for significance. A higher AUC represented greater ability of the measure to distinguish between patients who underwent a meaningful change and those who did not. By examination of the value of the data of the sensitivity and 1 − specificity plots nearest to the upper left-hand corner of the graph, the optimal cutoff value for maximal average sensitivity and specificity for detecting improvement or deterioration was determined.43

Role of the Funding Source

This study was supported by grants from the Research Advisory Council of WSIB of Ontario (WSIB-RAC-05028 and WSIB-RAC-02011). The funding source was not involved in the study's design, conduct, or reporting.

Results

Six hundred fourteen participants were enrolled; 105 participants did not participate at the follow-up visit, resulting in 509 participants (eFig. 1; available at ptjournal.apta.org). No differences were found (P<.05) in the scale scores between participants working at baseline who did or did not drop out. Although the proportions of women were not significantly different, participants who dropped out were significantly younger (eTab. 1; available at ptjournal.apta.org). Of the 509 participants evaluated at the 6-month follow-up, 206 were working at baseline and at follow-up, thus completing WLQ-25 and RA-WIS on 2 occasions. Our analyses were performed on these 206 participants (eFig. 1). One hundred forty-two participants (69%) had shoulder pain, 99 (48%) had elbow pain, 74 (36%) had wrist pain, and 79 (38%) had hand pain.

Validity

Floor and ceiling effects

At baseline, the WLQ-25 mental-interpersonal demand scores showed a ceiling effect, as 16% of the participants achieved the best possible score. At the 6-month follow-up, the WLQ-25 mental-interpersonal and output demand scores showed ceiling effects, with 29% and 23% of the participants, respectively, achieving the best possible score. No floor or ceiling effects were observed for the RA-WIS and WLQ-25 (eFig. 2; available at ptjournal.apta.org).

Convergent construct validity

A moderate correlation was observed between the WLQ-25 and RA-WIS (r=.53 for index score, r=.54 for summed score), whereas low to moderate correlations were observed between the presenteeism scales and the pain and function scales (.28<r<.62) (Tab. 2). Moderate to high correlations were observed between the WLQ-25 and its subscales (.78<r<.89), except for the physical demands subscales, for which a weak correlation (r=.06 for index score, r=.21 for summed score) was obtained.

Table 2

Convergent Construct Validity: Correlations (r) Among Self-Report Scales at Baselinea

ScaleWLQ-25 Index ScoreWLQ-25 Summed ScoreRA-WISQuickDASHCPGSPADI-P
CPGPainDisability
WLQ-25 index score (195≤n≤202).95**.53**.52**.30**.32**.35**.28**
WLQ-25 summed score (195≤n≤202).54**.54**.32**.33**.37**.30**
 Time-management demands (188≤n≤194).78**.79**.50**51**.24*.28**.32**.24*
 Physical demands (191≤n≤198).06.21*.08.01.02.04.12.01
 Mental-interpersonal demands (194≤n≤201).83**.86**.46**.47**.27**.26**.23*.23*
 Output demands (188≤n≤197).89**.76**.43**.43**.29**.31**.39**.29**
RA-WIS (202≤n≤205).62**.36**.28**.37**.32**
ScaleWLQ-25 Index ScoreWLQ-25 Summed ScoreRA-WISQuickDASHCPGSPADI-P
CPGPainDisability
WLQ-25 index score (195≤n≤202).95**.53**.52**.30**.32**.35**.28**
WLQ-25 summed score (195≤n≤202).54**.54**.32**.33**.37**.30**
 Time-management demands (188≤n≤194).78**.79**.50**51**.24*.28**.32**.24*
 Physical demands (191≤n≤198).06.21*.08.01.02.04.12.01
 Mental-interpersonal demands (194≤n≤201).83**.86**.46**.47**.27**.26**.23*.23*
 Output demands (188≤n≤197).89**.76**.43**.43**.29**.31**.39**.29**
RA-WIS (202≤n≤205).62**.36**.28**.37**.32**
a

RA-WIS=Work Instability Scale for Rheumatoid Arthritis, WLQ-25=Work Limitations Questionnaire-25, SPADI-P=pain subscale of the Shoulder Pain and Disability Questionnaire, CPG=Chronic Pain Grade Questionnaire. Correlations above .50 (moderate [.70>r>.50] to high [r>.70] correlations) are shown in boldface type. * Significant at P<.05, ** significant at P<.01.

Table 2

Convergent Construct Validity: Correlations (r) Among Self-Report Scales at Baselinea

ScaleWLQ-25 Index ScoreWLQ-25 Summed ScoreRA-WISQuickDASHCPGSPADI-P
CPGPainDisability
WLQ-25 index score (195≤n≤202).95**.53**.52**.30**.32**.35**.28**
WLQ-25 summed score (195≤n≤202).54**.54**.32**.33**.37**.30**
 Time-management demands (188≤n≤194).78**.79**.50**51**.24*.28**.32**.24*
 Physical demands (191≤n≤198).06.21*.08.01.02.04.12.01
 Mental-interpersonal demands (194≤n≤201).83**.86**.46**.47**.27**.26**.23*.23*
 Output demands (188≤n≤197).89**.76**.43**.43**.29**.31**.39**.29**
RA-WIS (202≤n≤205).62**.36**.28**.37**.32**
ScaleWLQ-25 Index ScoreWLQ-25 Summed ScoreRA-WISQuickDASHCPGSPADI-P
CPGPainDisability
WLQ-25 index score (195≤n≤202).95**.53**.52**.30**.32**.35**.28**
WLQ-25 summed score (195≤n≤202).54**.54**.32**.33**.37**.30**
 Time-management demands (188≤n≤194).78**.79**.50**51**.24*.28**.32**.24*
 Physical demands (191≤n≤198).06.21*.08.01.02.04.12.01
 Mental-interpersonal demands (194≤n≤201).83**.86**.46**.47**.27**.26**.23*.23*
 Output demands (188≤n≤197).89**.76**.43**.43**.29**.31**.39**.29**
RA-WIS (202≤n≤205).62**.36**.28**.37**.32**
a

RA-WIS=Work Instability Scale for Rheumatoid Arthritis, WLQ-25=Work Limitations Questionnaire-25, SPADI-P=pain subscale of the Shoulder Pain and Disability Questionnaire, CPG=Chronic Pain Grade Questionnaire. Correlations above .50 (moderate [.70>r>.50] to high [r>.70] correlations) are shown in boldface type. * Significant at P<.05, ** significant at P<.01.

Known-group validity

At the 6-month follow-up, mean scores of the RA-WIS and WLQ-25 could differentiate between improved participants and participants who were stable or had worsened according to the GRC question (Tab. 3). Their mean scores also could differentiate between: (1) participants who had no to mild versus moderate to very severe ongoing effects of their injury, (2) participants who rated their ability to do paid work as not difficult versus difficult, and (3) participants who stated that they could versus could not cope with their injury (Tab. 4).

Table 3

Presenteeism Scales Scores at Baseline and 6-Month Follow-upa

ScaleImproved (n=72)Stable (n=94)Worse (n=38)
Baseline6-Month Follow-upBaseline6-Month Follow-upBaseline6-Month Follow-up
WLQ-25 index score8.9 (5.4)5.4 (5.6)b,c8.1 (5.5)7.4 (4.9)b,d10.0 (6.9)12.0 (6.6)c,d
WLQ-25 summed score35.5 (16.7)22.4 (18.6)b,c33.4 (17.7)31.0 (17.1)b,d39.5 (20.6)49.0 (19.9)c,d
 Time-management demands35.8 (29.7)21.3 (24.1)b,c33.3 (26.0)37.1 (23.9)b,d43.7 (25.2)57.1 (29.3)c,d
 Physical demands48.7 (25.7)40.1 (28.1)b,c47.6 (25.6)45.0 (21.0)b41.5 (24.4)59.9 (21.8)c
 Mental-interpersonal
demands
25.5 (25.1)12.2 (21.5)b,c21.5 (22.4)20.7 (18.4)b,d28.0 (26.7)36.4 (28.1)c,d
 Output demands37.4 (27.6)22.1 (27.2)b,c38.3 (29.9)29.2 (24.7)b51.1 (27.2)51.5 (25.7)c
RA-WIS12.2 (5.2)8.6 (6.8)b,c12.9 (5.3)13.6 (5.1)b,d14.1 (4.4)17.5 (3.6)c,d
ScaleImproved (n=72)Stable (n=94)Worse (n=38)
Baseline6-Month Follow-upBaseline6-Month Follow-upBaseline6-Month Follow-up
WLQ-25 index score8.9 (5.4)5.4 (5.6)b,c8.1 (5.5)7.4 (4.9)b,d10.0 (6.9)12.0 (6.6)c,d
WLQ-25 summed score35.5 (16.7)22.4 (18.6)b,c33.4 (17.7)31.0 (17.1)b,d39.5 (20.6)49.0 (19.9)c,d
 Time-management demands35.8 (29.7)21.3 (24.1)b,c33.3 (26.0)37.1 (23.9)b,d43.7 (25.2)57.1 (29.3)c,d
 Physical demands48.7 (25.7)40.1 (28.1)b,c47.6 (25.6)45.0 (21.0)b41.5 (24.4)59.9 (21.8)c
 Mental-interpersonal
demands
25.5 (25.1)12.2 (21.5)b,c21.5 (22.4)20.7 (18.4)b,d28.0 (26.7)36.4 (28.1)c,d
 Output demands37.4 (27.6)22.1 (27.2)b,c38.3 (29.9)29.2 (24.7)b51.1 (27.2)51.5 (25.7)c
RA-WIS12.2 (5.2)8.6 (6.8)b,c12.9 (5.3)13.6 (5.1)b,d14.1 (4.4)17.5 (3.6)c,d
a

Scores are presented as mean (standard deviation). RA-WIS=Work Instability Scale for Rheumatoid Arthritis, WLQ-25=Work Limitations Questionnaire-25.

b

Significant differences at 6-month follow-up between improved participants and stable participants.

c

Significant differences at 6-month follow-up between improved participants and participants who were worse.

d

Significant differences at 6-month follow-up between participants who were worse and stable participants.

Table 3

Presenteeism Scales Scores at Baseline and 6-Month Follow-upa

ScaleImproved (n=72)Stable (n=94)Worse (n=38)
Baseline6-Month Follow-upBaseline6-Month Follow-upBaseline6-Month Follow-up
WLQ-25 index score8.9 (5.4)5.4 (5.6)b,c8.1 (5.5)7.4 (4.9)b,d10.0 (6.9)12.0 (6.6)c,d
WLQ-25 summed score35.5 (16.7)22.4 (18.6)b,c33.4 (17.7)31.0 (17.1)b,d39.5 (20.6)49.0 (19.9)c,d
 Time-management demands35.8 (29.7)21.3 (24.1)b,c33.3 (26.0)37.1 (23.9)b,d43.7 (25.2)57.1 (29.3)c,d
 Physical demands48.7 (25.7)40.1 (28.1)b,c47.6 (25.6)45.0 (21.0)b41.5 (24.4)59.9 (21.8)c
 Mental-interpersonal
demands
25.5 (25.1)12.2 (21.5)b,c21.5 (22.4)20.7 (18.4)b,d28.0 (26.7)36.4 (28.1)c,d
 Output demands37.4 (27.6)22.1 (27.2)b,c38.3 (29.9)29.2 (24.7)b51.1 (27.2)51.5 (25.7)c
RA-WIS12.2 (5.2)8.6 (6.8)b,c12.9 (5.3)13.6 (5.1)b,d14.1 (4.4)17.5 (3.6)c,d
ScaleImproved (n=72)Stable (n=94)Worse (n=38)
Baseline6-Month Follow-upBaseline6-Month Follow-upBaseline6-Month Follow-up
WLQ-25 index score8.9 (5.4)5.4 (5.6)b,c8.1 (5.5)7.4 (4.9)b,d10.0 (6.9)12.0 (6.6)c,d
WLQ-25 summed score35.5 (16.7)22.4 (18.6)b,c33.4 (17.7)31.0 (17.1)b,d39.5 (20.6)49.0 (19.9)c,d
 Time-management demands35.8 (29.7)21.3 (24.1)b,c33.3 (26.0)37.1 (23.9)b,d43.7 (25.2)57.1 (29.3)c,d
 Physical demands48.7 (25.7)40.1 (28.1)b,c47.6 (25.6)45.0 (21.0)b41.5 (24.4)59.9 (21.8)c
 Mental-interpersonal
demands
25.5 (25.1)12.2 (21.5)b,c21.5 (22.4)20.7 (18.4)b,d28.0 (26.7)36.4 (28.1)c,d
 Output demands37.4 (27.6)22.1 (27.2)b,c38.3 (29.9)29.2 (24.7)b51.1 (27.2)51.5 (25.7)c
RA-WIS12.2 (5.2)8.6 (6.8)b,c12.9 (5.3)13.6 (5.1)b,d14.1 (4.4)17.5 (3.6)c,d
a

Scores are presented as mean (standard deviation). RA-WIS=Work Instability Scale for Rheumatoid Arthritis, WLQ-25=Work Limitations Questionnaire-25.

b

Significant differences at 6-month follow-up between improved participants and stable participants.

c

Significant differences at 6-month follow-up between improved participants and participants who were worse.

d

Significant differences at 6-month follow-up between participants who were worse and stable participants.

Table 4

Known-Group Validity of Presenteeism Scales at the 6-Month Follow-upa

ScaleOngoing Effects of the InjuryAbility to Do Paid Work Over the Previous WeekCan Cope With the Upper-Extremity Injury
No to MildbModerate to SeverebPMean Difference (95% CI)Not DifficultbDifficultbPMean Difference (95% CI)YesbNobPMean Difference (95% CI)
WLQ-25 index score4.6 (5.8)9.6 (5.8)<.0014.9 (3.1, 6.8)5.1 (4.7)11.8 (5.6)<.0016.7 (5.3, 8.2)6.0 (5.4)11.1 (6.0)<.0012.1 (3.5, 6.7)
WLQ-25 summed score17.9 (18.7)35.7 (19.0)<.00117.7 (11.6, 23.9)19.9 (15.6)43.1 (17.9)<.00123.2 (18.5, 27.9)23.6 (17.5)41.8 (19.5)<.00118.2 (13.0, 23.3)
 Time-management demands17.6 (24.6)40.2 (26.8)<.00122.7 (14.1, 31.3)19.9 (21.9)49.9 (24.8)<.00130.0 (23.5, 36.6)26.8 (25.3)45.6 (28.1)<.00118.8 (11.4, 26.3)
 Physical demands31.7 (29.5)50.5 (21.2)<.00118.7 (11.1, 26.4)36.8 (26.5)54.8 (18.9)<.00118.0 (11.5, 24.56)39.8 (24.6)54.0 (22.6)<.00114.2 (7.4, 20.9)
 Mental-interpersonal
demands
10.0 (22.5)23.9 (22.0)<.00113.9 (6.7, 21.0)10.9 (18.2)30.7 (23.0)<.00119.8 (14.0, 25.6)12.6 (19.5)31.1 (23.0)<.00118.5 (12.6, 24.4)
 Output demands16.2 (24.1)34.9 (26.2)<.00118.7 (10.4, 27.1)1 7.0 (20.4)44.6 (25.2)<.00127.6 (21.2, 34.0)21.6 (24.0)42.3 (26.3)<.00120.7 (13.6, 27.7)
RA-WIS6.0 (5.9)14.6 (5.0)<.0018.6 (6.9, 10.3)8.9 (6.0)16.2 (4.4)<.0017.3 (5.8, 8.8)9.8 (6.2)16.1 (4.6)<.0016.3 (4.8, 7.9)
ScaleOngoing Effects of the InjuryAbility to Do Paid Work Over the Previous WeekCan Cope With the Upper-Extremity Injury
No to MildbModerate to SeverebPMean Difference (95% CI)Not DifficultbDifficultbPMean Difference (95% CI)YesbNobPMean Difference (95% CI)
WLQ-25 index score4.6 (5.8)9.6 (5.8)<.0014.9 (3.1, 6.8)5.1 (4.7)11.8 (5.6)<.0016.7 (5.3, 8.2)6.0 (5.4)11.1 (6.0)<.0012.1 (3.5, 6.7)
WLQ-25 summed score17.9 (18.7)35.7 (19.0)<.00117.7 (11.6, 23.9)19.9 (15.6)43.1 (17.9)<.00123.2 (18.5, 27.9)23.6 (17.5)41.8 (19.5)<.00118.2 (13.0, 23.3)
 Time-management demands17.6 (24.6)40.2 (26.8)<.00122.7 (14.1, 31.3)19.9 (21.9)49.9 (24.8)<.00130.0 (23.5, 36.6)26.8 (25.3)45.6 (28.1)<.00118.8 (11.4, 26.3)
 Physical demands31.7 (29.5)50.5 (21.2)<.00118.7 (11.1, 26.4)36.8 (26.5)54.8 (18.9)<.00118.0 (11.5, 24.56)39.8 (24.6)54.0 (22.6)<.00114.2 (7.4, 20.9)
 Mental-interpersonal
demands
10.0 (22.5)23.9 (22.0)<.00113.9 (6.7, 21.0)10.9 (18.2)30.7 (23.0)<.00119.8 (14.0, 25.6)12.6 (19.5)31.1 (23.0)<.00118.5 (12.6, 24.4)
 Output demands16.2 (24.1)34.9 (26.2)<.00118.7 (10.4, 27.1)1 7.0 (20.4)44.6 (25.2)<.00127.6 (21.2, 34.0)21.6 (24.0)42.3 (26.3)<.00120.7 (13.6, 27.7)
RA-WIS6.0 (5.9)14.6 (5.0)<.0018.6 (6.9, 10.3)8.9 (6.0)16.2 (4.4)<.0017.3 (5.8, 8.8)9.8 (6.2)16.1 (4.6)<.0016.3 (4.8, 7.9)
a

WLQ-25=Work Limitations Questionnaire-25, RA-WIS=Work Instability Scale for Rheumatoid Arthritis, 95% CI=95% confidence interval.

b

Mean (standard deviation).

Table 4

Known-Group Validity of Presenteeism Scales at the 6-Month Follow-upa

ScaleOngoing Effects of the InjuryAbility to Do Paid Work Over the Previous WeekCan Cope With the Upper-Extremity Injury
No to MildbModerate to SeverebPMean Difference (95% CI)Not DifficultbDifficultbPMean Difference (95% CI)YesbNobPMean Difference (95% CI)
WLQ-25 index score4.6 (5.8)9.6 (5.8)<.0014.9 (3.1, 6.8)5.1 (4.7)11.8 (5.6)<.0016.7 (5.3, 8.2)6.0 (5.4)11.1 (6.0)<.0012.1 (3.5, 6.7)
WLQ-25 summed score17.9 (18.7)35.7 (19.0)<.00117.7 (11.6, 23.9)19.9 (15.6)43.1 (17.9)<.00123.2 (18.5, 27.9)23.6 (17.5)41.8 (19.5)<.00118.2 (13.0, 23.3)
 Time-management demands17.6 (24.6)40.2 (26.8)<.00122.7 (14.1, 31.3)19.9 (21.9)49.9 (24.8)<.00130.0 (23.5, 36.6)26.8 (25.3)45.6 (28.1)<.00118.8 (11.4, 26.3)
 Physical demands31.7 (29.5)50.5 (21.2)<.00118.7 (11.1, 26.4)36.8 (26.5)54.8 (18.9)<.00118.0 (11.5, 24.56)39.8 (24.6)54.0 (22.6)<.00114.2 (7.4, 20.9)
 Mental-interpersonal
demands
10.0 (22.5)23.9 (22.0)<.00113.9 (6.7, 21.0)10.9 (18.2)30.7 (23.0)<.00119.8 (14.0, 25.6)12.6 (19.5)31.1 (23.0)<.00118.5 (12.6, 24.4)
 Output demands16.2 (24.1)34.9 (26.2)<.00118.7 (10.4, 27.1)1 7.0 (20.4)44.6 (25.2)<.00127.6 (21.2, 34.0)21.6 (24.0)42.3 (26.3)<.00120.7 (13.6, 27.7)
RA-WIS6.0 (5.9)14.6 (5.0)<.0018.6 (6.9, 10.3)8.9 (6.0)16.2 (4.4)<.0017.3 (5.8, 8.8)9.8 (6.2)16.1 (4.6)<.0016.3 (4.8, 7.9)
ScaleOngoing Effects of the InjuryAbility to Do Paid Work Over the Previous WeekCan Cope With the Upper-Extremity Injury
No to MildbModerate to SeverebPMean Difference (95% CI)Not DifficultbDifficultbPMean Difference (95% CI)YesbNobPMean Difference (95% CI)
WLQ-25 index score4.6 (5.8)9.6 (5.8)<.0014.9 (3.1, 6.8)5.1 (4.7)11.8 (5.6)<.0016.7 (5.3, 8.2)6.0 (5.4)11.1 (6.0)<.0012.1 (3.5, 6.7)
WLQ-25 summed score17.9 (18.7)35.7 (19.0)<.00117.7 (11.6, 23.9)19.9 (15.6)43.1 (17.9)<.00123.2 (18.5, 27.9)23.6 (17.5)41.8 (19.5)<.00118.2 (13.0, 23.3)
 Time-management demands17.6 (24.6)40.2 (26.8)<.00122.7 (14.1, 31.3)19.9 (21.9)49.9 (24.8)<.00130.0 (23.5, 36.6)26.8 (25.3)45.6 (28.1)<.00118.8 (11.4, 26.3)
 Physical demands31.7 (29.5)50.5 (21.2)<.00118.7 (11.1, 26.4)36.8 (26.5)54.8 (18.9)<.00118.0 (11.5, 24.56)39.8 (24.6)54.0 (22.6)<.00114.2 (7.4, 20.9)
 Mental-interpersonal
demands
10.0 (22.5)23.9 (22.0)<.00113.9 (6.7, 21.0)10.9 (18.2)30.7 (23.0)<.00119.8 (14.0, 25.6)12.6 (19.5)31.1 (23.0)<.00118.5 (12.6, 24.4)
 Output demands16.2 (24.1)34.9 (26.2)<.00118.7 (10.4, 27.1)1 7.0 (20.4)44.6 (25.2)<.00127.6 (21.2, 34.0)21.6 (24.0)42.3 (26.3)<.00120.7 (13.6, 27.7)
RA-WIS6.0 (5.9)14.6 (5.0)<.0018.6 (6.9, 10.3)8.9 (6.0)16.2 (4.4)<.0017.3 (5.8, 8.8)9.8 (6.2)16.1 (4.6)<.0016.3 (4.8, 7.9)
a

WLQ-25=Work Limitations Questionnaire-25, RA-WIS=Work Instability Scale for Rheumatoid Arthritis, 95% CI=95% confidence interval.

b

Mean (standard deviation).

Responsiveness

Longitudinal construct validity

A moderate correlation was observed between change scores on the WLQ-25 and the RA-WIS (eTab. 2; available at ptjournal.apta.org). Low correlations were observed between change score on the WLQ-25 and change scores on the pain and disability scales (.22<r<.41), whereas low to moderate correlations were observed between change score on the RA-WIS and change scores on the pain and disability scales (.33<r<.57).

SRM

For improved participants, moderate SRMs were observed for the WLQ-25 and RA-WIS (Tab. 5). The SRMs ranged from small to moderate for the WLQ-25 subscales. In participants classified as stable, the SRMs for the WLQ-25 and RA-WIS fell below the level classified as a small effect (−0.20<SRM<0.20). Participants classified as worse had moderate negative SRMs on the RA-WIS and small negative SRMs on the WLQ-25.

Table 5

Responsiveness (Standardized Response Means With 95% Confidence Interval) of Presenteeism Scalesa

ScaleGlobal Rating of Change
Improved
(n=72)
Stable
(n=94)
Worse
(n=38)
WLQ-25 index score0.63 (0.38, 0.89)0.15 (−0.06, 0.35)−0.37 (−0.71, 0.03)
WLQ-25 summed score0.65 (0.40, 0.91)0.14 (−0.07, 0.34)−0.49 (−0.83, −0.14)
 Time-management demands0.40 (0.15, 0.64)−0.15 (−0.36, 0.07)−0.52 (−0.88, −0.15)
 Physical demands0.24 (0.00, 0.48)0.10 (−0.12, 0.31)−0.63 (−1.00, −0.25)
 Mental-interpersonal demands0.52 (0.27, 0.77)0.03 (−0.17, 0.24)−0.34 (−0.69, 0.00)
 Output demands0.47 (0.19, 0.75)0.29 (0.05, 0.53)−0.01 (−0.39, 0.37)
RA-WIS0.66 (0.40, 0.91)−0.18 (−0.38, 0.03)−0.72 (−1.08, −0.36)
ScaleGlobal Rating of Change
Improved
(n=72)
Stable
(n=94)
Worse
(n=38)
WLQ-25 index score0.63 (0.38, 0.89)0.15 (−0.06, 0.35)−0.37 (−0.71, 0.03)
WLQ-25 summed score0.65 (0.40, 0.91)0.14 (−0.07, 0.34)−0.49 (−0.83, −0.14)
 Time-management demands0.40 (0.15, 0.64)−0.15 (−0.36, 0.07)−0.52 (−0.88, −0.15)
 Physical demands0.24 (0.00, 0.48)0.10 (−0.12, 0.31)−0.63 (−1.00, −0.25)
 Mental-interpersonal demands0.52 (0.27, 0.77)0.03 (−0.17, 0.24)−0.34 (−0.69, 0.00)
 Output demands0.47 (0.19, 0.75)0.29 (0.05, 0.53)−0.01 (−0.39, 0.37)
RA-WIS0.66 (0.40, 0.91)−0.18 (−0.38, 0.03)−0.72 (−1.08, −0.36)
a

WLQ-25=Work Limitations Questionnaire-25, RA-WIS=Work Instability Scale for Rheumatoid Arthritis.

Table 5

Responsiveness (Standardized Response Means With 95% Confidence Interval) of Presenteeism Scalesa

ScaleGlobal Rating of Change
Improved
(n=72)
Stable
(n=94)
Worse
(n=38)
WLQ-25 index score0.63 (0.38, 0.89)0.15 (−0.06, 0.35)−0.37 (−0.71, 0.03)
WLQ-25 summed score0.65 (0.40, 0.91)0.14 (−0.07, 0.34)−0.49 (−0.83, −0.14)
 Time-management demands0.40 (0.15, 0.64)−0.15 (−0.36, 0.07)−0.52 (−0.88, −0.15)
 Physical demands0.24 (0.00, 0.48)0.10 (−0.12, 0.31)−0.63 (−1.00, −0.25)
 Mental-interpersonal demands0.52 (0.27, 0.77)0.03 (−0.17, 0.24)−0.34 (−0.69, 0.00)
 Output demands0.47 (0.19, 0.75)0.29 (0.05, 0.53)−0.01 (−0.39, 0.37)
RA-WIS0.66 (0.40, 0.91)−0.18 (−0.38, 0.03)−0.72 (−1.08, −0.36)
ScaleGlobal Rating of Change
Improved
(n=72)
Stable
(n=94)
Worse
(n=38)
WLQ-25 index score0.63 (0.38, 0.89)0.15 (−0.06, 0.35)−0.37 (−0.71, 0.03)
WLQ-25 summed score0.65 (0.40, 0.91)0.14 (−0.07, 0.34)−0.49 (−0.83, −0.14)
 Time-management demands0.40 (0.15, 0.64)−0.15 (−0.36, 0.07)−0.52 (−0.88, −0.15)
 Physical demands0.24 (0.00, 0.48)0.10 (−0.12, 0.31)−0.63 (−1.00, −0.25)
 Mental-interpersonal demands0.52 (0.27, 0.77)0.03 (−0.17, 0.24)−0.34 (−0.69, 0.00)
 Output demands0.47 (0.19, 0.75)0.29 (0.05, 0.53)−0.01 (−0.39, 0.37)
RA-WIS0.66 (0.40, 0.91)−0.18 (−0.38, 0.03)−0.72 (−1.08, −0.36)
a

WLQ-25=Work Limitations Questionnaire-25, RA-WIS=Work Instability Scale for Rheumatoid Arthritis.

CID

The AUC was 0.84 for the RA-WIS, 0.72 for the WLQ-25 summed score, and 0.73 for the WLQ-25 index score for improvement and 0.68 for the RA-WIS and WLQ-25 summed score and 0.80 for the WLQ-25 index score for deterioration (Figs. 1 and 2), showing discriminative ability (P<.001) statistically better than chance (AUC=0.50). For improvement, the discriminative ability was good for the RA-WIS and fair for the WLQ-25, whereas for deterioration, the discriminative ability was good for the WLQ-25 index score and poor for the RA-WIS and WLQ-25 summed score.44 For improvement, the CID, which is defined by the optimal cutoff point, was 13 points (out of 100) for the WLQ-25 summed score (sensitivity=60%, specificity=73%), 5 points (out of 28.6) for the WLQ-25 index score (sensitivity 62%, specificity=84%), and 4 points (out of 23) for the RA-WIS (sensitivity=74%, specificity=85%). For deterioration, the CID was 1 point for the WLQ-25 summed score (sensitivity=62%, specificity=74%) and index score (sensitivity=80%, specificity=74%) and 2 points for the RA-WIS (sensitivity=60%, specificity=72%).

Receiver operating characteristic curves for the Work Limitations Questionnaire-25 (WLQ-25) and Work Instability Scale for Rheumatoid Arthritis (RA-WIS) cutoff points for differentiating clinically important improvement.
Figure 1

Receiver operating characteristic curves for the Work Limitations Questionnaire-25 (WLQ-25) and Work Instability Scale for Rheumatoid Arthritis (RA-WIS) cutoff points for differentiating clinically important improvement.

Receiver operating characteristic curves for the Work Limitations Questionnaire-25 (WLQ-25) and Work Instability Scale for Rheumatoid Arthritis (RA-WIS) cutoff points for differentiating clinically important deterioration.
Figure 2

Receiver operating characteristic curves for the Work Limitations Questionnaire-25 (WLQ-25) and Work Instability Scale for Rheumatoid Arthritis (RA-WIS) cutoff points for differentiating clinically important deterioration.

Discussion

The present results suggest that 2 presenteeism scales, the WLQ-25 and the RA-WIS, provide distinct information from pain and disability scales, as relatively small shared variation was demonstrated. The WLQ-25 and RA-WIS are more highly correlated with the QuickDASH, which mainly addresses upper-extremity disability, than with the CPG or SPADI-P, which mainly address pain. Because shared variation between the WLQ-25 and RA-WIS also was small, the results suggest that they may not be capturing the same aspects of presenteeism. Furthermore, both measures were moderately responsive to changes for workers with chronic work-related upper-extremity disorders, suggesting the potential to evaluate change in health-related work productivity over time. Both scales also discriminated among different levels of self-rated work ability, suggesting they may be able to classify patients into subgroups according to their capacity to work. Two subscales of the WLQ-25 showed problems with floor and ceiling effects; therefore, the use of these subscales alone is not warranted for this population.

The relationships between the presenteeism scales followed expectations of moderate relationships (r=.53–.54). The correlation coefficients indicate a shared variation near 30% (coefficient of determination [R2]), suggesting that there is some overlap in the presenteeism concepts measured. However, the shared variation also suggests that the 2 presenteeism scales evaluate different aspects of at-work disabilities. This finding would be expected, given differences in the conceptual framing of these measures. According to the International Classification of Functioning, Disability and Health, most items of WLQ-25 are related to activity limitations and participation restrictions, and most items of the RA-WIS are related to body functions.45 Furthermore, one scale requires responses on the amount of time workers have difficulty handling parts of their job,11 while the other is concerned about the mismatch between worker's abilities and demands of their job.12

In workers with rheumatoid arthritis, a score less than 10 on the RA-WIS has been proposed as being indicative of low risk of work disability, a score between 10 and 17 as indicating medium risk, and a score above 17 as indicating high risk.12 In the present study, similar benchmarks were observed. Workers who rated their ability to work as not difficult or who could cope with their injury had a mean score below 10, whereas workers who rated their ability to work as difficult or who could not cope with their injury had a mean score above 16. Known-group validity can provide useful benchmarks for clinicians in making clinical judgments because scores can be used to make statements about whether the score for an individual patient is consistent with the subgroups of difficulty defined in this study. The WLQ-25 has a directly interpretable score. In the population studied, mean summed scores over 40 and mean index scores over 10 were consistent with having difficulty in meeting work demands, whereas mean summed scores of less than 20 and mean index scores of less than 6 were consistent with being able to meet work demands. A summed score of 40 means that 2 days a week (40% of the time) an employee is unable to meet the demands of the job because of health, whereas an index score of 10 means a decrease in productivity of 9.5% compared with workers who are healthy.21 A summed score of 20 means an employee is unable to meet the demands of the job 1 day a week, whereas an index score of 6 means a decrease in productivity of 5.8%.21 Depending on the work site's operational needs and the worker's health, clinicians can make judgments about reasonable cutoff points.

One of the main study contributions is new evidence about responsiveness. The SRMs were similar for the improved subgroup for both the RA-WIS and WLQ-25, suggesting the scales have similar ability to detect group-level improvement in work-related disorders. The SRMs were higher for the RA-WIS for the worsened subgroup. However, the 95% CIs of the SRMs were overlapping, which could indicate that the difference was not significant. Previously, only Beaton et al16 had reported responsiveness indices for presenteeism scales, observing varying levels of responsiveness for the RA-WIS and the WLQ-25 in improved (0.28≤SRM≤0.64) and deteriorated (0.00≤SRM≤0.88) subgroups. However, comparisons between the 2 studies are limited as a result of different rating questions being used. The amount of change considered clinically important had not been previously established for these scales. Establishing CID allows rehabilitation professionals to set targets for improvement or deterioration. We used large improvement or deterioration in overall upper-extremity condition as an anchor for the CID. Individual workers exceeding the CID would have been likely to have had a definitive change in work productivity. The interpretation of CID also must be judged against the minimal detectable change (MDC),46 which describes the amount of change that is likely due to measurement error. We did not have the data to produce MDC in the current study, and this is something to work toward in future research.

With these caveats, a 13-point change on the WLQ-25 summed score (13% of total score), a 5-point change on the WLQ-25 index score (17% of total score), and a 4-point change on the RA-WIS (17% of total score) may be used to estimate the minimal amount of change necessary to be considered a clinically relevant improvement. These findings mean that if a patient is able to report a favorable response on 4 more questions listed on the RA-WIS or increase the WLQ-25 summed score by 13%, he or she has achieved an improvement. A 13-point change in the WLQ-25 summed score (eg, from 56 to 43) also means the person is now able to meet the demands of his or her job on 3 out of 5 days instead of only on 2 days. The relative change needed to reach CID on the RA-WIS and WLQ-25 compares favorably with that of upper-extremity–specific pain and disability scales, such as the Disabilities of the Arm, Shoulder, and Hand outcome measure (10% of the total score),47 the Shoulder Pain and Disability Index (13%),47 and the Simple Shoulder Test (25%).48 Although the relative change required on the RA-WIS and on the WLQ-25 index score to reach a CID was higher compared with the WLQ-25 summed score, this finding was associated with greater sensitivity and specificity to correctly classify improvement. A 4-point change in RA-WIS scores correctly classified 74% of the patients reporting moderate to great improvement, whereas 13-point and 5-point changes, respectively, in WLQ-25 summed and index scores correctly classified 60% and 64% of the patients. Patients not meeting these cutoffs agreed with no or slight improvement in 85% of cases for the RA-WIS, 73% for the WLQ-25 summed score, and 84% for the WLQ-25 index score.

In contrast, a deterioration of only 1.0 point on the WLQ-25 (index and summed scores) and 2.0 points on the RA-WIS reflects the minimal amount of change necessary to be considered a clinically relevant deterioration. These CIDs are likely less than the day-to-day variability in scores that could be estimated by the MDC. Clinicians, therefore, should apply these cutoffs with caution when tracking workers with deteriorating conditions.

The least responsive WLQ-25 subscale was physical demands. The physical demands subscale covers workers' ability to perform job tasks that involve bodily strength, movement, endurance, coordination, and flexibility.11 The questions on the physical demands subscale are worded positively, with each question beginning with “How much of the time were you able to …,” and are scored from 1 (“all of the time”) to 5 (“none of the time”). The questions of the other 3 WLQ-25 subscales are worded negatively, with each question beginning with “How much of the time did your physical health or emotional problems make it difficult for you to do … ,” and are scored from 1 (“none of the time”) to 5 (“all of the time”). It is plausible that some participants were not paying attention to the instructions and continued to respond as if it was a difficulty. Future studies may include cognitive interviewing or alternate forms assessments to assess whether differential response patterns exist. Alternatively, the target study population may be less likely to experience positive changes in physical ability over the time frame studied because the patients had complex musculoskeletal disorders. The mean scores on the physical demands subscale were higher, on average, than the mean scores of the other subscales and stayed higher over time, leading to smaller responsiveness indices. This finding makes sense, given the clinical population.

This study had limitations. First, workers attended a specialized clinic where referrals were predetermined by the caseworker at the WSIB. Although the catchment area was large, generalizability of these results to other populations is unknown. Workers considered for this study had chronic injuries with a recovery that was not timely or satisfactory, also limiting the generalizability beyond this population. However, the population represents a challenging group of workers often seen in clinics. Furthermore, the external criterion of change for the determination of SRM and CID was related to the overall change in the upper-extremity condition and not to change in work ability or productivity. Because the scales evaluate work presenteeism, different indices could have been obtained using a work-related criterion of change. Limitations could be raised over the use of a GRC question to define the CID because of concerns about recall bias and about a 1-item scale being used to validate multi-item scales. There are a variety of methods to provide an external criterion to establish which patients have changed clinically, including clinician-reported GRCs and measurements of physical impairment. None of these methods are without limitations, and we recognized that our methods have limitations. Self-report scales were exclusively used to establish the properties of presenteeism scales and may not reflect other approaches to evaluating at-work disability.

In conclusion, our study presents evidence of construct validity and responsiveness for instruments that aim to measure at-work instability (RA-WIS) and at-work health-related productivity loss (WLQ-25) in participants with chronic work-related upper-extremity disorders. Clinically important differences were suggested for both tools. Measurement of work outcomes is important in rehabilitation. Therefore, the study provides psychometric evidence about 2 work outcome measures that could be included in the evaluation of workers.

The Bottom Line

What do we already know about this topic?

Presenteeism scales have been developed to evaluate loss of work productivity due to illness or injury in people who are present at their job. Validity of available scales is emerging, but evidence on responsiveness remains scarce.

What new information does this study offer?

The results suggest that 2 presenteeism scales—the Work Limitations Questionnaire-25 (WLQ-25) and Work Instability Scale for Rheumatoid Arthritis (RA-WIS)—are moderately responsive to change for workers with chronic work-related upper-extremity disorders.

If you're a patient, what might these findings mean for you?

Following your treatment, 2 presenteeism scales—the WLQ-25 and RA-WIS—have the potential to measure changes in your health-related work productivity over time.

The authors acknowledge the collaboration of the staff and injured workers from the 2 WSIB specialty clinics involved in this study: the Holland Orthopaedic and Arthritis Centre Shoulder and Elbow Specialty Clinic, Toronto, Ontario, Canada, and St Joseph's Health Centre Upper Extremity Specialty Clinic, London, Ontario, Canada. The authors also acknowledge the contribution of the other investigators involved in the larger study design and implementation: Pierre Cote, Renee-Louise Franche, Sheilah Hogg-Johnson, Sonia Pagura, Robin Richards, and Claire Bombardier. Special thanks to Diana Sayers, Muge Dogan, Taucha Inrig, and Iona MacRitchie for assistance with implementing the study.

This study was approved by the institutional review boards of the University of Western Ontario, St Michael's Hospital, Sunnybrook Health Sciences Centre, and the University of Toronto.

During the conduct of this study, Dr Roy was supported by scholarships from the Fonds de la Recherche en Santé du Québec (FRSQ) and the Canadian Institutes of Health Research (CIHR), and Dr MacDermid and Dr Beaton were supported by a CIHR New Investigators award. Mr Tang is supported by a CIHR PhD fellowship, a Canadian Arthritis Network/Arthritis Society Trainee Fellowship, and a Syme Fellowship from the Institute for Work & Health.

*

SPSS Inc, 233 S Wacker Dr, Chicago, IL 60606.

References

1

Prasad
M
,
Wahlqvist
P
,
Shikiar
R
,
Shih
YC
.
A review of self-report instruments measuring health-related work productivity: a patient-reported outcomes perspective
.
Pharmacoeconomics
.
2004
;
22
:
225
244
.

2

Williams
RM
,
Schmuck
G
,
Allwood
S
, et al. .
Psychometric evaluation of health-related work outcome measures for musculoskeletal disorders: a systematic review
.
J Occup Rehabil
.
2007
;
17
:
504
521
.

3

Aronsson
G
,
Gustafsson
K
,
Dallner
M
.
Sick but yet at work: an empirical study of sickness presenteeism
.
J Epidemiol Community Health
.
2000
;
54
:
502
509
.

4

McKevitt
C
,
Morgan
M
,
Dundas
R
,
Holland
WW
.
Sickness absence and “working through” illness: a comparison of two professional groups
.
J Public Health Med
.
1997
;
19
:
295
300
.

5

Koopman
C
,
Pelletier
KR
,
Murray
JF
, et al. .
Stanford presenteeism scale: health status and employee productivity
.
J Occup Environ Med
.
2002
;
44
:
14
20
.

6

Sanderson
K
,
Tilse
E
,
Nicholson
J
, et al. .
Which presenteeism measures are more sensitive to depression and anxiety?
J Affect Disord
.
2007
;
101
:
65
74
.

7

Denis
S
,
Shannon
HS
,
Wessel
J
, et al. .
Association of low back pain, impairment, disability and work limitations in nurses
.
J Occup Rehabil
.
2007
;
17
:
213
226
.

8

Lerner
D
,
Reed
JI
,
Massarotti
E
, et al. .
The Work Limitations Questionnaire's validity and reliability among patients with osteoarthritis
.
J Clin Epidemiol
.
2002
;
55
:
197
208
.

9

Tang
K
,
Pitts
S
,
Solway
S
,
Beaton
D
.
Comparison of the psychometric properties of four at-work disability measures in workers with shoulder or elbow disorders
.
J Occup Rehabil
.
2009
;
19
:
142
154
.

10

MacKenzie
EJ
,
Bosse
MJ
,
Kellam
JF
, et al. .
Early predictors of long-term work disability after major limb trauma
.
J Trauma
.
2006
;
61
:
688
694
.

11

Lerner
D
,
Amick
BC
III
,
Rogers
WH
, et al. .
The work limitations questionnaire
.
Med Care
.
2001
;
39
:
72
85
.

12

Gilworth
G
,
Chamberlain
MA
,
Harvey
A
, et al. .
Development of a work instability scale for rheumatoid arthritis
.
Arthritis Rheum
.
2003
;
49
:
349
354
.

13

Macedo
A
,
Oakley
S
,
Gullick
N
,
Kirkham
B
.
An examination of work instability, functional impairment, and disease activity in employed patients with rheumatoid arthritis
.
J Rheumatol
.
2009
;
36
:
225
230
.

14

Turpin
RS
,
Ozminkowski
RJ
,
Sharda
CE
, et al. .
Reliability and validity of the Stanford Presenteeism Scale
.
J Occup Environ Med
.
2004
;
46
:
1123
1133
.

15

Lerner
D
,
Chang
H
,
Rogers
WH
, et al. .
A method for imputing the impact of health problems on at-work performance and productivity from available health data
.
J Occup Environ Med
.
2009
;
51
:
515
524
.

16

Beaton
DE
,
Tang
K
,
Gignac
MA
, et al. .
Reliability, validity, and responsiveness of five at-work productivity measures in patients with rheumatoid arthritis or osteoarthritis
.
Arthritis Care Res (Hoboken)
.
2010
;
62
:
28
37
.

17

Beaton
DE
,
Wright
JG
,
Katz
JN
.
Development of the QuickDASH: comparison of three-item reduction approaches
.
J Bone Joint Surg Am
.
2005
;
87
:
1038
1046
.

18

Roach
KE
,
Budiman-Mak
E
,
Songsiridej
N
,
Lertratanakul
Y
.
Development of a shoulder pain and disability index
.
Arthritis Care Res
.
1991
;
4
:
143
149
.

19

Von Korff
M
,
Ormel
J
,
Keefe
FJ
,
Dworkin
SF
.
Grading the severity of chronic pain
.
Pain
.
1992
;
50
:
133
149
.

20

Stewart
M
,
Maher
CG
,
Refshauge
KM
, et al. .
Responsiveness of pain and disability measures for chronic whiplash
.
Spine (Phila Pa 1976)
.
2007
;
32
:
580
585
.

21

Lerner
D
,
Rogers
WH
,
Chang
H
.
Technical report: scoring the Work Limitations Questionnaire (WLQ) and the WLQ index for estimating work productivity loss
.
Revised April 2003. Available from the authors with purchase of the questionnaire
.

22

Tang
K
,
Beaton
DE
,
Lacaille
D
, et al. .
The Work Instability Scale for Rheumatoid Arthritis (RA-WIS): does it work in osteoarthritis?
Qual Life Res
.
2010
;
19
:
1057
1068
.

23

Tang
K
,
Beaton
DE
,
Gignac
MA
, et al. .
The work instability scale for rheumatoid arthritis (RA-WIS) predicts arthritis-related work transitions within 12 months
.
Arthritis Care Res (Hoboken)
.
2010
June
2
[Epub ahead of print]
.

24

Gummesson
C
,
Ward
MM
,
Atroshi
I
.
The shortened Disabilities of the Arm, Shoulder and Hand questionnaire (QuickDASH): validity and reliability based on responses within the full-length DASH
.
BMC Musculoskelet Disord
.
2006
;
7
:
44
.

25

Matheson
LN
,
Melhorn
JM
,
Mayer
TG
, et al. .
Reliability of a visual analog version of the QuickDASH
.
J Bone Joint Surg Am
.
2006
;
88
:
1782
1787
.

26

Fayad
F
,
Lefevre-Colau
MM
,
Gautheron
V
, et al. .
Reliability, validity and responsiveness of the French version of the questionnaire Quick Disability of the Arm, Shoulder and Hand in shoulder disorders
.
Man Ther
.
2009
;
14
:
206
212
.

27

Roy
JS
,
MacDermid
JC
,
Woodhouse
LJ
.
Measuring shoulder function: a systematic review of four questionnaires
.
Arthritis Rheum
.
2009
;
61
:
623
632
.

28

Elliott
AM
,
Smith
BH
,
Smith
WC
,
Chambers
WA
.
Changes in chronic pain severity over time: the Chronic Pain Grade as a valid measure
.
Pain
.
2000
;
88
:
303
308
.

29

Smith
BH
,
Penny
KI
,
Purves
AM
, et al. .
The Chronic Pain Grade questionnaire: validation and reliability in postal research
.
Pain
.
1997
;
71
:
141
147
.

30

Salaffi
F
,
Stancati
A
,
Grassi
W
.
Reliability and validity of the Italian version of the Chronic Pain Grade questionnaire in patients with musculoskeletal disorders
.
Clin Rheumatol
.
2006
;
25
:
619
631
.

31

Portney
LG
,
Watkins
MP
.
Foundations of Clinical Research: Applications to Practice
. 2nd ed.
Upper Saddle River, NJ
:
Prentice-Hall Health
;
2000
.

32

McHorney
CA
,
Tarlov
AR
.
Individual-patient monitoring in clinical practice: are available health status surveys adequate?
Qual Life Res
.
1995
;
4
:
293
307
.

33

Munro
BH
.
Statistical Methods for Health Care Research
.
Philadelphia, PA
:
JB Lippincott Co
;
2000
.

34

Liang
MH
.
Longitudinal construct validity: establishment of clinical meaning in patient evaluative instruments
.
Med Care
.
2000
;
38
(
9 suppl
):
II84
II90
.

35

Jaeschke
R
,
Singer
J
,
Guyatt
GH
.
Measurement of health status: ascertaining the minimal clinically important difference
.
Control Clin Trials
.
1989
;
10
:
407
415
.

36

Michener
LA
,
McClure
PW
,
Sennett
BJ
.
American Shoulder and Elbow Surgeons Standardized Shoulder Assessment Form, patient self-report section: reliability, validity, and responsiveness
.
J Shoulder Elbow Surg
.
2002
;
11
:
587
594
.

37

Greco
NJ
,
Anderson
AF
,
Mann
BJ
, et al. .
Responsiveness of the International Knee Documentation Committee Subjective Knee Form in comparison to the Western Ontario and McMaster Universities Osteoarthritis Index, modified Cincinnati Knee Rating System, and Short Form 36 in patients with focal articular cartilage defects
.
Am J Sports Med
.
2010
;
38
:
891
902
.

38

Zou
GY
.
Quantifying responsiveness of quality of life measures without an external criterion
.
Qual Life Res
.
2005
;
14
:
1545
1552
.

39

Cohen
J
.
Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences
.
Hillsdale, NJ
:
Lawrence Erlbaum Associates
;
1983
.

40

MacDermid
JC
,
Drosdowech
D
,
Faber
K
.
Responsiveness of self-report scales in patients recovering from rotator cuff surgery
.
J Shoulder Elbow Surg
.
2006
;
15
:
407
414
.

41

Wells
G
,
Boers
M
,
Shea
B
, et al. .
MCID/Low Disease Activity State Workshop: low disease activity state in rheumatoid arthritis
.
J Rheumatol
.
2003
;
30
:
1110
1111
.

42

Riddle
DL
,
Stratford
PW
,
Binkley
JM
.
Sensitivity to change of the Roland-Morris Back Pain Questionnaire: part 2
.
Phys Ther
.
1998
;
78
:
1197
1207
.

43

Paul
A
,
Lewis
M
,
Shadforth
MF
, et al. .
A comparison of four shoulder-specific questionnaires in primary care
.
Ann Rheum Dis
.
2004
;
63
:
1293
1299
.

44

Metz
CE
.
Basic principles of ROC analysis
.
Semin Nucl Med
.
1978
;
8
:
283
298
.

45

Escorpizo
R
,
Cieza
A
,
Beaton
D
,
Boonen
A
.
Content comparison of worker productivity questionnaires in arthritis and musculoskeletal conditions using the International Classification of Functioning, Disability and Health framework
.
J Occup Rehabil
.
2009
;
19
:
382
397
.

46

Stratford
PW
,
Binkley
J
,
Solomon
P
, et al. .
Defining the minimum level of detectable change for the Roland-Morris Questionnaire
.
Phys Ther
.
1996
;
76
:
359
365
.

47

Schmitt
JS
,
Di Fabio
RP
.
Reliable change and minimum important difference (MID) proportions facilitated group responsiveness comparisons using individual threshold criteria
.
J Clin Epidemiol
.
2004
;
57
:
1008
1018
.

48

Roy
JS
,
MacDermid
JC
,
Faber
KJ
, et al. .
The simple shoulder test is responsive in assessing change following shoulder arthroplasty
.
J Orthop Sports Phys Ther
.
2010
;
40
:
413
421
.

Author notes

Dr Roy, Dr MacDermid, Dr Amick, Dr Shannon, Dr McMurtry, Dr Roth, Mr Tang, and Dr Beaton provided concept/idea/research design. Dr Roy, Dr MacDermid, Dr Amick, Dr McMurtry, Dr Grewal, and Dr Beaton provided writing. Dr MacDermid and Dr Beaton provided data collection and fund procurement. Dr Roy, Dr Amick, Dr Shannon, Mr Tang, and Dr Beaton provided data analysis. Dr Roy, Dr MacDermid, and Dr Beaton provided project management. Dr MacDermid, Dr McMurtry, Dr Roth, and Dr Beaton provided participants. Dr MacDermid and Dr Roth provided institutional liaisons. Dr MacDermid provided clerical support. Dr MacDermid, Dr Amick, Dr Shannon, Dr McMurtry, Dr Roth, Dr Grewal, and Mr Tang provided consultation (including review of manuscript before submission).

Comments

0 Comments
Submit a comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.