Validity and Responsiveness of Presenteeism Scales in Chronic Work-Related Upper-Extremity Disorders

Description of the Scales^a

Scale	No. of Items	Range of Scores	Type of Scale	Time Frame	Item Content	Response Options	Interpretation of Scores
RA-WIS	23	0–23	Dichotomous	At the moment	Mismatch between abilities and job demands	Yes/no	Higher score=higher risk of work disability
WLQ-25	25	0–28.6 (index score) or 0–100 (summed score)	5-point Likert scale, plus a “does not apply to my job” option	Previous 2 weeks		“All of the time” to “none of the time”	Higher score=greater productivity loss at work
Time-management demands	5	0–100^b			Difficulty handling time and scheduling demands
Physical demands	6	0–100			Ability to perform job tasks involving strength, movement, and flexibility.
Mental-interpersonal-demands	9	0–100^b			Difficulty handling cognitive job tasks and social interactions
Output demands	5	0–100^b			Diminished work quantity and quality
QuickDASH	11	0–100	5-point Likert scale	Previous week	Ability to do activities or severity of symptoms	“No difficulty” or “not at all” or “not limited at all” to “unable” or “extremely”	Higher score=greater disability
SPADI-P	5	0–100	11-point Likert scale	Previous week	Severity of pain	“No pain” to “worst pain imaginable”	Higher scores=greater pain
CPG	7	Grade 0–IV	11-point Likert scale	Previous 6 months			Higher grades=greater chronic disability and limitations
Characteristic pain intensity	3	0–100			Intensity of pain	“No pain” to “pain as bad as could be”
Disability score	3	0–100			Interference of pain with activities	“No interference” to “unable to carry on any activities”

Scale	No. of Items	Range of Scores	Type of Scale	Time Frame	Item Content	Response Options	Interpretation of Scores
RA-WIS	23	0–23	Dichotomous	At the moment	Mismatch between abilities and job demands	Yes/no	Higher score=higher risk of work disability
WLQ-25	25	0–28.6 (index score) or 0–100 (summed score)	5-point Likert scale, plus a “does not apply to my job” option	Previous 2 weeks		“All of the time” to “none of the time”	Higher score=greater productivity loss at work
Time-management demands	5	0–100^b			Difficulty handling time and scheduling demands
Physical demands	6	0–100			Ability to perform job tasks involving strength, movement, and flexibility.
Mental-interpersonal-demands	9	0–100^b			Difficulty handling cognitive job tasks and social interactions
Output demands	5	0–100^b			Diminished work quantity and quality
QuickDASH	11	0–100	5-point Likert scale	Previous week	Ability to do activities or severity of symptoms	“No difficulty” or “not at all” or “not limited at all” to “unable” or “extremely”	Higher score=greater disability
SPADI-P	5	0–100	11-point Likert scale	Previous week	Severity of pain	“No pain” to “worst pain imaginable”	Higher scores=greater pain
CPG	7	Grade 0–IV	11-point Likert scale	Previous 6 months			Higher grades=greater chronic disability and limitations
Characteristic pain intensity	3	0–100			Intensity of pain	“No pain” to “pain as bad as could be”
Disability score	3	0–100			Interference of pain with activities	“No interference” to “unable to carry on any activities”

^a

RA-WIS=Work Instability Scale for Rheumatoid Arthritis, WLQ-25=Work Limitations Questionnaire-25, SPADI-P=pain subscale of the Shoulder Pain and Disability Questionnaire, CPG=Chronic Pain Grade Questionnaire.

^b

Scores are reversed.

Table 1

Description of the Scales^a

Scale	No. of Items	Range of Scores	Type of Scale	Time Frame	Item Content	Response Options	Interpretation of Scores
RA-WIS	23	0–23	Dichotomous	At the moment	Mismatch between abilities and job demands	Yes/no	Higher score=higher risk of work disability
WLQ-25	25	0–28.6 (index score) or 0–100 (summed score)	5-point Likert scale, plus a “does not apply to my job” option	Previous 2 weeks		“All of the time” to “none of the time”	Higher score=greater productivity loss at work
Time-management demands	5	0–100^b			Difficulty handling time and scheduling demands
Physical demands	6	0–100			Ability to perform job tasks involving strength, movement, and flexibility.
Mental-interpersonal-demands	9	0–100^b			Difficulty handling cognitive job tasks and social interactions
Output demands	5	0–100^b			Diminished work quantity and quality
QuickDASH	11	0–100	5-point Likert scale	Previous week	Ability to do activities or severity of symptoms	“No difficulty” or “not at all” or “not limited at all” to “unable” or “extremely”	Higher score=greater disability
SPADI-P	5	0–100	11-point Likert scale	Previous week	Severity of pain	“No pain” to “worst pain imaginable”	Higher scores=greater pain
CPG	7	Grade 0–IV	11-point Likert scale	Previous 6 months			Higher grades=greater chronic disability and limitations
Characteristic pain intensity	3	0–100			Intensity of pain	“No pain” to “pain as bad as could be”
Disability score	3	0–100			Interference of pain with activities	“No interference” to “unable to carry on any activities”

Scale	No. of Items	Range of Scores	Type of Scale	Time Frame	Item Content	Response Options	Interpretation of Scores
RA-WIS	23	0–23	Dichotomous	At the moment	Mismatch between abilities and job demands	Yes/no	Higher score=higher risk of work disability
WLQ-25	25	0–28.6 (index score) or 0–100 (summed score)	5-point Likert scale, plus a “does not apply to my job” option	Previous 2 weeks		“All of the time” to “none of the time”	Higher score=greater productivity loss at work
Time-management demands	5	0–100^b			Difficulty handling time and scheduling demands
Physical demands	6	0–100			Ability to perform job tasks involving strength, movement, and flexibility.
Mental-interpersonal-demands	9	0–100^b			Difficulty handling cognitive job tasks and social interactions
Output demands	5	0–100^b			Diminished work quantity and quality
QuickDASH	11	0–100	5-point Likert scale	Previous week	Ability to do activities or severity of symptoms	“No difficulty” or “not at all” or “not limited at all” to “unable” or “extremely”	Higher score=greater disability
SPADI-P	5	0–100	11-point Likert scale	Previous week	Severity of pain	“No pain” to “worst pain imaginable”	Higher scores=greater pain
CPG	7	Grade 0–IV	11-point Likert scale	Previous 6 months			Higher grades=greater chronic disability and limitations
Characteristic pain intensity	3	0–100			Intensity of pain	“No pain” to “pain as bad as could be”
Disability score	3	0–100			Interference of pain with activities	“No interference” to “unable to carry on any activities”

^a

RA-WIS=Work Instability Scale for Rheumatoid Arthritis, WLQ-25=Work Limitations Questionnaire-25, SPADI-P=pain subscale of the Shoulder Pain and Disability Questionnaire, CPG=Chronic Pain Grade Questionnaire.

^b

Scores are reversed.

At the 6-month evaluation, participants completed a global rating of change (GRC) question using an 11-point rating scale (0=“much worse,” 5=“no change,” 10=“a lot better”): “Think about your injury now compared with when you completed the first questionnaire package. How would you rate the change in your problem overall?”20 Three other questions also were answered at the 6-month evaluation: (1) “How would you rate the ongoing effects of your injury?” (5-point rating scale); (2) “How would you rate your ability to do your paid work over the last week?” (7-point rating scale); and (3) “Are the effects of having an injury now at a level where you can ignore or cope with them and do whatever it is you need to do in your daily life?” (yes/no).

Outcome Measures

WLQ-25

The WLQ-25 asks respondents to rate their level of difficulty in performing or ability to perform specific job demands over the previous 2 weeks, given their current physical health and emotional problems. Those demands are grouped into 4 types8,11: time management (n=5), physical (n=6), mental-interpersonal (n=9), and output (n=5). Responses are endorsed as a percentage of time, reflecting the amount of time that the respondents felt limited in their ability to perform each category of job demands. Up to 50% missing scores are permitted for the calculation of the scores, as recommended by scale developers. In this study, summed score, which is an overall score equal to the average of all of the items rescaled to a 0 to 100 scale, and index score, calculated using an algorithm to convert subscale scores into an estimate of productivity loss, were used.21 Higher scores indicate greater at-work productivity loss.

RA-WIS

Work instability is defined as a state arising from a mismatch between an individual's functional abilities and the demands of his or her job.12 The RA-WIS consists of 23 questions with dichotomous (yes/no) response options. The scale is scored by summing the scores of all 23 items (higher scores indicate higher risk of work disability). Originally, the RA-WIS was developed for individuals with rheumatoid arthritis,12 but it has been used to evaluate workers with other musculoskeletal disorders.9,16,22,23

QuickDASH

The QuickDASH evaluates physical disability and symptoms of the upper extremity in individuals with upper-extremity disorders.17 The score ranges from 0 (no disability) to 100 (most severe disability). The psychometric properties of the QuickDASH have been established.17,24–26

SPADI-P

The Shoulder Pain and Disability Index measures pain and disability associated with shoulder pathology.18 In this study, only the pain subscale was completed. The total pain score ranges from 0 (“no pain”) to 100 (“worst pain imaginable”). The SPADI-P has been shown to be valid and reliable.27

CPG

The CPG measures the severity of chronic pain.19 It classifies respondents into hierarchical pain grades: grade 0 (pain-free) to grade IV (high disability-severely limiting). It includes subscale scores for characteristic pain intensity and disability score. The CPG has been shown to be reliable and valid.19,28–30

Statistical Analyses

Psychometric analyses related to validity and responsiveness were conducted with SPSS software, version 17.* The alpha level was set at .05. A priori hypotheses were established. They are presented after each psychometric property. Independent t tests and chi-square tests were used to compare the participants working at baseline who did or did not participate at the follow-up visit.

Validity

Floor and ceiling effects are the extent to which scores cluster near the less (floor) or more (ceiling) desirable health state extreme on the scale.31 Clustering at these extremes may indicate a problem with its application to specific populations. Floor and ceiling effects were considered when more than 15% of the participants achieved the highest or lowest possible scores.32

Construct convergent validity was assessed by evaluating whether presenteeism scales correlated with each other and with pain and disability scales according to expected relationships.31 Pearson correlations among the WLQ-25, RA-WIS, CPG, QuickDASH, and SPADI-P were categorized as follows: high=≥.70, moderate=.50 to .70, and low=.26 to .50.33A priori hypotheses were that moderate correlations (.70>r>.50) would be observed between the presenteeism scales, as they evaluate different aspects of at-work disabilities, and that low to moderate correlations (.30<r<.70) would be observed between the presenteeism scales and pain and disability scales, as larger differences in their constructs were anticipated.

Known-group validity is the capacity of a test to discriminate between a group of individuals known to have a particular trait and a group of individuals who do not have the trait.31 Different comparisons were performed. Using a one-way analysis of variance (ANOVA) (Bonferroni post hoc test), participants reporting improvement according to the GRC question (7–10 on the GRC) were compared with those who were stable (4–6 on the GRC) or worse (0–3 on the GRC). Thereafter, using t tests, participants who had no to mild ongoing effects of their injury were compared with those who had moderate to very severe ongoing effects; participants who rated their ability to work as not difficult (0–3 on the scale) were compared with those who rated their ability to work as difficult (4–7 on the scale); and participants who could cope with their injury were compared with those who could not. Our a priori hypothesis was that presenteeism scales would discriminate among participants classified into different known groups (P<.05).

Responsiveness

Longitudinal construct validity refers to the degree to which change over time correlates with other indicators of change. This analysis was performed using Pearson correlations between the change scores (score at 6-month evaluation minus score at baseline) of the presenteeism scales in comparison with each other and the change scores of the pain and disability scales. A priori hypotheses were that at least moderate correlations (r>.50) would be observed between the change scores of the presenteeism scales and that low correlations (.30<r<.50) would be observed between the change scores of presenteeism scales and change scores of the pain and disability scales.

Standardized response means are used to evaluate the ability of a measure to assess change over time. Before establishing SRMs, participants were divided into subgroups because statistical methods underlying the SRM assume that all participants change in the same direction.34 Therefore, 3 subgroups were defined according to the response to the GRC question at the 6-month follow-up: (1) those who were better (7–10 on the GRC), (2) those who were stable (4–6 on the GRC), and (3) those who were worse (0–3 on the GRC). Global rating of change questions have been used to characterize responsiveness and CID for other outcome measures.20,35–37 Thereafter, SRMs (mean change score divided by the standard deviation of the change score) were determined for the 3 subgroups. Ninety-five percent confidence intervals (95% CIs) were calculated for the SRMs.38 An SRM was considered large if ≥0.8, moderate if between 0.5 and 0.8, and small if between 0.2 and 0.5.39,A priori hypotheses were that large SRMs for the presenteeism scales would be observed for participants who had improved and that moderate negative SRMs would be observed for participants who had worsened, as previous studies showed lower indices for subgroups of individuals who had worsened in self-report scales.16,40

Clinically important difference is the smallest change that represents a clinically significant change for the individual patient. There are several methods to estimate CID.41 In this study, the GRC question (“Think about your injury now compared with when you completed the first questionnaire package. How would you rate the change in your problem overall?”) was used as the external criterion for establishing change. The model of Riddle and colleagues42 was followed, using receiver operating characteristics curves to determine the amount of change in the presenteeism scales that best differentiated those individuals who were moderately to greatly improved (8–10 on the GRC) from those who were stable or slightly improved (5–7 on the GRC) on the GRC question. Receiver operating characteristic curves also were plotted to determine the amount of change that best differentiated those individuals who were moderately to greatly deteriorated (0–2 on the GRC) from those who were stable or slightly deteriorated (3–5 on the GRC). Receiver operating characteristic curves were constructed for both the WLQ-25 and RA-WIS by plotting sensitivity versus 1 − specificity for all possible cutoff values of the self-reported scales. The area under the curve (AUC) was evaluated for significance. A higher AUC represented greater ability of the measure to distinguish between patients who underwent a meaningful change and those who did not. By examination of the value of the data of the sensitivity and 1 − specificity plots nearest to the upper left-hand corner of the graph, the optimal cutoff value for maximal average sensitivity and specificity for detecting improvement or deterioration was determined.43

Role of the Funding Source

This study was supported by grants from the Research Advisory Council of WSIB of Ontario (WSIB-RAC-05028 and WSIB-RAC-02011). The funding source was not involved in the study's design, conduct, or reporting.

Results

Six hundred fourteen participants were enrolled; 105 participants did not participate at the follow-up visit, resulting in 509 participants (eFig. 1; available at ptjournal.apta.org). No differences were found (P<.05) in the scale scores between participants working at baseline who did or did not drop out. Although the proportions of women were not significantly different, participants who dropped out were significantly younger (eTab. 1; available at ptjournal.apta.org). Of the 509 participants evaluated at the 6-month follow-up, 206 were working at baseline and at follow-up, thus completing WLQ-25 and RA-WIS on 2 occasions. Our analyses were performed on these 206 participants (eFig. 1). One hundred forty-two participants (69%) had shoulder pain, 99 (48%) had elbow pain, 74 (36%) had wrist pain, and 79 (38%) had hand pain.

Validity

Floor and ceiling effects

At baseline, the WLQ-25 mental-interpersonal demand scores showed a ceiling effect, as 16% of the participants achieved the best possible score. At the 6-month follow-up, the WLQ-25 mental-interpersonal and output demand scores showed ceiling effects, with 29% and 23% of the participants, respectively, achieving the best possible score. No floor or ceiling effects were observed for the RA-WIS and WLQ-25 (eFig. 2; available at ptjournal.apta.org).

Convergent construct validity

A moderate correlation was observed between the WLQ-25 and RA-WIS (r=.53 for index score, r=.54 for summed score), whereas low to moderate correlations were observed between the presenteeism scales and the pain and function scales (.28<r<.62) (Tab. 2). Moderate to high correlations were observed between the WLQ-25 and its subscales (.78<r<.89), except for the physical demands subscales, for which a weak correlation (r=.06 for index score, r=.21 for summed score) was obtained.

Table 2

Convergent Construct Validity: Correlations (r) Among Self-Report Scales at Baseline^a

Scale	WLQ-25 Index Score	WLQ-25 Summed Score	RA-WIS	QuickDASH	CPG			SPADI-P
Scale	WLQ-25 Index Score	WLQ-25 Summed Score	RA-WIS	QuickDASH	CPG	Pain	Disability	SPADI-P
WLQ-25 index score (195≤n≤202)		.95**	.53**	.52**	.30**	.32**	.35**	.28**
WLQ-25 summed score (195≤n≤202)			.54**	.54**	.32**	.33**	.37**	.30**
Time-management demands (188≤n≤194)	.78**	.79**	.50**	51**	.24*	.28**	.32**	.24*
Physical demands (191≤n≤198)	.06	.21*	.08	.01	.02	.04	.12	.01
Mental-interpersonal demands (194≤n≤201)	.83**	.86**	.46**	.47**	.27**	.26**	.23*	.23*
Output demands (188≤n≤197)	.89**	.76**	.43**	.43**	.29**	.31**	.39**	.29**
RA-WIS (202≤n≤205)				.62**	.36**	.28**	.37**	.32**

Scale	WLQ-25 Index Score	WLQ-25 Summed Score	RA-WIS	QuickDASH	CPG			SPADI-P
Scale	WLQ-25 Index Score	WLQ-25 Summed Score	RA-WIS	QuickDASH	CPG	Pain	Disability	SPADI-P
WLQ-25 index score (195≤n≤202)		.95**	.53**	.52**	.30**	.32**	.35**	.28**
WLQ-25 summed score (195≤n≤202)			.54**	.54**	.32**	.33**	.37**	.30**
Time-management demands (188≤n≤194)	.78**	.79**	.50**	51**	.24*	.28**	.32**	.24*
Physical demands (191≤n≤198)	.06	.21*	.08	.01	.02	.04	.12	.01
Mental-interpersonal demands (194≤n≤201)	.83**	.86**	.46**	.47**	.27**	.26**	.23*	.23*
Output demands (188≤n≤197)	.89**	.76**	.43**	.43**	.29**	.31**	.39**	.29**
RA-WIS (202≤n≤205)				.62**	.36**	.28**	.37**	.32**

^a

RA-WIS=Work Instability Scale for Rheumatoid Arthritis, WLQ-25=Work Limitations Questionnaire-25, SPADI-P=pain subscale of the Shoulder Pain and Disability Questionnaire, CPG=Chronic Pain Grade Questionnaire. Correlations above .50 (moderate [.70>r>.50] to high [r>.70] correlations) are shown in boldface type. * Significant at P<.05, ** significant at P<.01.

Table 2

Convergent Construct Validity: Correlations (r) Among Self-Report Scales at Baseline^a

Scale	WLQ-25 Index Score	WLQ-25 Summed Score	RA-WIS	QuickDASH	CPG			SPADI-P
Scale	WLQ-25 Index Score	WLQ-25 Summed Score	RA-WIS	QuickDASH	CPG	Pain	Disability	SPADI-P
WLQ-25 index score (195≤n≤202)		.95**	.53**	.52**	.30**	.32**	.35**	.28**
WLQ-25 summed score (195≤n≤202)			.54**	.54**	.32**	.33**	.37**	.30**
Time-management demands (188≤n≤194)	.78**	.79**	.50**	51**	.24*	.28**	.32**	.24*
Physical demands (191≤n≤198)	.06	.21*	.08	.01	.02	.04	.12	.01
Mental-interpersonal demands (194≤n≤201)	.83**	.86**	.46**	.47**	.27**	.26**	.23*	.23*
Output demands (188≤n≤197)	.89**	.76**	.43**	.43**	.29**	.31**	.39**	.29**
RA-WIS (202≤n≤205)				.62**	.36**	.28**	.37**	.32**

Scale	WLQ-25 Index Score	WLQ-25 Summed Score	RA-WIS	QuickDASH	CPG			SPADI-P
Scale	WLQ-25 Index Score	WLQ-25 Summed Score	RA-WIS	QuickDASH	CPG	Pain	Disability	SPADI-P
WLQ-25 index score (195≤n≤202)		.95**	.53**	.52**	.30**	.32**	.35**	.28**
WLQ-25 summed score (195≤n≤202)			.54**	.54**	.32**	.33**	.37**	.30**
Time-management demands (188≤n≤194)	.78**	.79**	.50**	51**	.24*	.28**	.32**	.24*
Physical demands (191≤n≤198)	.06	.21*	.08	.01	.02	.04	.12	.01
Mental-interpersonal demands (194≤n≤201)	.83**	.86**	.46**	.47**	.27**	.26**	.23*	.23*
Output demands (188≤n≤197)	.89**	.76**	.43**	.43**	.29**	.31**	.39**	.29**
RA-WIS (202≤n≤205)				.62**	.36**	.28**	.37**	.32**

^a

RA-WIS=Work Instability Scale for Rheumatoid Arthritis, WLQ-25=Work Limitations Questionnaire-25, SPADI-P=pain subscale of the Shoulder Pain and Disability Questionnaire, CPG=Chronic Pain Grade Questionnaire. Correlations above .50 (moderate [.70>r>.50] to high [r>.70] correlations) are shown in boldface type. * Significant at P<.05, ** significant at P<.01.

Known-group validity

At the 6-month follow-up, mean scores of the RA-WIS and WLQ-25 could differentiate between improved participants and participants who were stable or had worsened according to the GRC question (Tab. 3). Their mean scores also could differentiate between: (1) participants who had no to mild versus moderate to very severe ongoing effects of their injury, (2) participants who rated their ability to do paid work as not difficult versus difficult, and (3) participants who stated that they could versus could not cope with their injury (Tab. 4).

Table 3

Presenteeism Scales Scores at Baseline and 6-Month Follow-up^a

Scale	Improved (n=72)		Stable (n=94)		Worse (n=38)
Scale	Baseline	6-Month Follow-up	Baseline	6-Month Follow-up	Baseline	6-Month Follow-up
WLQ-25 index score	8.9 (5.4)	5.4 (5.6)^b,^c	8.1 (5.5)	7.4 (4.9)^b,^d	10.0 (6.9)	12.0 (6.6)^c,^d
WLQ-25 summed score	35.5 (16.7)	22.4 (18.6)^b,^c	33.4 (17.7)	31.0 (17.1)^b,^d	39.5 (20.6)	49.0 (19.9)^c,^d
Time-management demands	35.8 (29.7)	21.3 (24.1)^b,^c	33.3 (26.0)	37.1 (23.9)^b,^d	43.7 (25.2)	57.1 (29.3)^c,^d
Physical demands	48.7 (25.7)	40.1 (28.1)^b,^c	47.6 (25.6)	45.0 (21.0)^b	41.5 (24.4)	59.9 (21.8)^c
Mental-interpersonal demands	25.5 (25.1)	12.2 (21.5)^b,^c	21.5 (22.4)	20.7 (18.4)^b,^d	28.0 (26.7)	36.4 (28.1)^c,^d
Output demands	37.4 (27.6)	22.1 (27.2)^b,^c	38.3 (29.9)	29.2 (24.7)^b	51.1 (27.2)	51.5 (25.7)^c
RA-WIS	12.2 (5.2)	8.6 (6.8)^b,^c	12.9 (5.3)	13.6 (5.1)^b,^d	14.1 (4.4)	17.5 (3.6)^c,^d

Scale	Improved (n=72)		Stable (n=94)		Worse (n=38)
Scale	Baseline	6-Month Follow-up	Baseline	6-Month Follow-up	Baseline	6-Month Follow-up
WLQ-25 index score	8.9 (5.4)	5.4 (5.6)^b,^c	8.1 (5.5)	7.4 (4.9)^b,^d	10.0 (6.9)	12.0 (6.6)^c,^d
WLQ-25 summed score	35.5 (16.7)	22.4 (18.6)^b,^c	33.4 (17.7)	31.0 (17.1)^b,^d	39.5 (20.6)	49.0 (19.9)^c,^d
Time-management demands	35.8 (29.7)	21.3 (24.1)^b,^c	33.3 (26.0)	37.1 (23.9)^b,^d	43.7 (25.2)	57.1 (29.3)^c,^d
Physical demands	48.7 (25.7)	40.1 (28.1)^b,^c	47.6 (25.6)	45.0 (21.0)^b	41.5 (24.4)	59.9 (21.8)^c
Mental-interpersonal demands	25.5 (25.1)	12.2 (21.5)^b,^c	21.5 (22.4)	20.7 (18.4)^b,^d	28.0 (26.7)	36.4 (28.1)^c,^d
Output demands	37.4 (27.6)	22.1 (27.2)^b,^c	38.3 (29.9)	29.2 (24.7)^b	51.1 (27.2)	51.5 (25.7)^c
RA-WIS	12.2 (5.2)	8.6 (6.8)^b,^c	12.9 (5.3)	13.6 (5.1)^b,^d	14.1 (4.4)	17.5 (3.6)^c,^d

^a

Scores are presented as mean (standard deviation). RA-WIS=Work Instability Scale for Rheumatoid Arthritis, WLQ-25=Work Limitations Questionnaire-25.

^b

Significant differences at 6-month follow-up between improved participants and stable participants.

^c

Significant differences at 6-month follow-up between improved participants and participants who were worse.

^d

Significant differences at 6-month follow-up between participants who were worse and stable participants.

Table 3

Presenteeism Scales Scores at Baseline and 6-Month Follow-up^a

Scale	Improved (n=72)		Stable (n=94)		Worse (n=38)
Scale	Baseline	6-Month Follow-up	Baseline	6-Month Follow-up	Baseline	6-Month Follow-up
WLQ-25 index score	8.9 (5.4)	5.4 (5.6)^b,^c	8.1 (5.5)	7.4 (4.9)^b,^d	10.0 (6.9)	12.0 (6.6)^c,^d
WLQ-25 summed score	35.5 (16.7)	22.4 (18.6)^b,^c	33.4 (17.7)	31.0 (17.1)^b,^d	39.5 (20.6)	49.0 (19.9)^c,^d
Time-management demands	35.8 (29.7)	21.3 (24.1)^b,^c	33.3 (26.0)	37.1 (23.9)^b,^d	43.7 (25.2)	57.1 (29.3)^c,^d
Physical demands	48.7 (25.7)	40.1 (28.1)^b,^c	47.6 (25.6)	45.0 (21.0)^b	41.5 (24.4)	59.9 (21.8)^c
Mental-interpersonal demands	25.5 (25.1)	12.2 (21.5)^b,^c	21.5 (22.4)	20.7 (18.4)^b,^d	28.0 (26.7)	36.4 (28.1)^c,^d
Output demands	37.4 (27.6)	22.1 (27.2)^b,^c	38.3 (29.9)	29.2 (24.7)^b	51.1 (27.2)	51.5 (25.7)^c
RA-WIS	12.2 (5.2)	8.6 (6.8)^b,^c	12.9 (5.3)	13.6 (5.1)^b,^d	14.1 (4.4)	17.5 (3.6)^c,^d

Scale	Improved (n=72)		Stable (n=94)		Worse (n=38)
Scale	Baseline	6-Month Follow-up	Baseline	6-Month Follow-up	Baseline	6-Month Follow-up
WLQ-25 index score	8.9 (5.4)	5.4 (5.6)^b,^c	8.1 (5.5)	7.4 (4.9)^b,^d	10.0 (6.9)	12.0 (6.6)^c,^d
WLQ-25 summed score	35.5 (16.7)	22.4 (18.6)^b,^c	33.4 (17.7)	31.0 (17.1)^b,^d	39.5 (20.6)	49.0 (19.9)^c,^d
Time-management demands	35.8 (29.7)	21.3 (24.1)^b,^c	33.3 (26.0)	37.1 (23.9)^b,^d	43.7 (25.2)	57.1 (29.3)^c,^d
Physical demands	48.7 (25.7)	40.1 (28.1)^b,^c	47.6 (25.6)	45.0 (21.0)^b	41.5 (24.4)	59.9 (21.8)^c
Mental-interpersonal demands	25.5 (25.1)	12.2 (21.5)^b,^c	21.5 (22.4)	20.7 (18.4)^b,^d	28.0 (26.7)	36.4 (28.1)^c,^d
Output demands	37.4 (27.6)	22.1 (27.2)^b,^c	38.3 (29.9)	29.2 (24.7)^b	51.1 (27.2)	51.5 (25.7)^c
RA-WIS	12.2 (5.2)	8.6 (6.8)^b,^c	12.9 (5.3)	13.6 (5.1)^b,^d	14.1 (4.4)	17.5 (3.6)^c,^d

^a

Scores are presented as mean (standard deviation). RA-WIS=Work Instability Scale for Rheumatoid Arthritis, WLQ-25=Work Limitations Questionnaire-25.

^b

Significant differences at 6-month follow-up between improved participants and stable participants.

^c

Significant differences at 6-month follow-up between improved participants and participants who were worse.

^d

Significant differences at 6-month follow-up between participants who were worse and stable participants.

Table 4

Known-Group Validity of Presenteeism Scales at the 6-Month Follow-up^a

Scale	Ongoing Effects of the Injury				Ability to Do Paid Work Over the Previous Week				Can Cope With the Upper-Extremity Injury
Scale	No to Mild^b	Moderate to Severe^b	P	Mean Difference (95% CI)	Not Difficult^b	Difficult^b	P	Mean Difference (95% CI)	Yes^b	No^b	P	Mean Difference (95% CI)
WLQ-25 index score	4.6 (5.8)	9.6 (5.8)	<.001	4.9 (3.1, 6.8)	5.1 (4.7)	11.8 (5.6)	<.001	6.7 (5.3, 8.2)	6.0 (5.4)	11.1 (6.0)	<.001	2.1 (3.5, 6.7)
WLQ-25 summed score	17.9 (18.7)	35.7 (19.0)	<.001	17.7 (11.6, 23.9)	19.9 (15.6)	43.1 (17.9)	<.001	23.2 (18.5, 27.9)	23.6 (17.5)	41.8 (19.5)	<.001	18.2 (13.0, 23.3)
Time-management demands	17.6 (24.6)	40.2 (26.8)	<.001	22.7 (14.1, 31.3)	19.9 (21.9)	49.9 (24.8)	<.001	30.0 (23.5, 36.6)	26.8 (25.3)	45.6 (28.1)	<.001	18.8 (11.4, 26.3)
Physical demands	31.7 (29.5)	50.5 (21.2)	<.001	18.7 (11.1, 26.4)	36.8 (26.5)	54.8 (18.9)	<.001	18.0 (11.5, 24.56)	39.8 (24.6)	54.0 (22.6)	<.001	14.2 (7.4, 20.9)
Mental-interpersonal demands	10.0 (22.5)	23.9 (22.0)	<.001	13.9 (6.7, 21.0)	10.9 (18.2)	30.7 (23.0)	<.001	19.8 (14.0, 25.6)	12.6 (19.5)	31.1 (23.0)	<.001	18.5 (12.6, 24.4)
Output demands	16.2 (24.1)	34.9 (26.2)	<.001	18.7 (10.4, 27.1)	1 7.0 (20.4)	44.6 (25.2)	<.001	27.6 (21.2, 34.0)	21.6 (24.0)	42.3 (26.3)	<.001	20.7 (13.6, 27.7)
RA-WIS	6.0 (5.9)	14.6 (5.0)	<.001	8.6 (6.9, 10.3)	8.9 (6.0)	16.2 (4.4)	<.001	7.3 (5.8, 8.8)	9.8 (6.2)	16.1 (4.6)	<.001	6.3 (4.8, 7.9)

Scale	Ongoing Effects of the Injury				Ability to Do Paid Work Over the Previous Week				Can Cope With the Upper-Extremity Injury
Scale	No to Mild^b	Moderate to Severe^b	P	Mean Difference (95% CI)	Not Difficult^b	Difficult^b	P	Mean Difference (95% CI)	Yes^b	No^b	P	Mean Difference (95% CI)
WLQ-25 index score	4.6 (5.8)	9.6 (5.8)	<.001	4.9 (3.1, 6.8)	5.1 (4.7)	11.8 (5.6)	<.001	6.7 (5.3, 8.2)	6.0 (5.4)	11.1 (6.0)	<.001	2.1 (3.5, 6.7)
WLQ-25 summed score	17.9 (18.7)	35.7 (19.0)	<.001	17.7 (11.6, 23.9)	19.9 (15.6)	43.1 (17.9)	<.001	23.2 (18.5, 27.9)	23.6 (17.5)	41.8 (19.5)	<.001	18.2 (13.0, 23.3)
Time-management demands	17.6 (24.6)	40.2 (26.8)	<.001	22.7 (14.1, 31.3)	19.9 (21.9)	49.9 (24.8)	<.001	30.0 (23.5, 36.6)	26.8 (25.3)	45.6 (28.1)	<.001	18.8 (11.4, 26.3)
Physical demands	31.7 (29.5)	50.5 (21.2)	<.001	18.7 (11.1, 26.4)	36.8 (26.5)	54.8 (18.9)	<.001	18.0 (11.5, 24.56)	39.8 (24.6)	54.0 (22.6)	<.001	14.2 (7.4, 20.9)
Mental-interpersonal demands	10.0 (22.5)	23.9 (22.0)	<.001	13.9 (6.7, 21.0)	10.9 (18.2)	30.7 (23.0)	<.001	19.8 (14.0, 25.6)	12.6 (19.5)	31.1 (23.0)	<.001	18.5 (12.6, 24.4)
Output demands	16.2 (24.1)	34.9 (26.2)	<.001	18.7 (10.4, 27.1)	1 7.0 (20.4)	44.6 (25.2)	<.001	27.6 (21.2, 34.0)	21.6 (24.0)	42.3 (26.3)	<.001	20.7 (13.6, 27.7)
RA-WIS	6.0 (5.9)	14.6 (5.0)	<.001	8.6 (6.9, 10.3)	8.9 (6.0)	16.2 (4.4)	<.001	7.3 (5.8, 8.8)	9.8 (6.2)	16.1 (4.6)	<.001	6.3 (4.8, 7.9)

^a

WLQ-25=Work Limitations Questionnaire-25, RA-WIS=Work Instability Scale for Rheumatoid Arthritis, 95% CI=95% confidence interval.

^b

Mean (standard deviation).

Table 4

Known-Group Validity of Presenteeism Scales at the 6-Month Follow-up^a

Scale	Ongoing Effects of the Injury				Ability to Do Paid Work Over the Previous Week				Can Cope With the Upper-Extremity Injury
Scale	No to Mild^b	Moderate to Severe^b	P	Mean Difference (95% CI)	Not Difficult^b	Difficult^b	P	Mean Difference (95% CI)	Yes^b	No^b	P	Mean Difference (95% CI)
WLQ-25 index score	4.6 (5.8)	9.6 (5.8)	<.001	4.9 (3.1, 6.8)	5.1 (4.7)	11.8 (5.6)	<.001	6.7 (5.3, 8.2)	6.0 (5.4)	11.1 (6.0)	<.001	2.1 (3.5, 6.7)
WLQ-25 summed score	17.9 (18.7)	35.7 (19.0)	<.001	17.7 (11.6, 23.9)	19.9 (15.6)	43.1 (17.9)	<.001	23.2 (18.5, 27.9)	23.6 (17.5)	41.8 (19.5)	<.001	18.2 (13.0, 23.3)
Time-management demands	17.6 (24.6)	40.2 (26.8)	<.001	22.7 (14.1, 31.3)	19.9 (21.9)	49.9 (24.8)	<.001	30.0 (23.5, 36.6)	26.8 (25.3)	45.6 (28.1)	<.001	18.8 (11.4, 26.3)
Physical demands	31.7 (29.5)	50.5 (21.2)	<.001	18.7 (11.1, 26.4)	36.8 (26.5)	54.8 (18.9)	<.001	18.0 (11.5, 24.56)	39.8 (24.6)	54.0 (22.6)	<.001	14.2 (7.4, 20.9)
Mental-interpersonal demands	10.0 (22.5)	23.9 (22.0)	<.001	13.9 (6.7, 21.0)	10.9 (18.2)	30.7 (23.0)	<.001	19.8 (14.0, 25.6)	12.6 (19.5)	31.1 (23.0)	<.001	18.5 (12.6, 24.4)
Output demands	16.2 (24.1)	34.9 (26.2)	<.001	18.7 (10.4, 27.1)	1 7.0 (20.4)	44.6 (25.2)	<.001	27.6 (21.2, 34.0)	21.6 (24.0)	42.3 (26.3)	<.001	20.7 (13.6, 27.7)
RA-WIS	6.0 (5.9)	14.6 (5.0)	<.001	8.6 (6.9, 10.3)	8.9 (6.0)	16.2 (4.4)	<.001	7.3 (5.8, 8.8)	9.8 (6.2)	16.1 (4.6)	<.001	6.3 (4.8, 7.9)

Scale	Ongoing Effects of the Injury				Ability to Do Paid Work Over the Previous Week				Can Cope With the Upper-Extremity Injury
Scale	No to Mild^b	Moderate to Severe^b	P	Mean Difference (95% CI)	Not Difficult^b	Difficult^b	P	Mean Difference (95% CI)	Yes^b	No^b	P	Mean Difference (95% CI)
WLQ-25 index score	4.6 (5.8)	9.6 (5.8)	<.001	4.9 (3.1, 6.8)	5.1 (4.7)	11.8 (5.6)	<.001	6.7 (5.3, 8.2)	6.0 (5.4)	11.1 (6.0)	<.001	2.1 (3.5, 6.7)
WLQ-25 summed score	17.9 (18.7)	35.7 (19.0)	<.001	17.7 (11.6, 23.9)	19.9 (15.6)	43.1 (17.9)	<.001	23.2 (18.5, 27.9)	23.6 (17.5)	41.8 (19.5)	<.001	18.2 (13.0, 23.3)
Time-management demands	17.6 (24.6)	40.2 (26.8)	<.001	22.7 (14.1, 31.3)	19.9 (21.9)	49.9 (24.8)	<.001	30.0 (23.5, 36.6)	26.8 (25.3)	45.6 (28.1)	<.001	18.8 (11.4, 26.3)
Physical demands	31.7 (29.5)	50.5 (21.2)	<.001	18.7 (11.1, 26.4)	36.8 (26.5)	54.8 (18.9)	<.001	18.0 (11.5, 24.56)	39.8 (24.6)	54.0 (22.6)	<.001	14.2 (7.4, 20.9)
Mental-interpersonal demands	10.0 (22.5)	23.9 (22.0)	<.001	13.9 (6.7, 21.0)	10.9 (18.2)	30.7 (23.0)	<.001	19.8 (14.0, 25.6)	12.6 (19.5)	31.1 (23.0)	<.001	18.5 (12.6, 24.4)
Output demands	16.2 (24.1)	34.9 (26.2)	<.001	18.7 (10.4, 27.1)	1 7.0 (20.4)	44.6 (25.2)	<.001	27.6 (21.2, 34.0)	21.6 (24.0)	42.3 (26.3)	<.001	20.7 (13.6, 27.7)
RA-WIS	6.0 (5.9)	14.6 (5.0)	<.001	8.6 (6.9, 10.3)	8.9 (6.0)	16.2 (4.4)	<.001	7.3 (5.8, 8.8)	9.8 (6.2)	16.1 (4.6)	<.001	6.3 (4.8, 7.9)

^a

WLQ-25=Work Limitations Questionnaire-25, RA-WIS=Work Instability Scale for Rheumatoid Arthritis, 95% CI=95% confidence interval.

^b

Mean (standard deviation).

Responsiveness

Longitudinal construct validity

A moderate correlation was observed between change scores on the WLQ-25 and the RA-WIS (eTab. 2; available at ptjournal.apta.org). Low correlations were observed between change score on the WLQ-25 and change scores on the pain and disability scales (.22<r<.41), whereas low to moderate correlations were observed between change score on the RA-WIS and change scores on the pain and disability scales (.33<r<.57).

SRM

For improved participants, moderate SRMs were observed for the WLQ-25 and RA-WIS (Tab. 5). The SRMs ranged from small to moderate for the WLQ-25 subscales. In participants classified as stable, the SRMs for the WLQ-25 and RA-WIS fell below the level classified as a small effect (−0.20<SRM<0.20). Participants classified as worse had moderate negative SRMs on the RA-WIS and small negative SRMs on the WLQ-25.

Table 5

Responsiveness (Standardized Response Means With 95% Confidence Interval) of Presenteeism Scales^a

Scale	Global Rating of Change
Scale	Improved (n=72)	Stable (n=94)	Worse (n=38)
WLQ-25 index score	0.63 (0.38, 0.89)	0.15 (−0.06, 0.35)	−0.37 (−0.71, 0.03)
WLQ-25 summed score	0.65 (0.40, 0.91)	0.14 (−0.07, 0.34)	−0.49 (−0.83, −0.14)
Time-management demands	0.40 (0.15, 0.64)	−0.15 (−0.36, 0.07)	−0.52 (−0.88, −0.15)
Physical demands	0.24 (0.00, 0.48)	0.10 (−0.12, 0.31)	−0.63 (−1.00, −0.25)
Mental-interpersonal demands	0.52 (0.27, 0.77)	0.03 (−0.17, 0.24)	−0.34 (−0.69, 0.00)
Output demands	0.47 (0.19, 0.75)	0.29 (0.05, 0.53)	−0.01 (−0.39, 0.37)
RA-WIS	0.66 (0.40, 0.91)	−0.18 (−0.38, 0.03)	−0.72 (−1.08, −0.36)

Scale	Global Rating of Change
Scale	Improved (n=72)	Stable (n=94)	Worse (n=38)
WLQ-25 index score	0.63 (0.38, 0.89)	0.15 (−0.06, 0.35)	−0.37 (−0.71, 0.03)
WLQ-25 summed score	0.65 (0.40, 0.91)	0.14 (−0.07, 0.34)	−0.49 (−0.83, −0.14)
Time-management demands	0.40 (0.15, 0.64)	−0.15 (−0.36, 0.07)	−0.52 (−0.88, −0.15)
Physical demands	0.24 (0.00, 0.48)	0.10 (−0.12, 0.31)	−0.63 (−1.00, −0.25)
Mental-interpersonal demands	0.52 (0.27, 0.77)	0.03 (−0.17, 0.24)	−0.34 (−0.69, 0.00)
Output demands	0.47 (0.19, 0.75)	0.29 (0.05, 0.53)	−0.01 (−0.39, 0.37)
RA-WIS	0.66 (0.40, 0.91)	−0.18 (−0.38, 0.03)	−0.72 (−1.08, −0.36)

^a

WLQ-25=Work Limitations Questionnaire-25, RA-WIS=Work Instability Scale for Rheumatoid Arthritis.

Table 5

Open in new tab Download slide

Responsiveness (Standardized Response Means With 95% Confidence Interval) of Presenteeism Scales^a

Scale	Global Rating of Change
Scale	Improved (n=72)	Stable (n=94)	Worse (n=38)
WLQ-25 index score	0.63 (0.38, 0.89)	0.15 (−0.06, 0.35)	−0.37 (−0.71, 0.03)
WLQ-25 summed score	0.65 (0.40, 0.91)	0.14 (−0.07, 0.34)	−0.49 (−0.83, −0.14)
Time-management demands	0.40 (0.15, 0.64)	−0.15 (−0.36, 0.07)	−0.52 (−0.88, −0.15)
Physical demands	0.24 (0.00, 0.48)	0.10 (−0.12, 0.31)	−0.63 (−1.00, −0.25)
Mental-interpersonal demands	0.52 (0.27, 0.77)	0.03 (−0.17, 0.24)	−0.34 (−0.69, 0.00)
Output demands	0.47 (0.19, 0.75)	0.29 (0.05, 0.53)	−0.01 (−0.39, 0.37)
RA-WIS	0.66 (0.40, 0.91)	−0.18 (−0.38, 0.03)	−0.72 (−1.08, −0.36)

Scale	Global Rating of Change
Scale	Improved (n=72)	Stable (n=94)	Worse (n=38)
WLQ-25 index score	0.63 (0.38, 0.89)	0.15 (−0.06, 0.35)	−0.37 (−0.71, 0.03)
WLQ-25 summed score	0.65 (0.40, 0.91)	0.14 (−0.07, 0.34)	−0.49 (−0.83, −0.14)
Time-management demands	0.40 (0.15, 0.64)	−0.15 (−0.36, 0.07)	−0.52 (−0.88, −0.15)
Physical demands	0.24 (0.00, 0.48)	0.10 (−0.12, 0.31)	−0.63 (−1.00, −0.25)
Mental-interpersonal demands	0.52 (0.27, 0.77)	0.03 (−0.17, 0.24)	−0.34 (−0.69, 0.00)
Output demands	0.47 (0.19, 0.75)	0.29 (0.05, 0.53)	−0.01 (−0.39, 0.37)
RA-WIS	0.66 (0.40, 0.91)	−0.18 (−0.38, 0.03)	−0.72 (−1.08, −0.36)

^a

WLQ-25=Work Limitations Questionnaire-25, RA-WIS=Work Instability Scale for Rheumatoid Arthritis.

CID

The AUC was 0.84 for the RA-WIS, 0.72 for the WLQ-25 summed score, and 0.73 for the WLQ-25 index score for improvement and 0.68 for the RA-WIS and WLQ-25 summed score and 0.80 for the WLQ-25 index score for deterioration (Figs. 1 and 2), showing discriminative ability (P<.001) statistically better than chance (AUC=0.50). For improvement, the discriminative ability was good for the RA-WIS and fair for the WLQ-25, whereas for deterioration, the discriminative ability was good for the WLQ-25 index score and poor for the RA-WIS and WLQ-25 summed score.44 For improvement, the CID, which is defined by the optimal cutoff point, was 13 points (out of 100) for the WLQ-25 summed score (sensitivity=60%, specificity=73%), 5 points (out of 28.6) for the WLQ-25 index score (sensitivity 62%, specificity=84%), and 4 points (out of 23) for the RA-WIS (sensitivity=74%, specificity=85%). For deterioration, the CID was 1 point for the WLQ-25 summed score (sensitivity=62%, specificity=74%) and index score (sensitivity=80%, specificity=74%) and 2 points for the RA-WIS (sensitivity=60%, specificity=72%).

Figure 1

Receiver operating characteristic curves for the Work Limitations Questionnaire-25 (WLQ-25) and Work Instability Scale for Rheumatoid Arthritis (RA-WIS) cutoff points for differentiating clinically important improvement.

Figure 2

Receiver operating characteristic curves for the Work Limitations Questionnaire-25 (WLQ-25) and Work Instability Scale for Rheumatoid Arthritis (RA-WIS) cutoff points for differentiating clinically important deterioration.

Open in new tab Download slide

Discussion

The present results suggest that 2 presenteeism scales, the WLQ-25 and the RA-WIS, provide distinct information from pain and disability scales, as relatively small shared variation was demonstrated. The WLQ-25 and RA-WIS are more highly correlated with the QuickDASH, which mainly addresses upper-extremity disability, than with the CPG or SPADI-P, which mainly address pain. Because shared variation between the WLQ-25 and RA-WIS also was small, the results suggest that they may not be capturing the same aspects of presenteeism. Furthermore, both measures were moderately responsive to changes for workers with chronic work-related upper-extremity disorders, suggesting the potential to evaluate change in health-related work productivity over time. Both scales also discriminated among different levels of self-rated work ability, suggesting they may be able to classify patients into subgroups according to their capacity to work. Two subscales of the WLQ-25 showed problems with floor and ceiling effects; therefore, the use of these subscales alone is not warranted for this population.

The relationships between the presenteeism scales followed expectations of moderate relationships (r=.53–.54). The correlation coefficients indicate a shared variation near 30% (coefficient of determination [R²]), suggesting that there is some overlap in the presenteeism concepts measured. However, the shared variation also suggests that the 2 presenteeism scales evaluate different aspects of at-work disabilities. This finding would be expected, given differences in the conceptual framing of these measures. According to the International Classification of Functioning, Disability and Health, most items of WLQ-25 are related to activity limitations and participation restrictions, and most items of the RA-WIS are related to body functions.45 Furthermore, one scale requires responses on the amount of time workers have difficulty handling parts of their job,11 while the other is concerned about the mismatch between worker's abilities and demands of their job.12

In workers with rheumatoid arthritis, a score less than 10 on the RA-WIS has been proposed as being indicative of low risk of work disability, a score between 10 and 17 as indicating medium risk, and a score above 17 as indicating high risk.12 In the present study, similar benchmarks were observed. Workers who rated their ability to work as not difficult or who could cope with their injury had a mean score below 10, whereas workers who rated their ability to work as difficult or who could not cope with their injury had a mean score above 16. Known-group validity can provide useful benchmarks for clinicians in making clinical judgments because scores can be used to make statements about whether the score for an individual patient is consistent with the subgroups of difficulty defined in this study. The WLQ-25 has a directly interpretable score. In the population studied, mean summed scores over 40 and mean index scores over 10 were consistent with having difficulty in meeting work demands, whereas mean summed scores of less than 20 and mean index scores of less than 6 were consistent with being able to meet work demands. A summed score of 40 means that 2 days a week (40% of the time) an employee is unable to meet the demands of the job because of health, whereas an index score of 10 means a decrease in productivity of 9.5% compared with workers who are healthy.21 A summed score of 20 means an employee is unable to meet the demands of the job 1 day a week, whereas an index score of 6 means a decrease in productivity of 5.8%.21 Depending on the work site's operational needs and the worker's health, clinicians can make judgments about reasonable cutoff points.

One of the main study contributions is new evidence about responsiveness. The SRMs were similar for the improved subgroup for both the RA-WIS and WLQ-25, suggesting the scales have similar ability to detect group-level improvement in work-related disorders. The SRMs were higher for the RA-WIS for the worsened subgroup. However, the 95% CIs of the SRMs were overlapping, which could indicate that the difference was not significant. Previously, only Beaton et al16 had reported responsiveness indices for presenteeism scales, observing varying levels of responsiveness for the RA-WIS and the WLQ-25 in improved (0.28≤SRM≤0.64) and deteriorated (0.00≤SRM≤0.88) subgroups. However, comparisons between the 2 studies are limited as a result of different rating questions being used. The amount of change considered clinically important had not been previously established for these scales. Establishing CID allows rehabilitation professionals to set targets for improvement or deterioration. We used large improvement or deterioration in overall upper-extremity condition as an anchor for the CID. Individual workers exceeding the CID would have been likely to have had a definitive change in work productivity. The interpretation of CID also must be judged against the minimal detectable change (MDC),46 which describes the amount of change that is likely due to measurement error. We did not have the data to produce MDC in the current study, and this is something to work toward in future research.

With these caveats, a 13-point change on the WLQ-25 summed score (13% of total score), a 5-point change on the WLQ-25 index score (17% of total score), and a 4-point change on the RA-WIS (17% of total score) may be used to estimate the minimal amount of change necessary to be considered a clinically relevant improvement. These findings mean that if a patient is able to report a favorable response on 4 more questions listed on the RA-WIS or increase the WLQ-25 summed score by 13%, he or she has achieved an improvement. A 13-point change in the WLQ-25 summed score (eg, from 56 to 43) also means the person is now able to meet the demands of his or her job on 3 out of 5 days instead of only on 2 days. The relative change needed to reach CID on the RA-WIS and WLQ-25 compares favorably with that of upper-extremity–specific pain and disability scales, such as the Disabilities of the Arm, Shoulder, and Hand outcome measure (10% of the total score),47 the Shoulder Pain and Disability Index (13%),47 and the Simple Shoulder Test (25%).48 Although the relative change required on the RA-WIS and on the WLQ-25 index score to reach a CID was higher compared with the WLQ-25 summed score, this finding was associated with greater sensitivity and specificity to correctly classify improvement. A 4-point change in RA-WIS scores correctly classified 74% of the patients reporting moderate to great improvement, whereas 13-point and 5-point changes, respectively, in WLQ-25 summed and index scores correctly classified 60% and 64% of the patients. Patients not meeting these cutoffs agreed with no or slight improvement in 85% of cases for the RA-WIS, 73% for the WLQ-25 summed score, and 84% for the WLQ-25 index score.

In contrast, a deterioration of only 1.0 point on the WLQ-25 (index and summed scores) and 2.0 points on the RA-WIS reflects the minimal amount of change necessary to be considered a clinically relevant deterioration. These CIDs are likely less than the day-to-day variability in scores that could be estimated by the MDC. Clinicians, therefore, should apply these cutoffs with caution when tracking workers with deteriorating conditions.

The least responsive WLQ-25 subscale was physical demands. The physical demands subscale covers workers' ability to perform job tasks that involve bodily strength, movement, endurance, coordination, and flexibility.11 The questions on the physical demands subscale are worded positively, with each question beginning with “How much of the time were you able to …,” and are scored from 1 (“all of the time”) to 5 (“none of the time”). The questions of the other 3 WLQ-25 subscales are worded negatively, with each question beginning with “How much of the time did your physical health or emotional problems make it difficult for you to do … ,” and are scored from 1 (“none of the time”) to 5 (“all of the time”). It is plausible that some participants were not paying attention to the instructions and continued to respond as if it was a difficulty. Future studies may include cognitive interviewing or alternate forms assessments to assess whether differential response patterns exist. Alternatively, the target study population may be less likely to experience positive changes in physical ability over the time frame studied because the patients had complex musculoskeletal disorders. The mean scores on the physical demands subscale were higher, on average, than the mean scores of the other subscales and stayed higher over time, leading to smaller responsiveness indices. This finding makes sense, given the clinical population.

This study had limitations. First, workers attended a specialized clinic where referrals were predetermined by the caseworker at the WSIB. Although the catchment area was large, generalizability of these results to other populations is unknown. Workers considered for this study had chronic injuries with a recovery that was not timely or satisfactory, also limiting the generalizability beyond this population. However, the population represents a challenging group of workers often seen in clinics. Furthermore, the external criterion of change for the determination of SRM and CID was related to the overall change in the upper-extremity condition and not to change in work ability or productivity. Because the scales evaluate work presenteeism, different indices could have been obtained using a work-related criterion of change. Limitations could be raised over the use of a GRC question to define the CID because of concerns about recall bias and about a 1-item scale being used to validate multi-item scales. There are a variety of methods to provide an external criterion to establish which patients have changed clinically, including clinician-reported GRCs and measurements of physical impairment. None of these methods are without limitations, and we recognized that our methods have limitations. Self-report scales were exclusively used to establish the properties of presenteeism scales and may not reflect other approaches to evaluating at-work disability.

In conclusion, our study presents evidence of construct validity and responsiveness for instruments that aim to measure at-work instability (RA-WIS) and at-work health-related productivity loss (WLQ-25) in participants with chronic work-related upper-extremity disorders. Clinically important differences were suggested for both tools. Measurement of work outcomes is important in rehabilitation. Therefore, the study provides psychometric evidence about 2 work outcome measures that could be included in the evaluation of workers.

The Bottom Line

What do we already know about this topic?

Presenteeism scales have been developed to evaluate loss of work productivity due to illness or injury in people who are present at their job. Validity of available scales is emerging, but evidence on responsiveness remains scarce.

What new information does this study offer?

The results suggest that 2 presenteeism scales—the Work Limitations Questionnaire-25 (WLQ-25) and Work Instability Scale for Rheumatoid Arthritis (RA-WIS)—are moderately responsive to change for workers with chronic work-related upper-extremity disorders.

If you're a patient, what might these findings mean for you?

Following your treatment, 2 presenteeism scales—the WLQ-25 and RA-WIS—have the potential to measure changes in your health-related work productivity over time.

The authors acknowledge the collaboration of the staff and injured workers from the 2 WSIB specialty clinics involved in this study: the Holland Orthopaedic and Arthritis Centre Shoulder and Elbow Specialty Clinic, Toronto, Ontario, Canada, and St Joseph's Health Centre Upper Extremity Specialty Clinic, London, Ontario, Canada. The authors also acknowledge the contribution of the other investigators involved in the larger study design and implementation: Pierre Cote, Renee-Louise Franche, Sheilah Hogg-Johnson, Sonia Pagura, Robin Richards, and Claire Bombardier. Special thanks to Diana Sayers, Muge Dogan, Taucha Inrig, and Iona MacRitchie for assistance with implementing the study.

This study was approved by the institutional review boards of the University of Western Ontario, St Michael's Hospital, Sunnybrook Health Sciences Centre, and the University of Toronto.

During the conduct of this study, Dr Roy was supported by scholarships from the Fonds de la Recherche en Santé du Québec (FRSQ) and the Canadian Institutes of Health Research (CIHR), and Dr MacDermid and Dr Beaton were supported by a CIHR New Investigators award. Mr Tang is supported by a CIHR PhD fellowship, a Canadian Arthritis Network/Arthritis Society Trainee Fellowship, and a Syme Fellowship from the Institute for Work & Health.

*

SPSS Inc, 233 S Wacker Dr, Chicago, IL 60606.

References

1

Prasad

M

,

Wahlqvist

P

,

Shikiar

R

,

Shih

YC

.

A review of self-report instruments measuring health-related work productivity: a patient-reported outcomes perspective

.

Pharmacoeconomics

.

2004

;

22

:

225

–

244

.

2

Williams

RM

,

Schmuck

G

,

Allwood

S

, et al. .

Psychometric evaluation of health-related work outcome measures for musculoskeletal disorders: a systematic review

.

J Occup Rehabil

.

2007

;

17

:

504

–

521

.

3

Aronsson

G

,

Gustafsson

K

,

Dallner

M

.

Sick but yet at work: an empirical study of sickness presenteeism

.

J Epidemiol Community Health

.

2000

;

54

:

502

–

509

.

4

McKevitt

C

,

Morgan

M

,

Dundas

R

,

Holland

WW

.

Sickness absence and “working through” illness: a comparison of two professional groups

.

J Public Health Med

.

1997

;

19

:

295

–

300

.

5

Koopman

C

,

Pelletier

KR

,

Murray

JF

, et al. .

Stanford presenteeism scale: health status and employee productivity

.

J Occup Environ Med

.

2002

;

44

:

14

–

20

.

6

Sanderson

K

,

Tilse

E

,

Nicholson

J

, et al. .

Which presenteeism measures are more sensitive to depression and anxiety?

J Affect Disord

.

2007

;

101

:

65

–

74

.

7

Denis

S

,

Shannon

HS

,

Wessel

J

, et al. .

Association of low back pain, impairment, disability and work limitations in nurses

.

J Occup Rehabil

.

2007

;

17

:

213

–

226

.

8

Lerner

D

,

Reed

JI

,

Massarotti

E

, et al. .

The Work Limitations Questionnaire's validity and reliability among patients with osteoarthritis

.

J Clin Epidemiol

.

2002

;

55

:

197

–

208

.

9

Tang

K

,

Pitts

S

,

Solway

S

,

Beaton

D

.

Comparison of the psychometric properties of four at-work disability measures in workers with shoulder or elbow disorders

.

J Occup Rehabil

.

2009

;

19

:

142

–

154

.

10

MacKenzie

EJ

,

Bosse

MJ

,

Kellam

JF

, et al. .

Early predictors of long-term work disability after major limb trauma

.

J Trauma

.

2006

;

61

:

688

–

694

.

11

Lerner

D

,

Amick

BC

III,

Rogers

WH

, et al. .

The work limitations questionnaire

.

Med Care

.

2001

;

39

:

72

–

85

.

12

Gilworth

G

,

Chamberlain

MA

,

Harvey

A

, et al. .

Development of a work instability scale for rheumatoid arthritis

.

Arthritis Rheum

.

2003

;

49

:

349

–

354

.

13

Macedo

A

,

Oakley

S

,

Gullick

N

,

Kirkham

B

.

An examination of work instability, functional impairment, and disease activity in employed patients with rheumatoid arthritis

.

J Rheumatol

.

2009

;

36

:

225

–

230

.

14

Turpin

RS

,

Ozminkowski

RJ

,

Sharda

CE

, et al. .

Reliability and validity of the Stanford Presenteeism Scale

.

J Occup Environ Med

.

2004

;

46

:

1123

–

1133

.

15

Lerner

D

,

Chang

H

,

Rogers

WH

, et al. .

A method for imputing the impact of health problems on at-work performance and productivity from available health data

.

J Occup Environ Med

.

2009

;

51

:

515

–

524

.

16

Beaton

DE

,

Tang

K

,

Gignac

MA

, et al. .

Reliability, validity, and responsiveness of five at-work productivity measures in patients with rheumatoid arthritis or osteoarthritis

.

Arthritis Care Res (Hoboken)

.

2010

;

62

:

28

–

37

.

17

Beaton

DE

,

Wright

JG

,

Katz

JN

.

Development of the QuickDASH: comparison of three-item reduction approaches

.

J Bone Joint Surg Am

.

2005

;

87

:

1038

–

1046

.

PubMed

18

Roach

KE

,

Budiman-Mak

E

,

Songsiridej

N

,

Lertratanakul

Y

.

Development of a shoulder pain and disability index

.

Arthritis Care Res

.

1991

;

4

:

143

–

149

.

19

Von Korff

M

,

Ormel

J

,

Keefe

FJ

,

Dworkin

SF

.

Grading the severity of chronic pain

.

Pain

.

1992

;

50

:

133

–

149

.

20

Stewart

M

,

Maher

CG

,

Refshauge

KM

, et al. .

Responsiveness of pain and disability measures for chronic whiplash

.

Spine (Phila Pa 1976)

.

2007

;

32

:

580

–

585

.

21

Lerner

D

,

Rogers

WH

,

Chang

H

.

Technical report: scoring the Work Limitations Questionnaire (WLQ) and the WLQ index for estimating work productivity loss

.

Revised April 2003. Available from the authors with purchase of the questionnaire

.

22

Tang

K

,

Beaton

DE

,

Lacaille

D

, et al. .

The Work Instability Scale for Rheumatoid Arthritis (RA-WIS): does it work in osteoarthritis?

Qual Life Res

.

2010

;

19

:

1057

–

1068

.

23

Tang

K

,

Beaton

DE

,

Gignac

MA

, et al. .

The work instability scale for rheumatoid arthritis (RA-WIS) predicts arthritis-related work transitions within 12 months

.

Arthritis Care Res (Hoboken)

.

2010

June

2

[Epub ahead of print]

.

24

Gummesson

C

,

Ward

MM

,

Atroshi

I

.

The shortened Disabilities of the Arm, Shoulder and Hand questionnaire (QuickDASH): validity and reliability based on responses within the full-length DASH

.

BMC Musculoskelet Disord

.

2006

;

7

:

44

.

25

Matheson

LN

,

Melhorn

JM

,

Mayer

TG

, et al. .

Reliability of a visual analog version of the QuickDASH

.

J Bone Joint Surg Am

.

2006

;

88

:

1782

–

1787

.

PubMed

26

Fayad

F

,

Lefevre-Colau

MM

,

Gautheron

V

, et al. .

Reliability, validity and responsiveness of the French version of the questionnaire Quick Disability of the Arm, Shoulder and Hand in shoulder disorders

.

Man Ther

.

2009

;

14

:

206

–

212

.

27

Roy

JS

,

MacDermid

JC

,

Woodhouse

LJ

.

Measuring shoulder function: a systematic review of four questionnaires

.

Arthritis Rheum

.

2009

;

61

:

623

–

632

.

28

Elliott

AM

,

Smith

BH

,

Smith

WC

,

Chambers

WA

.

Changes in chronic pain severity over time: the Chronic Pain Grade as a valid measure

.

Pain

.

2000

;

88

:

303

–

308

.

29

Smith

BH

,

Penny

KI

,

Purves

AM

, et al. .

The Chronic Pain Grade questionnaire: validation and reliability in postal research

.

Pain

.

1997

;

71

:

141

–

147

.

30

Salaffi

F

,

Stancati

A

,

Grassi

W

.

Reliability and validity of the Italian version of the Chronic Pain Grade questionnaire in patients with musculoskeletal disorders

.

Clin Rheumatol

.

2006

;

25

:

619

–

631

.

31

Portney

LG

,

Watkins

MP

.

Foundations of Clinical Research: Applications to Practice

. 2nd ed.

Upper Saddle River, NJ

:

Prentice-Hall Health

;

2000

.

Google Preview

32

McHorney

CA

,

Tarlov

AR

.

Individual-patient monitoring in clinical practice: are available health status surveys adequate?

Qual Life Res

.

1995

;

4

:

293

–

307

.

33

Munro

BH

.

Statistical Methods for Health Care Research

.

Philadelphia, PA

:

JB Lippincott Co

;

2000

.

Google Preview

34

Liang

MH

.

Longitudinal construct validity: establishment of clinical meaning in patient evaluative instruments

.

Med Care

.

2000

;

38

(

9 suppl

):

II84

–

II90

.

35

Jaeschke

R

,

Singer

J

,

Guyatt

GH

.

Measurement of health status: ascertaining the minimal clinically important difference

.

Control Clin Trials

.

1989

;

10

:

407

–

415

.

36

Michener

LA

,

McClure

PW

,

Sennett

BJ

.

American Shoulder and Elbow Surgeons Standardized Shoulder Assessment Form, patient self-report section: reliability, validity, and responsiveness

.

J Shoulder Elbow Surg

.

2002

;

11

:

587

–

594

.

37

Greco

NJ

,

Anderson

AF

,

Mann

BJ

, et al. .

Responsiveness of the International Knee Documentation Committee Subjective Knee Form in comparison to the Western Ontario and McMaster Universities Osteoarthritis Index, modified Cincinnati Knee Rating System, and Short Form 36 in patients with focal articular cartilage defects

.

Am J Sports Med

.

2010

;

38

:

891

–

902

.

38

Zou

GY

.

Quantifying responsiveness of quality of life measures without an external criterion

.

Qual Life Res

.

2005

;

14

:

1545

–

1552

.

39

Cohen

J

.

Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences

.

Hillsdale, NJ

:

Lawrence Erlbaum Associates

;

1983

.

Google Preview

40

MacDermid

JC

,

Drosdowech

D

,

Faber

K

.

Responsiveness of self-report scales in patients recovering from rotator cuff surgery

.

J Shoulder Elbow Surg

.

2006

;

15

:

407

–

414

.

41

Wells

G

,

Boers

M

,

Shea

B

, et al. .

MCID/Low Disease Activity State Workshop: low disease activity state in rheumatoid arthritis

.

J Rheumatol

.

2003

;

30

:

1110

–

1111

.

PubMed