* SPSS syntax for calculating the Newcombe-Wilson hybrid score confidence interval * for the difference between independent proportions. * Dr. M. van Leerdam * Bare Statistics 2019 (www.barestatistics.nl) * Method described in: * Newcombe, R. G. (1998). Two-sided confidence intervals for the single proportion (comparison of seven methods). Statistics in Medicine, 17, 857-872. * Newcombe, R. G. (2013). Confidence Intervals for Proportions and Related Measures of Effect Size. Boca Raton: CRC Press. * First, a copy is made of your data file. DATASET NAME Original. DATASET COPY Copy. DATASET ACTIVATE Copy. DATASET CLOSE Original. * Now, replace the names of your independent and dependent variables by the new variable names "IV" and "DV". RENAME VARIABLES MYINDEPENDENTVARIABLE = IV. RENAME VARIABLES MYDEPENDENTVARIABLE = DV. FREQUENCIES IV DV. * From here, you can choose Run All from the menu! * DV must be binary coded (1, 0). * If this is not the case (for example if the variable is coded 2 and 1), the syntax below will code the larger value 1 and the lower value 0. * For reversed coding, switch (ELSE=0) and (ELSE=1). AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK= /DV_min=MIN(DV) /DV_max=MAX(DV). DO IF (DV = DV_min). RECODE DV (ELSE=0). END IF. DO IF (DV = DV_max). RECODE DV (ELSE=1). END IF. VARIABLE LEVEL DV (NOMINAL). VARIABLE LABELS DV 'Outcome'. FORMATS DV (F8.0). EXECUTE. DELETE VARIABLES DV_min DV_max. * IV must have values (1, 2). * If this is not the case (for example if the variable is coded 1 and 0), the syntax below will code the larger value 2 and the lower value 1. RENAME VARIABLES IV=IV_original. AUTORECODE VARIABLES=IV_original /INTO IV. FORMATS IV (F15.0). * For reversed coding, use: AUTORECODE VARIABLES=IV_original /INTO IV /DESCENDING. * Calculate proportions. COMPUTE filter_$=(IV = 1). FILTER BY filter_$. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK= /S_1=SUM(DV) /N_1=NU(DV). COMPUTE filter_$=(IV = 2). FILTER BY filter_$. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK= /S_2=SUM(DV) /N_2=NU(DV). FILTER OFF. AGGREGATE /OUTFILE=* MODE=ADDVARIABLES OVERWRITEVARS=YES /BREAK= /S_1=MEAN(S_1) /N_1=MEAN(N_1) /S_2=MEAN(S_2) /N_2=MEAN(N_2). DELETE VARIABLES filter_$. COMPUTE P_1 = S_1 / N_1 . COMPUTE P_2 = S_2 / N_2 . FORMATS S_1 S_2 N_1 N_2 (F8.0). FORMATS P_1 P_2 (F8.3). VARIABLE LABELS P_1 'Proportion Group 1'. VARIABLE LABELS P_2 'Proportion Group2'. VARIABLE LABELS N_1 'Sample size Group1'. VARIABLE LABELS N_2 'Sample size Group2'. EXECUTE. * Choose confidence level (C). The default cumulative probability (p) is .025 for 95% confidence. * C = 99%: p = .005; C = 95%: p = .025; C = 90%: p = .050; C = 80%: p = .100; C = 50%: p = .250. COMPUTE Z = ABS(IDF.NORMAL(.025,0,1)). * Calculate lower (LL) and upper (UL) confidence limits. COMPUTE C = 100 * ((2 * CDF.NORMAL(Z,0,1)) -1). FORMATS C (F8.0). ALTER TYPE C (A2). STRING Cstring (A3). COMPUTE Cstring = CONCAT(C,"%"). VARIABLE LABELS Cstring 'Confidence Level'. COMPUTE LL_1 = (((2 * S_1 + Z**2) - (Z * SQRT((Z**2 + 4 * S_1 * (1 - P_1)))))) / (2 * (N_1 + Z**2)) . COMPUTE UL_1 = (((2 * S_1 + Z**2) + (Z * SQRT((Z**2 + 4 * S_1 * (1 - P_1)))))) / (2 * (N_1 + Z**2)). COMPUTE LL_2 = (((2 * S_2 + Z**2) - (Z * SQRT((Z**2 + 4 * S_2 * (1 - P_2)))))) / (2 * (N_2 + Z**2)) . COMPUTE UL_2 = (((2 * S_2 + Z**2) + (Z * SQRT((Z**2 + 4 * S_2 * (1 - P_2)))))) / (2 * (N_2 + Z**2)). COMPUTE D = P_1 - P_2. COMPUTE LL = P_1 - P_2 - SQRT((P_1 - LL_1)**2 + (UL_2 - P_2)**2). COMPUTE UL = P_1 - P_2 + SQRT((UL_1 - P_1)**2 + (P_2 - LL_2)**2). FORMATS Z P_1 P_2 LL_1 UL_1 LL_2 UL_2 D LL UL (F8.3). VARIABLE LABELS D 'Difference between proportions'. VARIABLE LABELS LL_1 'Lower Limit'. VARIABLE LABELS LL_2 'Lower Limit'. VARIABLE LABELS LL 'Lower Limit'. VARIABLE LABELS UL_1 'Upper Limit'. VARIABLE LABELS UL_2 'Upper Limit'. VARIABLE LABELS UL 'Upper Limit'. EXECUTE. OUTPUT CLOSE *. * Table with counts, proportions, and confidence limits. OMS /SELECT TABLES /IF COMMANDS=['Summarize'] SUBTYPES=['Case Processing Summary'] /DESTINATION VIEWER=NO. USE 1 thru 1. SUMMARIZE /TABLES=Cstring N_1 P_1 LL_1 UL_1 P_2 N_2 LL_2 UL_2 /FORMAT=VALIDLIST NOCASENUM Total /TITLE='Wilson score confidence intervals for single proportions' /MISSING=VARIABLE /CELLS=NONE. SUMMARIZE /TABLES=D Cstring LL UL /FORMAT=VALIDLIST NOCASENUM Total /TITLE='Newcombe-Wilson hybrid score confidence interval for difference between independent proportions' /MISSING=VARIABLE /CELLS=NONE. OMSEND. USE ALL. CROSSTABS /TABLES=DV BY IV /FORMAT=AVALUE TABLES /STATISTICS=CHISQ /CELLS=COUNT COLUMN /COUNT ROUND CELL. DELETE VARIABLES IV TO UL. RENAME VARIABLES IV_original=IV.