Multiple-baseline studies are prevalent in behavioral research, but questions remain about how to best analyze the resulting data. Monte Carlo methods were used to examine the utility of multilevel models for multiple-baseline data under conditions that varied in the number of participants, number of repeated observations per participant, variance in baseline levels, variance in treatment effects, and amount of autocorrelation in the Level 1 errors. Interval estimates of the average treatment effect were examined for two specifications of the Level 1 error structure (sigma(2)I and first-order autoregressive) and for five different methods of estimating the degrees of freedom (containment, residual, between-within, Satterthwaite, and Kenward-Roger). When the Satterthwaite or Kenward-Roger method was used and an autoregressive Level 1 error structure was specified, the interval estimates of the average treatment effect were relatively accurate. Conversely, the interval estimates of the treatment effect variance were inaccurate, and the corresponding point estimates were biased.