bin2stg               package:missreg               R Documentation

_B_i_n_a_r_y _r_e_g_r_e_s_s_i_o_n _f_o_r _t_w_o-_p_h_a_s_e _s_a_m_p_l_e_d _d_a_t_a

_D_e_s_c_r_i_p_t_i_o_n:

     Fits binary regression models to data with the two-phase
     missingness structure.  This class includes stratified
     case-control data.

_U_s_a_g_e:

     bin2stg(formula, weights = NULL, xstrata = NULL, 
             obstype.name = "obstype", data, fit = TRUE, 
             xs.includes = FALSE, linkname = "logit", 
             start = NULL, Qstart = NULL, int.rescale = TRUE, 
             off.set = NULL, control = mlefn.control(...), 
             control.inner = mlefn.control.inner(...), ...)

_A_r_g_u_m_e_n_t_s:

 formula: A symbolic description of the model to be fitted. If there is
          only one non-NA level of the response variable presented in
          the data, that level is treated as "failure" (control).

 weights: An optional vector of weights to be used in the fitting
          process. Should be 'NULL' or a numeric vector.

 xstrata: Specify names of the stratification variables to be used,
          e.g. '"vname"' or 'c("vname1","vname2",...)'. Strata are
          defined by cross-classification of all levels.

obstype.name: Name of the variable specifying labels for observations
          by sampling and variable type: '"uncond"', '"retro"',
          '"xonly"', '"y|x"' or '"strata"'.

    data: A data frame containing all the variables required for
          analysis, including those for 'xstrata' and 'obstype.name'.

     fit: If 'FALSE', only stratum report will be generated without
          model fitting. 

           This is useful in providing a data check, or finding
          internal ordering of the 'xstrata' so that 'yCuts' can be
          specified consistently with this ordering.

xs.includes: 'TRUE' if 'weights' specified for observations labelled as
          '"strata"' include those observed at the second phase (i.e.
          '"retro"' or '"uncond"' observations). 

linkname: A specification for the model link function. Three choices
          are provided: '"logit"', '"probit"' or '"cloglog"'. The
          default is '"logit"'.

   start: Starting values for the regression parameters. Can be
          compusory if the program  cannot produce a valid starting
          value at some situations. 

           When only part of the starting parameters are provided,
          names of these parameters will be used (if specified) to
          match the design matrix. Zeros will be used as starting
          values for all other parameters. This is useful when an
          updated fit is considered. 

  Qstart: An optional starting matrix for Pr(Y=i|Xstratum=j). The first
          row should be related to the successes (cases) and the second
          to the failures (controls). Can be compulsory if the program
          cannot produce a valid starting value at some situations.

int.rescale: If 'TRUE', all X variables will be standardised first
          before fitted in the model. 

 off.set: Specify an 'a priori' known component to be included in the
          predictors. Should be 'NULL' or a numeric vector.

 control: Specify control parameters for the iterations in 'mlefn'
          call.  See 'mlefn' for details.

control.inner: Specify control parameters for inner iterations nested
          within  'mlefn' call. See 'mlefn' for details.

     ...: Further arguments passed to or from related functions.

_D_e_t_a_i_l_s:

     This function fits binary regression models using several links
     with various types  of observations collected at different
     two-phase sampling schemes. More detailed  descriptions of the
     function and its applications can be found in "Description  of the
     'missreg' Library" (Wild and Jiang).

_V_a_l_u_e:

missReport: Matrix containing information on deleted records with
          missing observations.

StrReport: Cross tabulation of counts for different levels of 'obstype'
          and Y-values  by X-strata.

xStrReport: Cross tabulation of counts for 'obstype' by X-strata when
          'obstype="xonly"'.

     key: Specify detailed classification for each of the X-strata.

    yKey: Specify the Y variable and its level that the model is
          constructed for.

     fit: 'TRUE' or 'FALSE' as its argument.

   error: The error messages returned by 'mlefn' call. Non-zero values
          indicate an unsuccessful fit.

coefficients: The coefficients matrix with estimates, standard errors,
          z values and associated p-values.

   loglk: Log-likelihood returned from final 'mlefn' call.

   score: Score vector returned from final 'mlefn' call.

     inf: Observed information matrix returned from final 'mlefn' call.

  fitted: The fitted values of Y obtained by transforming the linear
          predictors  by the inverse of the link function.

     cov: The asymptotic covariance matrix (inverse of the informnation
          matrix).

     cor: The asymptotic correlation matrix.

    Qmat: The estimated Pr(Y=i|Xstratum=j) from the last iteration.

_N_o_t_e:

     The function 'summary.bin2stg' provides a complete summary of  the
     regression results including the Wald tests and a regression
     panel. All related output functions ('print.bin2stg',
     'summary.bin2stg' and 'print.summary.bin2stg') don't have help
     files provided at the moment.

_A_u_t_h_o_r(_s):

     Chris Wild, Yannan Jiang

_R_e_f_e_r_e_n_c_e_s:

     Description of the 'missreg' Library, Wild and Jiang, 2007.

_E_x_a_m_p_l_e_s:

     data(leprosy1)
     leprosy1$age.trans <- 100 * (leprosy1$age + 7.5)^-2
     z1 <- bin2stg(leprosy ~ age.trans + scar, data=leprosy1, weights=counts,
                   xstrata="age", xs.includes=TRUE)
     summary(z1)

     data(leprosy2)
     leprosy2$age.trans <- 100 * (leprosy2$age + 7.5)^-2
     z2 <- bin2stg(cbind(case,control) ~ age.trans + scar, data=leprosy2,
                   xstrata="age", xs.includes=TRUE) 
     summary(z2)

     data(leprosy3)
     leprosy3$age.trans <- 100 * (leprosy3$age + 7.5)^-2
     z3 <- bin2stg(leprosy ~ age.trans + scar, data=leprosy3, weights=counts,
                   xs.includes=TRUE)

     data(wilms.sub)
     z4 <- bin2stg(cbind(case,control) ~ stage*hist, xstrata=c("stage","inst"), 
                   xs.includes=TRUE, data=wilms.sub)
     summary(z4)

     data(trawl)
     attach(trawl)
     # 265 out of 787 fish in fine net have length over 35  (caught37=NA)
     # 353 out of 738 fish in test net have length over 35  (caught37=1)
     # So 738 were caught from (estimate) 353*787/265 that entered
     #est. pr(caught) assuming all fish over len=35 are caught
     phat <- 738 / (787*353/265)  
                                              
     z5 <- bin2stg(caught37 ~ I(length-35), weights=count, data=trawl,
               start=c(log(phat/(1-phat)),0), Qstart=matrix(c(phat,1-phat)))
     summary(z5)

     data(lowbirth.bin)
     z6 <- bin2stg(sgagp~mumht+bmi+I(bmi^2) + ethnicdb + factor(occ)+ hyper + smoke,
               weights=counts, xstrata=c("ethnicdb","smokedb"),
               obstype.name=c("instudy"), data=lowbirth.bin, xs.includes=FALSE)
     summary(z6)

