# Some functions to debug or improve # John Fox 18 June 2009 lregIWLS <- function(X, y, n=rep(1,length(y)), maxIter=10, tol=1E-6){ # bugged! # X is the model matrix # y is the response vector of observed proportion # n is the vector of binomial counts # maxIter is the maximum number of iterations # tol is a convergence criterion X <- cbind(1, X) # add constant b <- bLast <- rep(0, ncol(X)) # initialize it <- 1 # iteration index while (it <= maxIter){ if (max(abs(b - bLast)/(abs(bLast) + 0.01*tol)) < tol) break eta <- X %*% b mu <- 1/(1 + exp(-eta)) nu <- as.vector(mu*(1 - mu)) w <- n*nu z <- eta + (y - mu)/nu b <- lsfit(X, z, w, intercept=FALSE)$coef bLast <- b it <- it + 1 # increment index } if (it > maxIter) warning('maximum iterations exceeded') Vb <- solve(t(X) %*% diag(w) %*% X) list(coefficients=b, var=Vb, iterations=it) } # This function fits the ordered logit (proportional-odds) and probit models pom <- function(X, y, model=c("logit", "probit"), ...) # very slow version! # X: model matrix, excluding constant # y: response vector: factor or ordered factor # model: "logit" or "probit" # ...: arguments to be passed to optim() { negLogL <- function(coef){ a <- coef[lev1] # first nlev - 1 coefficients are intercepts b <- coef[-lev1] # the rest are "slopes" B <- matrix(b, p, nlev - 1) # except for intercepts all columns of B are the same B <- rbind(a, B) # add row of intercepts P <- cbind(1, cumprob(X %*% B)) # compute fitted cumulative probabilities P <- cbind(-t(apply(P, 1, diff)), P[,nlev]) # find individual-category probabilities P[P <= 0] <- sqrt(.Machine$double.eps) # avoid warnings in log() - sum(log(P[cbind(rows, y)])) # negative of log-likelihood } model <- match.arg(model) cumprob <- if (model == "logit") plogis else pnorm levels <- levels(y) nlev <- length(levels) y <- as.numeric(y) n <- length(y) rows <- 1:n p <- ncol(X) X <- cbind(1, X) # attach constant lev1 <- 1:(nlev - 1) b <- rep(0, p) # start values for "slopes" marg.p <- rev(cumsum(rev(table(y)/n)))[-1] a <- log(marg.p/(1 - marg.p)) # start values for intercepts start <- c(a, b) result <- optim(start, negLogL, method="BFGS", hessian=TRUE, ...) result <- list(thresholds= -result$par[lev1], coefficients=result$par[-lev1], var=solve(result$hessian), deviance=2*result$value, converged= result$convergence == 0) result } # This function fits the cumulative logit and probit models cumlogit <- function(X, y, model=c("logit", "probit"), ...) # very slow version! { negLogL <- function(b){ B <- matrix(b, p + 1, nlev - 1) # arrange parameter vector into matrix of regr. coefs. P <- cbind(1, cumprob(X %*% B)) # compute fitted cumulative probabilities P <- cbind(-t(apply(P, 1, diff)), P[,nlev]) # find individual-category probabilities P[P <= 0] <- sqrt(.Machine$double.eps) # avoid warnings in log() - sum(log(P[cbind(rows, y)])) # negative of log-likelihood } model <- match.arg(model) cumprob <- if (model == "logit") plogis else pnorm levels <- levels(y) nlev <- length(levels) y <- as.numeric(y) n <- length(y) rows <- 1:length(y) p <- ncol(X) X <- cbind(1, X) # attach constant B <- matrix(0, p + 1, nlev - 1) # start values for "slopes" marg.p <- rev(cumsum(rev(table(y)/n)))[-1] a <- log(marg.p/(1 - marg.p)) B[1,] <- a # start values for intercepts start <- as.vector(B) # ravel start values to a vector result <- optim(start, negLogL, method="BFGS", hessian=TRUE, ...) result <- list(coefficients=matrix(result$par, p + 1, nlev - 1), var=solve(result$hessian), deviance=2*result$value, converged= result$convergence == 0) result result } reducedRowEchelonForm <- function(A){ # bugged! n <- nrow(A) m <- ncol(A) i <- j <- 1 while (i <= n && j <= m){ while (j <= m){ currentColumn <- A[,j] currentColumn[1:n < i] <- 0 # find maximum pivot in current column at or below current row which <- which.max(abs(currentColumn)) pivot <- currentColumn[which] if (pivot == 0) { # check for 0 pivot j <- j + 1 next } if (which > i) A[c(i, which),] <- A[c(which, i),] # exchange rows A[i,] <- A[i,]/pivot # pivot row <- A[i,] A <- A - outer(A[,j], row) # sweep A[i,] <- row # restore current row j <- j + 1 break } i <- i + 1 } # 0 rows to bottom zeros <- which(apply(A[,1:m], 1, function(x) all(x == 0))) if (length(zeros) > 0){ zeroRows <- A[zeros,] A <- A[-zeros,] A <- rbind(A, zeroRows) rownames(A) <- NULL } A } runningMedian <- function(x, length=3){ # bugged! # x: a numeric vector # length: the number of values for each running median, defaults to 3 n <- length(x) X <- matrix(x, n, length) for (i in 1:length) X[1:(n - i + 1), i] <- x[-(1:(i - 1))] apply(X, 1, median)[1:(n - length + 1)] }