mercoledì 20 giugno 2007

String manipulation, insert delim

From the list, as usual:

I want to be able to insert delimiters, say commas, into a string
of characters at uneven intervals such that:

foo<-c("haveaniceday")# my string of character
bar<-c(4,1,4,3) # my vector of uneven intervals
my.fun(foo,bar) # some function that places delimiters appropriately
have,a,nice,day # what the function would ideally return


1)

paste(read.fwf(textConnection(foo), bar, as.is = TRUE), collapse = ",")
[1] "have,a,nice,day"


2)

my.function <- function(foo, bar){
# construct a matrix with start/end character positions
start <- head(cumsum(c(1, bar)), -1) # delete last one
sel <- cbind(start=start,end=start + bar -1)
strings <- apply(sel, 1, function(x) substr(foo, x[1], x[2]))
paste(strings, collapse=',')
}

my.function(foo, bar)
[1] "have,a,nice,day"

venerdì 8 giugno 2007

Back to back historgram

library(Hmisc)
age <- rnorm(1000,50,10)
sex <- sample(c('female','male'),1000,TRUE)
out <- histbackback(split(age, sex), probability=TRUE, xlim=c(-.06,.06), main = 'Back to Back Histogram')
#! just adding color
barplot(-out$left, col="red" , horiz=TRUE, space=0, add=TRUE, axes=FALSE)
barplot(out$right, col="blue", horiz=TRUE, space=0, add=TRUE, axes=FALSE)


lunedì 4 giugno 2007

How do you get the most common row from a matrix?

If I have a matrix like this:

array(1:3,dim=c(4,5))

[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 1 2
[2,] 2 3 1 2 3
[3,] 3 1 2 3 1
[4,] 1 2 3 1 2


in which rows 1 and 4 are similar, I want to find that vector c(1,2,3,1,2).

library(cluster)
x <- array(1:3,dim=c(4,5))
dissim <- as.matrix(daisy(as.data.frame(x)))
dissim[!upper.tri(dissim)] <- NA
unique(x[which(dissim == 0, arr.ind=TRUE), ])


or

count <- table(apply(x, 1, paste, collapse=" "))
count[which.max(count)]

venerdì 1 giugno 2007

R number output format

I'd like to save the number 0.0000012 to a file just as it appears:
?formatC
formatC(.000000012, format='fg')
[1] "0.000000012"
also
?sprintf
sprintf("%.10f", 0.0000000012)
[1] "0.0000000012"
or
format(.0000012, scientific=FALSE)
[1] "0.0000012"