martedì 31 marzo 2009

Multiple plot in a single image using ImageMagick

Sometimes you need to add several plots/images either by row or by column to a single page/sheet.
If you generate all your plot with R base graphics you can easily accomplished the task using the par() function, e.g., using par(mfrow=c(2,2)) and then drawing 4 plots of your choice.
However, if you need to create a single image build up from different sources, e.g. external images, plots not compatible with R base graphics, etc. , you can create/retrieve the single images and then merge them together using the tools from the Unix (Linux, Mac OS X, etc.) ImageMagick suite.

## Example
# we generate some random plot
require(seqLog)
## the first plot is taken from the seqLogo help ( ?seqLogo )
## I selected this example on purpose because the seqLogo function is based on the grid graphics
and is coded in such a way that doesn't allow the use of the par() function
mFile <- system.file("Exfiles/pwm1", package="seqLogo")
m <- read.table(mFile)
pwm <- makePWM(m)
png("seqLogo1.png", width=400, height=400)
seqLogo(pwm)
dev.off()
## totally unrelated
png("plot1.png", width=400, height=400)
plot(density(rnorm(1000)))
dev.off()

Then you can type:
system("convert \\( seqLogo1.png plot1.png +append \\) \\( seqLogo1.png plot1.png +append \\) -background none -append final.png")

Remember that in R you have to start escape character with '\' !

Or, alternatively, from the command line:
convert \( seqLogo1.png plot1.png +append \) \( seqLogo1.png plot1.png +append \) -background none -append final.png

See man convert and man ImageMagick for the full story.

mercoledì 25 marzo 2009

Alternative implementations using ggplot2

Here and here, you can find alternative implementations of two plots  (1, 2) I created time ago using R basic graphic. The author recreates the plots taking advantage of the excellent ggplot2 package.

giovedì 12 marzo 2009

no "Infinities"

Thanks to  Pierre-Yves for the below useful tip!

if you have a dataset from which you want the max or min but they have to be real number and not "Inf" or "-Inf" there is a way to do it:

data <- c(-Inf, 1,2,3,4,5,6,7,8,9,10, Inf)
max(data)
# Return Inf
min(data)
# Return -Inf
# To solve the problem I went to:
range(data, finite=TRUE)
# Then you can do
myMinimum <- range(data, finite=TRUE)[1]
myMaximum <- range(data, finite=TRUE)[2]

domenica 8 marzo 2009

Dealing with missing values

Two new quick tips from 'almost regular' contributor Jason:

Handling missing values in R can be tricky. Let's say you have a table
with missing values you'd like to read from disk. Reading in the table
with,

read.table( fileName )

might fail. If your table is properly formatted, then R can determine
what's a missing value by using the "sep" option in read.table:

read.table( fileName, sep="\t" )

This tells R that all my columns will be separated by TABS regardless of
whether there's data there or not. So, make sure that your file on disk
really is fully TAB separated: if there is a missing data point you must
have a TAB to tell R that this datum is missing and to move to the next
field for processing.

Lastly, don't forget the "header=T" option if you have a header line in
your file.

Here's the 2nd tip:

Some algorithms in R don't support missing (NA) values. If you have a
data.frame with missing values and quickly want the ROWS with any
missing data to be removed then try:

myData[rowSums(is.na(myData))==0, ]

To find NA values in your data you have to use the "is.na" function.