snpStatsWriter is a package to allow “flexible writing of snpStats objects to flat files”. It should help write snpStats objects to disk in formats suitable for reading by snphap, phase, mach, IMPUTE, beagle, and (almost) anything else that expects a rectangular format. All the writing and conversion is done in C, so is fast, even for large datasets.
It looks like my ggplot2 heatmap function gets most traffic on this blog. That’s a bit unfortunate, because it’s the first function I wrote in earnest using ggplot2 and ggplot2 itself has undergone some updates since then, meaning my code is clunky, outdated and, er, broken.
So, with a bit more knowledge of ggplot2 and grid gained over the last few months, I have updated it today, and it is working. I hope it’s useful!
Get the code from github, and run it with
## simulate data library(mvtnorm) sigma=matrix(0,10,10) sigma[1:4,1:4] <- 0.6 sigma[6:10,6:10] <- 0.8 diag(sigma) <- 1 X <- rmvnorm(n=100,mean=rep(0,10),sigma=sigma) ## make plot p <- ggheatmap(X) ## display plot ggheatmap.show(p)
The result should look something like
I feel like I’ve arrived late to a party. Idly googling, I discovered roxygen2, a “Doxygen-like in-source documentation system for Rd, collation, and NAMESPACE”. With roxygen2, you can use the functions in devtools to automate lots of the painful parts of writing/updating/testing packages. To convert existing packages to roxygen format, there’s Rd2roxygen .
A common denominator to all these is Hadley Wickham, who is quickly becoming my R hero.
So, how does this work? I decided to take an existing package of mine that needed a little update, and have a go.
After backing up my package directory, I begain by roxygenize-ing an existing package
Coloc version 1.11 on CRAN. This version adds two options to deal with variable selection in performing colocalisation testing. You can now either
- summarize the genetic variation in a region using principle components, and perform a (likely high df) test over a substantial proportion of these, see functions
- use Bayesian Model Averaging to average over all possible SNP selections of a fixed size, with some options to trim the model space according to posterior probability of a SNP being involved in either trait in a univariate analysis, see function
coloc.test() has become a workhorse function, rather than something that is typically called directly, although that option remains available. The BMA approach means the
BMA library is now a dependency.