MavericK has been replaced by rmaverick!
As of August 2018, MavericK has been superseded by the R package rmaverick. This new version of the program has a number of advantages, including…
- Faster implementation
- More accurate evidence estimates
- Better cross-platform compatibility
- Better visualisation of results
rmaverick can be downloaded directly from here. The original MavericK program is now deprecated and support will be limited, but please feel free to read on to learn what the program does (rmaverick does essentially the same thing).
What is MavericK
MavericK is a program for inferring population structure on the basis of genetic information. The mixture modelling framework used by MavericK is identical to that used in the program STRUCTURE by Pritchard et al. (2000), which remains one of the most powerful and widely used programs in population genetics. However, STRUCTURE struggles with the issue of estimating the number of demes (denoted K) in a reliable way. The chief strength of MavericK lies in its ability to estimate K using a technique known as thermodynamic integration (TI). Estimates of the evidence for K obtained in this way have been found to be orders of magnitude more accurate and precise than those based on traditional methods (Verity & Nichols, 2016).
Posterior allocation of individuals to demes
The main output of the STRUCTURE software is a matrix known as the ‘Q-matrix’, which gives the posterior probability of each individual belonging to each deme. These allocation probabilities can be visualised using a variety of downstream tools, such as the program distruct, the web-based service CLUMPAK, or even a simple R script.

Figure 1 – Posterior allocation plot produced directly from Q-matrix output using a simple R script.
The implementation of the TI methodology in MavericK involves generating a series of draws from the posterior allocation of individuals to demes. As a result, MavericK can also produce Q-matrix output and can be used to produce plots such as the one in Figure 1 (i.e. MavericK does not need to be run in addition to STRUCTURE). MavericK also implements some extra features that make it possible to to measure confidence in our results, and to create allocation plots at the level of individual gene copies.
Estimating K
By running multiple closely related MCMC chains and combining the output, MavericK is able to estimate K in a reliable way. Outputs include estimates of the model evidence (i.e. the likelihood) for each K in log space, as well as in ordinary linear space after normalising to sum to 1. The latter is equivalent to an estimate of the posterior distribution of K. Estimates obtained in this way have been found to be far more accurate and reliable than those obtained by other methods (Figure 2).

Figure 2 – Example of the posterior distribution of K as estimated by various methods. Grey bars give the exact evidence, calculated by brute force (only possible for small sample sizes).
In addition to the TI approach, MavericK has options for estimating K using some standard model comparison statistics, such as the AIC, BIC, and DIC.
Evolutionary models
At the moment MavericK only supports the three basic evolutionary models present in STRUCTURE; 1) the without-admixture model, 2) the admixture model with alpha (admixture parameter) fixed, 3) the admixture model with alpha free to vary (given a Uniform(0,10) prior). More advanced models, such as models that make use of location information or that assume correlated allele frequencies between locations, are not yet implemented. Future versions of MavericK are likely to contain a larger repertoire of models.
As mentioned above, the original MavericK program has now been deprecated, and replaced by rmaverick. The legacy version of the program and documentation can be downloaded for both Mac and PC from here. For related files and code that may be of use check out the Additional files page. All code is open source, and is available via GitHub. If you have any specific queries then please drop me an e-mail at r.verity@imperial.ac.uk.
Verity, R & Nichols, R.A (2016). Estimating the number of subpopulations (K) in structured populations. Genetics 203.4, 1827-1839.