R extensions and the tidyverse

The basic features of R are incredibly useful for someone coming from a non-R background, but experienced R users have sought faster and more effective data manipulation and improved plotting abilities, as well as multimedia options. Hadley Wickham’s contribution to the R universe includes his ‘ tidyverse‘ concept, and the associated set of packages.  These… Continue reading R extensions and the tidyverse

Published
Categorized as Computing

R extensions and the data.table

One alternative to R’s native data frames is the data.table package first released to CRAN in 2006.  data table extends base R’s data.frame.  data.table was authored principally by Arun Srinivasan and Matt Dowle. One of the central features of the data.table was that it replaced the focus on data.frame rownames with the concept of a… Continue reading R extensions and the data.table

Published
Categorized as Computing

Prosecutor’s Fallacies and Relative Probabilities

In the United Kingdom some years ago a mother of two deceased children was convicted, in part, because evidence was given that the probability of her two children dying through natural means or cot death (her defence) was very low – as low as 1 in 73 million. The woman was Sally Clark, a solicitor… Continue reading Prosecutor’s Fallacies and Relative Probabilities

R extensions and interactivity

Yihui Xie is a creative software engineer who works at RStudio.  His R packages are innovative and useful. 1. shiny.  Shiny enables you to build interactive web applications straight from R. 2. knitR and animation. Yihui Xie created a literate programming package for R, called knitR.  knitR enables R code to be embedded in other… Continue reading R extensions and interactivity

Published
Categorized as Computing

R extensions: Grammar of Graphics, tidyverse and ggplot2

The Grammar of Graphics (the ‘gg’ in the ‘ggplot2’) Hadley Wickham is the author of several popular R packages and an employee of R Studio.  In developing his graphical plotting package for R, called ggplot2, he has applied many of the ideas in Leland Wilkinson’s influential Grammar of Graphics concept and book.  This 2010 article… Continue reading R extensions: Grammar of Graphics, tidyverse and ggplot2

Published
Categorized as Computing

Data science literacy

The data life cycle and software One of the requirements to work professionally with data is not only mathematical knowledge, but familiarity with data processing hardware and software.  Software packages can be useful through many stages of the life cycle, from data acquisition and cleaning, through to formatting and eventually, data visualisation. Data-focussed scientists are… Continue reading Data science literacy

Published
Categorized as Computing

Data science software

This is a working list of software that is useful for data science and visualisation projects.  I’ll have a separate page for useful data libraries. R Some useful R packages from the Institute of Bioinformatics, Johannes Kepler Uni, Linz are available at http://www.bioinf.jku.at/software/. RStudio Jupyter Notebooks (a dynamic/interactive programming environment for python, R and other… Continue reading Data science software

Published
Categorized as Computing

R software

R is a software environment for statistical computing and graphics (https://www.r-project.org/). It is well established and scripting languages like python easily interface with its data structures (in fact, python library Pandas reproduces many of them), but R has its own universe of ideas and extensions as well. One related application is RStudio (www.rstudio.com), which is… Continue reading R software

Published
Categorized as Computing