The basic features of R are incredibly useful for someone coming from a non-R background, but experienced R users have sought faster and more effective data manipulation and improved plotting abilities, as well as multimedia options. Hadley Wickham’s contribution to the R universe includes his ‘ tidyverse‘ concept, and the associated set of packages. ...
Month: July 2018
R extensions and the data.table
One alternative to R’s native data frames is the data.table package first released to CRAN in 2006. data table extends base R’s data.frame. data.table was authored principally by Arun Srinivasan and Matt Dowle. One of the central features of the data.table was that it replaced the focus on data.frame rownames with the concept of...
Prosecutor’s Fallacies and Relative Probabilities
In the United Kingdom some years ago a mother of two deceased children was convicted, in part, because evidence was given that the probability of her two children dying through natural means or cot death (her defence) was very low – as low as 1 in 73 million. The woman was Sally Clark, a...
Python’s adoption of R’s ‘data frame’
Due to the influence of R on python, some of the terminology is similar and could become confusing, but this is avoided by understanding that many of the concepts originated with R and have since been incrementally picked up in the python universe or elsewhere. Here are some examples A ‘data frame’ was introduced...
Live Coding Overview
Live Coding is a topic raised when software developers talk about the tools they like to use, or might like to use in the future. The main idea behind live coding is to allow a software developer to illustrate the state of variables and other data in the code as coding is carried...
R extensions and interactivity
Yihui Xie is a creative software engineer who works at RStudio. His R packages are innovative and useful. 1. shiny. Shiny enables you to build interactive web applications straight from R. 2. knitR and animation. Yihui Xie created a literate programming package for R, called knitR. knitR enables R code to be embedded in...
R extensions: Grammar of Graphics, tidyverse and ggplot2
The Grammar of Graphics (the ‘gg’ in the ‘ggplot2’) Hadley Wickham is the author of several popular R packages and an employee of R Studio. In developing his graphical plotting package for R, called ggplot2, he has applied many of the ideas in Leland Wilkinson’s influential Grammar of Graphics concept and book. This 2010...
Data science literacy
The data life cycle and software One of the requirements to work professionally with data is not only mathematical knowledge, but familiarity with data processing hardware and software. Software packages can be useful through many stages of the life cycle, from data acquisition and cleaning, through to formatting and eventually, data visualisation. Data-focussed scientists...
Data science software
This is a working list of software that is useful for data science and visualisation projects. I’ll have a separate page for useful data libraries. R Some useful R packages from the Institute of Bioinformatics, Johannes Kepler Uni, Linz are available at http://www.bioinf.jku.at/software/. RStudio Jupyter Notebooks (a dynamic/interactive programming environment for python, R and...
R software
R is a software environment for statistical computing and graphics (https://www.r-project.org/). It is well established and scripting languages like python easily interface with its data structures (in fact, python library Pandas reproduces many of them), but R has its own universe of ideas and extensions as well. One related application is RStudio (www.rstudio.com), which...