Data science software

This is a working list of software that is useful for data science and visualisation projects.  I’ll have a separate page for useful data libraries.

R

Some useful R packages from the Institute of Bioinformatics, Johannes Kepler Uni, Linz are available at http://www.bioinf.jku.at/software/.

RStudio

Jupyter Notebooks (a dynamic/interactive programming environment for python, R and other languages).   It works by providing that environment in notebooks, which are html pages empowered to handle scripts and an input-output flow for data.  This is a link to what it is.  It can also be used in a shared way, by setting up a JupyterHub to deliver the same notebook document to many users.   There is a gallery of interesting jupyter notebooks, including one that is an introduction to python.

If you wish to work with Jupyter Notebooks and python in an integrated environment, one option is to use the Anaconda DataScience Distribution, which features a recent version of python, and most of the useful data science libraries for analytics (numpy, scipy and pandas), visualisation (matplotlib) and machine learning (theano, tensor flow).   There are some additional setup suggestions at Yale.

Static web pages in the form of an HTML file can be created from python (including the ipython notebooks) using the Sphinx program, originally written for the python documentation.

BORIS (Behavioural Observation Research Interactive Software).

I discovered BORIS talking to Erika at HackyHour this June 2018. Ethology and ethograms are very practical scientific concepts for a range of topics: something I’ve unknowingly been doing as part of my cricket scoring apps!

 

Leave a comment

Your email address will not be published. Required fields are marked *