ggplot and python with conda

Update: ggplot in python is not regularly being maintained, so best to work with the R version in RStudio.

SUMMARY – ggplot (python) within conda

This is a guide to setting up the ggplot package for use with python in a conda virtual environment.

This description uses the ‘conda’ application. Conda is not Anaconda, but usually comes with the Anaconda distribution.

Step 1: install anaconda or just ‘pip install conda’ to get conda directly.

Step 2: create a new conda environment

Step 3: install any other python packages you need

Once a conda environment is set up (e.g. if test1 is your conda environment you’ve set up before), you can type this at command prompt:

source activate test1

and it will have you running python with ggplot package using python 3.5 straight off, regardless of what python version is on your system.

Conda virtual environments

Conda is a package manager AND an environment manager. It comes with the Anaconda distribution, so there may be some confusion, but conda is a program on its own, which will independently help you take care of the installs of packages yourself in each environment you create.

Conda has been around since 2012. For its history see http://technicaldiscovery.blogspot.com/2013/12/why-i-promote-conda.html and https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/

Think of conda as a git for setting up programming environments, where each environment is like a git branch (or apt/yum) that you can switch between, to use different package combinations and versions.   It is not python specific in that regard.

(nb: ‘pip’ is python’s own package manager but conda can install python packages too, and will do so from pre-compiled binaries not source. If you use conda, you must install pip inside whatever your active environment is to use it as well.)

Create conda environment

To create and activate a specific environment:
conda create -n newenv
source activate newenv

The current environment appears in parentheses before your command prompt
e.g. (newenv)

conda info –envs or conda env list

(tells you what environments are set up for you)

Refs:
https://conda.io/docs/user-guide/getting-started.html
https://conda.io/docs/_downloads/conda-cheatsheet.pdf

Create a python 3.5 environment for ggplot

The conda-forge version of ggplot is /is not compatible with Python 3.6.: https://anaconda.org/conda-forge/ggplot/files.   Be aware that running ‘conda create –name test python=3’ will probably create a Python 3.6 environment by default, and so you will need to force it to use an earlier version.   First create an environment and specify the python version to use:

conda create –name test1 python=3.5
source activate test1

To see all the packages installed in the current conda environment:

conda list (or conda list –explicit)

You can pipe this list out to a file in PWD with:

conda list –explicit > specs.txt

Refs:
https://conda.io/docs/user-guide/tasks/manage-environments.html

Now install ggplot into the conda environment

Now you can  install a ggplot version into that conda environment:

conda install -c conda-forge ggplot=0.11.5

Since you’ve already set up an environment for python 3.5 conda will restrict the version that can be installed for you.  You can just type this:

conda install -c conda-forge ggplot

Conda will take of downgrading any installed numpy, pandas to the requisite version. It might be better to install ggplot in the conda env before numpy and pandas, because some of the latest numpy, pandas are suited to python 3.6 and will be downgraded to work with ggplot. It also loads in matplotlib-2.0, qt, cairo, patsy, pyparsing, icu, cycler, libxml, pyqt, scipy (18MB), statsmodels-0.9 (15.8MB) etc. ggplot is a little 1.2MB at end, ad well as pixman, glib,jpeg, gettext,

Using a Jupyter notebook in a conda environment

If you are using the command line and want to run a jupyter notebook in a conda environment, type this:

conda install jupyter
jupityer-notebook

ps – this will install quite a few packages, but may downgrade openssl in the environment.   It also installs ‘pandoc 2.2.3.2’

If you’ve already setup a conda environment and run jupyter from within it, it will open a token and your browser will be directed to:
http://localhost:8888/tree

with your current directory loaded up, ready to go.

To run one of your prepared python script files through the notebook:
%load filename
%run filename

(press shift Enter to process each line in the notebook)
(no need for .py at end of filename – Jupyter will understand)

Note that in Jupyter, if you change the script in its cell, you can run the modified script by pressing the Run (>|) button that appears to the left of the cell when you put your mouse near the In [] label. However, until you resave the file the %run command will just run the script from the file, without the changes.

The % commands are magic commands

Ref:
https://ipython.org/ipython-doc/3/interactive/magics.html

Native Forest Logging (Victoria) and Auditable Legislation

This ABC article about allegedly illegal logging in Victorian native forests highlighted the surprising reliance on low-resolution diagrams to define and administer logging activities in State forests in Victoria.

In this case, it is a State-government controlled organisation which is largely trusted to self-regulate its own economic activities.   The issues raised in the ABC article are ultimately about the present legality of some logging activities, but there is another matter which interests me: the adequacy of the spatial information used in the Allocation Order.

Summary

The Allocation Order is intended to be the basis for allocation of total timber harvesting entitlements in State forests (Victoria), and the annual amount that can be taken.   This kind of regulation is ultimately based on setting spatial boundaries to permit or prohibit certain activities.   If there are quotas, they are with reference to particular species within those spatial boundaries.

The Allocation Order content seems to be insufficient in that it:

1. provides only a low-resolution map for the two characteristic timber groups in the State forests.

2. does not contain or refer to any auditable or supporting data or coordinate system for any relevant forest stand locations or quantities.

This Article reviews the sources and connections between the relevant information about timber entitlements, and suggests the Allocation Order may need redrafting to make it work.

Mapping and enforcement issues

The Map that is used for the Allocation Order appears as follows (source: S343 1 October 2013 gazette):

One of the central issues raised in the ABC news article is the extent to which maps, and poorly detailed ones, are used to describe legal status over important areas of State forests in Victoria.

The relevant Allocation Order in this case was prepared only in paper-based visual form, to depict land boundaries for transfer of legal entitlements.    The choice of such a small, low-resolution map seems inappropriate, but it seems that the limitations of the Gazette as a form of paper media take priority over sensible sets of information.

The obvious question is why the Gazette article does not refer to a primary set of digital information that defines the spatial boundaries, held by a relevant authority and formally updated when necessary?   Something I am yet to do is to list those other examples of legislation that also prescribe some kind of electronic data source for the contents of the law.

In my view, there is no reason why electronic information cannot be used in this context, and probably already is in an unofficial administrative sense (e.g. web sites which ultimately rely on electronic data in order to prepare data visualisations).   With the introduction of mapping services delivered by internet browsers, it is now also common to use information that is both in electronic and visual form. In practice, this is what is actually needed to administer the Sustainable Forests (Timber) Act 2004 legislation, and the terms of the Allocation Order.

ADDENDUM – A Walk Through the Legislation

How the Victorian native forest logging system works

The Victorian legislative regime for forests and reserves is outlined on its government website.  The legislation regarding native forest logging in Victoria is the Sustainable Forests (Timber) Act 2004.  The Victorian Government website also has details of forests in its periodic State of the Forest Reports.

There are three regulatory stages to the conversion of State-owned land into Timber able to be owned and sold by VicForests:

  1. Allocation Order
  2. Timber Release Plans
  3. Licences.

Under the Act, Allocation Orders may be made which authorise harvesting of timber and related activities by VicForests.  These can extend for a number of years and must be reviewed at regular intervals, say 5 yearly.

In section 15 of that Act (which was introduced by the 2013 amendments to the Act), relevantly, it states that an allocation order must include “in relation to the timber resources allocated”:
1. A description of the forest stands to which VicForests has access.
2. References to, or details of, the extent and location of forest stands to which VicForests has access
3. A table which is divided into three 5-year time periods listing:
(a) the total available area for each forest stand; and
(b) the area available for timber harvesting in the period of operation of the allocation order.

In this context, the definition of what a ‘each forest stand’ is, and its location, becomes important.   It also includes reference to the need for a quantification of the total available area for each ‘forest stand’.

Another matter of significance is that the Allocation Order is supposed to define clearly the areas to which title in the timber harvest passes.   That is a general consequence, even though the Timber Release Plans specify that the timber cannot be all harvested at once, but is limited to what is specified in the release plans.

The information content and concepts used in the Allocation Order

It is not clear from the ABC article what the precise terms of the Allocation Order are, and where it (or any of its previous versions) can be found.

There is reference to the history of the Allocation Order in the State of the Forests information, in the State of the Forests Report 2013, where at the earlier page 94 it says:

Following a five-year review, the Allocation Order was amended in May 2010, by
the Allocation to VicForests (Amendment) Order 2010.

The primary amendment involved allocating an equal proportion of the gross area of State forest that is defined as available for timber harvesting rather than a volume-based allocation. The Allocation Order specifies the maximum gross area of Ash and Mixed Species forest stand types that can be harvested in a specified five year period. VicForests are now responsible for calculating the volume of timber that can be harvested sustainably from the allocated area of Victorian public native forests.

Notice that it refers to ‘forest stand types”, which is not the same concept as a ‘forest stand’ to which the legislation refers.  The reference to ‘gross area of State forest” seems to refer to “State forest” as singular, and to “Victorian public native forests” as a set of forests, even though “a State forest” remains defined in the Sustainable Forests (Timber) Act 2004 as an individual forest.

The 2012 Briefing Note by the Environment Defender’s Office (Vic) highlighted the lack of up to date and sufficient information about the stages of Allocation Order, Timber Release Plan and Licensing.  It also noted that VicForests would be required to determine the volumes within each forest itself.

Since then, it appears that there was another Allocation Order made, in 2013.  The State of the Forests Report 2013 simply noted this as a footnote on page 96, because it was outside its reporting period, namely:

An amendment to the Allocation Order was published in October 2013.
However this is outside of the current 2013 State of the Forests Reporting
period.  http://www.dpi.vic.gov.au/forestry/public-land-forestry/timber-
allocation-order

The link address there appears to be out of date.

The Timber Release Plan factsheets (2017) also contain a web link to the allocation order.  In that web page, it is stated that allocations are no longer dealt with in Timber Release Plans, but in the Allocation Order.  This suggests that the granting of entitlements was once a detailed process.

It seems that it is necessary to go to the gazette website of the Victorian Government and do a document search to find it.  By doing an advanced search, specifying the relevant legislation as Sustainable Forest (Timber) Act 2004 and searching for the term Allocation Order, the relevant S343 1 October 2013 gazette is able to be located.  This contains both the relevant map that the ABC article refers to, as well as a contemporaneous Timber Release Plan, and also identifies that the relevant total area is comprised of the forest stands noted in the map.

This extract from the gazette contains the text allocating the timber resources:

There is a mismatch between what the clauses seem to be describing (forest stands) and what the map actually represents (a representation of the areas of two forest stand ‘types’, namely Ash and Mixed).   The forest stand types are a completely different system of classification that ignores relevant forest boundaries.

The legislation seems to suggest that more detail would be required.  Within the Sustainable Forest (Timber) Act 2004 ta forest stand is defined as follows:

“forest stand means a group of trees within a State forest that share common characteristics relating to eucalypt species composition and age”

On the face of it, the concept of ‘a State forest’ means there can be a set of forests, not merely one aggregate area.  It follows that there should be forest stands within each forest.   In turn, a “State forest” is defined in the same Sustainable Forest (Timber) Act 2004 to have the same meaning as it has in the Forests Act 1958. This Act is linked at the Victorian Government website.  The Forests Act seems to further subdivide the concept of State forest into protected forests and reserved forests.

Now with that in mind, look at Table 1 in the 2013 Allocation Order.

As the Appendix 1 states, it is only trying to define the location of the ‘types of forest stands referred to in Table 1″.  It isn’t able to describe the location and types of individual forest stands.

The result is that the Allocation Order does not achieve its purpose of specifying the areas to which VicForests are entitled to harvest in the future with any certainty.  It may be incapable of doing so.

Can other information identify the entitlements of VicForests?

The Allocation Order map does a poor job of identifying individual forest stands.   Whether there are other sources is unclear.  The ABC referred to online maps that refer to Allocation Orders, in particular the Spatial Data website.  That website contains this statement:

The Allocation Order 2013 made under section 13 of the Sustainable Forests (Timber) Act 2004 allocates and vests specified timber in State forests to VicForests for the purpose of harvesting and selling. This dataset specifies the spatial extent and location of these timber resources, by forest stand type.

Again, that document may have some of the information that is missing from the Allocation Order, but it is also focussed on identifying the forest stand types, rather than individual forest stands, and the forests in which they are located.

One possible resolution of the uncertainty is to reconcile the map with the contemporaneous Timber Release Plans that were contained in the same S343 1 October 2013 gazette.   However, those release plans may only have been for a small proportion of the total area (a 5 year harvesting plan) and not the total area for which title was transferred.

An amendment to the Allocation Order was made in October 2014, as notified in the s405 Government Gazette.  Though this simplified Table 1, and modified the terms of some of the clauses of the Allocation Order accordingly, it did not alter the original map or provide any further detail of the individual forests.  The conceptual approach to the information was the same as the original 2013 Allocation Order.

The clear need here is for the Allocation Order to be revised and redrafted so as to reconcile with the initial transfer of entitlements to VicForests within individual State forests.  This is so it is capable of being the master document against which audits of the Timber Release Plans and compliance can occur, on a forest by forest basis.  The reference to the forest stand types is a further means of testing the compliance of each individual forest Timber Release Plan against the subject of the Order.

Timber Release Plans need further information to be reconciled to the Allocation Order

The process of specifying the uptake of VicForests’ timber entitlements involves specifying harvesting areas (coupes) in  Timber Release Plans. These plans are ultimately approved by the Board of VicForests itself.  Notice of changes is published in the Victorian Government Gazette.  For example, the January 2017 Timber Release Plan notification appears in this gazette document, at page 25.  The notice looks like this:

The Timber Release Plan as at 2017 is itself a PDF paper-style document with detailed data produced in table form.

Timber Release Plans are now also published as a digital image in a dynamic-view,  electronic system.  The VicForests Timber Release Plans used to be produced in paper or static electronic form (PDFs), but from January 2018 it seems that this has been converted to a dynamic map system (an “Interactive Online Map”), delivered via a web browser.

The link to the TRP appears on the map embedded in the Approved Timber Release web page.  There is a separate link to the actual Interactive Map Page. There was also a 2016 short guide to this information, prepared before the Board decided this would be the only form of information provided.

The detail in this information also suggests that a higher degree of detail in the Allocation Order is required in order that a reconciliation between entitlements and harvesting can take place.

The need for clear information for NSW Land Rights and Native Title Claims

A recent news report [1] highlighted the potential for conflict and administration work in NSW indigenous land areas.  The work required is to determine the extent of overlap between native title claims and historical dispossession in NSW indigenous land areas, the latter being a possible basis for compensation.

The Land Rights regime created by the NSW Wran government did not anticipate the compensation claims.   If the native title claims were able to be considered at the time of creation of broader land titles, and integrated within them, it would make the extinguishment point moot.

The Aboriginal Land Rights Act 1983 (ALRA) was introduced to compensate Aboriginal people in NSW for dispossession of their land. The systems reflect different attitudes to land too: native title is not based on European ideas of control, transfer, economics.   It is based on assumptions about continuity.

The existence of Land Rights now represents a basis for economic payouts for loss[2]. However, native title is a concept based on the assumption of continuity of relation to land, not dispossession per se.

There are tens of thousands of native title claims being processed and hundreds of LALC’s.  The reconciliation of the LALC land boundaries[3] and native title areas is a big data job, most likely requiring people with local knowledge teaming up with people skilled in data processing and land or geographic information systems.

Refs:
1. ABC 16 November 2018 URL Link
2. LALC regime.  URL Link
3. LALC boundaries URL Link.

Indigenous vocabulary projects and information systems

An exciting information project has been created to digitally encode indigenous vocabularies derived from the records of Daisy Bates (from around the turn of the twentieth century).   A paper explaining how the information systems were set up has been written by Nick Thieberger (university of Melbourne) and Conal Tuoh (Queensland) (1).   I was interested to read that the authors of this database had concluded that valuable context could be retained by a system that was concerned with the specific data they had available first, and not with fitting the data to a preconceived relational database system. As they said (in conclusion):

“While the more usual approach to archival lexical material has been to extract lexical items into a relational database or spreadsheet, the data could not be coerced into such a form now without a significant amount of interpretation and loss of contextual information.”

My own exploration of how best to store information systems has led me to consider the pros and cons of relational databases, and this project illustrates why it is necessary to be closely scrutinise the specific data you are deailing with, to maximise the benefits of transforming it into forms that are useful.

I will be very interested in using the database to improve my own understanding of indigenous languages and provide a way for engaging more fully with local culture.  However, we must reflect on the fact that this exists through historical circumstances, and is both defined by those circumstances, and limited because of them.  I will be using it as a collection of language-relevant historical artifacts of information gleaned from indigenous speakers.  I am aware that Daisy Bates’ own research into the actual situation was peculiarly limited by her own perspective and views.  In 1938, she published “The Passing of the Aborigines” which reflects her erroneous views at that time.

References:

(1) (2017) Thieberger and Tuoh, “From Small to Big Data: paper manuscripts to RDF triples of Australian Indigenous Vocabularies. URL Link.