The Project

Wiki New Zealand

Wiki New Zealand

What do you need to connect people with data?

Few individuals possess all of these

Wiki New Zealand

Wiki New Zealand

Some problems:

Some solutions:

The openapi Project

Some problems:

Some solutions:

openapi

openapi is NOT visual programming

openapi is NOT visual programming

openapi is NOT visual programming

openapi is Data and Scripts

LTD404701_20140509_101154_10.csv
"Birth rates - DFMA (Annual-Dec)",""
"","Total Population"
1855,39.25
1856,37.81
1857,39.47
1858,38.24
...

birthrate-file.R
# Read in original data source from Stats NZ ...
#   'brsrcfile' ("LTD404701_20140509_101154_10.csv")
# ... and tidy it to produce nicer CSV ...
#   "birthrate.csv"
lines <- readLines(brsrcfile)
# Drop any line that does not start with a digit
writeLines(lines[grep("^[0-9]", lines)], "birthrate.csv")

openapi is Modules

<?xml version="1.0"?>
<module xmlns="http://www.openapi.org/2014/" version="0.1">
  <platform name="fileSystem"/>
  <description><![CDATA[This module provides a CSV file call ...
  <output name="brsrcfile" type="external"
          ref="data/LTD404701_20140509_101154_10.csv"/>
</module>

<?xml version="1.0"?>
<module xmlns="http://www.openapi.org/2014/" version="0.1">
  <platform name="R"/>
  <description><![CDATA[This module takes a CSV file and pro ...
  <input name="brsrcfile" type="external"/>
  <output name="brfile" type="external" ref="birthrate.csv"/>
  <source ref="src/birthrate-file.R"><![CDATA[]]></source>
</module>

openapi is Pipelines

<pipeline xmlns="http://www.openapi.org/2014/" version="0.1">
  <component name="brsource"/>
  <component name="birthrate"/>
  <component name="brplot-R"/>
  <pipe>
    <start component="brsource" name="brsrcfile"/>
    <end component="birthrate" name="brsrcfile"/>
  </pipe>
  <pipe>
    <start component="birthrate" name="brfile"/>
    <end component="brplot-R" name="brfile"/>
  </pipe>
</pipeline>

openapi is a Glue System

library(oaglue)
p <- readPipeline("birthrate-pipe")
results <- runPipeline(p)
          compname         name        type       format formatType
brsrcfile "birthrate-pipe" "brsrcfile" "external" ""     "text"    
brfile    "birthrate-pipe" "brfile"    "external" ""     "text"    
brsvg     "birthrate-pipe" "brsvg"     "external" ""     "text"    
          ref                                                 
brsrcfile "data/LTD404701_20140509_101154_10.csv"             
brfile    "birthrate-pipe/Components/birthrate/birthrate.csv" 
brsvg     "birthrate-pipe/Components/brplot-R/birthrate-R.svg"

openapi is ...

An openapi example

Andrew Balemi wanted to add an annotation to the Wiki New Zealand plot of NZ birth rate to show the end of World War II (the onset of the baby boomers)

An openapi example


births <- read.csv(brfile, col.names=c("year", "births"))

svg("birthrate-R.svg")
plot(births, type="l")
abline(v=1945 +
       as.numeric(as.Date("1945-09-02") - as.Date("1945-01-01"))/365)
dev.off()

An openapi example

<?xml version="1.0"?>
<module xmlns="http://www.openapi.org/2014/" version="0.1">
  <platform name="R"/>
  <description><![CDATA[This module reads a CSV file and pro ...
  <input name="brfile" type="external"/>
  <output name="brsvg" type="external" ref="birthrate-R.svg"/>
  <source ref="src/birthrate-plot-custom.R"><![CDATA[]]></source>
</module>

An openapi example

<pipeline xmlns="http://www.openapi.org/2014/" version="0.1">
  <component name="brsource"/>
  <component name="birthrate"/>
  <component name="brplot-R"/>
  <component name="brplot-R-custom"/>
  <pipe>
    <start component="brsource" name="brsrcfile"/>
    <end component="birthrate" name="brsrcfile"/>
  </pipe>
  <pipe>
    <start component="birthrate" name="brfile"/>
    <end component="brplot-R" name="brfile"/>
  </pipe>
  <pipe>
    <start component="birthrate" name="brfile"/>
    <end component="brplot-R-custom" name="brfile"/>
  </pipe>
</pipeline>

An openapi example

Another openapi example

Scripts can be in any programming language

Another openapi example

birthrate-plot.py
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

year, births = np.loadtxt(brfile, unpack=True, delimiter=",")

plt.plot_date(x=year, y=births, fmt="r-")
plt.grid(True)
plt.savefig("birthrate-py.svg")


Another openapi example

<?xml version="1.0"?>
<module xmlns="http://www.openapi.org/2014/" version="0.1">
  <platform name="python"/>
  <description><![CDATA[This module reads a CSV file and pro ...
  <input name="brfile" type="external"/>
  <output name="brsvg" type="external" ref="birthrate-py.svg"/>
  <source ref="src/birthrate-plot.py"><![CDATA[]]></source>
</module>

Another openapi example

<pipeline xmlns="http://www.openapi.org/2014/" version="0.1">
  <component name="brsource"/>
  <component name="birthrate"/>
  <component name="brplot-R"/>
  <component name="brplot-py"/>
  <pipe>
    <start component="brsource" name="brsrcfile"/>
    <end component="birthrate" name="brsrcfile"/>
  </pipe>
  <pipe>
    <start component="birthrate" name="brfile"/>
    <end component="brplot-R" name="brfile"/>
  </pipe>
  <pipe>
    <start component="birthrate" name="brfile"/>
    <end component="brplot-py" name="brfile"/>
  </pipe>
</pipeline>

Another openapi example

Yet another openapi example

Yet another openapi example

What if all of the New Zealand Youth (18-24) who did NOT vote in 2011 all voted for the Internet Party in 2014?

Yet another openapi example

What do I need?

Yet another openapi example

The accessibility of the data is problematic

Yet another openapi example

The accessibility of the data is problematic

Yet another openapi example

What if there was a nice web site that already made pictures of these data?

Yet another openapi example

What if there was a nice web site that already made pictures of these data?

Yet another openapi example

What if those pictures were part of a modular and reusable framework?

Yet another openapi example

<?xml version="1.0"?>
<module xmlns="http://www.openapi.org/2014/" version="0.1">
  <platform name="fileSystem"/>
  <output name="nvfile" type="external" ref="data/non-voters.csv"/>
</module>

<?xml version="1.0"?>
<module xmlns="http://www.openapi.org/2014/" version="0.1">
  <platform name="fileSystem"/>
  <output name="popfile" type="external" 
  ref="data/TABLECODE7511_Data_821b2c90-79e3-4462-9994-4ae796f6e654.csv"/>
</module>

Yet another openapi example

<?xml version="1.0"?>
<module xmlns="http://www.openapi.org/2014/" version="0.1">
  <platform name="R"/>
  <input name="nvfile" type="external"/>
  <input name="popfile" type="external"/>
  <output name="nonvoters" type="internal"/>
  <output name="pop2013" type="internal"/>
  <output name="pop2013grouped" type="internal"/>
  <source ref="src/tidy.R"><![CDATA[]]></source>
</module>

Yet another openapi example

What if you could search for an existing script?
Or request a script?
Or request a module wrapper for an existing script?
Or write your own wrapper on an existing script?

Yet another openapi example

The purple wedge shows what proportion of the overall vote the Internet Party would get.

Yet another openapi example

Yet another openapi example

Where did I get 5.51 from?
My statement is informed
We can share and remix the data and the code
Our discussion can be informed

Summary

References and links

Acknowledgements

openapi wants to be ...

openapi could be (with funding) ...

openapi challenges