Basic operation

Launching R AnalyticFlow

Install R AnalyticFlow following Chapter 2, Installation of preview edition. See the section called “Setup” for initial setup, and launch the program.

Create a node

At first read sample data for analysis. In this tutorial "iris" data set is used from R sample data sets. This is a famous data measuring iris flowers. For details, see Fisher (1936) or type help(iris) in R console.

Basically all the processes are described by creating nodes in R AnalyticFlow. Create a node to read a data set here. Click on "Node" > "Create Simple Node" from the menu.

A Simple node represents a process which can be expressed by a single R expression. Into "Code", input

data(iris)

as follows:

Click on "OK", then a node is created on the flow area.

Execute a node

The process is not executed by creating a node. To execute a process described as a node, right-click on the node and click on "Run". Now run on the data node we created. Then you will have the following output in the console window:

> data(iris)
> 

By running on a single node, the R code described in the node is sent to the console and executed. Here you can see that data(iris) has been executed and R is waiting for the next input.

Execute from console

R codes can directly be executed from the console window. To execute the function head to look into the data, input as follows:

Push enter key to execute the code. Then the first certain rows are displayed, so you can see that this data have four quantitative variables (height and width of petal, sepal) and one qualitative variable (species of iris). In such a situation as a quick check of data, direct execution from console is useful.

Tip

You can also create a simple node from the console, by pushing control key (command key on Mac) and enter key together after inputting code. You can take a trial and error on the console, and leave only necessary things in a flow.

Draw an analysis flow

Next return to the main window to add another node. Click on blank space of the flow area, and you can see the node created earlier as follows:

The brackets indicates that this node is the last one which are already excuted. Click on this node to make it selected.

If a new node is created when another node is selected, an edge (arrow) is automatically drawn from the selected node to a new node. Click on "Node" > "Create Simple Node" from the menu, and input the following code:

plot(iris[, 1:4], col = as.integer(iris$Species) + 1)

An edge is drawn automatically, resulting in the flow as follows:

Run a flow

On running on a node in a flow, all nodes in the path are executed in order, from the root node (the first node in the path). If there is a node with brackets in the excution path, execution starts from the next node to the bracketed node. Right-click on the plot node we created, and run on it. The graphic windows displays a figure as follows:

A scatterplot with four quantitative variables are drawn. The points indicate Species, which suggests that iris species may be well discriminated if these quantitative variables are used in an efficient way.

Tip

Now look at the console window. You can see that only plot function was executed followed by head function we executed earlier.

> data(iris)
> head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
> plot(iris[, 1:4], col = as.integer(iris$Species) + 1)
> 

This is because the data is bracketed (already executed) so there is no need to execute the former path again. If you want to execute all the nodes in the path, use "Clear and Run" in spite of "Run". It clears all the objects on the workspace, and all the nodes on the path are executed from the root node to the node that is run.

Edge operation

The scatterplot suggests that Petal.Length varies widly according to Species.

Let us draw a boxplot to examine this relation closer. Select the plot node and crate a simple node with the following code:

boxplot(Petal.Width ~ Species, data = iris, col = 3, main = "Petal.Width")

Then the flow becomes as follows:

Since the boxplot node comes next to the plot node in this flow, they will be executed in this order. It is natural as a process of exploratory analysis — however, it is not necessary when you want to see each result separately.

Therefore we rearrange this flow so that the boxplot node comes next to the data node, in the same way as the plot node does. As there are several ways to do this, the easiest way is simply drawing a new edge from the data node to the boxplot.

First click on the data node (the source of the new edge) to be selected:

Next center-click (or Alt + click) on the boxplot (the destination of the new edge).

Then a new edge is drawn, replacing the existing edge. To make it more eye-friendly, drug the boxplot to reallocate it:

Now the edge replacement has been done. Here the plot node (which was executed at the last) does not come before the boxplot node. So when the flow is run on the boxplot node, excution starts from the data node.

Save a flow

Finally save the flow we have drawn. Click on "File" > "Save As" on the menu to save the current flow. A saved flow can be loaded by clicking on "File" > "Open".