In this chapter, you will learn to define a **single layer neural network (NN)** in Caffe2 and run it on a randomly generated dataset. We will write code to graphically depict the network architecture, print input, output, weights, and bias values. To comprehend this chapter, you must be familiar with **neural network architectures**, its **terms** and **mathematics** used in them.

Let us consider that we want to build a single layer NN as appeared in the image below −

Mathematically, this network is represented by the following Python code −

Y = X * W^T + b

Where **X, W, b** are tensors and **Y** is the output. We will fill all three tensors with some random data, run the network and examine the **Y** output. To define the network and tensors, Caffe2 provides several **Operator** functions.

In Caffe2, **Operator** is the basic unit of computation. The Caffe2 **Operator** is presented as follows.

Caffe2 provides an exhaustive list of operators. For the network that we are designing currently, we will use the operator called FC, which computes the result of passing an input vector **X** into a fully connected network with a two-dimensional weight matrix **W** and a single-dimensional bias vector **b**. In other words, it computes the mathematical equation which is given below:

Y = X * W^T + b

Where **X** has dimensions **(M x k), W** has dimensions **(n x k)** and **b** is **(1 x n)**. The output **Y** will be of dimension **(M x n)**, where **M** is the batch size.

For the vectors **X** and **W**, we use the **GaussianFill** operator to create some random data. For generating bias values **b**, we use **ConstantFill** operator.

We will now proceed to define our network.

First of all, import the required packages −

from caffe2.python import core, workspace

Next, define the network by calling **core.Net** as follows −

net = core.Net("SingleLayerFC")

The name of the network is specified as **SingleLayerFC**. At this point, the network object called net is created. It does not contain any layers so far.

We will now create the three vectors required by our network. First, we will create X tensor by calling **GaussianFill** operator as follows −

X = net.GaussianFill([], ["X"], mean=0.0, std=1.0, shape=[2, 3], run_once=0)

The **X** vector has dimensions **2 x 3** with the mean data value of 0,0 and standard deviation of **1.0**.

Likewise, we create **W** tensor as follows −

W = net.GaussianFill([], ["W"], mean=0.0, std=1.0, shape=[5, 3], run_once=0)

The **W** vector is of size **5 x 3**.

Finally, we create bias **b** matrix of size 5.

b = net.ConstantFill([], ["b"], shape=[5,], value=1.0, run_once=0)

Now, comes the most significant part of the code and that is defining the network itself.

We define the network in the following Python statement −

Y = X.FC([W, b], ["Y"])

We call **FC** operator on the input data **X**. The weights are specified in **W** and bias in b. The output is **Y**. Alternatively, you may create the network using the following Python statement, which is more verbose.

Y = net.FC([X, W, b], ["Y"])

At this point, the network is simply created. Until we run the network at least once, it will not contain any data. Before running the network, we will examine its architecture.

Caffe2 defines the network architecture in a JSON file, which can be examined by calling the Proto method on the created **net** object.

print (net.Proto())

This produces the following output −

name: "SingleLayerFC" op { output: "X" name: "" type: "GaussianFill" arg { name: "mean" f: 0.0 } arg { name: "std" f: 1.0 } arg { name: "shape" ints: 2 ints: 3 } arg { name: "run_once" i: 0 } } op { output: "W" name: "" type: "GaussianFill" arg { name: "mean" f: 0.0 } arg { name: "std" f: 1.0 } arg { name: "shape" ints: 5 ints: 3 } arg { name: "run_once" i: 0 } } op { output: "b" name: "" type: "ConstantFill" arg { name: "shape" ints: 5 } arg { name: "value" f: 1.0 } arg { name: "run_once" i: 0 } } op { input: "X" input: "W" input: "b" output: "Y" name: "" type: "FC" }

As you can see in the above listing, it first defines the operators **X, W** and **b**. Let us examine the definition of **W** as an example. The type of **W** is specified as **GausianFill**. The **mean** is defined as float **0.0**, the standard deviation is defined as float **1.0**, and the **shape** is **5 x 3**.

op { output: "W" name: "" type: "GaussianFill" arg { name: "mean" f: 0.0 } arg { name: "std" f: 1.0 } arg { name: "shape" ints: 5 ints: 3 } ... }

Examine the definitions of **X** and **b** for your own understanding. Finally, let us look at the definition of our single layer network, which is reproduced here

op { input: "X" input: "W" input: "b" output: "Y" name: "" type: "FC" }

Here, the network type is **FC** (Fully Connected) with **X, W, b** as inputs and **Y** is the output. This network definition is too verbose and for large networks, it will become tough to examine its contents. Fortunately, Caffe2 provides a graphical representation for the created networks.

To get the graphical representation of the network, run the following code snippet, which is essentially only two lines of Python code.

from caffe2.python import net_drawer from IPython import display graph = net_drawer.GetPydotGraph(net, rankdir="LR") display.Image(graph.create_png(), width=800)

When you run the code, you will see the following output −

For large networks, the graphical representation becomes extremely helpful in visualizing and debugging network definition errors.

Finally, it is now time to run the network.

You can run the network by calling the **RunNetOnce** method on the **workspace** object −

workspace.RunNetOnce(net)

After the network is run once, all our data that is generated at random would be created, fed into the network and the output will be generated. The tensors which are created after running the network are called **blobs** in Caffe2. The workspace consists of the **blobs** you create and store in memory. This is quite similar to Matlab.

After running the network, you can examine the **blobs** that the workspace contains using the following **print** command

print("Blobs in the workspace: {}".format(workspace.Blobs()))

You will see the following output −

Blobs in the workspace: ['W', 'X', 'Y', 'b']

Note that the workspace consists of three input blobs − **X, W** and **b**. It also contains the output blob called **Y**. Let us now examine the contents of these blobs.

for name in workspace.Blobs(): print("{}:\n{}".format(name, workspace.FetchBlob(name)))

You will see the following output −

W: [[ 1.0426593 0.15479846 0.25635982] [-2.2461145 1.4581774 0.16827184] [-0.12009818 0.30771437 0.00791338] [ 1.2274994 -0.903331 -0.68799865] [ 0.30834186 -0.53060573 0.88776857]] X: [[ 1.6588869e+00 1.5279824e+00 1.1889904e+00] [ 6.7048723e-01 -9.7490678e-04 2.5114202e-01]] Y: [[ 3.2709925 -0.297907 1.2803618 0.837985 1.7562964] [ 1.7633215 -0.4651525 0.9211631 1.6511179 1.4302125]] b: [1. 1. 1. 1. 1.]

Note that the data on your machine or as a matter of fact on every run of the network would be seperate as all inputs are created at random. You have now successfully defined a network and run it on your PC.