Manipulating the Network Packets Data Set with R


Table of Contents
Introduction
Variables
Data format
Exercises

Introduction

The data are measurements made on each packet of information that passes through the University of Auckland Internet Gateway. These measurement include a timestamp that represents the time at which the packet reached the network location, and the size of the packet, as a number of bytes. The data set described herein is provided in a space-delimited ASCII text file format. This is a very small sample of just 46 packets.


Variables

The data set contains the following variables:

timestamp - The time at which the packet arrived at the Internet Gateway, as a number of seconds since January 1st 1970.
size - The size of the packet, in bytes.


Data format

The data set is provided as a space-delimited ASCII text file called packetSample.txt.

The file NetPacketsMeta.xml provides a StatDataML description of the data set.


Exercises

Q: Reading the data into R
Q: Displaying significant digits

Q: Reading the data into R

The first task is to read the data set into R. The result should look like this:

          V1   V2
1 1156748010   60
2 1156748010 1254
3 1156748010 1514
4 1156748010 1494
5 1156748010  114
6 1156748010 1514

Q: Displaying significant digits

Take a look at the original text file. You will see that the timestamp values appear to be much more precise than the values that are shown above when we read the file into R. For example, the first timestamp value is 1156748010.47817.

The problem is not that R cannot represent these numbers precisely; the problem is that R does not display numbers to full accuracy by default.

Change the number of significant digits that R displays so that we can see these values with full precision. The result should look like this:

                V1   V2
1 1156748010.47817   60
2 1156748010.47865 1254
3 1156748010.47878 1514
4 1156748010.47890 1494
5 1156748010.47892  114
6 1156748010.47891 1514