Storage Format Options for the Indian Mothers Data Set


Table of Contents
Introduction
Variables
Data format
Exercises

Introduction

The data originated from the Demographic and Health Survey program. An SPSS file containing data for 1000 mothers was provided by Deepankar Basu. The data set described herein is provided in a space-delimited ASCII text file format. This data set contains demographic information for 1000 Indian mothers, including age, years of education, social status, and whether the mother has employment outside the home. There is also information about the gender of any children that the mother has and a count of how many of the children are currently alive.


Variables

The data set contains the following variables:

age - The mother's age.
edu - The number of years of formal schooling that the mother has received.
alive - How many of the mother's children are still alive.
middle - Whether the mother is middle-class
poor - Whether the mother is poor
work - Whether the mother works for pay outside the home
cord1 to cord14 - The gender of the mother's first child, second child, etc up to fourteenth child.


Data format

The data set is provided as a space-delimited ASCII text file called india.txt.

The file indianMothersMeta.xml provides a StatDataML description of the data set.


Exercises

Q: The original storage format
Q: The storage format provided

Q: The original storage format

The data were first obtained in the form of an Stata binary .dta file.

The .dta format is described in detail online.

What are the advantages and disadvantages of having the data stored in this binary format?

Q: The storage format provided

The data are provided in this exercise in a plain text, space-delimited format.

What are the advantages and disadvantages of having the data stored in this plain text format?