Data

Data

Categorical Data (or Qualitative Data)

Categorical Data (or Qualitative Data)

Numerical Data (or Quantitative Data)

Numerical Data (or Quantitative Data)

2.1.1 Table for Categorical Data

Example A survey of some company CEOs about their highest college degree shows the following:

MBA

MBA

Law

Law

MBA

PhD

None

Masters

Bachelors

Bachelors

MBA

Bachelors

MBA

MBA

Masters

Law

Bachelors

MBA

Bachelors

Bachelors

In this example, the categories are: Bachelors, Law, Masters, MBA, None and PhD.

The degree for each individual CEO is called an observation.

The number of observations in a category is called the frequency of the category.

We use the following frequency table to summarize the above data:

Degree

There are 6 Bachelors

There are 6 Bachelors Frequency

Bachelors

6

Law

3

Masters

2

MBA

7

None

1

PhD

Total frequency: 6+3+2+7+1+1 = 20

Total frequency: 6+3+2+7+1+1 = 201

Total

20

The ratio of the frequency of a category to the total frequency is called relative frequency:

We use the following table (frequency and relative frequency table) to show the frequency distribution and relative frequency distribution:

Degree

Frequency

Relative Frequency

Bachelors

6

6/20 = 0.3

Law

3

3/20 = 0.15

Masters

2

2/20 = 0.1

MBA

7

7/20 = 0.35

None

1

1/20 = 0.05

PhD

1

1/20 = 0.05

Total

20

1

You can show your work as below without showing the computation of the relative frequencies.

Degree

Frequency

Relative Frequency

Bachelors

6

0.3

Law

3

0.15

Masters

2

0.1

MBA

7

0.35

None

1

0.05

PhD

1

0.05

Total

20

1

Sometimes we use percentage instead of relative frequency. The following table is a frequency and percent distribution table.

Degree

Frequency

Percent

Bachelors

6

6/20 × 100% = 30%

Law

3

3/20 × 100% = 15%

Masters

2

2/20 × 100% = 10%

MBA

7

7/20 × 100% = 35%

None

1

1/20 × 100% = 5%

PhD

1

1/20 × 100% = 5%

Total

20

100%

Note: Usually we think relative frequency distribution and percent distribution are equivalent. Therefore, if we have constructed a relative frequency table, then there is no need to construct a percent table, and vice versa.

2.1.2 Table for Numerical Data

Example The following shows monthly electricity bills (in dollars) for a sample of households:

130 55 45 64 155 66 60 80 102 62

58 75 111 139 81 55 66 90

For numerical data, we often use intervals to classify the data. For example, we may choose the following intervals:

40-59, 60-79, 80-99, 100-119, 120-139 and 140-159

Then we have the following frequency table:

Monthly electricity bill

Frequency

40 – 59

4

60 – 79

6

80 – 99

3

100 – 119

2

120 – 139

2

140 – 159

1

total

18

Note: The above shows that four households have bills between 40 and 59 (55, 45, 58, 55).

Note: For homework assignments or tests, I will give you the intervals.

The frequency and relative frequency table is as follows:

Monthly electricity bill

Frequency

Relative Frequency

40 – 59

4

4/18 = 0.22

60 – 79

6

6/18 = 0.33

80 – 99

3

3/18 = 0.17

100 – 119

2

2/18 = 0.11

120 – 139

2

2/18 = 0.11

140 – 159

1

1/18 = 0.06

total

18

1

Rounding Numbers

In the above, 4/18 = 0.222….. If we keep more decimal places, it is more accurate but more tedious. In this course, in most cases, it is required to keep at least two decimal places.

We use the following rule to round 2 decimal places:

If the 3rd decimal number is 5, 6, 7, 8, or 9, then add the 2nd decimal number by 1. For example, in the above, 1/18 = 0.05555…. Then 1/18 = 0.06.

If the 3rd decimal number is 0, 1, 2, 3, or 4, then keep the same 2nd decimal number. For example, in the above, 4/18 = 0.222…. Then 4/18 = 0.22.