Navigation

More options

Style variation

Close Menu

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How can I use categorical (class) inputs with my neural network?

Data Mining

How can I use categorical (class) inputs with my neural network?

by Predictor Posted May 16, 2004 (Edited Jun 25, 2004)

There are quite a few ways to present categorical inputs to neural networks (specifically, multilayer perceptrons). To be perfectly clear, categorical variables take on a finite number of distinct values, which are not ordered. An example of a categorical variable is "Country of Manufacture" for automobiles, which assumes values like "USA", "Germany" and "Japan", which do not have a natural order. In contrast, ordinal variables, such as "Size", have some inherent ordering, like: "Small", "Medium" and "Large".

The simplest way to present categorical data to neural networks is using dummy variables. One 0/1 flag is created for each possible value, like this:

"Country of Origin" "USAFlag" "GermanyFlag" "JapanFlag"
"USA" 1 0 0
"Germany" 0 1 0
"Japan" 0 0 1

This permits categorical data to be input to a neural network (or any mathematical model) and effectively localizes the information about each categorical value. It does, however, expand the number of input variables dramatically. In most situations, it is possible to use only one less flag than the total number of values. When "USAFlag" and "GermanyFlag" are both zero, the value of "JapanFlag" is implied. To reduce the number of dummy variables, classes with small representation can be collected under an "Other" category flag. Although this discards information distinguishing those small categories from one another, it can potentially reduce the number of inputs substantially.

A warning: For those familiar with binary numbering, it may be tempting to reduce the number of inputs by storing them as binary integers. This is not likely to work well. Consider an expanded "Country of Origin" example, which will add two values: "Britain" and "Italy". One might "compress" the representation by using only 2 bits like this:

"Country of Origin" "FlagA" "FlagB" "Flag C"
"USA" 0 0 1
"Germany" 0 1 0
"Japan" 0 1 1
"Britain" 1 0 0
"Italy" 1 0 1

In this representation, note that different pairs of values have varying numbers of bits in common. For instance, "Japan" differs from "Britain" in all three bits, while differing from "Germany" by only one. This implies an artificial similarity between "Japan" and "Germany". Keep in mind that there is no implicit ordering of these value. Stick to dummy variables and avoid binary representations.

The literature also records some success representing class inputs by the sample probability of the target class in classification problems. For instance, if the neural network is intended to classify cars as being "fuel efficient" (say, MPG >= 35) or "fuel inefficient" (MPG < 35), one can represent the categorical variable "Country of Origin" by the probability that examples from each value are "fuel efficient", like this:

"Country of Origin" "CountryClassSummary"
"USA" 0.45
"Germany" 0.67
"Japan" 0.83

Please Note: 1 is Bad, 10 is Good :-)

Part and Inventory Search

This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.

Accept Learn more…

Back

Top