machine learning - feeding categorical data to classifier -


suppose have dataset in following format:

col1    col2     col3      col4         col5 (to predicted) 12      13       4         primary      12  1       15       2         secondary    13 5       7        8         primary      18 14      12       44        college      6 

col5 needs predicted test data using col1, col2, col3 , col4

during training, col1, col2, col3 can feeded such in array classifier how feed col4. aware categorical , need converted numeric type, after assigning number, still remain nominal type.

so if primary=1, secondary=2 , college=3, numbers 1,2 , 3 cant compared per magnitude because still labels, no numerical significance.

so how should proceed after step... should normalized ? or further should done ?

you should use 1 hot encoding in such cases. every possible categorial value creates new binary feature.

one hot encoding machine learning


Comments

Popular posts from this blog

resizing Telegram inline keyboard -

command line - How can a Python program background itself? -

php - "cURL error 28: Resolving timed out" on Wordpress on Azure App Service on Linux -