On Sat, 1 Oct 2016 18:12:29 -0700 (PDT), 38016226...@gmail.com wrote:
> I am trying to print a simple decision tree for my homework.
> The answer must keep in this format:
>
> Top 7,4,0.95
> career gain = 100
>       1.Management 2, 3, 0.9709505944546686
>       2.Service 5, 1, 0.6500224216483541
> location gain = 100
>       1.Oregon 4, 1, 0.7219280948873623
>       2.California 3, 3, 1.0
> edu_level gain = 100
>       1.High School 5, 1, 0.6500224216483541
>       2.College 2, 3, 0.9709505944546686
> years_exp gain = 100
>       1.Less than 3 3, 1, 0.8112781244591328
>       2.3 to 10 2, 1, 0.9182958340544896
>       3.More than 10 2, 2, 1.0
>
> Here is my code:
>     features={'edu_level':['High School',
                             'College'],
                'career':    ['Management',
                              'Service'],
                'years_exp':['Less than 3',
                             '3 to 10',
                             'More than 10'],
                'location':['Oregon',
                            'California']}
>
>     print('Top 7,4,0.95')
>     for key in features:
>         print('{} gain = {}'.format(key,100))
>         attributes_list=features[key]
>         kargs={}
>         for i in range(len(attributes_list)):
>             kargs[key]=attributes_list[i]
>             low=table.count('Low',**kargs)
>             high=table.count('High',**kargs)
>             print('\t{}.{} {}, {}, {}'.format(
                    i+1,attributes_list[i],low,high,entropy(low,high)))
>
> I set all the gain as 100 now.But actually the gain must calculate
> with the data below.  For example, the career gain need the data of
> 'Management' and 'Service'.  I don't know how to do.  or Anyone can
> provide me a better logic?

I interpret your question as meaning that the value that you 
print after "gain =" should depend on features[key].  To do that,
you'll need to insert a line resembling
          gain = gain_from_features(features[key])
before the print statement.  You'll have to write the gain_from_features
function, and provide it with the numbers from which it will
compute the gain.

As a stylistic suggestion, note that Python allows you to break
your "features=" line into a more readable format, as I have done
above.

Another stylistic suggestions:
  for key, attributes_list in features.iteritems():

-- 
To email me, substitute nowhere->runbox, invalid->com.
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to