好文档 - 专业文书写作范文服务资料分享网站

大数据挖掘第二次作业

天下 分享 时间: 加入收藏 我要投稿 点赞

实用文档

数据挖掘第二次作业

第一题: 1.

a) Compute the Information Gain for Gender, Car Type and Shirt Size. b) Construct a decision tree with Information Gain.

答案:

a) 因为class分为两类:C0和C1,其中C0的频数为10个,C1的频数为10,所以class

元组的信息增益为Info(D)=?1.按照Gender进行分类:

Infogender(D)=

1020

10

?log2()??log2(10/20)=1 202020

1010

?(?

410

?log2()?

10

4610

?log2())+

10

61020

?(?

410

?log2()?

10

4610

?log2())=0.971

10

6

Gain(Gender)=1-0.971=0.029 2.按照Car Type进行分类 InfocarType(D)=

88

42088

?(??log2()??log2())+

4

4

4

4

08

08

1133820

?(??log2()??log2())+

8

8

8

8

7711820

?

(??log2()??log2())=0.314

Gain(Car Type)=1-0.314=0.686 3.按照Shirt Size进行分类: InfoshirtSize(D)=

520

?(??log2()??log2())+

5

5

5

5

3322720

?(??log2()??log2())+

7

7

7

7

4433420

?

实用文档

(??log2()??log2())+

4

4

4

4

2222420

?(??log2()??log2())=0.988

4

4

4

4

2222

Gain(Shirt Size)=1-0.988=0.012

b) 由a中的信息增益结果可以看出采用Car Type进行分类得到的信息增益最大,所以决策树为:

第二题:

Car Type? family Sport luxury Shirt Size? C0 medium,large, extra large C1 small C0 C1 2. (a) Design a multilayer feed-forward neural network (one hidden layer) for the data set in Q1. Label the nodes in the input and output layers.

(b)Using the neural network obtained above, show the weight values after one iteration of the back propagation algorithm, given the training instance “(M, Family, Small)\weight values and biases and the learning rate used.

a)

实用文档

输入层x11x12x21x22x23x31x32x33x341隐藏层输出层23104125611789

X12 M 1 X21 Family 1 X22 Sports 0 X23 Luxury 0 X31 Small 1 X32 Medium 0 X33 Large 0 X34 Extra Large 0 b) 由a可以设每个输入单元代表的属性和初始赋值

X11 F 0

由于初始的权重和偏倚值是随机生成的所以在此定义初始值为: W1,10 0.2 W6,10 0.1 θ10 -0.2

净输入和输出: 单元 j 10 11 12

每个节点的误差表:

净输入 Ij 0.1 0.2 0.089 输出Oj 0.52 0.55 0.48 W1,11 0.2 W6,11 -0.2 θ11 0.2 W2,10 -0.2 W7,10 -0.4 θ12 0.3 W2,11 -0.1 W7,11 0.2 W3,10 0.4 W8,10 0.2 W3,11 0.3 W8,11 0.2 W4,10 -0.2 W9,10 -0.1 W4,11 -0.1 W9,11 0.3 W5,10 0.1 W10,12 -0.3 W5,11 -0.1 W11,12 -0.1

大数据挖掘第二次作业

实用文档数据挖掘第二次作业第一题:1.a)ComputetheInformationGainforGender,CarTypeandShirtSize.b)ConstructadecisiontreewithInformationGain.答案:
推荐度:
点击下载文档文档为doc格式
5s0jq7x2qp8mqar1rud16ehs64cxfu01246
领取福利

微信扫码领取福利

微信扫码分享