命题规则学习(下)
(Press ?
for help, n
and p
for next and previous slide)
\[H\leftarrow B_1 \wedge B_2 \wedge\ldots \wedge B_n.\]
也叫析取范式(Disjunctive Normal Form, DNF):用“\(\vee\)”把所有可能的正样例都包含进来 \[positive\leftarrow (\mathbf{f}_1 \wedge \mathbf{f}_2) \vee (\mathbf{f}_3 \wedge \mathbf{f}_4)\vee \ldots.\]
它可以等价地表示为一个命题逻辑规则集:
命题逻辑集是由命题逻辑规则构成的集合。
Closed World Assumption (CWA)
相当于:
也叫Sequential Covering
1: # E: 训练样本
2: function covering(E)
3: R = [] # 初始化规则集合
4: E = P, N # P/N: 正/负样例集合
5:
6: E_rest = E # 当前还未覆盖的样本
7: P_rest = P # 当前还未覆盖的负样例
8:
9: F = init_features(E) # 初始化特征集合
10: while isempty(P_rest) == false
11: # 检查是否还需要学更多规则
12: if need_more_rule(R, E_rest) == false
13: break
14: end
15:
16: # 学习单条命题规则r, 并加入规则集R
17: r = find_best_rule(E_rest, F)
18: push!(R, r)
19:
20: # 调整样本集合(例如删掉被覆盖的样例)
21: E_rest = adjust_examples(R, E_rest)
22: end
23:
24: # 后处理
25: R = post_process_ruleset(R)
26:
27: return R
28: end
精度(accuracy)vs准确率(precision)
精度(accuracy)vs准确率(precision)
1: function RISE(E)
2: # P/N为正/负样例
3: P, N = E
4:
5: # 生成bottom rules, 即只覆盖一个正样例的规则
6: R = bottom_rules(E)
7:
8: while true
9: for r in R
10: # 在P中找到:
11: # 1. 距离规则r最近(Hamming)的
12: # 2. 未覆盖的正样例
13: e = select_nearest_uncovered_neighbor(P, r)
14:
15: # 泛化规则,令r_1能覆盖r和e
16: r_1 = generalize(r, e)
17:
18: # 如果新规则不导致准确率下降,则保留
19: R_1 = R - r + r_1
20: if ! accuracy(R_1) < accuracy(R)
21: R = R_1
22: end
23: end
24:
25: # 如果满足退出条件,返回R
26: if meeting_stop_criterion(R)
27: break
28: end
29: end
30:
31: return R
32: end
与所有机器学习模型一样,在有噪声的数据中学习可能造成命题规则集过拟合(overfitting)
与决策树一样,可分为:
利用启发式函数约束,在规则(集)生长时及时停止
对于已学得的规则集进行以下行为,并评估打分函数
REP [Brunk and Pazzani, 1991]和IREP [Fürnkranz and Widmer, 1994]
“A small network illustrating the limitations of prepositional descriptions.” [Quinlan, 1990]
假设网络:
需要学习的目标概念:“两个节点相接”
只需要用一个谓词:\(\text{linked-to}(X,Y)\)
“两个节点相接”: \[ \text{linked-to-each-other}(X,Y)\leftarrow \text{linked-to}(X,Y)\wedge \text{linked-to}(Y, X) \]
“两个节点连通”:
“两个节点连通”:
初始规则1:\(\text{can-reach}(X,Y)\leftarrow\)
加入文字:\(\text{can-reach}(X,Y)\leftarrow \text{linked-to}(X,Y)\)
初始规则2:\(\text{can-reach}(X,Y)\leftarrow\)
加入文字:\(\text{can-reach}(X,Y)\leftarrow\text{linked-to}(X,Z)\)
包含变量\(\langle X, Y, Z\rangle\),扩展样例:
加入文字:\(\text{can-reach}(X,Y)\leftarrow\text{linked-to}(X,Z)\wedge\text{can-reach}(Z,Y)\)
\[ \text{正样本覆盖率}\cdot(\text{信息量}-\text{精化前信息量})=\hat{P}\cdot\left(\log_2{\frac{\hat{P}}{\hat{P}+\hat{N}}}-\log_2 c\right) \]