Extension neural network

Extension neural network is a pattern recognition method found by M. H. Wang and C. P. Hung in 2003 to classify instances of data sets. Extension neural network is composed of artificial neural network and extension theory concepts. It uses the fast and adaptive learning capability of neural network and correlation estimation property of extension theory by calculating extension distance.
ENN was used in:

Extension theory was first proposed by Cai in 1983 to solve contradictory problems. While classical mathematics is familiar with quantity and forms of objects, extension theory transforms these objects to matter-element models.

 

 

 

 

(1)

where in matter

R

{\displaystyle R}

,

N

{\displaystyle N}

is the name or type,

C

{\displaystyle C}

is its characteristics and

V

{\displaystyle V}

is the corresponding value for the characteristic. There is a corresponding example in equation 2.

 

 

 

 

(2)

where

H
e
i
g
h
t

{\displaystyle Height}

and

W
e
i
g
h
t

{\displaystyle Weight}

characteristics form extension sets. These extension sets are defined by the

V

{\displaystyle V}

values which are range values for corresponding characteristics. Extension theory concerns the extension correlation function between matter-element models like shown in equation 2, and extension sets. Extension correlation function is used to define extension space which is composed of pairs of elements and their extension correlation functions. The extension space formula is shown in equation 3.

 

 

 

 

(3)

where,

A

{\displaystyle A}

is the extension space,

U

{\displaystyle U}

is the object space,

K

{\displaystyle K}

is the extension correlation function,

x

{\displaystyle x}

is an element from the object space and

y

{\displaystyle y}

is the corresponding extension correlation function output of element

x

{\displaystyle x}

.

K
(
x
)

{\displaystyle K(x)}

maps

x

{\displaystyle x}

to a membership interval

[



,

]

{\displaystyle \left[-\infty ,\infty \right]}

. Negative region represents an element not belonging membership degree to a class and positive region vice versa. If

x

{\displaystyle x}

is mapped to

[

0
,
1

]

{\displaystyle \left[0,1\right]}

, extension theory acts like fuzzy set theory. The correlation function can be shown with the equation 4.

r
(
x
,

X

o
u
t

)
=

|

x

c
+
d

2

|

d

c

2

{\displaystyle \rho (x,X_{out})=\left|x-{\frac {c+d}{2}}\right|-{\frac {d-c}{2}}}

 

 

 

 

(4)

where,

X

i
n

{\displaystyle X_{in}}

and

X

o
u
t

{\displaystyle X_{out}}

are called concerned and neighborhood domain and their intervals are (a,b) and (c,d) respectively. The extended correlation function used for estimation of membership degree between

x

{\displaystyle x}

and

X

i
n

{\displaystyle X_{in}}

,

X

o
u
t

{\displaystyle X_{out}}

is shown in equation 5.

 

 

 

 

(5)

Extension neural network has a neural network like appearance. Weight vector resides between the input nodes and output nodes. Output nodes are the representation of input nodes by passing them through the weight vector.

There are total number of input and output nodes are represented by

n

{\displaystyle n}

and

n

c

{\displaystyle n_{c}}

, respectively. These numbers depend on the number of characteristics and classes. Rather than using one weight value between two layer nodes as in neural network, extension neural network architecture has two weight values. In extension neural network architecture, for instance

i

{\displaystyle i}

,

x

i
j

p

{\displaystyle x_{ij}^{p}}

is the input which belongs to class

p

{\displaystyle p}

and

o

i
k

{\displaystyle o_{ik}}

is the corresponding output for class

k

{\displaystyle k}

. The output

o

i
k

{\displaystyle o_{ik}}

is calculated by using extension distance as shown in equation 6.

k
=
1
,
2
,
.
.
.
.
,

n

c

{\displaystyle k=1,2,….,n_{c}}

 

 

 

 

(6)

Estimated class is found through searching for the minimum extension distance among the calculated extension distance for all classes as summarized in equation 7, where

k

{\displaystyle k^{*}}

is the estimated class.

 

 

 

 

(7)

Each class is composed of ranges of characteristics. These characteristics are the input types or names which come from matter-element model. Weight values in extension neural network represent these ranges. In the learning algorithm, first weights are initialized by searching for the maximum and minimum values of inputs for each class as shown in equation 8

w

k
j

L

=

min

i

{

x

i
j

k

}

{\displaystyle w_{kj}^{L}=\min _{i}\{x_{ij}^{k}\}}

i
=
1
,
.
.
.
,

N

p

{\displaystyle i=1,…,N_{p}}

k
=
1
,
.
.
.
,

n

c

{\displaystyle k=1,…,n_{c}}

j
=
1
,
2
,
.
.
.
.
,
n

{\displaystyle j=1,2,….,n}

 

 

 

 

(8)

where,

i

{\displaystyle i}

is the instance number and

j

{\displaystyle j}

is represents number of input. This initialization provides classes’ ranges according to given training data.

After maintaining weights, center of clusters are found through the equation 9.

z

k
j

=

w

k
j

U

+

w

k
j

L

2

{\displaystyle z_{kj}={\frac {w_{kj}^{U}+w_{kj}^{L}}{2}}}

k
=
1
,
2
,
.
.
.
.
,

n

c

{\displaystyle k=1,2,….,n_{c}}

j
=
1
,
2
,
.
.
.
.
,
n

{\displaystyle j=1,2,….,n}

 

 

 

 

(9)

Before learning process begins, predefined learning performance rate is given as shown in equation 10

 

 

 

 

(10)

where,

N

m

{\displaystyle N_{m}}

is the misclassified instances and

N

p

{\displaystyle N_{p}}

is the total number of instances. Initialized parameters are used to classify instances with using equation 6. If the initialization is not sufficient due to the learning performance rate, training is required. In the training step weights are adjusted to classify training data more accurately, therefore reducing learning performance rate is aimed. In each iteration,

E

t

{\displaystyle E_{\tau }}

is checked to control if required learning performance is reached. In each iteration every training instance is used for training.
Instance

i

{\displaystyle i}

, belongs to class

p

{\displaystyle p}

is shown by:

X

i

p

=
{

x

i
1

p

,

x

i
2

p

,
.
.
.
,

x

i
n

p

}

{\displaystyle X_{i}^{p}=\{x_{i1}^{p},x_{i2}^{p},…,x_{in}^{p}\}}

1

p

n

c

{\displaystyle 1\leq p\leq n_{c}}

Every input data point of

X

i

p

{\displaystyle X_{i}^{p}}

is used in extension distance calculation to estimate the class of

X

i

p

{\displaystyle X_{i}^{p}}

. If the estimated class

k

=
p

{\displaystyle k^{*}=p}

then update is not needed. Whereas, if

k


p

{\displaystyle k^{*}\neq p}

then update is done. In update case, separators which show the relationship between inputs and classes, are shifted proportional to the distance between the center of clusters and the data points.
The update formula:

z

p
j

n
e
w

=

z

p
j

o
l
d

+
i
(

x

i
j

p

z

p
j

o
l
d

)

{\displaystyle z_{pj}^{new}=z_{pj}^{old}+\eta (x_{ij}^{p}-z_{pj}^{old})}

z

k

j

n
e
w

=

z

k

j

o
l
d


i
(

x

i
j

p

z

k

j

o
l
d

)

{\displaystyle z_{k^{*}j}^{new}=z_{k^{*}j}^{old}-\eta (x_{ij}^{p}-z_{k^{*}j}^{old})}

w

p
j

L
(
n
e
w
)

=

w

p
j

L
(
o
l
d
)

+
i
(

x

i
j

p

z

p
j

o
l
d

)

{\displaystyle w_{pj}^{L(new)}=w_{pj}^{L(old)}+\eta (x_{ij}^{p}-z_{pj}^{old})}

w

p
j

U
(
n
e
w
)

=

w

p
j

U
(
o
l
d
)

+
i
(

x

i
j

p

z

p
j

o
l
d

)

{\displaystyle w_{pj}^{U(new)}=w_{pj}^{U(old)}+\eta (x_{ij}^{p}-z_{pj}^{old})}

w

k

j

L
(
n
e
w
)

=

w

k

j

L
(
o
l
d
)


i
(

x

i
j

p

z

k

j

o
l
d

)

{\displaystyle w_{k^{*}j}^{L(new)}=w_{k^{*}j}^{L(old)}-\eta (x_{ij}^{p}-z_{k^{*}j}^{old})}

w

k

j

U
(
n
e
w
)

=

w

k

j

U
(
o
l
d
)


i
(

x

i
j

p

z

k

j

o
l
d

)

{\displaystyle w_{k^{*}j}^{U(new)}=w_{k^{*}j}^{U(old)}-\eta (x_{ij}^{p}-z_{k^{*}j}^{old})}

To classify the instance

i

{\displaystyle i}

accurately, separator of class

p

{\displaystyle p}

for input

j

{\displaystyle j}

moves close to data-point of instance

i

{\displaystyle i}

, whereas separator of class

k

{\displaystyle k^{*}}

for input

j

{\displaystyle j}

moves far away. In the above image, an update example is given. Assume that instance

i

{\displaystyle i}

belongs to class A, whereas it is classified to class B because extension distance calculation gives out

E

D

A

>
E

D

B

{\displaystyle ED_{A}>ED_{B}}

. After the update, separator of class A moves close to the data-point of instance

i

{\displaystyle i}

whereas separator of class B moves far away. Consequently, extension distance gives out

E

D

B

>
E

D

A

{\displaystyle ED_{B}>ED_{A}}

, therefore after update instance

i

{\displaystyle i}

is classified to class A.

Cart

loader
Top
0