penalty
Class PenaltyMinimizer

java.lang.Object
  extended bypenalty.PenaltyMinimizer
All Implemented Interfaces:
SystemMatcher

public class PenaltyMinimizer
extends java.lang.Object
implements SystemMatcher

This algorithm tries to compute the correspondence between two concept systems that minimizes a certain penalty function. The penalty function is constructed so that there is a penalty imposed whenever there is a correspondence between node a in the first concept system and node x in the second concept system, and a correspondence between node b and node y, while the relationship between a and b in the first system is not identical to the relationship between x and y in the second system.

The penalty function is defined as a function of a correspondence matrix C as follows:

Penalty(C) = MatchingPenalty(C) + kappa * NormalizationPenalty(C)
Here, MatchingPenalty(C) is designed as to minimize the degree of mismatch between the structures of the graphs (i.e., for two graphs that are isomorphic, the absolute minimum of MatchingPenalty(C)=0 will be reached on a correspondence matrix C where C(a,x)!=0 only on pairs (a,x)=(a,f(a)) for some isomorphism f). NormalizationPenalty(C) is designed to be zero on a matrix C for which the values in each row and each column sum to 1. (Or to sqrt(m/n) and sqrt(n/m), respectively, if the two systems have a different number of nodes). Namely,
MatchingPenalty(C) = 
  = K1 * sum_{a,b,x,y: linked(a,b) & !linked(x,y)}{ C(a,x)*C(b,y)} +
  + K1 * sum_{a,b,x,y: !linked(a,b) & linked(x,y)}{ C(a,x)*C(b,y)} +
  + K2 * sum_{a,x,y:  !linked(x,y)}{ C(a,x)*C(a,y)} +
  + K2 * sum_{a,b,x:  !linked(a,b)}{ C(a,x)*C(b,x)} +
  + K3 * sum_{a,x,y:  linked(x,y)}{ C(a,x)*C(a,y)} +
  + K3 * sum_{a,b,x:  linked(a,b)}{ C(a,x)*C(b,x)}.
Thus, K1 is the penalty for matching a pair of linked nodes to a pair of non-linked nodes, or vice versa; K2 is the penalty for matching a single node to two non-linked nodes, or vice versa; and K3 is the penalty for matching a single node to two linked nodes, or vice versa.
NormalizationPenalty(C) =
   = sum_{a}{ (sum_{x}{C(a,x)} - sqrt(m/n))^2 } +
   + sum_{x}{ (sum_{a}{C(a,x)} - sqrt(n/m))^2 }.

The approach used in this method, that is, minimizing MatchingPenalty(C)+kappa*NormalizationPenalty(C) on the set of all matrices with non-negative coeffciients, is, really, a cludge, and may only produce an approximate solution. To obtain an exact solution, one would want to minimize MatchingPenalty(C) over the set of all matrices for which NormalizationPenalty(C)=0. But the latter would be a somewhat more complex quadratic-programming problem.

This program uses a very traditional fastest descent method, and works as follows:

1. We take an initial point C (a random matrix with positive
coefficients);

2. We compute the gradient of the penalty functional g; we
know that C is increases in the direction of vector g, and decreases
in the direction  of -g;

3. We obtain the "restricted" ascent vector v. v is based on g,
but if the current point C alread has some zero coefficients we
may replace the corresponding component of v with 0 to ensure that
going along the direction of -v won't make that component of C negative.

4. We solve the one-dimension minimization problem, finding the
real number xi that minimzes Penalty( C - xi * v), while C'=C-xi*v
still stays in the non-negative domain.

5. Assign C := C' = C - xi*v, and continue to step 2.

This program termiates after a specified number of iterations, or when it is detected that the penalty function cannot be noticeably reduced any more. (That is, when Penalty(C)-Penalty(C') is very small, or when v is almost zero).

The general outline of the code is based on that of Absurdist class, as it stood in April 2003.

See Also:
Absurdist

Field Summary
(package private)  double accuracy
           
(package private)  double[][] correspond
           
(package private) static boolean debug
           
 double K1
          K1 and K3 are the mapping penalty parameters.
 double K2
          K1 and K3 are the mapping penalty parameters.
 double K3
          K1 and K3 are the mapping penalty parameters.
 double kappa
          Kappa is the normzlization penalty parameter.
(package private)  int[] mapNO
           
(package private)  int[] mapON
           
(package private)  double mapT
           
(package private)  int nLoop
          Maximum number of iterations.
(package private)  CSGI sysN
          Graph-style interface for systemO and systemN
(package private)  CSGI sysO
          Graph-style interface for systemO and systemN
(package private)  ConceptSystem systemN
           
(package private)  ConceptSystem systemO
           
 
Constructor Summary
PenaltyMinimizer()
           
 
Method Summary
 void adjustCoeff()
           
 void evalMap()
          Computes the accuracy of the mapping.
 double evalMap2()
          An alternative way to evaluate the accuracy of the mapping.
 void findMapping()
          Finds the strongest valid correspondence as successful mappings.
(package private)  CorrespondenceVector gradMatchingPenalty(CorrespondenceVector u)
           
(package private)  CorrespondenceVector gradNormPenalty(CorrespondenceVector u)
           
(package private)  CorrespondenceVector gradTotalPenalty(CorrespondenceVector u)
           
(package private)  CorrespondenceVector initRandom()
          Creates a ranomly-initialized vector.
 void iterateNet()
          Iterates, using the steepest descent method, to find a correspondence between two concept systems minimizing the penalty functional.
 void mapSystem(ConceptSystem systemO, ConceptSystem systemN)
          Maps orginal concept system to the one with added noise.
(package private)  double matchingPenalty(CorrespondenceVector u, CorrespondenceVector v)
           
(package private)  double normPenalty(CorrespondenceVector c, double zu)
          Computes the normalization penalty |B*c-z*b|^2
(package private)  double normPenalty(CorrespondenceVector u, double zu, CorrespondenceVector v, double zv)
          Computes the dot product of two vector combinations.
 void print(boolean toFile)
          Prints the original and noisy concept systems, the correspondence matrix, and the final mappings.
 void printCorrespond(java.io.PrintWriter out)
           
 void printMapping(java.io.PrintWriter out)
           
 void printMatrix(double[][] c, java.io.PrintWriter out)
           
 void printSystem(java.io.PrintWriter out)
           
(package private)  void restrictGrad(CorrespondenceVector g, CorrespondenceVector c)
           
(package private)  double restrictXi(double xi, CorrespondenceVector v, CorrespondenceVector c)
          Ensures that the vector c - xi*v stays in the positive octant
 void setSystem(ConceptSystem systemO, ConceptSystem systemN)
           
(package private)  void smooth(CorrespondenceVector c)
          Replaces near-zero coordinate values of the correpsondence matrix c with exact zeros.
(package private)  double totalPenalty(CorrespondenceVector u)
           
(package private)  double totalPenalty(CorrespondenceVector u, double zu, CorrespondenceVector v, double zv)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

debug

static final boolean debug
See Also:
Constant Field Values

K1

public double K1
K1 and K3 are the mapping penalty parameters. In general, the elements of the penalty matrix are as follows:
            C(a,a)  C(a,b)  C(a,c)
   C(x,x)   0       K3      K2
   C(x,y)   K3      0       K1
   C(x,z)   K2      K1      0
Here, a is a node in the first graph; b is a node in the first graph linked to a; c is a node in the first graph NOT linked to a. Similarly, x is a node in the second graph; y is a node in the second graph linked to x; z is a node in the second graph NOT linked to x.

Thus, K1 is a penalty for mapping a pair of linked nodes to a pair of distinct but not lined nodes; K2 is a penalty for mapping a single node to two distinct, not linked nodes; and K3 is a penalty for mapping a single node to two distinct linked nodes.

However, in order to be able to construct an efficient implementation, we always have K1=K2. Thus, K1=K2 is a penalty for a mapping two


K2

public double K2
K1 and K3 are the mapping penalty parameters. In general, the elements of the penalty matrix are as follows:
            C(a,a)  C(a,b)  C(a,c)
   C(x,x)   0       K3      K2
   C(x,y)   K3      0       K1
   C(x,z)   K2      K1      0
Here, a is a node in the first graph; b is a node in the first graph linked to a; c is a node in the first graph NOT linked to a. Similarly, x is a node in the second graph; y is a node in the second graph linked to x; z is a node in the second graph NOT linked to x.

Thus, K1 is a penalty for mapping a pair of linked nodes to a pair of distinct but not lined nodes; K2 is a penalty for mapping a single node to two distinct, not linked nodes; and K3 is a penalty for mapping a single node to two distinct linked nodes.

However, in order to be able to construct an efficient implementation, we always have K1=K2. Thus, K1=K2 is a penalty for a mapping two


K3

public double K3
K1 and K3 are the mapping penalty parameters. In general, the elements of the penalty matrix are as follows:
            C(a,a)  C(a,b)  C(a,c)
   C(x,x)   0       K3      K2
   C(x,y)   K3      0       K1
   C(x,z)   K2      K1      0
Here, a is a node in the first graph; b is a node in the first graph linked to a; c is a node in the first graph NOT linked to a. Similarly, x is a node in the second graph; y is a node in the second graph linked to x; z is a node in the second graph NOT linked to x.

Thus, K1 is a penalty for mapping a pair of linked nodes to a pair of distinct but not lined nodes; K2 is a penalty for mapping a single node to two distinct, not linked nodes; and K3 is a penalty for mapping a single node to two distinct linked nodes.

However, in order to be able to construct an efficient implementation, we always have K1=K2. Thus, K1=K2 is a penalty for a mapping two


kappa

public double kappa
Kappa is the normzlization penalty parameter. The larger it is, the closer the solution will be to the "normalized-matrix plane", where the values in each row and each column of the matrix sums to 1.


systemO

ConceptSystem systemO

systemN

ConceptSystem systemN

sysO

CSGI sysO
Graph-style interface for systemO and systemN


sysN

CSGI sysN
Graph-style interface for systemO and systemN


correspond

double[][] correspond

mapON

int[] mapON

mapNO

int[] mapNO

accuracy

double accuracy

nLoop

int nLoop
Maximum number of iterations. It would be wise to set this to, at least, systemO.n * systemN.n


mapT

double mapT
Constructor Detail

PenaltyMinimizer

public PenaltyMinimizer()
Method Detail

setSystem

public void setSystem(ConceptSystem systemO,
                      ConceptSystem systemN)

adjustCoeff

public void adjustCoeff()

normPenalty

double normPenalty(CorrespondenceVector c,
                   double zu)
Computes the normalization penalty |B*c-z*b|^2


normPenalty

double normPenalty(CorrespondenceVector u,
                   double zu,
                   CorrespondenceVector v,
                   double zv)
Computes the dot product of two vector combinations.

Returns:
        (B*u-z_u * b, B*v-z_v * b) =
        sum_a { (sum_x{u(a,x)} - sqrt(m/n))*(sum_x{v(a,x)} - sqrt(m/n)) }
        sum_x { (sum_a{u(a,x)} - sqrt(m/n))*(sum_a{u(a,x)} - sqrt(m/n)) }
        

matchingPenalty

double matchingPenalty(CorrespondenceVector u,
                       CorrespondenceVector v)

totalPenalty

double totalPenalty(CorrespondenceVector u,
                    double zu,
                    CorrespondenceVector v,
                    double zv)

totalPenalty

double totalPenalty(CorrespondenceVector u)

gradMatchingPenalty

CorrespondenceVector gradMatchingPenalty(CorrespondenceVector u)

gradNormPenalty

CorrespondenceVector gradNormPenalty(CorrespondenceVector u)

gradTotalPenalty

CorrespondenceVector gradTotalPenalty(CorrespondenceVector u)

initRandom

CorrespondenceVector initRandom()
Creates a ranomly-initialized vector. It is in the positive octant, but not 1-normalized


restrictGrad

void restrictGrad(CorrespondenceVector g,
                  CorrespondenceVector c)

restrictXi

double restrictXi(double xi,
                  CorrespondenceVector v,
                  CorrespondenceVector c)
            throws java.lang.Exception
Ensures that the vector c - xi*v stays in the positive octant

Throws:
java.lang.Exception

smooth

void smooth(CorrespondenceVector c)
Replaces near-zero coordinate values of the correpsondence matrix c with exact zeros. This has two purposes:
  1. it ensures that the vector c stays in the non-negative octant;
  2. it allows proper component removal in vector v in restrictGrad.

This method expects that all components of c are already almost non-negative


iterateNet

public void iterateNet()
                throws java.lang.Exception
Iterates, using the steepest descent method, to find a correspondence between two concept systems minimizing the penalty functional.

In practice, this methods often converges to a local minimum, rather than the global one, thus providing a poor quality match.

Throws:
java.lang.Exception

findMapping

public void findMapping()
Finds the strongest valid correspondence as successful mappings.


evalMap

public void evalMap()
Computes the accuracy of the mapping.


evalMap2

public double evalMap2()
An alternative way to evaluate the accuracy of the mapping.

Returns:
The number that counts edges present in one graph in absent in its "image", and vice versa. This should be 0 is the mapping is a homomorphism, i.e. a perfect matching date: April-10-2003

mapSystem

public void mapSystem(ConceptSystem systemO,
                      ConceptSystem systemN)
               throws java.lang.Exception
Maps orginal concept system to the one with added noise.

Specified by:
mapSystem in interface SystemMatcher
Parameters:
systemO - The first concept system to match. In our experiments, this is the "original" one.
systemN - The second system to match. In our experiments, this system has been obtained from the first one by adding some "noise".
Throws:
java.lang.Exception

printSystem

public void printSystem(java.io.PrintWriter out)
                 throws java.lang.Exception
Throws:
java.lang.Exception

printCorrespond

public void printCorrespond(java.io.PrintWriter out)
                     throws java.lang.Exception
Throws:
java.lang.Exception

printMatrix

public void printMatrix(double[][] c,
                        java.io.PrintWriter out)
                 throws java.lang.Exception
Throws:
java.lang.Exception

printMapping

public void printMapping(java.io.PrintWriter out)
                  throws java.lang.Exception
Throws:
java.lang.Exception

print

public void print(boolean toFile)
           throws java.lang.Exception
Prints the original and noisy concept systems, the correspondence matrix, and the final mappings.

Specified by:
print in interface SystemMatcher
Throws:
java.lang.Exception