Vector Symbolic Architectures

Friday, October 14, 2005

Random Indexing

Ross Gayler pointed out two interesting statements in An Introduction to Random Indexing (Magnus Sahlgren):

“Hecht-Nielsen (1994), … demonstrated that there are many more nearly orthogonal than truly orthogonal directions in a high-dimensional space.” (Hecht-Nielsen, R.; Context vectors; general purpose approximate meaning representations self-organized from raw data. In Zurada, J. M.; R. J. Marks II; C. J. Robinson; Computational intelligence: imitating life. IEEE Press, 1994.)
“the Johnson-Lindenstrauss lemma (Johnson & Lindenstrauss, 1984) – that states that if we project points in a vector space into a randomly selected subspace of sufficiently high dimensionality, the distances between the points are approximately preserved.” (Johnson, W. B. & J. Lindenstrauss; Extensions to Lipshitz mapping into Hilbert space. Contemporary mathematics. 26, 1984.)

Both these are very powerful properties of vector space representations, and these are particularly nice statemnts of these properties.

1 Comments:

The "curse of dimensionality" refers to the difficulty of making infernces from high-dimensional data (because there are so many possible combinatorial features). So, my take on it would be that high dimensionality is a curse when we have few data points, many dimensions, and little prior knowledge about which combinatorial features might be useful.

Isn't it quite easy to run the J-L lemma backwards, just by concatenating projections onto lower dimensional spaces? E.g., project a 10-d space onto 4 5-d spaces, and then concatenate to get a 20-d space. I'd also be suprised if it wasn't quite easy to prove the result more directly.

I think I mention this in my book -- the idea of projecting low-dimensional information into high-dimensional spaces in order to be able to work with HRRs.

By Tony Plate, at 8:19 AM, December 17, 2005

Post a Comment

<< Home

This blog is about using distributed representations in Cognitive Science and Artificial Intelligence. Vector symbolic architectures are ways of representing complex concepts in distributed representations (fixed width vectors). Holographic reduced represenation (HRR) is one example of a Vector Symbolic architecture. Don't expect to see this blog updated frequently -- it's more a collection of notes that a running commentary (I have a family and a full-time job in a different field :-).

Name: Tony Plate

View my complete profile