\(\newcommand{\abs}[1]{\left\lvert#1\right\rvert}\) \(\newcommand{\norm}[1]{\left\lVert#1\right\rVert}\) \(\newcommand{\inner}[1]{\left\langle#1\right\rangle}\) \(\DeclareMathOperator*{\argmin}{arg\,min}\) \(\DeclareMathOperator*{\argmax}{arg\,max}\) \(\DeclareMathOperator*{\E}{\mathbb{E}}\) \(\DeclareMathOperator*{\V}{\mathbb{V}}\)

T

his post introduces random projections as a tool for dimensionality reduction in high-dimensional spaces. The emphasis is geometric: how random linear maps approximately preserve distances, inner products, and subspace structure. Probabilistic limit theory is used only implicitly; detailed asymptotic tools are treated separately.


Motivation: Geometry in High Dimensions

In high dimensions, many datasets exhibit redundancy: although observations live in $\mathbb{R}^N$ with $N$ large, their intrinsic dimension is often much smaller. Random projections exploit this by embedding high-dimensional data into a lower- dimensional space while approximately preserving Euclidean geometry.

The central question is:

How small can the target dimension be while preserving pairwise distances?


Random Linear Maps

Let $R \in \mathbb{R}^{n\times N}$ be a random matrix and define the linear map \(f(x) = \frac{1}{\sqrt{n}} R x.\)

Common choices for $R$ include:

  • i.i.d. Gaussian entries $R_{ij}\sim \mathcal N(0,1)$,
  • i.i.d. Rademacher entries $R_{ij}\in{\pm1}$ with equal probability,
  • sparse or structured random matrices.

The normalization ensures that $\mathbb E|f(x)|_2^2=|x|_2^2$.


Johnson–Lindenstrauss Lemma

The foundational result in random projection theory is the Johnson–Lindenstrauss (JL) lemma.

Let $Q={x_1,\ldots,x_q}\subset\mathbb{R}^N$ and let $\epsilon\in(0,1)$. There exists a linear map $f:\mathbb{R}^N\to\mathbb{R}^n$ with \(n = O\!\left(\frac{\log q}{\epsilon^2}\right)\) such that, for all $u,v\in Q$, \((1-\epsilon)\|u-v\|_2^2 \le \|f(u)-f(v)\|_2^2 \le (1+\epsilon)\|u-v\|_2^2.\)

Moreover, a random projection with i.i.d. Gaussian or Rademacher entries satisfies this property with high probability.


Interpretation

  • The embedding dimension $n$ depends logarithmically on the number of points, not on the ambient dimension $N$.
  • Pairwise Euclidean distances are approximately preserved.
  • The result is non-adaptive: the same random map works for all pairs in $Q$.

This phenomenon is a manifestation of concentration of measure in high- dimensional probability.


Distance and Inner Product Preservation

Distance preservation implies approximate inner product preservation. For centered data, \(\langle u,v\rangle = \frac{1}{2}\left(\|u\|_2^2 + \|v\|_2^2 - \|u-v\|_2^2\right).\)

Thus, JL embeddings preserve:

  • angles between vectors,
  • cosine similarity,
  • norms of individual vectors.

Subspace Embeddings

The JL lemma extends beyond finite point sets.

Let $S\subset\mathbb{R}^N$ be a fixed $k$-dimensional subspace. A random projection $f:\mathbb{R}^N\to\mathbb{R}^n$ with \(n = O\!\left(\frac{k}{\epsilon^2}\right)\) satisfies \((1-\epsilon)\|x\|_2^2 \le \|f(x)\|_2^2 \le (1+\epsilon)\|x\|_2^2 \quad \forall x\in S\) with high probability.

Such maps are called oblivious subspace embeddings.


  • Random projections are geometry-preserving, not data-adaptive.
  • They complement, rather than replace, structured decompositions like SVD.
  • JL-type results explain why high-dimensional data often behaves “as if” it were low-dimensional.