Title: New Methods for Spectral Embedding
Abstract: Data from many applications can be represented by nonnegative matrices, including undirected and directed graphs, and “term-document"
matrices (for graph and text mining applications). Low rank decompositions of these various matrices provide vector representations, known as spectral embeddings, of the nodes and edges of a graph, the terms and documents of a term-document matrix, and even the nonzero entries of a nonnegative matrix. These vector embeddings can be used as input into other algorithms. We propose a statistical framework that provides interpretive, scalable, spectral embedding for both graphs and term-document matrices. We also develop relationships that exist between Skip-Gram, Global Vectors (GloVe), and Pointwise Mutual Information (PMI) spectral embeddings, three ad hoc methods used in text mining applications.