Abstract
Tensor factorization methods have recently gained increased popularity. A key feature that renders tensors attractive is the ability to directly model multi-relational data. In this work, we propose ParaSketch, a parallel tensor factorization algorithm that enables massive parallelism, to deal with large tensors. The idea is to compress the large tensor into multiple small tensors, decompose each small tensor in parallel, and combine the results to reconstruct the desired latent factors. Prior art in this direction entails potentially very high complexity in the (Gaussian) compression and final combining stages. Adopting sketching matrices for compression, the proposed method enjoys a dramatic reduction in compression complexity, and features a much lighter combining step. Moreover, theoretical analysis shows that the compressed tensors inherit latent identifiability under mild conditions, hence establishing correctness of the overall approach. Numerical experiments corroborate the theory and demonstrate the effectiveness of the proposed algorithm.
Original language | American English |
---|---|
Pages (from-to) | 365-378 |
Number of pages | 14 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 34 |
Issue number | 1 |
DOIs | |
State | Published - 1 Jan 2022 |
Bibliographical note
Publisher Copyright:© 1989-2012 IEEE.
NREL Publication Number
- NREL/JA-5D00-82847
Keywords
- block term decomposition
- identifiability
- PARAFAC
- sketching
- Tensor factorization