Highlights What are the main findings? A mini-cluster-based optimization scheme is proposed to preserve the non-local structure of hyperspectral image (HSI) data. A one-stage, end-to-end deep clustering network is designed to learn subspace bases under the joint guidance of local and non-local structures. What are the implications of the main findings? The mini-cluster optimization scheme adaptively models non-local similarity with higher efficiency than manifold-based methods relying on fixed neighbor settings. The end-to-end framework enables local and non-local structures to jointly supervise and optimize the entire clustering process, overcoming the limitations of previous two-stage deep subspace methods.Highlights What are the main findings? A mini-cluster-based optimization scheme is proposed to preserve the non-local structure of hyperspectral image (HSI) data. A one-stage, end-to-end deep clustering network is designed to learn subspace bases under the joint guidance of local and non-local structures. What are the implications of the main findings? The mini-cluster optimization scheme adaptively models non-local similarity with higher efficiency than manifold-based methods relying on fixed neighbor settings. The end-to-end framework enables local and non-local structures to jointly supervise and optimize the entire clustering process, overcoming the limitations of previous two-stage deep subspace methods.Abstract Subspace clustering has become widely adopted for the unsupervised analysis of hyperspectral images (HSIs). Recent model-aware deep subspace clustering methods often use a two-stage framework, involving the calculation of a self-representation matrix with complexity of O(n2), followed by spectral clustering. However, these methods are computationally intensive, generally incorporating only local or non-local structure constraints, and their structural constraints fall short of effectively supervising the entire clustering process. We propose a scalable, context-preserving deep clustering method based on basis representation, which jointly captures local and non-local structures for efficient HSI clustering. To preserve local structure-i.e., spatial continuity within subspaces-we introduce a spatial smoothness constraint that aligns clustering predictions with their spatially filtered versions. For non-local structure-i.e., spectral continuity-we employ a mini-cluster-based scheme that refines predictions at the group level, encouraging spectrally similar pixels to belong to the same subspace. These two constraints are jointly optimized to reinforce each other. Specifically, our model is designed as a one-stage approach, in which the structural constraints are applied to the entire clustering process. The time and space complexity of our method are O(n), making it applicable to large-scale HSI data. Experiments on real-world datasets show that our method outperforms state-of-the-art techniques.