Recent progress in spectral classification is largely attributed to the use of convolutional neural networks (CNN). While a variety of successful architectures have been proposed, they all extract spectral features from various portions of adjacent spectral bands. In this paper, we take a different approach and develop a deep spectral feature fusion method, which extracts both local and interlocal spectral features, capturing thus also the correlations among non-adjacent bands. To our knowledge, this is the first reported deep spectral feature fusion method. Our model is a two-stream architecture, where an intergroup and a groupwise spectral classifiers operate in parallel. The interlocal spectral correlation feature extraction is achieved elegantly, by reshaping the input spectral vectors to form the socalled non-adjacent spectral matrices. We introduce the concept of groupwise band convolution to enable efficient extraction of discriminative local features with multiple kernels adopting to the local spectral content. Another important contribution of this work is a novel dual-channel attention mechanism to identify the most informative spectral features. The model is trained in an end-to-end fashion with a joint loss. Experimental results on real data sets demonstrate excellent performance compared to the current state-of-the-art.