author    = {Hang Su and 
	       Varun Jampani and 
	       Deqing Sun and 
	       Orazio Gallo and 
	       Erik Learned-Miller and 
	       Jan Kautz},
  title     = {Pixel-Adaptive Convolutional Neural Networks},
  booktitle = {Proceedings of the IEEE Conference on Computer 
               Vision and Pattern Recognition (CVPR)},
  year      = {2019}


[paper] [code]


Convolutions are the fundamental building block of CNNs. The fact that their weights are spatially shared is one of the main reasons for their widespread use, but it also is a major limitation, as it makes convolutions content agnostic.

We propose a pixel-adaptive convolution (PAC) operation, a simple yet effective modification of standard convolutions, in which the filter weights are multiplied with a spatially-varying kernel that depends on learnable, local pixel features. PAC is a generalization of several popular filtering techniques and thus can be used for a wide range of use cases. Specifically, we demonstrate state-of-the-art performance when PAC is used for deep joint image upsampling. PAC also offers an effective alternative to fully-connected CRF (Full-CRF), called PAC-CRF, which performs competitively, while being considerably faster. In addition, we also demonstrate that PAC can be used as a drop-in replacement for convolution layers in pre-trained networks, resulting in consistent performance improvements.

Pixel-Adaptive Convolution

General formula of PAC $$\hat{\mathbf{I}}_p = \frac{1}{Z}\sum_{q \in nbr(p)} \mathbf{I}_q M_q W(q-p) K(\mathbf{f}_p, \mathbf{f}_q) + \mathbf{b}$$ Normalization options A few special cases


Hang Su and Erik Learned-Miller acknowledge support from AFRL and DARPA (agreement# FA8750-18-2-0126) and the MassTech Collaborative grant for funding the UMass GPU cluster.