Spatial PixelCNN: Generating Images from Patches
This is a very cool paper in the computer vision space that proposes Spatial PixelCNN, a conditional autoregressive model that generates images from small patches. By conditioning on a grid of pixel coordinates and global features extracted from a Variational Autoencoder (VAE), they’re able to train on patches of images, and reproduce the full-sized image. They show that the technique not only allows for generating high quality samples at the same resolution as the underlying data set, but is also capable of up-scaling images to arbitrary resolutions (tested at resolutions up to 50×) on the MNIST dataset. Compared to a PixelCNN++ baseline, Spatial PixelCNN quantitatively and qualitatively achieves similar performance on the MNIST data set.
A group of Google researchers presents a method to create universal, robust, targeted adversarial image patches in the real world. The patches are universal because they can be used to attack any scene, robust because they work under a wide variety of transformations, and targeted because they can cause a classifier to output any target class. These adversarial patches can be printed, added to any scene, photographed, and presented to image classifiers; even when the patches are small, they cause the classifiers to ignore the other items in the scene and report a chosen target class.
Visualizing the Loss Landscape of Neural Nets
Neural network training relies on our ability to find “good” minimizers of highly non-convex loss functions. It is well known that certain network architecture designs (e.g., skip connections) produce loss functions that train easier, and well-chosen training parameters (batch size, learning rate, optimizer) produce minimizers that generalize better. However, the reasons for these differences, and their effect on the underlying loss landscape, is not well understood. This paper explores the structure of neural loss functions, and the effect of loss landscapes on generalization, using a range of visualization methods.
Ray RLLib: A Composable and Scalable Reinforcement Learning Library
Reinforcement learning (RL) algorithms involve the deep nesting of distinct components, where each component typically exhibits opportunities for distributed computation. Current RL libraries offer parallelism at the level of the entire program, coupling all the components together and making existing implementations difficult to extend, combine, and reuse. This paper argues for building composable RL components by encapsulating parallelism and resource requirements within individual components, which can be achieved by building on top of a flexible task-based programming model. The authors demonstrate this principle by building Ray RLLib on top of Ray and show how to implement a wide range of state-of-the-art algorithms by composing and reusing a handful of standard components. Ray RLLib is available as part of Ray on GitHub.
Gradients explode – Deep Networks are shallow – ResNet explained
Deep Extreme Cut: From Extreme Points to Object Segmentation
This paper explores the use of extreme points in an object (left-most, right-most, top, bottom pixels) as input to obtain precise object segmentation for images and videos. The authors do so by adding an extra channel to the image in the input of a convolutional neural network (CNN), which contains a Gaussian centered in each of the extreme points. The CNN learns to transform this information into a segmentation of an object that matches those extreme points. The paper demonstrates the usefulness of this approach for guided segmentation (grabcut-style), interactive segmentation, video object segmentation, and dense segmentation annotation.
Generative adversarial networks (GANs) can implicitly learn rich distributions over images, audio, and data which are hard to model with an explicit likelihood. This paper presents a practical Bayesian formulation for unsupervised and semi-supervised learning with GANs. Within this framework, the authors use stochastic gradient Hamiltonian Monte Carlo to marginalize the weights of the generator and discriminator networks. The resulting approach is straightforward and obtains good performance without any standard interventions such as feature matching, or mini-batch discrimination. By exploring an expressive posterior over the parameters of the generator, the Bayesian GAN avoids mode-collapse, produces interpretable and diverse candidate samples, and provides state-of-the-art quantitative results for semi-supervised learning on benchmarks including SVHN, CelebA, and CIFAR-10, outperforming DCGAN, Wasserstein GANs, and DCGAN ensembles.
Deep Unsupervised Clustering Using Mixture of Autoencoders
Unsupervised clustering is one of the most fundamental challenges in machine learning. A popular hypothesis is that data are generated from a union of low-dimensional nonlinear manifolds; thus an approach to clustering is identifying and separating these manifolds. This paper presents a novel approach to solve this problem by using a mixture of autoencoders. The model consists of two parts: 1) a collection of autoencoders where each autoencoder learns the underlying manifold of a group of similar objects, and 2) a mixture assignment neural network, which takes the concatenated latent vectors from the autoencoders as input and infers the distribution over clusters. By jointly optimizing the two parts, the authors simultaneously assign data to clusters and learn the underlying manifolds of each cluster.
Non-convex Optimization for Machine Learning
Improving Generalization Performance by Switching from Adam to SGD
Sign up for the free insideBIGDATA newsletter.