Unsupervised Learning of Sparse Representations

Marc'Aurelio Ranzato (NYU), Koray Kavukcuoglu (NYU), Y-Lan Boureau (NYU), Yann LeCun (NYU)

Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. In particular, sparse representations, i.e. representations with only few non-zero components, are desirable because they can be learned very efficiently and they can be easier to discriminate. In our work, we have proposed many unsupervised algorithms to learn sparse representations with applications to handwriting recognition, generic visual object recognition, image denoising and image compression.


Learning Applied to Ground Robots

Raia Hadsell (CMU), Pierre Sermanet (NYU), Marco Scoffier (NYU, Net-Scale Technologies), Chris Crudelle (Net-Scale Technologies), Matt Grimes (NYU), Koray Kavukcuoglu (NYU), Yann LeCun (NYU), Urs Muller (Net-Scale Technologies)

The purpose of the LAGR project is to design vision and learning algorithms to allow the robot to navigate in complex outdoors environment solely from camera input. Our team is one of 10 participants in the program funded by DARPA. Each LAGR team received identical copies of the LAGR robot, built be the National Robotics Engineering Consortium at Carnegie Mellon University. The government periodically runs competitions between the teams. The software from each team is loaded and run by the goverment team on their robot. The robot is given the GPS coordinates of a goal to which it must drive as fast as possible. The terrain is unknown in advance. The robot is run three times through the test course. The software can use the knowledge acquired during the early runs to improve the performance on the latter runs.

A complete long-range vision and navigation system was developed and allows the robot to dramatically improve its autonomous navigation performances, by learning the long-range appearance of obstacles using short-range stereo examples. The robot can thus adapt to completely new environments. In addition to learning obstacles, the robot learns its control by remembering possible trajectories which allows a smooth and optimized local obstacle avoidance. Additionally, an efficient long-range mapping and path planning allows real-time planning over a very large visual space, and a cheap visual odometry corrects for pose errors. All those efficient modules are integrated in a multiple perception/planning layer architecture that makes the navigation reactive and collision-free as well as long-range.


Factor Graphs for Relational Learning

Sumit Chopra (NYU), Trivikaraman Thampy (NYU), John Leahy (NYU, Economics), Andrew Caplin (NYU, Economics), Yann LeCun (NYU)

Many interesting regression problems possess a rich underlying inter-sample relational structure. In these problems, the samples may be related to each other in ways such that the unknown variables associated with any sample not only depends on its individual attributes, but also on the variables associated with related samples. One such problem, whose importance is further emphasized by the present economic crises, is understanding real estate prices. The price of a house clearly depends on its individual attributes, such as, the number of bedrooms. However, the price also depends on the neighborhood in which the house lies and on the time period in which it was sold. Uncovering these spatio-temporal dependencies can help better understand house prices, while at the same time improving prediction accuracy.

Problems of this nature fall in the domain of "Statistical Relational Learning". However the drawback of most models proposed so far is that they cater only to classification problems. We propose "Relational Factor Graph" models for doing regression in relational data and apply it for predicting house prices and constructing house price indices. The relational aspect of the model accounts for the hidden spatio-temporal influences on the price of every house. The dataset used is industry standard, consisting of around 1.3 million transaction in Los Angeles County spanning 24 years. Experiments show that one can achieve considerably superior performance by identifying and using the underlying spatio-temporal structure associated with the problem.