12 Applications

12.1 Large Scale Deep Learning

12.1.1 Fast CPU Implementations

12.1.2 GPU Implementations

12.1.3 Large Scale Distributed Implementations

12.1.4 Model Compression

12.1.5 Dynamic Structure

12.1.6 Specialized Hardware Implementations of Deep Networks

12.2 Computer Vision

12.2.1 Preprocessing

12.2.1.1 Contrast Normalization

12.2.1.2 Dataset Augmentation

12.3 Speech Recognition

12.4 Natural Language Processing

12.4.1 $n$-grams

12.4.2 Neural Language Models

12.4.3 High-Dimensional Outputs

12.4.3.1 Use of a Short List

12.4.3.2 Hierarchical Softmax

12.4.3.3 Importance Sampling

12.4.3.4 Noise-Contrastive Estimation and Ranking Loss

12.4.4 Combining Neural Language Models with $n$-grams

12.4.5 Neural Machine Translation

12.4.5.1 Using an attention Mechanism and Aligning Pieces of Data

12.4.6 Historical Perspective

12.5 Other Applications

12.5.1 Recommender Systems

12.5.1.1 Exploration versus Exploitation

12.5.2 Knowledge Representation, Reasoning and Question Answering

12.5.2.1 Knowledge, Relations and Question Answering