12 Applications
12.1 Large Scale Deep Learning
12.1.1 Fast CPU Implementations
12.1.2 GPU Implementations
12.1.3 Large Scale Distributed Implementations
12.1.4 Model Compression
12.1.5 Dynamic Structure
12.1.6 Specialized Hardware Implementations of Deep Networks
12.2 Computer Vision
12.2.1 Preprocessing
12.2.1.1 Contrast Normalization
12.2.1.2 Dataset Augmentation
12.3 Speech Recognition
12.4 Natural Language Processing
12.4.1 $n$-grams
12.4.2 Neural Language Models
12.4.3 High-Dimensional Outputs
12.4.3.1 Use of a Short List
12.4.3.2 Hierarchical Softmax
12.4.3.3 Importance Sampling
12.4.3.4 Noise-Contrastive Estimation and Ranking Loss
12.4.4 Combining Neural Language Models with $n$-grams
12.4.5 Neural Machine Translation
12.4.5.1 Using an attention Mechanism and Aligning Pieces of Data
12.4.6 Historical Perspective
12.5 Other Applications
12.5.1 Recommender Systems
12.5.1.1 Exploration versus Exploitation
12.5.2 Knowledge Representation, Reasoning and Question Answering
12.5.2.1 Knowledge, Relations and Question Answering