人工智能资料库：第16辑（20170125）-白红宇

人工智能资料库：第16辑（20170125）

阅读量：2439 次

发布时间：2019-05-10

本文共 4512 字，大约阅读时间需要 15 分钟。

【博客】Deep Learning Paper Implementations: Spatial Transformer Networks - Part I

简介：

The first three blog posts in my “Deep Learning Paper Implementations” series will coverintroduced by*Max Jaderberg, Karen Simonyan, Andrew Zisserman and Koray Kavukcuoglu*of Google Deepmind in 2016. The Spatial Transformer Network is a learnable module aimed at increasing the spatial invariance of Convolutional Neural Networks in a computationally and parameter efficient manner.

In this first installment, we’ll be introducing two very important concepts that will prove crucial in understanding the inner workings of the Spatial Transformer layer. We’ll first start by examining a subset of image transformation techniques that fall under the umbrella of**affine transformations**, and then dive into a procedure that commonly follows these transformations:bilinear interpolation.

In the second installment, we’ll be going over the Spatial Transformer Layer in detail and summarizing the paper, and then in the third and final part, we’ll be coding it from scratch in Tensorflow and applying it to the(German Traffic Sign Recognition Benchmark).

原文链接：

2.【博客】Attention and Memory in Deep Learning and NLP

简介：

A recent trend in Deep Learning are Attention Mechanisms. In an, Ilya Sutskever, now the research director of OpenAI, mentioned that Attention Mechanisms are one of the most exciting advancements, and that they are here to stay. That sounds exciting. But what are Attention Mechanisms?

Attention Mechanisms in Neural Networks are (very) loosely based on the visual attention mechanism found in humans. Human visual attention is well-studied and while there exist different models, all of them essentially come down to being able to focus on a certain region of an image with “high resolution” while perceiving the surrounding image in “low resolution”, and then adjusting the focal point over time.

Attention in Neural Networks has a long history, particularly in image recognition. Examples includeor. But only recently have attention mechanisms made their way into recurrent neural networks architectures that are typically used in NLP (and increasingly also in vision). That’s what we’ll focus on in this post.

原文链接：

3.【博客】Game Theory reveals the Future of Deep Learning

简介：

This makes intuitive sense for two reasons. The first intuition is that DL systems will eventually need to tackle situations with imperfect knowledge.

The second intuition is that systems will not remain monolithic as they are now, but rather would involve multiple coordinating (or competing) cliques of DL systems.

原文链接：

4.【工具】Visual Analysis for Recurrent Neural Networks

简介：

Recurrent neural networks, and in particular long short-term memory networks (LSTMs), are a remarkably effective tool for sequence processing that learn a dense black-box hidden representation of their sequential input. Researchers interested in better understanding these models have studied the changes in hidden state representations over time and noticed some interpretable patterns but also significant noise.

We present LSTMVis a visual analysis tool for recurrent neural networks with a focus on understanding these hidden state dynamics. The tool allows a user to select a hypothesis input range to focus on local state changes, to match these states changes to similar patterns in a large data set, and to align these results with structural annotations from their domain. We provide data for the tool to analyze specific hidden state properties on dataset containing nesting, phrase structure, and chord progressions, and demonstrate how the tool can be used to isolate patterns for further statistical analysis.

原文链接：

5.【论文&代码】Sequence Level Training with Recurrent Neural Networks

简介：

Many natural language processing applications use language models to generate text. These models are typically trained to predict the next word in a sequence, given the previous words and some context such as an image. However, at test time the model is expected to generate the entire sequence from scratch. This discrepancy makes generation brittle, as errors may accumulate along the way. We address this issue by proposing a novel sequence level training algorithm that directly optimizes the metric used at test time, such as BLEU or ROUGE. On three different tasks, our approach outperforms several strong baselines for greedy generation. The method is also competitive when these baselines employ beam search, while being several times faster.

原文链接：

代码链接：

转载地址：http://epdqb.baihongyu.com/

你可能感兴趣的文章

Linux黑客大曝光：Linux安全机密与解决方案(转)