TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Deep LSTM Reader

Deep LSTM Reader

Natural Language ProcessingIntroduced 20001 papers
Source Paper

Description

The Deep LSTM Reader is a neural network for reading comprehension. We feed documents one word at a time into a Deep LSTM encoder, after a delimiter we then also feed the query into the encoder. The model therefore processes each document query pair as a single long sequence. Given the embedded document and query the network predicts which token in the document answers the query.

The model consists of a Deep LSTM cell with skip connections from each input x(t)x\left(t\right)x(t) to every hidden layer, and from every hidden layer to the output y(t)y\left(t\right)y(t):

x′(t,k)=x(t)∣∣y′(t,k−1), y(t)=y′(t,1)∣∣…∣∣y′(t,K)x'\left(t, k\right) = x\left(t\right)||y'\left(t, k - 1\right) \text{, } y\left(t\right) = y'\left(t, 1\right)|| \dots ||y'\left(t, K\right) x′(t,k)=x(t)∣∣y′(t,k−1), y(t)=y′(t,1)∣∣…∣∣y′(t,K)

i(t,k)=(W_kxix′(t,k)+W_khih(t−1,k)+W_kcic(t−1,k)+b_ki)i\left(t, k\right) = \left(W\_{kxi}x'\left(t, k\right) + W\_{khi}h(t - 1, k) + W\_{kci}c\left(t - 1, k\right) + b\_{ki}\right)i(t,k)=(W_kxix′(t,k)+W_khih(t−1,k)+W_kcic(t−1,k)+b_ki)

f(t,k)=(W_kxfx(t)+W_khfh(t−1,k)+W_kcfc(t−1,k)+b_kf)f\left(t, k\right) = \left(W\_{kxf}x\left(t\right) + W\_{khf}h\left(t - 1, k\right) + W\_{kcf}c\left(t - 1, k\right) + b\_{kf}\right)f(t,k)=(W_kxfx(t)+W_khfh(t−1,k)+W_kcfc(t−1,k)+b_kf)

c(t,k)=f(t,k)c(t−1,k)+i(t,k)tanh(W_kxcx′(t,k)+W_khch(t−1,k)+b_kc)c\left(t, k\right) = f\left(t, k\right)c\left(t - 1, k\right) + i\left(t, k\right)\text{tanh}\left(W\_{kxc}x'\left(t, k\right) + W\_{khc}h\left(t - 1, k\right) + b\_{kc}\right)c(t,k)=f(t,k)c(t−1,k)+i(t,k)tanh(W_kxcx′(t,k)+W_khch(t−1,k)+b_kc)

o(t,k)=(W_kxox′(t,k)+W_khoh(t−1,k)+W_kcoc(t,k)+b_ko)o\left(t, k\right) = \left(W\_{kxo}x'\left(t, k\right) + W\_{kho}h\left(t - 1, k\right) + W\_{kco}c\left(t, k\right) + b\_{ko}\right)o(t,k)=(W_kxox′(t,k)+W_khoh(t−1,k)+W_kcoc(t,k)+b_ko)

h(t,k)=o(t,k)tanh(c(t,k))h\left(t, k\right) = o\left(t, k\right)\text{tanh}\left(c\left(t, k\right)\right)h(t,k)=o(t,k)tanh(c(t,k))

y′(t,k)=W_kyh(t,k)+b_kyy'\left(t, k\right) = W\_{kyh}\left(t, k\right) + b\_{ky}y′(t,k)=W_kyh(t,k)+b_ky

where || indicates vector concatenation, h(t,k)h\left(t, k\right)h(t,k) is the hidden state for layer kkk at time ttt, and iii, fff, ooo are the input, forget, and output gates respectively. Thus our Deep LSTM Reader is defined by gLSTM(d,q)=y(∣d∣+∣q∣)g^{\text{LSTM}}\left(d, q\right) = y\left(|d|+|q|\right)gLSTM(d,q)=y(∣d∣+∣q∣) with input x(t)x\left(t\right)x(t) the concatenation of ddd and qqq separated by the delimiter |||.

Papers Using This Method

Teaching Machines to Read and Comprehend2015-06-10