deep learning - how word2vec model integrating with lstm model? -

August 15, 2012

for natural language processing (nlp) task 1 uses word2vec vectors embedding words.however still dont understand how word2vec model integrating lstm model?

how should such unknown words handled when modeling nlp task such sentiment prediction using long short-term (lstm) network?

to use text data input of neural network tipically need convert numbers, word2vec in nutshell, alternative having big one-hot-encoded vector.

but able use it, need dictionary, set of known words, mapped embedding matrix (the generated word2vec). matrix have shape dictionary size , size of embeddings (your feature vector size).

that dictionary handles unknown words special token (e.g. <unk>), has entry in embedding matrix.

edit: added example

lets input text: the quick brown fox jumps on lazy dog

and dictionary (size 8):

<eof> 0 <unk> 1   2 fox   3 jumps 4 on  5 lazy  6 dog   7

the embedding matrix embedding size 2:

0 | -0.88930951349  -1.62185932033 1 | -0.11004085279  0.552127884563 2 | 0.689740990506  0.834548005211 3 | -0.7228834693   0.633890390277 4 | -1.47636106953  -0.20830548073 5 | 1.08289425079   0.211504860598 6 | -0.626065160814 0.505306007423 7 | 1.91239085331   -0.102223754095

then need preprocess input, replacing every word index in dictionary, result looks this:

[2, 1, 1, 3, 4, 5, 2, 6, 7]

notice quick , brown not in dictionary, unknown words.

and use in network, need replace indexes embeddings.

[[0.689740990506, 0.834548005211], [-0.11004085279, 0.552127884563],  [-0.11004085279, 0.552127884563], [-0.7228834693, 0.633890390277],  [-1.47636106953, -0.20830548073], [1.08289425079, 0.211504860598],  [0.689740990506, 0.834548005211], [-0.626065160814, 0.505306007423],  [1.91239085331, -0.102223754095]]

Search This Blog

Enable

deep learning - how word2vec model integrating with lstm model? -

Comments

Post a Comment

Popular posts from this blog

resizing Telegram inline keyboard -

javascript - How to bind ViewModel Store to View? -

recursion - Can every recursive algorithm be improved with dynamic programming? -