CBoW Word2Vec

Continuous Bag-of-Words Word2Vec

Natural Language ProcessingIntroduced 20006 papers

Description

Continuous Bag-of-Words Word2Vec is an architecture for creating word embeddings that uses $n$ future words as well as $n$ past words to create a word embedding. The objective function for CBOW is:

$J\_\theta = \frac{1}{T}\sum^{T}\_{t=1}\log{p}\left(w\_{t}\mid{w}\_{t-n},\ldots,w\_{t-1}, w\_{t+1},\ldots,w\_{t+n}\right)$

In the CBOW model, the distributed representations of context are used to predict the word in the middle of the window. This contrasts with Skip-gram Word2Vec where the distributed representation of the input word is used to predict the context.

Papers Using This Method

HuSpaCy: an industrial-strength Hungarian natural language processing toolkit2022-01-06 A Statutory Article Retrieval Dataset in French2021-08-26 LU-BZU at SemEval-2021 Task 2: Word2Vec and Lemma2Vec performance in Arabic Word-in-Context disambiguation2021-04-16 FarsTail: A Persian Natural Language Inference Dataset2020-09-18 IP2Vec: Learning Similarities Between IP Addresses2017-11-21 Efficient Estimation of Word Representations in Vector Space2013-01-16

CBoW Word2Vec

Continuous Bag-of-Words Word2Vec

Natural Language ProcessingIntroduced 20006 papers

Source Paper

Description

$J\_\theta = \frac{1}{T}\sum^{T}\_{t=1}\log{p}\left(w\_{t}\mid{w}\_{t-n},\ldots,w\_{t-1}, w\_{t+1},\ldots,w\_{t+n}\right)$