Natural Language Understanding

1. Bag of words

spam_prediction_with_bagofwords.ipynb is a simple realization with the bag of word as tokenization method and followed with linear classification. Bags of word is a limited method, since it only taken account of the word appeared in the sentence, neglecting the word order, co-occurrence of word, let alone the semantic meaning of the word. Meanwhile, linear classification is also not an expressive model. Prediction on spam managed to reach 95% despite all those effects. This might have ample to do with the spam prediction itself. Spam prediction samples are not long. If some specific words, like 'buy', 'sold', etc. appear in the sample, the model is going to be determined as spam. If we replace website address as 'http://XXX', tel-phone number as '(XXX)XXX-XXXX', this model can reach higher accuracy.

Overall, bag of words if an classic tokenize method, with multinomial Bernoulli probability as mathematical backup, is simple and well enough to support some simple tasks.

2. BiLSTM

3. What are we talking about when we are talking about BERT

BERT is the game changer in the NLP domain.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
readme.md		readme.md
spam_classification_with_BiLSTM.ipynb		spam_classification_with_BiLSTM.ipynb
spam_prediction_with_bagofwords.ipynb		spam_prediction_with_bagofwords.ipynb
word_embedding.ipynb		word_embedding.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Natural Language Understanding

1. Bag of words

2. BiLSTM

3. What are we talking about when we are talking about BERT

About

Releases

Packages

Languages

graceyrhuang/Natural-Language-Understanding

Folders and files

Latest commit

History

Repository files navigation

Natural Language Understanding

1. Bag of words

2. BiLSTM

3. What are we talking about when we are talking about BERT

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages