Repository of contents (+notebooks) covered in AV's Datahack Summit 2019 workshop on Applied NLP by Sudalai Rajkumar
- Introduction to Natural Language Processing
- Text pre-processing and Wrangling
- Removing HTML tagsnoise
- Removing accented characters
- Removing special characterssymbols
- Handling contractions
- Stemming
- Lemmatization
- Stop word removal
- Project: Build a duplicate character removal module
- Project: Build a spell-check and correction module
- Project: Build an end-to-end text pre-processor
- Text Understanding
- POS (Parts of Speech) Tagging
- Text Parsing
- Shallow Parsing
- Dependency Parsing
- Constituency Parsing
- NER (Named Entity Recognition) Tagging
- Project: Build your own POS Tagger
- Project: Build your own NER Tagger
- Text Representation – Feature Engineering
- Traditional Statistical Models – BOW, TF-IDF
- Newer Deep Learning Models for word embeddings – Word2Vec, GloVe, FastText
- Project: Similarity and Movie Recommendations
- Project: Interactive exploration of Word Embeddings
- Case Studies for other common NLP Tasks
- Project: Sentiment Analysis using unsupervised learning and supervised learning (machine and deep learning)
- Project: Text Clustering (grouping similar movies)
- Project: Text Summarization and Topic Models
- Promise of Deep Learning for NLP, Transfer and Generative Learning
- Final words and where to go from here?
- Learn and understand popular NLP workflows with interactive examples
- Covers concepts and interactive projects on cleaning and handling noisy unstructured text data including duplicate checks, spelling corrections and text wrangling
- Build your own POS and NER taggers and parse text data to understand it better
- Understand, build and explore text semantics and representations with traditional statistical models and newer word embedding models
- Projects on popular NLP tasks including text classification, sentiment analysis, text clustering, summarization, topic models and recommendations
- Brief coverage of the promise of deep learning for NLP
Weblink of the workshop: