What is Natural Language Processing (NLP)
Natural Language Processing is a branch of artificial intelligence that deals with analyzing, understanding and generating the languages that humans use naturally in order to interface with computers in both written and spoken contexts using natural human languages instead of computer languages.
Components of Natural Language Processing
- Natural Language Understanding (NLU)
Natural Language Understanding deals with understanding the input given by the user as a part of natural language.
- Natural Language Generation (NLG)
Natural Language Generation deals with producing written or spoken a language from raw data.
Concept of Natural Language Processing
- Tokenization
- Stemming
- Lemmatization
- Parts of Speech (POS) Tagging
- Named Entity Recognition
Tokenization
Tokenization is the process of splitting up of text into minimal meaningful unit.
Stemming
Stemming is the process of reducing a word to its word stem (base form) by cutting off the beginning of the end.
Lemmatization
Lemmatization is the process of reducing words into their lemma or dictionary.
Parts of Speech (POS) Tagging
Words in the sentences or phrases are categorized into 8 parts of speech. POS tagging is the process of making up of word in a text (corpus) as corresponding to a particular part of speech.
Named Entity Recognition
Named Entity Recognition done only on Nouns. Named-entity recognition is a subtask of information extraction that seeks to locate and classify named entity mentions in unstructured text into pre-defined categories such as the person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.
Applications of NLP
- Machine Translation (Google Translate)
- Spam filters
- Sentiment Analysis
- Chatbots
- Natural language generation
- Web Search
Conclusion
Natural language processing has close ties with Artificial intelligence. Some problems in NLP are solved by AI and vice-versa. In conclusion, Natural language processing is a field of computer science and AI that focuses mainly on the interaction among computers and humans.