Natural Language Processing

If you've ever used a translation app, had predictive text spell that tricky word for you, or said the words, "Alexa, what's the weather like tomorrow?" then you've enjoyed the products of natural language processing. 

Get started

Millions of flashcards designed to help you ace your studies

Sign up for free

Need help?
Meet our AI Assistant

Upload Icon

Create flashcards automatically from your own documents.

   Upload Documents
Upload Dots

FC Phone Screen

Need help with
Natural Language Processing?
Ask our AI Assistant

Review generated flashcards

Sign up for free
You have reached the daily AI limit

Start learning or create your own AI flashcards

Contents
Contents

Jump to a key chapter

    It's no coincidence that we can now communicate with computers using human language - they were trained that way - and in this article, we're going to find out how. We'll begin by looking at a definition and the history behind natural language processing before moving on to the different types and techniques. Finally, we will look at the social impact natural language processing has had.

    Definition of Natural Language Processing

    Natural language processing (NLP) is a branch of artificial intelligence (AI) that assists in the process of programming computers/computer software to 'learn' human languages. The goal of NLP is to create software that understands language as well as we do.

    Natural language processing has roots in linguistics, computer science, and machine learning and has been around for more than 50 years (almost as long as the modern-day computer!).

    Today, we can see the results of NLP in things such as Apple's Siri, Google's suggested search results, and language learning apps like Duolingo.

    Natural language processing, amazon echo dot, StudySmarterFig 1. We can talk to 'Alexa' because of natural language processing

    History of Natural Language Processing

    The beginnings of NLP as we know it today arose in the 1940s after the Second World War. The global nature of the war highlighted the importance of understanding multiple different languages, and technicians hoped to create a 'computer' that could translate languages for them.

    The creation of such a computer proved to be pretty difficult, and linguists such as Noam Chomsky identified issues regarding syntax. For example, Chomsky found that some sentences appeared to be grammatically correct, but their content was nonsense. He argued that for computers to understand human language, they would need to understand syntactic structures.

    Syntactic structures - In 1957, Noam Chomsky released his highly influential book Syntactic Structures, in which he argued that syntax should be treated separately from semantics and that there must be a formal and standardized approach to analyzing syntax.

    By the 1990s, NLP had come a long way and now focused more on statistics than linguistics, 'learning' rather than translating, and used more Machine Learning algorithms. Using Machine Learning meant that NLP developed the ability to recognize similar chunks of speech and no longer needed to rely on exact matches of predefined expressions. For example, software using NLP would understand both "What's the weather like?" and "How's the weather?".

    By 2011, Apple released the first successful and publicly available NLP virtual assistant, Siri.

    How Does Natural Language Processing Work?

    You're probably wondering by now how NLP works - this is where linguistics knowledge will come in handy.

    NLP uses AI to take in real-world human language and perform processing tasks in order to turn the language into code the computer will understand. There are two parts to this process:

    • Pre-processing (sometimes referred to as data processing) This involves breaking the language down and converting it into data that an algorithm can work with.

    • Algorithm development - Once the language has been turned into data, an algorithm must be developed to process and use it.

    Let's look at some of the most common pre-processing techniques now. These techniques are rooted in linguistics and linguistic analysis. We won't be looking at algorithm development today, as this is less related to linguistics.

    Natural Language Processing Techniques

    There are two main pre-processing types: syntactic and semantic analysis. Before we dive into these techniques, let's look at some definitions for these two terms.

    Syntax - The arrangement and order of words within a sentence. The most basic syntax structure is subject-verb-object (SVO).

    Semantics - The branch of linguistics that looks at the meaning, logic, and relationship of and between words.

    Syntactic Analysis

    Syntactic analysis involves looking at a sentence as a whole to understand its meaning rather than analyzing individual words. There are several syntactic analysis techniques NLP utilizes.

    Parsing

    Parsing involves breaking a sentence down into each of its constituents. A constituent is a unit of language that serves a function in a sentence; they can be individual words, phrases, or clauses. For example, the sentence "The cat plays the grand piano." comprises two main constituents, the noun phrase (the cat) and the verb phrase (plays the grand piano). The verb phrase can then be further divided into two more constituents, the verb (plays) and the noun phrase (the grand piano).

    Conducting a parsing analysis involves representing each sentence's constituents in a parse tree, like so:

    Natural language processing, Parsing tree, StudySmarterFig 2. Example of a parse tree

    Parse trees can show us the relationship between words in a sentence and how they work together to form constituents. For example, we can see that "the grand piano" is a constituent, but "plays the" isn't. This information can be turned into data for an NLP algorithm.

    Stemming

    Stemming is a morphological process that involves reducing conjugated words back to their root word.

    Conjugation (adj. conjugated) - Inflecting a verb to show different grammatical meanings, such as tense, aspect, and person. Inflecting verbs typically involves adding suffixes to the end of the verb or changing the word's spelling.

    Root word - Walk (verb)

    Conjugations - walking, walked, walks, walker

    Taking each word back to its original form can help NLP algorithms recognize that although the words may be spelled differently, they have the same essential meaning. It also means that only the root words need to be stored in a database, rather than every possible conjugation of every word.

    Text Segmentation

    Text segmentation is the process of separating language into meaningful units, such as morphemes (e.g., un-, luck, -y), words, sentences, paragraphs, and intent (i.e., what is the purpose of the language? does it ask a question, provide a statement, or give an order?).

    Semantic Analysis

    Sometimes sentences can follow all the syntactical rules but don't make semantical sense. This is why it's important to also conduct semantic analyses. These help the algorithms understand the tone, purpose, and intended meaning of language.

    Sentiment Analysis

    Sentiment analysis is an NLP technique that aims to understand whether the language is positive, negative, or neutral. It can also determine the tone of language, such as angry or urgent, as well as the intent of the language (i.e., to get a response, to make a complaint, etc.). Sentiment analysis works by finding vocabulary that exists within preexisting lists.

    Adjectives like disappointed, wrong, incorrect, and upset would be picked up in the pre-processing stage and would let the algorithm know that the piece of language (e.g., a review) was negative.

    Disambiguation

    Word disambiguation is the process of trying to remove lexical ambiguities. A lexical ambiguity occurs when it is unclear which meaning of a word is intended.

    "I'll meet you at the bank."

    The word bank has more than one meaning, so there is an ambiguity as to which meaning is intended here. By looking at the wider context, it might be possible to remove that ambiguity.

    "I need to deposit some money, so I'll meet you at the bank."

    Now we can see that the word bank is referring to a financial establishment and not a river bank or the verb to bank.

    Removing lexical ambiguities helps to ensure the correct semantic meaning is being understood.

    Natural Language Processing Examples

    Now we have a good idea of what NLP is and how its works, let's look at some real-world examples of how NLP affects our day-to-day lives.

    Email filters

    If you open up your email and look at the menu, you'll likely find different folders such as "spam" or "social." Emails you've received have been automatically 'filtered' to these folders based on the vocabulary they contain. This is a type of sentiment analysis.

    Predictive text

    One of the earliest uses of NLP was in predictive text. Today, predictive text uses NLP techniques and 'deep learning' to correct the spelling of a word, guess which word you will use next, and make suggestions to improve your writing.

    Activity: Try sending a message using only predictive text. It's possible to create a whole message only using the suggested words proposed by predictive text. Thanks to NLP, these words will be unique and tailored to you and can create some very funny (and revealing) messages!

    Language apps

    Natural language processing has made huge improvements to language translation apps. It can help ensure that the translation makes syntactic and grammatical sense in the new language rather than simply directly translating individual words.

    Natural language processing, image of online language translating, StudySmarterFig 3. Language translation as we know it today wouldn't be possible without NLP

    The Social Impact of Natural Language Processing

    In 2016, the researchers Hovy & Spruit released a paper discussing the social and ethical implications of NLP. In it, they highlight how up until recently, it hasn't been deemed necessary to discuss the ethical considerations of NLP; this was mainly because conducting NLP doesn't involve human participants. However, researchers are becoming increasingly aware of the social impact the products of NLP can have on people and society as a whole.

    Here are some of the main issues they identified:

    • Exclusion - NLP may learn from dominant cultures, making it easier to use and more appropriate for those from those dominant cultures.

    • Overgeneralization - NLP may lead to software making widespread assumptions about things like our gender, age, religion, and sexual orientation.

    • Bias - Most NLP tools focus on English and can therefore produce more rich data for English speakers than for others.1

    Natural Language Processing - Key takeaways

    • Natural language processing (NLP) is a branch of artificial intelligence (AI) that assists in programming computer software to 'learn' human languages.
    • Natural language processing has roots in linguistics, computer science, and machine learning.
    • NLP uses AI to take in real-world human language and perform processing tasks to turn the language into code the computer will understand. There are two parts to this process: pre-processing and algorithm development.
    • Pre-processing involves categorizing language into data an algorithm can work with. Common pre-processing techniques include syntactic analysis (e.g., parsing, stemming, and text segmentation), and semantic analysis (e.g., sentiment analysis and disambiguation).
    • We can see examples of NLP in predictive text, email filters, language learning apps, virtual assistants (e.g., Siri), and more.

    References

    1. D. Hovy & S. L. Spruit. The social impact of natural language processing. 2016.
    Frequently Asked Questions about Natural Language Processing

    What is natural language processing?

    Natural language processing (NLP) is a branch of artificial intelligence (AI) that assists in the process of programming computers/computer software to "learn" human languages. The goal of NLP is to create software that understands language as well as we do. 

    What is natural language processing used for?

    The main goal of natural language processing is for computers to understand human language as well as we do. It is used in software such as predictive text, virtual assistants, email filters, automated customer service, language translations, and more.

    How many phases are in natural language processing?

    There are two main phases in natural language processing: pre-processing and algorithm development.

    What are the main challenges of natural language processing?

    Challenges include:

    • Spelling mistakes 
    • Strong accents
    • Lexical ambiguities 
    • Unclear intent 
    • Ethical considerations

    What steps are involved in natural language processing?

    There are many different ways to analyze language for natural language processing. Some techniques include syntactical analyses like parsing and stemming or semantic analyses like sentiment analysis. 

    Save Article

    Test your knowledge with multiple choice flashcards

    True or false, software using natural language processing would understand both "What's the weather like?" and "How's the weather?"

    Natural language processing involves two processes. What are they?

    Pre-processing involves two different types of analyses. What are they?

    Next

    Discover learning materials with the free StudySmarter app

    Sign up for free
    1
    About StudySmarter

    StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.

    Learn more
    StudySmarter Editorial Team

    Team English Teachers

    • 10 minutes reading time
    • Checked by StudySmarter Editorial Team
    Save Explanation Save Explanation

    Study anywhere. Anytime.Across all devices.

    Sign-up for free

    Sign up to highlight and take notes. It’s 100% free.

    Join over 22 million students in learning with our StudySmarter App

    The first learning app that truly has everything you need to ace your exams in one place

    • Flashcards & Quizzes
    • AI Study Assistant
    • Study Planner
    • Mock-Exams
    • Smart Note-Taking
    Join over 22 million students in learning with our StudySmarter App
    Sign up with Email