NLP with Python

https://www.nltk.org/book/

NLTK

Contents

  • 0 Preface
  • 1 Language Processing and Python
  • 2 Accessing Text Corpora and Lexical Resources
  • 3 Processing Raw Text
  • 4 Writing Structured Programs
  • 5 Categorizing and Tagging Words
  • 6 Learning to Classify Text
  • 7 Extracting Information from Text
  • 8 Analyzing Sentence Structure
  • 9 Building Feature Based Grammars
  • 10 Analyzing the Meaning of Sentences
  • 11 Managing Linguistic Data
  • 12 Afterword: Facing the Language Challenge

2 Accessing Text Corpora and Lexical Resources

from nltk.corpus import *

2.1 Corpora

  • Gutenberg Corpus gutenberg
  • Web and Chat Text webtext nps_chat
  • Brown Corpus brown
  • Reuters Corpus reuters
  • Inaugural Address Corpus inaugural

2.1.6 Annotated Text Corpora

2.1.7 Corpora in Other Languages

2.4 Lexical Resources

2.4.1 Wordlist Corpora

nltk.corpus.words.words()
nltk.corpus.stopwords.words('english')
nltk.corpus.names

2.4.2 A Pronouncing Dictionary

2.5 WordNet

6 Learning to Classify Text

6.6 Maximum Entropy Classifiers

7 Extracting Information from Text

7.2 Chunking