http://www.ted.com Have you played with Google Labs' NGram Viewer? It's an addicting tool that lets you search for words and ideas in a database of 5 million books from across centuries. Erez Lieberman Aiden and Jean-Baptiste Michel show us how it works, and a few of the surprising things we can learn from 500 billion words.
An n-gram is a subsequence of n items from a given sequence. The items in question can be phonemes, syllables, letters, words or base pairs according to the application.
An n-gram of size 1 is referred to as a "unigram"; size 2 is a "bigram" (or, less commonly, a "digram"); size 3 is a "trigram"; size 4 is a "four-gram" and size 5 or more is simply called an "n-gram". Some language models built from n-grams are "(n − 1)-order Markov models".
An n-gram model is a type of probabilistic model for predicting the next item in such a sequence. n-gram models are used in various areas of statistical natural language processing and genetic sequence analysis.
Link to Google Ngram Viewer : http://books.google.com/ngrams/graph?content=Red%2CBlue%2C&year_start=1800&year_end=2008&corpus=0&smoothing=3
Tags:
Comment
© 2025 Created by Sevan Bomar. Powered by
You need to be a member of THE OFFICIAL RESISTANCE to add comments!
Join THE OFFICIAL RESISTANCE