N-grams are essentially a set of co-occurring words within a given window. It has many uses in NLP and text mining right from summarizing to feature extraction in supervised machine learning tasks. Here is a basic tutorial on what n-grams are. In this article, we will focus on a Web API for generating N-Grams which is available on mashape: https://www.mashape.com/rxnlp/text-mining-and-nlp/.
These are the input parameters:
- text for n-gram generation,
- case-sensitive: true/false
- n-gram size.
![]() |
Below is a sample output from the API for the text "I love rainy days. How I wish it was raining ! How I wish it was snowing !". As you can see, you have the n-grams and the count of n-grams in descending order. Note that the API also does basic sentencing to generate sentences from text so that n-grams can be computed. Also note that this tool is language-neutral so you can generate n-grams for multiple languages. To force your sentences to be correctly split, you would probably need to ensure that punctuation is available at least between two consequent sentences.
No comments:
Post a Comment