yii-word-frequency

A class for accumulating tokens from various text sources, filtering them, and counting their frequency.
4 followers

Potential Usage

  • Generating Tag Clouds
  • Determining the top keywords in a set of texts (e.g. Most often mentioned keywords in a set of posts or comments)
  • Determining the frequency of a specified set of tokens (e.g. A sports blog displays which players from a specific team appear in a post)

Requirements

Yii 1.1.14 (developed with this version. I see no problems using earlier versions but it has not been tested with any version prior to 1.1.14)

Installation

Place the contents of the 'yii-word-frequency' extension (extracted from the zip file or obtained via Github) into the extensions folder extensions/yii-word-frequency

Add ext.yii-word-frequency to the import array in config/main.php

// preloading 'log' component
'preload'=>array('log'),

// autoloading model and component classes
'import'=>array(
    'application.models.*',
    'application.components.*',
    'ext.yii-word-frequency.*', 
),

Usage

A minimalistic example

$ywf = Yii::createComponent(array('class' => 'YiiWordFrequency'));
$ywf->sourceList = 'This is a test string. This is another test string. Test strings are fun.';
$ywf->accumulateSources();
$frequencyList = $ywf->generateList();

An example using Active Records

(the frequency list can be obtained from the return value of generateList() as above or via the property

$ywf = Yii::createComponent(array('class' => 'YiiWordFrequency'));
$model = new Testdata;
$criteria=new CDbCriteria();
$criteria->addInCondition('id',array(1,2)); 
$criteria->select = "col1, col2, col3";
$ywf->sourceList = array(array($model, $criteria));
$ywf->accumulateSources();
$ywf->generateList();
$frequencyList = $ywf->tagFrequencyList;

An example usinge multiple sources

$ywf = Yii::createComponent(array('class' => 'YiiWordFrequency'));
$model = new Testdata; // Active Record Model
$criteria=new CDbCriteria(); // Criteria object for determining columns and record sources
$criteria->addInCondition('id',array(1)); 
$criteria->select = "col1";
$this->ywf->sourceList = array(
    $this->inputFixture[0],
    $this->inputFixture[1],
    array($model, $criteria),
);
$this->ywf->accumulateSources();
$this->ywf->generateList();
$frequencyList = $ywf->tagFrequencyList;

An example using a blacklist and method chaining

$ywf = Yii::createComponent(array('class' => 'YiiWordFrequency'));
$ywf->sourceList = 'This is a test string. This is another test string. Test strings are fun.';
$ywf->blackList = array('this', 'is');
$ywf->accumulateSources()->runBlackListFilter()->generateList();
$frequencyList = $ywf->tagFrequencyList;

An example using a whitelist as well as configuration at object creation

$this->ywf = Yii::createComponent(array(
      'class' => 'YiiWordFrequency',
      'sourceList'=> array($this->inputFixture[0]),
      'whiteListFile' => array('../tests/fixtures/whiteList_test.txt'),
      'whiteListCaseSensitive' => true,
   )
);
$this->ywf->accumulateSources()->runWhiteListFilter()->generateList();
print_r($this->ywf->tagFrequencyList);

Motivation

The initial motivation for this class was to generate data for a TagCloud which could be displayed with the yiitagcloud widget

See inline documentation for more information

Resources

Total 2 comments

#16471 report it
apotter at 2014/02/26 11:53am
This extension will not work with PHP >= 5.4

An fix is coming soon (btw already there on the GitHub project site)

#16407 report it
apotter at 2014/02/20 09:36am
What would be useful additional features for this class?

Things I've considered:

Retrieval of blacklist, whitelists and substitution lists from a database

Behaviors for creating an Active Record property containing a token list based on other properties of the record

other suggestions welcome ...

Leave a comment

Please to leave your comment.

Create extension