Difference between #13 and #12 of Drills : Search by a HAS_MANY relation

unchanged
Title
Drills : Search by a HAS_MANY relation
unchanged
Category
Tutorials
unchanged
Tags
has_many, search, CActiveRecord
changed
Content
Sometimes we get lost trying to search by a HAS_MANY relation using
CActiveRecord or CActiveDataProvider. This article is a series of drills that
try to describe the practical techniques of searching by a HAS_MANY relation.

## Relation
Two entities are sometimes connected with a relation of **1:N**. Or we may say
that 1:N is the only possible relation between 2 entities as long as we are in
the RDB world. **1:1** relation is just a particular kind of 1:N where N is
always assumed to be 1 at the maximum. And **N:N** relation can be considered as
a combination of two 1:N relations.

Yii supports this 1:N relation in CActiveRecord as **BELONGS_TO** and
**HAS_MANY** relations. 1:N relation seen from the side of N is BELONG_TO, and
from the side of 1 it is HAS_MANY.

BELONGS_TO relation is fairly easy. There's not so much to talk about it. But
HAS_MANY can be very tough sometimes.

Now, let's construct an example of 1:N relation.

## Example of HAS_MANY

Think about blog posts and their authors. It's a relation of "An Author has
many Posts" and "A Post belongs to an Author" at the same time.

~~~
[php]
/**
 * Author model
 * @property integer $id
 * @property integer $name Author's name
...
 */
class Author extends CActiveRecord
{
	...
	public function relations()
	{
		return array(
			'posts' => array(self::HAS_MANY, 'Post', 'author_id');
		);
	}
	...
~~~

~~~
[php]
/**
 * Post model
 * @property integer $id
 * @property integer $author_id FK to Author's id
 * @property integer $title Title of Post
...
 */
class Post extends CActiveRecord
{
	...
	public function relations()
	{
		return array(
			'author' => array(self::BELONGS_TO, 'Author', 'author_id');
		);
	}
	...
~~~

We are going to solve the possible use cases of search regarding this example.

## Task #1

**Show all posts that has a word in post title**

I want to start with an easy one. Let's retrieve all the posts that has a
cirtain word in their title.

~~~
[php]
public static function GetPostsByTitle($searchWord)
{
	// query criteria
	$criteria = new CDbCriteria();
	// compare title
	$criteria->compare('title', $searchWord, true);
	// find
	$posts = Post::model()->findAll($criteria);
	// show
	foreach($posts as $post)
	{
		echo "Title = " . $post->title . "\n";
	}
}
~~~

Ah, it was too easy. And no relation is involved in this task.

OK, let's move on to the next.

## Task #2

**Show all posts with their authors that has a certain word in post title**

Now we have to retrieve the authors' name, too.

~~~
[php]
public static function GetPostsWithAuthorByTitle($searchWord)
{
	// query criteria
	$criteria = new CDbCriteria();
	// with Author model
	$criteria->with = array('author');
	// compare title
	$criteria->compare('t.title', $searchWord, true);
	// find all posts
	$posts = Post::model()->findAll($criteria);
	// show all posts
	foreach($posts as $post)
	{
		echo "Title = " . $post->title . "\n";
		echo "Author = " . $post->author->name . "\n";
	}
}
~~~

Well, you may say "You don't have to do it. Just add a line to echo
`$post->author->name` to the code of the 1st task."

Yes, you are right. Post model has `author` relation, so all you have to do to
retrieve its Author is just **$post->author**. How convenient is it! It's
so-called **lazy loading** approach.

But the lazy loading approach has a drawback in this case, because you have to
execute one query for retrieving an entire array of Posts plus every one query
per each Post for retrieving its author. It will end up in 1+N queries to be
executed.

Here we are retrieveing Posts and their Authors at the same time by specifying
**CDbCriteria::with**. It enables you to do the job with a single query that
joins the related tables. This is so-called **eager loading** approach.

The eager loading is preferrable in this particular case, because it's more
effective. But, you have to note, it is not always so.

Please note that you have to disambiguate the column name with table aliases
when you have joined the tables using **with**. We use the fixed alias of **t**
for the main table. And the alias for the related table is usually the same as
the relation name.

So you can write like this when you want to search by Author's name.

~~~
[php]
	...
	// compare Author's name
	$criteria->compare('author.name', $searchName, true);
	...
~~~

Quite simple, isn't it? I do love CActiveRecord of Yii. It's quite convenient.

Huh? Who said "It's just a **BELONGS_TO** relation. Yeah, it's simple. I
know."?

Yes, you are right, definitely. Let's move on to the next task where we will
deal with **HAS_MANY**.

## Task #3

**Show all authors who has at least one post that has a certain word in post
title**

Now we will retrieve the authors, not the posts.

~~~
[php]
public static function GetAuthorsByPostTitle($searchWord)
{
	// query criteria
	$criteria = new CDbCriteria();
	// with Post model
	$criteria->with = array('posts');
	// compare title
	$criteria->compare('posts.title', $searchWord, true);
	// find all authors
	$authors = Author::model()->findAll($criteria);
	// show all authors
	foreach($authors as $author)
	{
		echo "Author = " . $author->name . "\n";
	}
}
~~~

Umm, is it OK? Seems OK, but looks too simple. We don't care. Let's move on.

## Task #4

**Show all authors with his/her all posts who has at least one post that has a
certain word in post title**

Hmmm. Just a small addition to the 3rd task. Something like this?

### Wrong Answer

~~~
[php]
public static function GetAuthorsWithPostsByPostTitle($searchWord)
{
	// query criteria
	$criteria = new CDbCriteria();
	// with Post model
	$criteria->with = array('posts');
	// compare title
	$criteria->compare('posts.title', $searchWord, true);
	// find all authors
	$authors = Author::model()->findAll($criteria);
	// show all authors and his/her posts
	foreach($authors as $author)
	{
		echo "Author = " . $author->name . "\n";
		foreach($author->posts as $post)
		{
			echo "Post = " . $post->title . "\n";
		}
	}
}
~~~

BOOP!! You've got hooked.

Why?

If the task had been "Show all authors **with his/her relevant posts** who
has at least one post that has a certain word in post title", then it would
have been the answer. But the task is "**with his/her all posts**".
Your answer doesn't show the posts which don't have that certain word in their
titles.

Because you are comparing the post titles, the posts without the certain word in
their titles are omitted from the query results.

We want to do the **lazy loading** of the posts in order to retrieve all the
posts, but at the same time we need to join the post table using **with** in
order to compare the post title.

How to solve this dilemma?

### Answer

~~~
[php]
public static function GetAuthorsWithPostsByPostTitle($searchWord)
{
	// query criteria
	$criteria = new CDbCriteria();
	// join Post model (without selecting)
	$criteria->with = array(
		'posts' => array(
			'select' => false,
		),
	);
	// compare title
	$criteria->compare('posts.title', $searchWord, true);
	// find all authors
	$authors = Author::model()->findAll($criteria);
	// show all authors and his/her posts
	foreach($authors as $author)
	{
		echo "Author = " . $author->name . "\n";
		foreach($author->posts as $post)
		{
			echo "Post = " . $post->title . "\n";
		}
	}
}
~~~

You can join the table without fetching its data by specifying **select**
property to **false** in the definition of the relation.

Now you can do the lazy loading of Posts by this trick.

Please take note that the definition of the relation can be dynamically changed
on the fly as you see in the code above. It overrides the definition of the
relation declared in relations() method.

Now, let's move on to the next one. I tell you, it is a difficult one, in fact.

## Task #5

**Show top 5 authors in the order of name with his/her all posts who has at
least one post that has a certain word in post title**

OK, so we need to add "LIMIT" and "ORDER". Going to try with
this one.

### Trial #1

~~~
[php]
public static function GetTop5AuthorsWithPostsByPostTitle($searchWord)
{
	// query criteria
	$criteria = new CDbCriteria();
	// with Post model
	$criteria->with = array('posts');
	// compare title
	$criteria->compare('posts.title', $searchWord, true);
	// order by author name
	$criteria->order = 't.name ASC';
	// limit to 5 authors
	$criteria->limit = 5;
	// find all authors
	$authors = Author::model()->findAll($criteria);
	// show all authors and his/her posts
	foreach($authors as $author)
	{
		echo "Author = " . $author->name . "\n";
		foreach($author->posts as $post)
		{
			echo "Post = " . $post->title . "\n";
		}
	}
}
~~~

But this will end in an error as long as `$searchWord` is not empty. Yii will
say "There's no column like **posts.title**". Huh? We have set 'posts'
to 'with'! Why is it?

The guide says:

> Info|Guide : By default, Yii uses eager loading, i.e., generating a single
SQL statement, except when LIMIT is applied to the primary model.

It means that if LIMIT is applied to the primary model, then lazy loading will
be used. This rule has been applied to our code above, and the query was
executed without joining the author table. So, what to do then?

The guide proceeds to say:

> Info|Guide : We can set the together option in the relation declarations to
be true to force a single SQL statement even when LIMIT is used.

OK. So we will modify the code to:

### Trial #2

~~~
[php]
	...
	// force to join Post
	$criteria->with = array(
		'posts' => array(
			'together' => true,
		),
	);
	...
~~~

What about this? Seems OK. You will not see the error anymore.

But, alas, you will get the strange output like this:

~~~
[search word = foo]
Author = Andy
	Post = Don't use foo
	Post = Use yoo for foo
Author = Ben
	Post = foo is great
	Post = I love foo
Author = Charlie
	Post = What's foo?
[end]
~~~

We want to show 5 authors, but the list ends where the count of posts sums up to
5.

OK. Then we will use **'select' => false** trick again.

### Trial #3

~~~
[php]
	...
	// force to join Post (without selecting)
	$criteria->with = array(
		'posts' => array(
			'together' => true,
			'select' => false,
		),
	);
	...
~~~

But it still doesn't work. It will show the results like this:

~~~
[search word = foo]
Author = Andy
	Post = Don't use foo
	Post = Use yoo for foo
	Post = Don't use bar
	Post = Use yar for bar
Author = Ben
	Post = foo is great
	Post = I love foo
	Post = I also love bar
Author = Charlie
	Post = What's foo?
	Post = What's bar?
[end]
~~~

It is because LIMIT is not applied to the primary table, but to the virtually
constructed table that is the result of joining. It's no use complaining about
this behavior, because that's how the query works in RDB.

But we won't give up. Let's try grouping then.

### Trial #4

~~~
[php]
	...
	// force to join Post (without selecting)
	$criteria->with = array(
		'posts' => array(
			'together' => true,
			'select' => false,
		),
	);
	...
	// group by Author's id
	$criteria->group = 't.id';
	...
~~~

Wow, great! It works!

To sum it up:

### Answer

~~~
[php]
public static function GetTop5AuthorsWithPostsByPostTitle($searchWord)
{
	// query criteria
	$criteria = new CDbCriteria();
	// force to join Post (without selecting)
	$criteria->with = array(
		'posts' => array(
			'together' => true,
			'select' => false,
		),
	);
	// compare title
	$criteria->compare('posts.title', $searchWord, true);
	// group by Author's id
	$criteria->group = 't.id';
	// order by author name
	$criteria->order = 't.name ASC';
	// limit to 5 authors
	$criteria->limit = 5;
	// find all authors
	$authors = Author::model()->findAll($criteria);
	// show all authors and his/her posts
	foreach($authors as $author)
	{
		echo "Author = " . $author->name . "\n";
		foreach($author->posts as $post)
		{
			echo "Post = " . $post->title . "\n";
		}
	}
}
~~~

Now, here is the last task.

## Task #6

**Show top 5 authors in the order of name with his/her relevant posts who has at
least one post that has a certain word in post title**

Probably the filtering in lazy loading should be the only answer. I cant't think
of another solution. 

### Example of Answer

~~~
[php]
public static function GetTop5AuthorsWithPostsByPostTitle($searchWord)
{
	// query criteria
	$criteria = new CDbCriteria();
	// force to join Post (without selecting)
	$criteria->with = array(
		'posts' => array(
			'together' => true,
			'select' => false,
		),
	);
	// compare title
	$criteria->compare('posts.title', $searchWord, true);
	// group by Author's id
	$criteria->group = 't.id';
	// order by author name
	$criteria->order = 't.name ASC';
	// limit to 5 authors
	$criteria->limit = 5;
	// find all authors
	$authors = Author::model()->findAll($criteria);
	// show all authors and his/her posts
	foreach($authors as $author)
	{
		echo "Author = " . $author->name . "\n";
		// lazy loading posts with filtering
		$filteredPosts = $author->posts(
			array(
				'condition' => 'title LIKE :search_word',
				'params' => array(
					':search_word' => '%' . $searchWord . '%',
				),
			)
		);
		foreach($filteredPosts as $post)
		{
			echo "Post = " . $post->title . "\n";
		}
	}
}
~~~

It might not looks very elegant, but I think it's a decent solution without any
hacky tricks.

The key point of the answer is that you can dynamically define the query options
of the relation in lazy loading.

The guide says:

> Info|Guide : Dynamic query options can also be used when using the lazy
loading approach to perform relational query. To do so, we should call a method
whose name is the same as the relation name and pass the dynamic query options
as the method parameter. 

## Conclusion and Notice

Well, we've managed to accomplish all the tasks without getting lost.

It has been a little surprise to me, because I didn't expect it. Originally, my
intention for this article was just to show the dilemma in the search by a
HAS_MANY and the reason for it.

CActiveRecord of Yii has been more powerful than I have imagined.

### CActiveDataProvider

Although we didn't take up CActiveDataProvider here, the basic concepts we
discussed are also applicable to it. You should note that LIMIT is usually there
to confuse you, because CActiveDataProvider is normally used with CPagination.

### The Guide

"The guide" in this article refers to "The Definitive Guide to
Yii", particularly the section of "Relational Active Record".

[The Definitive Guide to Yii : Relational Active
Record](http://www.yiiframework.com/doc/guide/1.1/en/database.arr)

It's not quite easy to understand all the content of it, esspecially when you
are new to Yii. But you **MUST** read it. I can also recommend to those people
who feel at home with Yii to read it over once in a while. It's worth the time.

## UPDATE: Using a dedicated relation for searching

So, this is **a very important** update to this article.

By specifying a dedicated relation for searching, we can join the same table
independently for filtering and fetching data.

For instance, the answer to the task #4 could be written like the following:

### Optimized Answer to Task #4

~~~
[php]
public function relations()
{
	return array(
		'posts' => array(self::HAS_MANY, 'Post', 'author_id'),
		'posts_search' => array(self::HAS_MANY, 'Post', 'author_id'),
	);
}

public static function GetAuthorsWithPostsByPostTitle($searchWord)
{
	// query criteria
	$criteria = new CDbCriteria();
	// join Post model (one for fetching data, the other for filtering)
	$criteria->with = array(
		'posts' => array(  // this is for fetching data
			'together' => false,
		),
		'posts_search' => array(  // this is for filtering
			'select' => false,
			'together' => true,
		),
	);
	// compare title
	$criteria->compare('posts_search.title', $searchWord, true);
	// find all authors with his/her posts
	$authors = Author::model()->findAll($criteria);
	// show all authors and his/her posts
	foreach($authors as $author)
	{
		echo "Author = " . $author->name . "\n";
		foreach($author->posts as $post)  // no queries executed here !!
		{
			echo "Post = " . $post->title . "\n";
		}
	}
}
~~~

The difference between the original answer and the optimized one lies in the
**performance**.

There were **1 + N** queries executed in the original answer, because every
`$author->posts` would trigger a query to fetch the posts.

But there are only **2** queries here. There's no query executed for each
`$author->posts`, because all the posts have been fetched in `findAll`. Yes,
findAll executes 2 queries: one for the relations with 'together', and the other
for those without 'together'. (For filtering the main table in the
2nd query, Yii will use the primary keys fetched in the 1st query in `IN`
condition.)

Likewise, the answers to the task #5 and #6 could be optimized like the
following:

### Optimized Answer to Task #5

~~~
[php]
public static function GetTop5AuthorsWithPostsByPostTitle($searchWord)
{
	// query criteria
	$criteria = new CDbCriteria();
	// join Post model (one for fetching data, the other for filtering)
	$criteria->with = array(
		'posts' => array(  // this is for fetching data
			'together' => false,
		),
		'posts_search' => array(  // this is for filtering
			'select' => false,
			'together' => true,
		),
	);
	// compare title
	$criteria->compare('posts_search.title', $searchWord, true);
	// group by Author's id
	$criteria->group = 't.id';
	// order by author name
	$criteria->order = 't.name ASC';
	// limit to 5 authors
	$criteria->limit = 5;
	// find all authors with his/her posts
	$authors = Author::model()->findAll($criteria);
	// show all authors and his/her posts
	foreach($authors as $author)
	{
		echo "Author = " . $author->name . "\n";
		foreach($author->posts as $post)  // no queries executed here !!
		{
			echo "Post = " . $post->title . "\n";
		}
	}
}
~~~

### Optimized Answer to Task #6

~~~
[php]
public static function GetTop5AuthorsWithPostsByPostTitle($searchWord)
{
	// query criteria
	$criteria = new CDbCriteria();
	// join Post model (one for fetching data, the other for filtering)
	$criteria->with = array(
		'posts' => array(  // this is for fetching data
			'together' => false,
			'condition' => 'posts.title LIKE :search_word',
			'params' => array(
				':search_word' => '%' . $searchWord . '%',
			),
		),
		'posts_search' => array(  // this is for filtering
			'select' => false,
			'together' => true,
		),
	);
	// compare title
	$criteria->compare('posts.title', $searchWord, true);
	// group by Author's id
	$criteria->group = 't.id';
	// order by author name
	$criteria->order = 't.name ASC';
	// limit to 5 authors
	$criteria->limit = 5;
	// find all authors
	$authors = Author::model()->findAll($criteria);
	// show all authors and his/her posts
	foreach($authors as $author)
	{
		echo "Author = " . $author->name . "\n";
		foreach($author->posts as $post)  // no queries executed here !!
		{
			echo "Post = " . $post->title . "\n";
		}
	}
}
~~~