stat relation

I have a user model that HAS_MANY posts.  In the user list, I would like to display the number of posts each user has. All works fine, but the generated SQL looks inefficient and unclean.

<?php


//User model:


	public function relations() {


		return array(


			'post' => array(self::HAS_MANY, 'Post', 'user_id'),


			'num_posts' => array(self::STAT, 'Post', 'user_id'),


		);


	}





//relevant controller snippet


$users=User::model()->with('group', 'num_posts')->findAll($criteria);





//generated SQLs


1. SELECT `user`.`id` AS t0_c0, `user`.`username` AS t0_c1, `user`.`password` AS t0_c2, `user`.`email` AS t0_c3, `user`.`email_visible` AS t0_c4, `user`.`notify_comments` AS t0_c5, `user`.`notify_messages` AS t0_c6, `user`.`about` AS t0_c7, `user`.`aboutParsed` AS t0_c8, `user`.`group_id` AS t0_c9, `user`.`email_confirmed` AS t0_c10, `user`.`created` AS t0_c11, `user`.`modified` AS t0_c12, t1.`id` AS t1_c0, t1.`name` AS t1_c1 FROM `user`


LEFT OUTER JOIN `group` t1 ON `user`.`group_id`=t1.`id`


ORDER BY `user`.`created`


DESC LIMIT 25





2. SELECT `user_id` AS c, COUNT(*) AS s FROM `post` GROUP BY `user_id`


HAVING `post`.`user_id`


IN (105, 103, 102, 1, 101, 78, 56, 67, 66, 89, 86, 65, 52, 93, 79, 41, 58, 51, 76, 88, 46, 57, 80, 72, 82)


Using the IN() functions seems much less efficient then a simple JOIN.  Shouldn't it JOIN it instead?  (using together() gives the same result)

Also, I have not tried or tested this yet, but do you think it would be possible to sort a user list by the number of posts a user has created?

Good observation.

It is hard to say which approach is more efficient. For simple cases, the JOIN approach might be the winner. For complex cases (e.g. getting the comment count for each post of a user), the IN approach could be more efficient. Another reason for the IN approach is the simplicity of implementation.

I can't think of a good way to sort a user list by some aggregation column without explicitly writing the JOIN fragment.

Thanks,

In this case I was only pulling 25 users, what if I was pulling 500 of them, wouldn't a join be much better in that case?  Is there a way to force a JOIN over IN?

Yes, with 500 users, join would be much faster. I just did a quick test. At 100 users, IN takes twice the time needed by JOIN; At 20 users, JOIN is roughly the same as IN. At  For smaller numbers, IN is faster. If the JOIN involves many tables and conditions, this threshold could be higher.

Currently, there is no way to force a JOIN. This is mainly due to technical difficulty since we are seeking a generic solution that can be applied to very complex scenarios. Maybe in future we can do some optimization in some special yet commonly used cases.

I think AR in general is not very suitable to perform aggregational queries. The SQLMap that we are working on is a better choice for this kind of task.

In reality, if the performance of such query becomes an issue (it is not very common that you would pull more than 100 users during a request), we should turn to raw SQL.

I thought SQLmap was an addon for AR…?  I guess i'll just wait to see…

I remember Cake had a very advanced AR methods to do complex queries like this… It could do almost literally any query, without you writing out SQL.  Perhaps that should be looked at…  An example, fyi:

<?php


$results = $this->Post->find('all', array(


	'fields' => array('Post.*', 'User.name', 'COUNT(Comment.id) AS NumComments'),


));


The above would fetch all the fields of the Post table, along with each post's author's name, and a count of each post's number of comments.  I believe this would generate 1 query with 2 JOINS.

Note the dot syntax.  'User.name' means the name attribute of the User model (the table may not necessarily be named User however, the actual name is checked from the model).  'Post.*' means all the attributes of the Post table.

If you wrote "Post.name" it would automatically be translated to:

postTableName.name where postTableName is defined as the name of the table in the Post model

Good to know that. Did you try this statement? Will it automatically add GROUP by clause? On pgsql, this won't work without adding those post.* and user.name fields to the GROUP by clause.

No, in cake I think you have to add the following into the above code to group:

'group' => 'Post.id'

Yes, I tried

<?php $criteria->select=‘COUNT(post.id) AS NumComments’;
in yii, but the problem is that post is not being JOINed, so it doesn’t recognize post.  It might work if I wrote that out the JOIN in $criteria->join.

Basically, I think it would be nice Yii used the dot syntax, so you could do:

<?php $criteria->select = 'Post.*, User.name, COUNT(Comment.id) AS NumComments'

And it would automatically perform 2 joins, and replace the model name with the table name.  So you would actually not necessarily need the with() method to do joins

How is this situation handled: a post has an editor and an author? In this case, which column will be joined?

Can you give me a complete code not