Anti-together ActiveFinder behaviour
Posted 30 June 2009 - 08:43 AM
I was looking forward to implement model-level cache layer as extension (or even hack) within ActiveRecord. Current implementation of ActiveFinder was a surprise for me. Let's say we have 2 models test[id], super_test[id, test_id].
SuperTest::model()->find() will load all fields of first record. Including test_id. SuperTest::model()->find()->test expected to query one table super_test with test_id field. But it didn't. It joined 2 tables instead. I believe there were strong reasons to do it this way. Maybe difficult cases of "->with()->" or something else. Could you please share them, guys?
I trust we could just run one-table query in most cases. This way is faster and easier to handle with caching. With explanation I could implement switches in correct places, add cache layer and propose diff which could help others.
Either I wonder if there well way to add database-connection switch functionality in concrete model depending on some function or anything. Second thing I really miss in current ActiveRecord implementation is sharding support. And probably there's way to do it without strictly changing AR code. I didn't find it. Going to share result too.
We are going to base on yii in extremely high-loaded project. And alternatives' list is completely empty. You are doing great work with Yii.
Posted 30 June 2009 - 09:54 AM
Sharding is vertical database scaling. Realization of such a thing is only available inside of software cause it strictly depends on data we operate with. Excelent expalation and description of what sharding is can be found here: http://en.netlog.com...blogid=3071854. We are going to implement project with same (like in article) load on yii in fact. That's why Sharding is top priority feature. I'll attach here description of what I'm going to add to AR and how I gonna do it before implementing so we can discuss details.
Posted 30 June 2009 - 10:00 AM
I'm looking forward to discussing with you about sharding support. I didn't know this name, but this is something we wanted to do but don't have much experience with. I think your experience will help a lot here. Thanks.
Posted 01 July 2009 - 03:45 AM
In the spirit of N+1 queries for several many-to-many relations, I'd like to file a feature request that has been covered in this article.
However, i don't think this belongs here, so
Posted 01 July 2009 - 07:08 AM
1. CActiveRecord.getRelational()'s only task is to get related model. It doesn't need CActiveFinder at all. It has to use CActiveRecord.findByAttributes queries. Will replace.
2. All JOINs logic is included in CActiveFinder with concrete interface. I'm going to implement abstract class describing it's interface and implement alternate way of doing same things with lots of queries. Let's call it CCachingActiveFinder (cause it only makes sense with caching).
Abstract class will define static factory method.
if ( we have model caching turned on )
else return CActiveFinder;
Besides doing same logic with more queries, new finder will try to put to cache something like "model-foobar-id#34" with model data. And before querying by array of IDs it'll try to get such data from cache. It's also required to add garbage collector to models. Model changed -> clear itself cache. I didn't find any conventions on inner cache naming. How should I name those models cache entries?
And most important thing here is to make all of that functionality setupable. Such realisation requires next settings to be set by user:
1. Model cache on/off
2. Lifetime of cache
3. Exclude list: which models shouldn't be cached
4. Maybe prefix for cache entries names
I wasn't able to find something about interacting with config from inner things like AR. So if you have some time, could you please give me some links or even describe a bit?
P.S. For sure all that caching functions will be separated to ModelCache class in addition to current cache implementation classes ;]
Posted 01 July 2009 - 09:14 AM
The N+1 methodic is a new approach I first saw in Yii, which is about to split up huge join queries, so no more than 1 has_many or many_many relation is involved in one single query. It has nothing to do with my feature request.
I can't see the point of adding exclude list, as model caching could be turned off individually within each model.
(Maybe there's even no need to implement such a switch, as parent class can decide whether model can support caching or not.)
I suppose it'd be better to contribute this caching strategy as a standalone component to keep the core balanced.
As far as I know, config entries are not free to modify. You can however use Yii::app()->setParams(array) to redefine the whole parameter array, but not one-by-one.
For caching, I think the table name is acceptable as name prefix. As far as I know, naming is absolutely up to the developer. I currently use CDbCache with SQLite, and the primary key is a field of type CHAR(128).
Feel free to correct me, I'm not very experienced regarding Yii.
Posted 02 July 2009 - 01:34 AM
------------------- Cut here -----------------------
Ok. I studied AR implementation and made small first patch to eliminate JOIN-queries usage in AR->getRelated() (all but MANY_MANY: they will use JOIN anyway). I decided to try to implement Sharding feature first. You seem to be more interested in it. Either cache layer supposed to be automatic while sharding not. So it would be more optimal to get new interface and capabilities to use first and add background optimizing feature after.
Could you please check patch to see if I took everything in account?
Patch is for branches/1.0, apply from root.
Posted 02 July 2009 - 06:15 AM
I believe call_user_func got deprecated in recent versions, so $class=new $relation->className could be a better idea, if this is supported by php.
Please note, findXXX methods can take CDbCriteria, you don't have to pass them as an array.
I hope I could help.
By the way, I'm more interested in that queries can run in two steps, returning the ids first, then the selected fields. See the ticket for more information.
Posted 02 July 2009 - 07:44 AM
P.S. call_user_func is not deprecated. It's ok. We need to handle static call here. So your syntax for replacement is not correct.
Posted 02 July 2009 - 08:01 AM
When determining foreign primary key, you cannot reach _md, as it is declared as private class property. Use getPrimaryKey() instead as you've done in line 24.
We're on the right track to purge all errors. I hope an admin will confirm your patch.
Posted 03 July 2009 - 01:05 AM
However it's true we should change this line to:
$fpk = call_user_func("$class::model")->getMetaData()->tableSchema->primaryKey;
Just to make it less hacky :].
qiang, could you please comment on this version? It seems to be mature enough which means I got AR architecture at last . I feel somewhat ready to go describe Sharding.
Posted 03 July 2009 - 11:06 AM
Posted 03 July 2009 - 12:35 PM
I wasn't able to do Foo::model()->with('...')->associate. It doesn't make sense in fact. Are there any available modifiers for relation request? I didn't find any.
Posted 03 July 2009 - 01:03 PM
If the secondary relation also has a 'with' param, other relations may also be involved. I can't confirm this as I've never done anything this complex.
Seems more like a hierarchical problem. Can't this be made so that it uses Yii internals recursively?