An idea for Joiningless ActiveRecord

Hi,

I want to explain my new idea about ActiveRecord, It maybe like samdark’s idea apart. My idea is about an active record’s design with joiningless. Because some reasons:

  • Problem with big data

  • Cross storage relations

  • MySQL Replication & Sharding

[size="4"]Problem with big data[/size]

First, as of my experience, with big data, the SQL join is a problem.

For example, if I have 2 tables: person and team with a relation : person.team_id = team.id

Normally, whenever we need to get some people including their team, we use :


SELECT * FROM person p INNER JOIN team t ON p.team_id = t.id WHERE {some conditions on person}

That’s fine, but if the person table has about 2 milions of records, and about 500,000 records in team, this will be a problem for MySQL. The joining progress will kill the server.

My solution is use 2 SQL queries, the first get people, the second get team and then join them by PHP code.


SELECT * FROM person WHERE {conditions}

PHP: get all team id from the results

SELECT * FROM team WHERE id IN (...)

[size="4"]Cross storage relations[/size]

At my project, I use Service Oriented design. So, our product is splited into some services, each services has its own data (model) and business. But, in general business some services need to access other services’ models, I use RESTful to exchange the data. That means I can do find, findAll, save (insert, update) active records through RESTful.

Sometimes I need to make a relation between MySQL ActiveRecord and RESTModel, I used component’s getter to solve it, but it will be better if I use relations.

And some other storages which we can apply find, findAll, save actions on them : file, memcache, sqlite … (samdark has told about that).

[size="4"]MySQL Replication & Sharding[/size]

The biggest disadvantage point of MySQL is no multi-master at replication. With a high scale product, we can use 1 master server only, and many slave servers. With high rate of write query, the master server will die.

If you want to join some tables, they have to be at same server, same connection.

We can not use MySQL sharding with current related active record design.

[size="4"]Summary[/size]

The SQL with joining is very pretty at theory, so we can use it in relations declaration.

But when we implement it, should we use a joiningless finding, and the above 4 problems can be solved.

What do you think?

Isn’t it the same thing as current together=false option?

Yes, but now we have MySQL only.

My idea is about another design, so we can have other storages, like File, RESTful, sqlite and they can be used together with relations.

For example:




/**

* This class is used to store data only

*/

abstract class CActiveRecord

{

abstract function getDataAdapter();

function save() {...}

function find() {...}

function findAll() {...}

}


abstract class CDataAdapter

{

abstract function insert();

abstract function update();

abstract function findAllByPk();

abstract function findAllByAttributes();

}



So, whenever we want to add a new storage, for example RESTful:




class CRESTfulActiveRecord extend CActiveRecord

{

function getDataAdapter() { return new CRESTfulDataAdapter($this); }

}


class CRESTfulDataAdapter extend CDataAdapter

{

// implements abstract functions with REST requests

}



The same to sqlite, file, memcache …

I’ve customized my Yii instance to that, just MySQL & RESTful. But I think could you guys please think about put it into Yii 2.0, as an option.

I’ve took a look into together’s document, it’s not exactly what I want to say.

It’s just for HAS_MANY and MANY_MANY queries.

I never had the honor of developing a site that outgrow MySQL so it had to be sharded. I am very well aware that this problem exists but I think most people are fine with a single instance (or master slave replication). Yii shouldn’t try to fix the faulty design of SQL databases (they were never ment to be sharded) but instead make it easier to come up with another database solution that is well suited for such scenarios. But then again this sounds easier than it is. SQL is a quite standardized language while noSQL databases use proprietary ways of querying etc… The same is true for APIs (I tried to build something like you mentioned with my EActiveResource extension). Having a dataAdapter layer could make it easier for the community to develop their own stuff though so that could be an interesting feature. I basically had to copy paste huge parts of CActiveRecord in order to implement my extension.

I agree that if we want to write an EActiveRecord, we will have to copy & paste many codes from CActiveRecord.

What I actually did is write an abstract class, call CRelatableModel, with many functions and relations like CActiveRecord.

And then I forced CActivedRecord to extend from CRelatableModel, my CRESTfulActiveRecord was too,

So, I had not to duplicate codes from CActiveRecord, most of them were moved from CActiveRecord upto CRelatableModel.

I cann’t share this code, sorry. This is my company’s policy. However I’m very glad to talk about its design, and maybe make a copy of it.

If you need to shard - you probably have to ditch the AR because there are more efficient ways to spend your CPU and memory. Just use DAO or modify the implementation of AR like you did. Yii is build for general purpose use and gives you excellent abilities on extending and redefining it’s standard components to fit special needs.

It’s more a shame that PDO is lacking in features and I’m slowly working an adapter over mysqli library to fit the PDO public API and expose the functionality that mysqli has via CDbConnection that comes in handy when you write a project that is not “yet another CMS-based website”

Not only sharding, but also for using array, file, Memcache … as data storages.

And, I agree with you that Yii should be designed for general purpose. With current design, it’s able but too hard to make the other storages with relations.

Samdark has told about the independent of Model and Storage before, I like his idea.