Saas Multi-Tenant Separate Db

Hi guys

I’m developing a commercial SaaS site with a shared DB. However, I am participating problems with a shared DB and consider moving towards a separate DB for each tenant.

Reasons:

  1. Commercial clients never want exactly the same system, simply because their business-processes are unique. Separate DB could make individual tweaks easier.

  2. Commercial programs are not simple. They normally have many things that the user could do wrong. Restoring backups to a shared DB could influence innocent tenants. Trying to correct user mistakes in only certain records could be dangerous for other tenants if you are not careful - and normally you don’t have much time for testing, because the client’s business must go on.

  3. Most tenants want their invoice numbers to be sequential. How do you do that in a shared DB? Auto-increment will not work, because tenants share the invoice table, meaning tenant-A will have invoices 1, 5, 13, 87, That will not work.

  4. Separate DB’s could be much faster and scaling is a dream.

  5. Some clients insists on their data being in a separate DB.

Negatives:

  1. Separate DB’s are more expensive. Really? Is this still a factor to mention? I mean hosting at world-class institutions now cost as low as US$2 per month per database.

Question:

So I want a main DB which contains all the tenants. Then I want to use the tenant_id as the name for the tenant’s own database. The connection string must therefore use the tenant_id (stored in session) to connect with the tenant’s own DB.

Any thoughts? Any hiccups?

I would choose separate DBs also.

However, You may need to implement your own CDbConnection since connectionString property seems to be read-only. This new class needs to construct the proper connectionString to connect to tenant’s DB.

I’m going to push back on your reasons a little bit. I have a little experience in this area but I appreciate the discussion as it stimulates some ideas for my projects as well.

  1. SaaS software is typically very opinionated. Any flexibility in schema has to be reflected in code, right? If this is really a SaaS app you will have identical code for all clients - how do you plan to support flexibility without writing custom code per customer? Because if you are doing that I don’t believe you have a SaaS application anymore.

  2. Yes and no. It truly is more complex to deal with DB backups / restorations in this scenario but since each record is keyed to a client it is easy to only update records matching a clientID.

  3. I would use the ID key as a non-client specific record (or forego one, entirely). Consider this… you have an invoiceID and tenantID field in your invoice table. The primary key for the table can be composite of both keys. You can either get the max() value per tenant or try this link for an idea using a DB trigger:

http://stackoverflow.com/questions/5286685/sqlite-use-composite-primary-key-and-autoincrement-both-columns

  1. Could be. Or could not be, depends on design. Most databases handle millions of records well with proper indexes. The exception may be if you have really ‘wide’ tables (lots and lots of columns).

  2. Again, for a SaaS app (my experience) the client usually has stipulations based on data integrity and security. They don’t really specify how you achieve that, but that may be the case in your area.

There are pros and cons. One option you didn’t mention is unique schema or views per client instead of dedicated DBs. This would be a hybrid approach.

More food for thought:

http://stackoverflow.com/questions/1742756/multitenant-db-why-put-a-tenantid-column-in-every-table

Dear Gerhard Liebenberg

I have no idea about Saas.

I once succeeded in switching between databases depending on the values stored in session.

I hope we can apply that in your scenario.

If user has right to have separate database, when he gets logged in we are storing his id as database name.

SiteController.php




public function actionLogin()

	{	

	    $model=new LoginForm;

	    if(isset($_POST['LoginForm']))

	    {

		$model->attributes=$_POST['LoginForm'];


		if($model->validate() && $model->login())  {


                        $user=User::model()->findByAttributes(array('username'=>Yii::app()->user->id));


                        if($user->hasOwnDatabase)

                             Yii::app()->user->setState('database',Yii::app()->user->id);


			$this->redirect(Yii::app()->user->returnUrl);  }

	    }

		

		$this->render('login',array('model'=>$model));

	}



Create an application behavior and attach it to main application.

applicationBehavior.php in components directory




<?php

class applicationBehavior extends CBehavior

{      

    private $_owner;

    public function events() {


                        return  array(

                                'onBeginRequest'=>'assignDb',        

                        );

                             }

        

    public function assignDb()

    {

        $owner=$this->_owner=$this->getOwner();

                

        if(!$owner->user->getIsGuest && $owner->user->hasState('database'))

        {

            $database=Yii::app()->user->getState('database');                

            $owner->db->setActive(FALSE);   


$owner->setComponent('db',new CDbConnection("mysql:host=localhost;dbname=$database",'root','yourpassword'));

                                                   

            $owner->db->setActive(TRUE);

                                

        }

    }

        

}

?>




Attach this behavior instance as a property to main configuration file. Leave the ‘db’ component intact.




return array(

......................................

.......................................

'behaviors'=>array(

'class'=>'application.components.applicationBehavior',

),

......................................

......................................

'components'=>array(

.........................................

........................................

'db'=>array(

                        'connectionString' => 'mysql:host=localhost;dbname=routine',//routine-default database

                        'emulatePrepare' => true,

                        'username' => 'root',

                        'password' => 'password',

                        'charset' => 'utf8',

                ),

.....................................................

....................................................

),

)



So when user is created, if he has right to have separate database it is stored as a boolean. Then we create a separate database on his name. After successful registration, when he login, he will use his own database on the next immediate request. Otherwise he will work on default database.

I do not know exactly what you expect.

I hope this helps a bit.

Regards.

Hi guys, thanx for the nice feedback. This is what saves people lots of time and struggling.

Roman Solomatin:

Thank you for the tip. I will check it out.

waitforit:

I see what you mean: all of my concerns could be addressed in a single DB as well. And I would actually prefer to continue using a single DB (much easier and much less maintenance work).

But as you said, I guess it boils down to how much "custom code per customer" are you going to allow - regardless of whether the final product is truly SaaS or not (none of my clients would care anyway).

I don’t plan on much customisation. I would rather want to add new program-functions as customers come with new requests. Then all other customers could decide for themselves what functions they want to use.

But, I’m very scared of having to tell a customer - maybe five years from now - “sorry Sir, we can’t add your patients’ scanned x-rays to the database, because that will slow down all our other customers”, or “sorry Sir, we can’t move your data to a faster server that is closer to your home, because it is kept in a database closer to most other customers in the Kalahari”.

I’ve never done this; so maybe my concerns are ridiculous. I’m just trying to avoid future surprises. I hate surprises:)

But you’re making a good point.

Could you please elaborate on "unique schema or views per client instead of dedicated DBs". I know SQLServer uses schemas inside databases, but it looks like a schema and a database is the same thing in MySQL.

seenivasan:

Since you are already doing this stuff, I really value your input. I got more than what I expected.

Many thanx.

Check out this IBM article:

Convert your web application to a multi-tenant SaaS solution

Another article:

Multi-Tenant Data Architecture

Hi CodeButterfly

I guess you win some and you loose some. Multiple databases will make life easier in some situations but in others not.

So each developer needs to evaluate his/her own environment. My clients are very specialized medical training institutions. So even if each tenant has its own database, there will probably not be that many. And as long as you keep all schemata identical, it should not be that hard to manage.

And I do agree with the article in that: multiple databases should not result in unique programming features for different tenants. That will be a nightmare. But what I am willing to do is to write different add-on functions, which all tenants can choose to use or not. If your database design is done correctly, this should not be a problem.

Okay, I think I will try to have best of both worlds:

I first wanted to have a main db with a table containing all the tenants. Then I have a separate db for each tenant with their tenant-id forming part of their db’s name.

Now, I think I will change it to this:

I still have a main db with a table containing all the tenants.

Then I have a single secondary db which stores all the tenants’ data. This secondary db thus contains multiple tenants. IF a tenant needs to be moved to its own db, then I create another secondary db for that tenant.

The main db will store the name of the secondary db to use for each tenant.

So each secondary db can have just one or multiple tenants.

This way, you can keep all tenants in one db, until you need the ability to separate some of them. You also have full control over the number of secondary databases created.

Gerhard,

Did you manage to build a production multi tenant site on this hybrid architecture. I really like your idea. I’m starting to build a SaaS myself and I’m thinking on mixing this method and code, with the DB structure that you proposed. Would you mind sharing your experience and lessons learned? I would really appreciate it.

Read this wiki, it might help you: A Multi-Tenant Strategy Using Yii and MySQL. Also read the linked articles. Whoever recommends a single database per tenant has not had to handle hundreds or thousands of them.

As for unique invoice numbers, that could be a separate DB field from the primary key so it can be handled easily.

Individual tenant DB backups require a some thought but they can be managed through DBMS choice, partitioning, etc. Certainly there is no free lunch so YMMV.

Good luck.

@Yomer

I halted the ‘hybrid architecture’ idea for a while when I found out that in Yii 1, you can not retrieve data from separate databases via SQL, because sql statements use only one connection string. You can thus only do it with AR, because each model can use a different connection string. (Or you have to merge separate sql results.)

This is a problem for me, because I have reports that tunnel through 4 or 5 tables producing thousands of pages. In these reports, AR could hammer the memory.

But I see that Yii 2 has the ability to retrieve multiple records via AR, without creating a huge new model for each record - which should sort out the memory problem. I don’t know if Yii 2 can use single sql statements to multiple DBs.

So I will soon try that ‘hybrid architecture’ again.

@JFReyes

Your mentioning of ‘Individual tenant DB backups’ should be an interesting idea. Being able to “restore a tenant’s backup” without influencing other tenants is definitely one of the major reasons I want to have multiple DBs. But if you are able to restore only one tenant’s data, then having a single DB might be an option.

Thanx for the link to that new wiki that both you guys included.

Gerhard,

I was actually using JFReyes' article to do my multi-tenant architecture. When I read your idea about a hybrid arch., I started to wonder if I could modify JFReyes' solution to allow his last bit of code in the article(The Behavior), to change the DB connection string upon reading a specila &quot;flag&quot; field in the tenant table. 

@JFReyes, @Gerhard

Wouldn't it be possible to change the AppStartup Behavior's beginRequest method from this:

public function beginRequest()

    {

        // switch db credentials for logged in users

        if (!Yii::app()->user->isGuest) {

            $u = TUser::model()->findByPk(Yii::app()->user->id);

            $tu = TTenant::model()->findByPk($u->tenant_id)->dbu;

            $tp = TTenant::model()->findByPk($u->tenant_id)->e_dbpwd;

            Yii::app()->db->setActive(false);

            Yii::app()->db->username = $tu;

            Yii::app()->db->password = $tp;

            Yii::app()->db->setActive(true);

        }

    }

To this:


public function beginRequest()

    {

        // switch db credentials for logged in users

        if (!Yii::app()->user->isGuest) {

            $u = TUser::model()->findByPk(Yii::app()->user->id);

            $tu = TTenant::model()->findByPk($u->tenant_id)->dbu;

            $tp = TTenant::model()->findByPk($u->tenant_id)->e_dbpwd;

            $odb = TTenant::model()->findByPk($u->tenant_id)->own_db; //Boolean or Tinyint if Tenant uses own DB. 

            Yii::app()->db->setActive(false);

            Yii::app()->db->username = $tu;

            Yii::app()->db->password = $tp;

            if ($odb === 1) {

               Yii::app()->sb->connectionString = 'mysql:host=localhost;dbname=db_' . $tu;

            }

            Yii::app()->db->setActive(true);

        }

    }

This way, if the tenant has paid or asks for a separate database, and using db_[tenantuser] as the name of the tenant’s own database. The connection will be made to store the data to another database. Since in my case, each of the tenants’s write actions would only need to access one database at a time. Be it its own or the shared one if he doesn’t have the “own_db” flag active.

Anyway, I’m still configuring the Controllers, this is an idea I’m planning to implement to maybe get hte Hybrid DB theory working. What do you guys think?

Hi Yomer

There should not be any differences in controllers used for single DBs and controllers used for multiple DBs. The only differences should be in the models. To do this, check out this wiki and my comment here.

Regards

It seems complicated without having to change app code as well. In your scenario you would need to duplicate both the tenant and user tables in every DB, albeit having only one tenant in the tenant table and few users in the user table for each particular tenant. How will you keep their primary keys in sync? For example, when a new user is added to the user table how would you ensure that its primary key will match its counterpart in the master (commingled) DB so it can be read upon app startup? I’m not sure how to tackle that.

You’re spot on about the keys, I guess I didn’t think that through.

How about this. In my case, each tenant has a different subdomain/path, so upon connection, the app relates this starting point to the tenant username. Then it asks through the default DB, if it needs to connect to the same default DB or an independent one. When the tenant’s client authenticates, it does to its corresponding DB. And when a new user is added, it’s added only in the tenant’s DB, since user information will only exist in that DB, there’s no need to have a counterpart in the master DB. For Tenants and their clients, this architecture should work out well; the problem would present itself for the app-staff/automated-registration-process when a new tenant needs to be created; the tenant info goes into the master DB and the user account into the master or independent DB, depending on the registered working/connection mode. After tenant registration, depending on the working mode, all data connections should fall through to the correct destination.

The only primary/foreign key relation that would be broken is the tenant_id in the tenant and user tables. All other table’s key relations would remain intact, since tables don’t share data between DBs(except the user table). The solution would be to emulate the tenant_id key relation through the app, or use only the tenant dbu to link the tenant table to the corresponding user table, be it in the master or independent DB.

Does all this sound coherent?

I should maybe leave this hybrid DB architecture out for now and concentrate on building the app with a single shared database, and later try another solution like the one I mentioned above(if it works).

That is what I am doing at the moment. I think I will try this when I split the DB later:

All users are stored in the master DB. So any internet user can create a profile in the master DB.

However, users need to be “invited” by the tenant’s “user-manager” to access the tenant’s DB (which is in a joined DB or separate DB). When a user wants to access a tenant’s DB, he will have to supply the tenant’s code as well when he logs in, which will then give him access to the right tenant DB.

This allows you to build in more security features and a tenant’s “user manager” has full control over who is allowed in the DB and what they are allowed to do.

I’m sorry but I’ve read your post a few times and am still confused. The problem is that the Yii application lifecycle always resets the db connection and urlManager rules to the configuration default when it initializes the app at every request. The tenant client corresponding DB isn’t accesible at this time. You would need to set all of this up during onBeginRequest.

@yomer and @Gerhard:

If I were to handle separate DB’s for each tenant (and will soon have to because of an upcoming project) I would use 2 db connections simultaneously. db1 would contain the tenant and user data while db2 (identified in db1’s tenant) would contain the rest of the application data for each tenant. In onBeginRequest I would only change the db2 connection. I believe in MySQL you can have foreign keys to a different database but if not it gets messy; I haven’t gone that far yet. Models and controllers have to be very careful about which database they’re talking to.

@yomer:

To answer your questions about the wiki article sent via PM…

  1. In the controllers, the actionUpdate method is actually simpler than actionCreate because the tenant info is already there:

UserController.php




public function actionUpdate($id)

{

	$model = $this->loadModel($id); // returns TUser or VUser model

	// Uncomment the following line if AJAX validation is needed

	$this->performAjaxValidation($model);


	if(Yii::app()->user->isUserTenantAdmin() && isset($_POST['VUser']))

		$postVars = $_POST['VUser'];

	elseif (Yii::app()->user->isUserAppStaff() && isset($_POST['TUser']))

		$postVars = $_POST['TUser'];


	if(isset($postVars)) {

		$model->attributes = $postVars;

		if($model->save())

			$this->redirect(array('view','id'=>$model->id));

	}


	$this->render('update',array(

		'model'=>$model,

	));

}



actionView and actionDelete are as per Gii default which BTW is how I always start developing the CRUD functionality.

  1. Other non-tenant/user controllers are similar in this respect:

a) for create, they first check if the logged in user is a tenant or app staff and instantiate the appropriate new VModel (MySQL view based) or TModel (table based) model. If table based they populate the tenant_dbu column according to the _form view’s selected tenant; otherwise they populate the tenant_id column according to the tenant to which the session’s user belongs.

b ) for updates it’s similar to #1 above.

c) as mentioned before, view and delete actions are as per Gii default.

Hope this helps.

I dug up some old code that I plaid with before. It should give you a start.

My config file forces all requests through onBeginRequest via applicationBehavior and it also

connects both db1 and db2 to the same database. applicationBehavior will then change the connection string of db2.




<?php


return array(

	'basePath'=>dirname(__FILE__).DIRECTORY_SEPARATOR.'..',

	'name'=>'InterChest Commercial Applications',


	'import'=>array(

		'application.models.*',

		'application.components.*',

		'ext.yii-mail.YiiMailMessage',

	),


	// Force all requests through applicationBehavior

	'behaviors' => array(

		'class' => 'application.components.applicationBehavior', 

	),


	// application components

	'components'=>array(

		...	

	

		/* Set db1 connection */

		'db1'=>array(

			'connectionString' => 'mysql:host=localhost;dbname=db1',

			'emulatePrepare' => true,

			'username' => 'root',

			'password' => '',

			'charset' => 'utf8',

			'tablePrefix' => 'cap_',

		),


		/* Set db2 connection. Use db1 as default so that exceptions

			occur if db2 is not set correctly in applicationBehavior (if you use db2

			as default here, then everything looks okay and errors are not picked up.) */

        'db2' => array(

            'connectionString' => 'mysql:host=localhost;dbname=db1',

			'emulatePrepare' => true,

			'username' => 'root',

			'password' => '',

			'charset' => 'utf8',

			'tablePrefix' => 'cap_',

			'class' => 'CDbConnection',

        ),

		...

);



Here is the behavior. It simply generates a new user-specific connection string for db2.




class applicationBehavior extends CBehavior 

{       

    public function events(){ 

 		return array(

			'onBeginRequest'=>'assignBranchDb',

		);

    }

 

    public function assignBranchDb() 

    {

		$owner=$this->getOwner(); 

		

		if(!strstr($_SERVER['REQUEST_URI'],"captcha")) /*ignore requests involving the captcha*/

		{

			if(!$owner->user->isGuest && $owner->user->hasState('branchID'))

			{ 

				$dsn = 'mysql:host=' . Yii::app()->user->getState('dbHost') . ';dbname='.Yii::app()->user->getState('dbName');

				$username = Yii::app()->db->username;

				$password = Yii::app()->db->password;


				$owner->db2->setActive(FALSE);

				$component=Yii::createComponent(array(

	                	'class'=>'CDbConnection',

						'tablePrefix' => 'cap_',

	                	'connectionString' => $dsn, 

	                	'emulatePrepare' => true, 

	                	'username' => $username,

	                	'password' => $password,

	                	'charset' => 'utf8', 

	                	'enableParamLogging'=>true, 

	                //	'schemaCachingDuration'=>3600,

	            )); 

	            $owner->setComponent('db2',$component);

				$owner->db2->setActive(TRUE);

			}

		}

	}

}



All models must have this code to add their relevant database names to queries. See the link in my post #13 above.




/* New tableName function */

public function tableName()

{

	preg_match("/dbname=([^;]+)/i", $this->dbConnection->connectionString, $matches);

	return $matches[1].'.{{tablename}}';

}

	

/* Override the parent getDbConnection() */

public function getDbConnection()

{

    return self::getDbtenantConnection();

}



@JFReyes

It does help a lot.

Just to clear your point 2.a. The triggers in every table force the tenant_dbu

when inserting a new record, but when the AppStaff inserts a record, it skips the trigger, because he would have control over which tenant_dbu and tenant_id to assign to that record. Is this right? So every time a record is inserted a tenant_dbu and tenant_id will always be populated either manually by the AppStaff or automatically by the TenandAdmin and Clients’ actions.

About my overly confusing post…

You mention that by design the tenant client DB wouldn’t be accesible. What I was thinking was that if you already have the tenant_dbu and tenant_id in session. On the beginRequest, if I understand correctly, you can change the connectionString on the fly.


[s]$u = TUser::model()->findByPk(Yii::app()->user->id);[/s]

$tu = TTenant::model()->findByPk(Yii::app()->session['tenant_id'])->dbu;

$tp = TTenant::model()->findByPk(Yii::app()->session['tenant_id'])->e_dbpwd;

$odb = TTenant::model()->findByPk(Yii::app()->session['tenant_id'])->own_db; //Boolean or Tinyint if Tenant uses own DB. 

In this first lines, could I remove the requirement of getting the User model if I have a tenant_id already stored in session? Then it would go get the tenant’s dbu and dbpwd from the default master db which has the tenant table. Then…


Yii::app()->db->setActive(false);

Yii::app()->db->username = $tu;

Yii::app()->db->password = $tp;

if ($odb === 1) {

   Yii::app()->sb->connectionString = 'mysql:host=localhost;dbname=db_' . $tu;

}

Yii::app()->db->setActive(true);

If the tenant has his own DB, then the connectionString changes to go to look for DB: db_[dbusername]. The credentials are the same, what changes is where are the credentials for the tenant going to be valid, wither in the master db or the independent db.

When the connection is made to either of those, the session will already have the tenant_id to write to the tables and populate. Can the primary/foreign key relationship between all tables and the tenant table be omitted in favor of using the current user’s stored tenant_id and tenant_dbu to view and modify the tables?

I’m probably oversimplifying things. I’m not implying all this as an authority in Yii, since I’m far from it. If it turns out that this method isn’t possible or a safe, I’m expecting the community to correct me and learn a bit more about Yii.

Thanks for the help.