Mysql Innodb Gurus - Simple Forum Design And Help Needed On Clustered Index

Hell All,

I am designing in a simple forum database using innodb and got few questions . I your free time , could you please help me ?

Below are Posts and Comments DB design


CREATE TABLE `my_posts` (

  `forum_id` mediumint(<img src='http://www.yiiframework.com/forum/public/style_emoticons/default/cool.gif' class='bbc_emoticon' alt='8)' /> unsigned NOT NULL,

  `post_id` int(10) unsigned NOT NULL,          // not an auto increment

   subject   varchar(200) NOT NULL,

  `details` text NOT NULL,

  `access` tinyint(3) unsigned NOT NULL,  //private,friends,public

  `created_by` int(10) unsigned NOT NULL,

  `created_on` int(10) unsigned NOT NULL,

  `updated_on` int(10) unsigned NOT NULL,

  `comment_count` smallint(5) unsigned NOT NULL DEFAULT '0',

  `ip_address` int(10) unsigned NOT NULL,

   sphinx_unique_id  int(10) unsigned NOT NULL,  // not an auto increment

   PRIMARY KEY (`forum_id`,`post_id`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8;


CREATE TABLE `my_posts_comments` (

  `forum_id` mediumint(<img src='http://www.yiiframework.com/forum/public/style_emoticons/default/cool.gif' class='bbc_emoticon' alt='8)' /> unsigned NOT NULL,

  `post_id` int(10) unsigned NOT NULL,          // not an auto increment 

  `comment_id` int(10) unsigned NOT NULL,

  `details` text NOT NULL,

  `created_by` int(10) unsigned NOT NULL,

  `created_on` int(10) unsigned NOT NULL,

  `ip_address` int(10) unsigned NOT NULL,

  PRIMARY KEY (`forum_id`,`post_id`,comment_id)

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

I have queries based on forum_id, post_id, created_by and also sorting on updated_on , comment_count


SELECT * FROM my_posts  WHERE access=public ORDER BY updated_on DESC      //OR ASC

SELECT * FROM my_posts  WHERE access=public ORDER BY comment_count DESC      //OR ASC

SELECT * FROM my_posts WHERE forum_id=?  and access=public ORDER BY updated_on DESC      //OR ASC

SELECT * FROM my_posts WHERE forum_id=?  and access=public  ORDER BY comment_count DESC   //OR ASC

SELECT * FROM my_posts WHERE created_by=?  ORDER BY updated_on DESC     //OR ASC

SELECT * FROM my_posts  WHERE created_by=(Friends_ids) AND access=(public,friends)   ORDER BY updated_on DESC      //OR ASC 

SELECT * FROM my_posts WHERE forum_id=? AND  post_id=?  

SELECT * FROM my_posts_comments WHERE forum_id=? AND post_id=?

  1. Is this the best design for using Cluster Index?

  2. Since the Cluster primary key on forum_id, post_id, I see problem accessing it just conditioning on created_by . What is the best design to query on created_by? Can I just add index for it . I will loose Cluster performance here …

  3. Also All the queries I have sorting on updated_on and on comment_count . These two are not of PK , so the first two simple queries looks like a major problem because the data is physically ordered in the database on forum_id and on post_id level

Thank You for checking it

Regards

Yii Fan

Hi Fan_Of_Yii

  1. It’s more usual for me to see something similar to the following structure (I’ll write it in pseudocode):




table `forum`(

    `id` auto_increment primary key

    `name`

)

table `post`(

    `id` auto_increment primary key

    `forum_id` foreign key

)

table `comment`(

    `id` auto_increment primary key

    `post_id` foreign key 

)

      1. I would (I’d even say SHOULD) add indexes on fields - that are not already primary or unique keys - you’re searching/sorting on.

Hope it’ll help.

Thank You yugene for your help . But I will loosed innodb clustered index capabilities if I design like that .

You can have a look at the innodb clustered indexes here

http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/

I am thinking some thing that I can get clustered index benefits :)

Thanks again for looking into it

Regards

Yii Fan

Yeah, I know about innodb clustered index, though suggested approach is something I haven’t thought of before and an elegant one. BTW, the linked article is written by the co-author of one of my the most favorite books: High Performance MySQL (O’Reilly), so I enjoyed reading it.

I don’t think there’s a simple final solution for your questions.

It’s possible you’ll get some benefits from approach suggested at SO, adding some other fields to your clustered index (index will become bigger, and if you’ll add one more field to your PK, you’ll then need to handle this PK), but I think this solution is really definite situation dependent: [size=“2”]which exactly queries are made, how often, which speed is required, which hardware you have, how much data you have, etc… [/size][size=“2”]All these moments described at links you send so I think you more then aware of it. [/size]

[size=“2”]So, I think I’d make benchmarks on your queries, and investigate then with EXPLAIN SELECT, setting up different indexes variants also taking in notice which way there’re used in your application. Possibly, you will use some other approaches as well to improve performance (for example, trick about OFFSET issue solving described at SO Victor Nicolett[/size][size=“2”]).[/size]

[size=“2”]Hope my thoughts will help in any way. Would be interesting to hear thoughts from other participants of the forum :)[/size]

P.S. I suppose SELECT * FROM is used for demonstration purposes only, and * will be replaces with actual fields names?

Thanks a lot yugene for the detailed explanation. It was simply great.

Yes , as you mentioned SELECT * FROM is used for demonstration purposes only . But any way for secondary indexes it doesn’t matter ( from my understanding) . Since any way first it has to read the index ( Secondary index always combined with primary key internally ) and then have to go to primary key to get the fields .

You mentioned some thing about Victor Nicolett OFFSET , sorry I don’t know about that trick , Could you please provide me link for that ?

As you suggested I will try different approaches and see . You are right any way for this case I can not keep all the records sequential for all the cases I mentioned .

Yes "High Performance MySQL 3rd edition " change the way you think about indexes :) I read it couple of times , But I still feel some think I am missing

Regards

Yii Fan

Welcome, happy to hear I could help. The discussion was useful for me as well.

[/size]

Just take a look at answer and comment by Victor Nicolett at link to Stack Overflow you sent.

Thank You yugene for pointing out Victor Nicolett comment . I didn’t realize it was there …

Finally I am going ahead with secondary indexes only .

Thanks again

Regards

Yii Fan

Hello Yugene,

I found good one on LIMIT and here is the link

http://stackoverflow.com/questions/4481388/why-does-mysql-higher-limit-offset-slow-the-query-down

Thank You

Regards

Yii Fan

Yeah, elegant one :)