I am currently aggregating posts from different social networks. Currently I have 50-100 million posts from Facebook, Twitter, Youtube, Instagram , Pinterest.
Consider a table posts
posts{ id int(11), user_id int(11), url varchar(256), image varchar(256), source int(11), // Social Network Source created bigint(20), // Publish time of the post visible int(1) // Public or Private Posts}
Using InnoDB
Indexes :
primary key on iduser_id,sourceuser_id,created
Space usage
Type Usage Data 45,876.0 MiBIndex 4,959.0 MiBTotal 50,835.0 MiB
Using Amazon RDS with 7.5 GB RAM.
The queries I am executing are as follows
select id from posts where user_id={user_id} and visible=1 order by created desc LIMIT 20 ;select id from posts where user_id={user_id} and source={your_network} and visible=1 order by created desc LIMIT 20 ;
For users with > 500k posts sometimes my queries take a really long time.
I have run explain and you can see the results below
EXPLAIN SELECT idFROM `posts` WHERE user_id =123529745 and source=1 and visible=1ORDER BY created DESC LIMIT 20id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE posts ref user_id_2,user_id_4,user_id_3,created created 4 const 954174 Using where; Using indexid select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE posts ref user_id_2,user_id_4,user_id_3,created created 4 const 742308 Using where;
Can you please suggest me any optimizations?