![]() |
![]() |
![]() |
||||
Welcome to the GoFuckYourself.com - Adult Webmaster Forum forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact us. |
![]() ![]() |
|
Discuss what's fucking going on, and which programs are best and worst. One-time "program" announcements from "established" webmasters are allowed. |
|
Thread Tools |
![]() |
#1 |
Confirmed User
Industry Role:
Join Date: Jul 2009
Posts: 217
|
mysql query optimization
i'm working on a simple contextual ad rotator for my sites... and i'm not sure if there is anyway to improve/optimize it on the database side...
well, there are 3 tables... ads, category/ads match and categories. table 1: ads - a_id (PRIMARY) - a_area - a_code - a_imp table 2: adtags - at_tid - at_aid PRIMARY (at_tid, at_aid) table 3: tags - t_id (PRIMARY) - t_name (Unique) so, i get keywords and ad spots from the page, then put them into a query like that: SELECT a.a_id, a.a_area, a.a_code FROM ( SELECT ads.a_id, ads.a_area, ads.a_code) AS a GROUP BY a.a_area i've been trying it last couple days, not a complete disaster but i'm not sure if it can be improved or how it will act with growing tables and growing traffic.... currently, 450 records on the ads table and probably it will be around 2000 at max, and 500-600 records for keywords... 150k-160k pageviews/day... 700-800 new records at slow query log everyday, i assume there are some other factors which may affect that... etc. so, what do you think? am i on the right way? the query or tables need anything? or i do need totally different approach? |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#2 |
Confirmed User
Industry Role:
Join Date: Feb 2005
Location: Los Angeles
Posts: 1,531
|
Why you splitting up the table and making it so complicated?
Code:
table 1: ads - a_id (PRIMARY) - a_area - a_code - a_imp - at_tid - at_aid - t_id (PRIMARY) - t_name (Unique) - t_category_name (INT(1), *INDEX) // 1='sponsor1',2='big tits',3='milf',4='lisa ann',5='anal',6='default' SELECT * from table a where t_category_name = 5 ORDER BY RAND() LIMIT 1
__________________
www.gimmiegirlproductions.com |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#3 |
Confirmed User
Industry Role:
Join Date: Feb 2005
Location: Los Angeles
Posts: 1,531
|
But if you must keep the table like it is, the IN statement is the killer.
1) index t_name. Use varchar (64) index *8) 2) Replace tags.t_name IN ('sponsor1','big tits','milf','lisa ann','anal','default') with UNION: SELECT ads.a_id, ads.a_area, ads.a_code FROM ads, adtags, tags WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id AND tags.t_name='sponsor1' AND ads.a_area IN (3,5,6,4) ORDER BY ads.a_imp ASC UNION SELECT ads.a_id, ads.a_area, ads.a_code FROM ads, adtags, tags WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id AND tags.t_name='big tits' AND ads.a_area IN (3,5,6,4) ORDER BY ads.a_imp ASC UNION SELECT ads.a_id, ads.a_area, ads.a_code FROM ads, adtags, tags WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id AND tags.t_name='milf' AND ads.a_area IN (3,5,6,4) ORDER BY ads.a_imp ASC UNION SELECT ads.a_id, ads.a_area, ads.a_code FROM ads, adtags, tags WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id AND tags.t_name='lisa ann' AND ads.a_area IN (3,5,6,4) ORDER BY ads.a_imp ASC UNION SELECT ads.a_id, ads.a_area, ads.a_code FROM ads, adtags, tags WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id AND tags.t_name='anal' AND ads.a_area IN (3,5,6,4) ORDER BY ads.a_imp ASC ...
__________________
www.gimmiegirlproductions.com |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#4 | |
Too lazy to set a custom title
Industry Role:
Join Date: May 2004
Location: West Coast, Canada.
Posts: 10,217
|
Quote:
Not sure why you're doing a sub query which I suspect will end up being relatively slow. I often find it's faster to do multiple queries as opposed to one "big" one. So for example, SELECT a.at_aid FROM adtags AS a LEFT JOIN tags AS b ON b.t_id=a.at_tid WHERE b.t_name IN ('sponsor1','big tits','milf','lisa ann','anal','default') Then compile the at_aid results in list form (eg: a,b,c,d,e) and then SELECT a_id, a_area, a_code FROM ads WHERE a_id IN (a,b,c,d,e) AND a_code IN (3,5,6,4) ORDER BY a_imp GROUP BY a_area It's one of those things where, if there's a problem, you need to test things out to find the optimal queries. |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#5 |
Confirmed User
Industry Role:
Join Date: Dec 2004
Posts: 363
|
Show us the output of "
EXPLAIN SELECT a.a_id, a.a_area, a.a_code FROM ( SELECT ads.a_id, ads.a_area, ads.a_code FROM ads, adtags, tags WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id AND tags.t_name IN ('sponsor1','big tits','milf','lisa ann','anal','default') AND ads.a_area IN (3,5,6,4) ORDER BY ads.a_imp ASC ) AS a GROUP BY a.a_area;" Actually one of the worst things you are doing here is the "GROUP BY" which will cause a tmp table generation which when run in high concurrency will slam disk i/o and stack queries up resulting in longer processing times. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#6 |
Confirmed User
Industry Role:
Join Date: Jul 2009
Posts: 217
|
@camperjohn64
splitting is for the ads which have 2+ keywords... for example, if the ad is tagged to show for both "milf" and "lisa ann", we had to add one more row for each tag with same ad... well, UNION was the one of solutions when i was searching, but it was quite slower than the current one... i'm not sure, maybe i'm missing something... as you said, the IN statement looks like the killer @Tempest i had index on ads.a_area, i'm not sure if it made any improvements... about putting index on ads.a_imp... well, i did not mention that, with every query, ads.a_imp (impressions) getting updated, so, putting index on it probably will not do any good, right? actually, i was trying to keep simple at first as you showed and with a few other ways... did not see any improvements... @dubsix you are right about GROUP BY... it somehow involved in the equation... i thought it could be better with less queries... with group by, i can get one ad for each spot in one query... 150k bad queries VS. 600k less bad queries... with group by (0.0095 sec) id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 69 Using temporary; Using filesort 2 DERIVED tags range PRIMARY,t_name t_name 152 NULL 4 Using where; Using temporary; Using filesort 2 DERIVED adtags ref PRIMARY PRIMARY 4 cfs_tags.t_id 8 Using index 2 DERIVED ads eq_ref PRIMARY PRIMARY 4 adtags.at_aid 1 Using where without group by, which means running a query like below for each spot (in this case we have 4 spots) it tooks 0.0237 sec SELECT ads.a_id, ads.a_area, ads.a_code FROM ads, adtags, tags WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id AND tags.t_name IN ('sponsor1','big tits','milf','lisa ann','anal','default') AND ads.a_area = 3 ORDER BY ads.a_imp ASC LIMIT 1 id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE cfs_tags range PRIMARY,t_name t_name 152 NULL 4 Using where; Using temporary; Using filesort 1 SIMPLE cfs_adtags ref PRIMARY PRIMARY 4 cfsads.cfs_tags.t_id 8 Using index 1 SIMPLE cfs_ads eq_ref PRIMARY PRIMARY 4 cfsads.cfs_adtags.at_aid 1 Using where |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#7 | |
Too lazy to set a custom title
Industry Role:
Join Date: May 2004
Location: West Coast, Canada.
Posts: 10,217
|
Quote:
Regardless of what method you end up using to get the data, on a high traffic site, I'd be caching the results. It's only ads, and refreshing the cache every 5 seconds is more than enough. Not only are you fetching the ads, but you're incrementing the impressions every time right? So cut down on the fetch by using a cache. How are you doing your benchmarking of the speed? |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#8 |
Confirmed User
Join Date: Oct 2002
Posts: 3,745
|
Subqueries are almost always more expensive than the same query expressed as a join. This is because the optimizer is wreak with subqueries and that is being worked on.
__________________
For historical display only. This information is not current: support@bettercgi.com ICQ 7208627 Strongbox - The next generation in site security Throttlebox - The next generation in bandwidth control Clonebox - Backup and disaster recovery on steroids |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#9 |
Confirmed User
Industry Role:
Join Date: Jul 2009
Posts: 217
|
caching the results is one solution which could help but this time some other factors make things complicated... well, i still can work on that...
subquery got involved, because i could not manage to work with GROUP BY and ORDER BY in the same query.. regardless of what ORDER BY dictates, GROUP BY returns same results.... i think, one of the bad things i do in here is ordering by impression to provide rotation... ok, i guess it's better to get back simple javascript banner rotator for server's health ![]() |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#10 |
Confirmed User
Industry Role:
Join Date: Feb 2005
Location: Los Angeles
Posts: 1,531
|
Another thing I was thinking this morning was having a fourth table with all the pre-calculated results:
Code:
table4: text html; int (1) category; Code:
INSERT INTO table 4 (SELECT all possible ads blah blah) Every time you add or delete an affiliate or advertisement, you re-run the calculation for table 4 and re-populate it.
__________________
www.gimmiegirlproductions.com |
![]() |
![]() ![]() ![]() ![]() ![]() |