GoFuckYourself.com - Adult Webmaster Forum

GoFuckYourself.com - Adult Webmaster Forum (https://gfy.com/index.php)
-   Fucking Around & Business Discussion (https://gfy.com/forumdisplay.php?f=26)
-   -   mysql query optimization (https://gfy.com/showthread.php?t=1030807)

miXXXture 07-18-2011 10:56 PM

mysql query optimization
 
i'm working on a simple contextual ad rotator for my sites... and i'm not sure if there is anyway to improve/optimize it on the database side...

well, there are 3 tables... ads, category/ads match and categories.

table 1: ads
- a_id (PRIMARY)
- a_area
- a_code
- a_imp

table 2: adtags
- at_tid
- at_aid
PRIMARY (at_tid, at_aid)

table 3: tags
- t_id (PRIMARY)
- t_name (Unique)

so, i get keywords and ad spots from the page, then put them into a query like that:

SELECT a.a_id, a.a_area, a.a_code FROM (
SELECT ads.a_id, ads.a_area, ads.a_code
FROM ads, adtags, tags
WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id
AND tags.t_name IN ('sponsor1','big tits','milf','lisa ann','anal','default')
AND ads.a_area IN (3,5,6,4)
ORDER BY ads.a_imp ASC
) AS a
GROUP BY a.a_area


i've been trying it last couple days, not a complete disaster but i'm not sure if it can be improved or how it will act with growing tables and growing traffic....

currently, 450 records on the ads table and probably it will be around 2000 at max, and 500-600 records for keywords...
150k-160k pageviews/day... 700-800 new records at slow query log everyday, i assume there are some other factors which may affect that... etc.


so, what do you think? am i on the right way? the query or tables need anything? or i do need totally different approach?

camperjohn64 07-18-2011 11:09 PM

Why you splitting up the table and making it so complicated?

Code:

table 1: ads
- a_id (PRIMARY)
- a_area
- a_code
- a_imp
- at_tid
- at_aid
- t_id (PRIMARY)
- t_name (Unique)
- t_category_name (INT(1), *INDEX)

// 1='sponsor1',2='big tits',3='milf',4='lisa ann',5='anal',6='default'

SELECT * from table a where t_category_name = 5 ORDER BY RAND() LIMIT 1


camperjohn64 07-18-2011 11:21 PM

But if you must keep the table like it is, the IN statement is the killer.

1) index t_name. Use varchar (64) index *8)

2) Replace tags.t_name IN ('sponsor1','big tits','milf','lisa ann','anal','default') with UNION:

SELECT ads.a_id, ads.a_area, ads.a_code
FROM ads, adtags, tags
WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id
AND tags.t_name='sponsor1'
AND ads.a_area IN (3,5,6,4)
ORDER BY ads.a_imp ASC

UNION
SELECT ads.a_id, ads.a_area, ads.a_code
FROM ads, adtags, tags
WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id
AND tags.t_name='big tits'
AND ads.a_area IN (3,5,6,4)
ORDER BY ads.a_imp ASC

UNION
SELECT ads.a_id, ads.a_area, ads.a_code
FROM ads, adtags, tags
WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id
AND tags.t_name='milf'
AND ads.a_area IN (3,5,6,4)
ORDER BY ads.a_imp ASC

UNION
SELECT ads.a_id, ads.a_area, ads.a_code
FROM ads, adtags, tags
WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id
AND tags.t_name='lisa ann'
AND ads.a_area IN (3,5,6,4)
ORDER BY ads.a_imp ASC

UNION
SELECT ads.a_id, ads.a_area, ads.a_code
FROM ads, adtags, tags
WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id
AND tags.t_name='anal'
AND ads.a_area IN (3,5,6,4)
ORDER BY ads.a_imp ASC

...

Tempest 07-18-2011 11:23 PM

Quote:

Originally Posted by miXXXture (Post 18291122)
i'm working on a simple contextual ad rotator for my sites... and i'm not sure if there is anyway to improve/optimize it on the database side...

well, there are 3 tables... ads, category/ads match and categories.

table 1: ads
- a_id (PRIMARY)
- a_area
- a_code
- a_imp

table 2: adtags
- at_tid
- at_aid
PRIMARY (at_tid, at_aid)

table 3: tags
- t_id (PRIMARY)
- t_name (Unique)

so, i get keywords and ad spots from the page, then put them into a query like that:

SELECT a.a_id, a.a_area, a.a_code FROM (
SELECT ads.a_id, ads.a_area, ads.a_code
FROM ads, adtags, tags
WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id
AND tags.t_name IN ('sponsor1','big tits','milf','lisa ann','anal','default')
AND ads.a_area IN (3,5,6,4)
ORDER BY ads.a_imp ASC
) AS a
GROUP BY a.a_area


i've been trying it last couple days, not a complete disaster but i'm not sure if it can be improved or how it will act with growing tables and growing traffic....

currently, 450 records on the ads table and probably it will be around 2000 at max, and 500-600 records for keywords...
150k-160k pageviews/day... 700-800 new records at slow query log everyday, i assume there are some other factors which may affect that... etc.


so, what do you think? am i on the right way? the query or tables need anything? or i do need totally different approach?

Put an index on ads.a_area and ads.a_imp

Not sure why you're doing a sub query which I suspect will end up being relatively slow.

I often find it's faster to do multiple queries as opposed to one "big" one.

So for example,

SELECT a.at_aid FROM adtags AS a LEFT JOIN tags AS b ON b.t_id=a.at_tid WHERE b.t_name IN ('sponsor1','big tits','milf','lisa ann','anal','default')

Then compile the at_aid results in list form (eg: a,b,c,d,e) and then

SELECT a_id, a_area, a_code FROM ads WHERE a_id IN (a,b,c,d,e) AND a_code IN (3,5,6,4) ORDER BY a_imp GROUP BY a_area

It's one of those things where, if there's a problem, you need to test things out to find the optimal queries.

dubsix 07-19-2011 12:30 AM

Show us the output of "
EXPLAIN SELECT a.a_id, a.a_area, a.a_code FROM (
SELECT ads.a_id, ads.a_area, ads.a_code
FROM ads, adtags, tags
WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id
AND tags.t_name IN ('sponsor1','big tits','milf','lisa ann','anal','default')
AND ads.a_area IN (3,5,6,4)
ORDER BY ads.a_imp ASC
) AS a
GROUP BY a.a_area;"


Actually one of the worst things you are doing here is the "GROUP BY" which will cause a tmp table generation which when run in high concurrency will slam disk i/o and stack queries up resulting in longer processing times.

miXXXture 07-19-2011 01:23 AM

@camperjohn64

splitting is for the ads which have 2+ keywords... for example, if the ad is tagged to show for both "milf" and "lisa ann", we had to add one more row for each tag with same ad...

well, UNION was the one of solutions when i was searching, but it was quite slower than the current one... i'm not sure, maybe i'm missing something... as you said, the IN statement looks like the killer

@Tempest

i had index on ads.a_area, i'm not sure if it made any improvements...
about putting index on ads.a_imp... well, i did not mention that, with every query, ads.a_imp (impressions) getting updated, so, putting index on it probably will not do any good, right?

actually, i was trying to keep simple at first as you showed and with a few other ways... did not see any improvements...

@dubsix

you are right about GROUP BY... it somehow involved in the equation... i thought it could be better with less queries... with group by, i can get one ad for each spot in one query... 150k bad queries VS. 600k less bad queries...

with group by (0.0095 sec)

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 69 Using temporary; Using filesort
2 DERIVED tags range PRIMARY,t_name t_name 152 NULL 4 Using where; Using temporary; Using filesort
2 DERIVED adtags ref PRIMARY PRIMARY 4 cfs_tags.t_id 8 Using index
2 DERIVED ads eq_ref PRIMARY PRIMARY 4 adtags.at_aid 1 Using where


without group by, which means running a query like below for each spot (in this case we have 4 spots)

it tooks 0.0237 sec

SELECT ads.a_id, ads.a_area, ads.a_code
FROM ads, adtags, tags
WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id
AND tags.t_name IN ('sponsor1','big tits','milf','lisa ann','anal','default')
AND ads.a_area = 3
ORDER BY ads.a_imp ASC LIMIT 1

id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE cfs_tags range PRIMARY,t_name t_name 152 NULL 4 Using where; Using temporary; Using filesort
1 SIMPLE cfs_adtags ref PRIMARY PRIMARY 4 cfsads.cfs_tags.t_id 8 Using index
1 SIMPLE cfs_ads eq_ref PRIMARY PRIMARY 4 cfsads.cfs_adtags.at_aid 1 Using where

Tempest 07-19-2011 03:22 AM

Quote:

Originally Posted by miXXXture (Post 18291228)
@camperjohn64

splitting is for the ads which have 2+ keywords... for example, if the ad is tagged to show for both "milf" and "lisa ann", we had to add one more row for each tag with same ad...

well, UNION was the one of solutions when i was searching, but it was quite slower than the current one... i'm not sure, maybe i'm missing something... as you said, the IN statement looks like the killer

@Tempest

i had index on ads.a_area, i'm not sure if it made any improvements...
about putting index on ads.a_imp... well, i did not mention that, with every query, ads.a_imp (impressions) getting updated, so, putting index on it probably will not do any good, right?

actually, i was trying to keep simple at first as you showed and with a few other ways... did not see any improvements...

@dubsix

you are right about GROUP BY... it somehow involved in the equation... i thought it could be better with less queries... with group by, i can get one ad for each spot in one query... 150k bad queries VS. 600k less bad queries...

with group by (0.0095 sec)

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 69 Using temporary; Using filesort
2 DERIVED tags range PRIMARY,t_name t_name 152 NULL 4 Using where; Using temporary; Using filesort
2 DERIVED adtags ref PRIMARY PRIMARY 4 cfs_tags.t_id 8 Using index
2 DERIVED ads eq_ref PRIMARY PRIMARY 4 adtags.at_aid 1 Using where


without group by, which means running a query like below for each spot (in this case we have 4 spots)

it tooks 0.0237 sec

SELECT ads.a_id, ads.a_area, ads.a_code
FROM ads, adtags, tags
WHERE ads.a_id = adtags.at_aid AND adtags.at_tid=tags.t_id
AND tags.t_name IN ('sponsor1','big tits','milf','lisa ann','anal','default')
AND ads.a_area = 3
ORDER BY ads.a_imp ASC LIMIT 1

id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE cfs_tags range PRIMARY,t_name t_name 152 NULL 4 Using where; Using temporary; Using filesort
1 SIMPLE cfs_adtags ref PRIMARY PRIMARY 4 cfsads.cfs_tags.t_id 8 Using index
1 SIMPLE cfs_ads eq_ref PRIMARY PRIMARY 4 cfsads.cfs_adtags.at_aid 1 Using where


Regardless of what method you end up using to get the data, on a high traffic site, I'd be caching the results. It's only ads, and refreshing the cache every 5 seconds is more than enough. Not only are you fetching the ads, but you're incrementing the impressions every time right? So cut down on the fetch by using a cache.

How are you doing your benchmarking of the speed?

raymor 07-19-2011 06:01 AM

Subqueries are almost always more expensive than the same query expressed as a join. This is because the optimizer is wreak with subqueries and that is being worked on.

miXXXture 07-19-2011 09:39 AM

caching the results is one solution which could help but this time some other factors make things complicated... well, i still can work on that...

subquery got involved, because i could not manage to work with GROUP BY and ORDER BY in the same query.. regardless of what ORDER BY dictates, GROUP BY returns same results....

i think, one of the bad things i do in here is ordering by impression to provide rotation...

ok, i guess it's better to get back simple javascript banner rotator for server's health :)

camperjohn64 07-19-2011 09:52 AM

Another thing I was thinking this morning was having a fourth table with all the pre-calculated results:

Code:

table4:
  text html;
  int (1) category;

Then precalculate that table with the results of all possible joins from your original SELECT:

Code:

INSERT INTO table 4 (SELECT all possible ads blah blah)
Then just grab random ads from table4 based on category. No unions, no joins, no lookups, no nothing.

Every time you add or delete an affiliate or advertisement, you re-run the calculation for table 4 and re-populate it.


All times are GMT -7. The time now is 12:30 AM.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123