View Single Post
Old 09-22-2011, 09:36 AM  
Babaganoosh
♥♥♥ Likes Hugs ♥♥♥
 
Babaganoosh's Avatar
 
Industry Role:
Join Date: Nov 2001
Location: /home
Posts: 15,841
Quote:
Originally Posted by idogfy View Post
Thanks for your input.

My experience with using a file had been different than yours, so that's great to know you had good result in real life, I shall try it again next time then. (locking aside, all write queued (by appending) had been a killer for me in the past, file access is very often my biggest performance trouble)

So i have some question :

-I did not get what you meant by : "they could be ordered by productivity" ???
-Are you not locking the file when you are reading it during your batch import ?
-You say you don't mind losing some click, which I would agree (if you have volume in the first place), so even if you are not locking, are you/could you losing some click with your solution?
-How big is the file after 5 minutes, and how long does it take to import it ?

Would you mind sharing some code showing your strategy: what and how you write and how you read an import ?

Thx again.
What I meant by ordering by productivity, was I was displaying thumbnails according to which was clicked the most. The most clicked (or productive) thumbnails always appeared higher in my listings which boosted overall productivity on my site.

Before reading the file in I would move the data files over to a temp directory for processing. I never tried to read and write at the same time. I should also mention that the thumbs were given a unique id number so the data files were unique to the thumb. E.g. 12345.jpg had all of its clicks logged to 12345.txt. I wasn't writing every click to the same file. I don't think that would have gone well at all.

But when I append a file in PHP I never explicitly flock the file. Even the php.net page on the function says it is unnecessary since fwrites are atomic if the handle was opened in append mode.

I suppose I could have been losing some click data but I never really cared enough to do any checking. My traffic script and my click logging script showed very similar numbers in respect to total clicks for the day so I just never worried about it. I do babysit my error logs just for fun and I never saw many errors related to the logging of clicks so I assume everything was working ok.

I don't really know how long it took to process the click data from each 5 minute period. I'm sure it was less than a minute or two even at the busiest times of the day but I can't tell you exactly how long it took.

As far as code goes it was nothing fancy. fopen, fwrite fclose. I just logged the ip which clicked the thumb. Same story with the importing of data. Open a file, read into an array, insert data into a table. The only thing I did that caused any kind of overhead was removing duplicate IPs from each array. I logged both total raw and total unique clicks (so whoever submitted the thumbnail couldn't increase their position by clicking their thumb over and over - at least not more than once every 5 mins).

Keep in mind that this was on a relatively expensive dedicated server too, multi processor, raid, massive ram. I wasn't doing this on a virtual account on hostgator. The scripts were never a performance issue, except when I was trying to log each click to mysql immediately. That sucked. Most of my server load came from apache serving all of those thumbnails.
Babaganoosh is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook