View Single Post
Old 02-19-2012, 01:36 PM  
borked
Totally Borked
 
borked's Avatar
 
Industry Role:
Join Date: Feb 2005
Posts: 6,284
Quote:
Originally Posted by Brujah View Post
Right, but you'd be returning the codes so I thought it could be an improvement if you returned more than just a yes or no. Factors like if pixel size matched, byte size matched, fingerprint matched, or whatever kind of variables are worth taking into consideration for the cms or developer to find useful.
This is kind of what k0nr4d was alluding to higher up - trust me, a match is pretty much a match. The only variable that comes into play is query movie length. A match can be made on anything > 10 seconds. But match a 10 second clip to a db of hundreds of thousands of 30minute movies and your false positive rate goes up *ONLY* if the real movie is not in the db. If it's in there, the real movie will *always* come out on top.

When the query clip though is unique and not in the db, and the query clip is small (< 1 min), *then* there is the chance of false positives. It must be said, anything > a few minutes will not generate a false positive (in my tests), but to be 100% sure, and to add a scoring system to the results, I would need a db of true negatives to make the script learn from. For this, the true negative db needs to be at least 1% the size of the positive db. I have a negative dataset, but it's from youtube and so it's pretty pathetic in a real world of pron vids, but the capability is there so yes, I'll look at it. Shit, if needs be, I can just use the tubes db and randomly pick out some to use for training. Suffice it to say, the bigger the negative dataset, the higher the confidence ;)

--edit and to add to that 'suffice it to say', throw a positive into that sea of negatives and the whole thing crumbles (hence why I used youtube for my negative dataset in testing)

All stuff for me to worry about, not you guys


--edit x2
Quote:
Factors like if pixel size matched, byte size matched, fingerprint matched, or whatever kind of variables are worth taking into consideration for the cms or developer to find useful.
To be clear, the only thing being sent is the fingerprint, so only that can match... the fingerprint contains all the info to match along with video metadata.
__________________

For coding work - hit me up on andy // borkedcoder // com
(consider figuring out the email as test #1)



All models are wrong, but some are useful. George E.P. Box. p202

Last edited by borked; 02-19-2012 at 01:42 PM..
borked is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote