![]() |
Paysite Webmasters: do you allow members to use mass downloading software (quadsuck)
Paysite webmasters: Do you allow or prevent the use of mass downloading software programs like Quadsucker, etc...
The reason I ask is these programs are reaking havoc on my page counters (the enhanced stats area) since the program follows every last link. The program will also download a file first to the member's HD before deciding if it is a dupe, then not overwrite it on their HD. Of course the average surfer doesn't care about configuring the program to filter out unnecessary files. I have members (it seems by looking at my logs and server task manager) that set their Quadsuck to DL the entire site every morning before going to work just to save them a few mouseclicks for today's update. My site is like 500megs with over 40,000 files and has page counter scripts that are very server CPU intensive. Yes, I have protected the page counter areas that cut them off after 10 attempts, but their program goes on and on trying to download every last link everyday. Now I know bandwidth is cheap nowadays, but server CPUs are not. When the server CPUs are pegged at 100% it fucks with my recurring scripts every night. I've added a dual processor Xeon server just to deal with the page counters and have done other things. Servers are cheap and I have good retention and not complaining about revenue. My question is: Should I just let them do this everyday or should I crack down on them downloading the entire site's 40,000 files everyday just for 10-15 new files and possibly piss them off when they get cut off. Seems like a big waste of server CPU and php processes. Thoughts ? Thanks...and Happy New Year to all. My Apologies for a biz thread on the first day of the New Year. |
fuckem block them
|
block 'em
|
Yep, I agree.... block 'em hard.
|
We block them as well.
|
Quadsuck.. will have to look into that one.
block em |
block them, block them all
|
Block em if not your just losing money
|
Are you running Unix with Apapche? How would you block them? I'd love to know how and do it too. Thanks.
|
My TOS disallow them and my tech blocks them.
My sistes are designed for human browsing in real-time only. -Dino |
We block them as well. They are stealing your content. Something that youpaid for one way or another.
|
We block them as well.
|
Yeah, do exactly what every post here said.
|
Quote:
My dedicated server is getting pegged from these guys. I only have about 400 meg of pics on the site, and some idiots pull 2 and 3 gigs.. it's really frustrating. My images are all served through a gallery software package (php), and the spiders/rippers beat the crap out of the server. Any thoughts? |
You cannot block the good downloaders since they pretend to be web browsers. Only some advertised themselves so you can block in htaccess.
Me, I have mixed feelings about it. They have paid for the content and some members with crappy connections like to download all night so it will be on their hard disk. I've thought about putting in an htaccess with a couple hundred blocks, but you'd be lucky to catch 50% that way. |
My feelings about it aren't mixed at all.. it's simple economics, these guys will crush you if you don't block it.
Does anyone know of a way better than htaccess blocks? I imagine you could come up with various ways to server the content up dynamically but that's quite an undertaking. Quote:
|
If you decide to block site rippers, there's a bad way to do it,
an OK way to do it, and a way good to do it. The bad way is to have a .htaccess listing hundreds of User-Agent strings that may be used by site rippers. This is bad for two main reasons. #1, most site rippers let the user change the User-Agent string. Many have a simple checkbox to make it send the same user agent as IE6. Though IE6 [B]IS/B] 3 years old already, a lot of people still use it of course so you can't block it, which means that you can't block site rippers which can so easily spoof this user agent, which is most of them. Also your list will never be complete, so may sites rippers would be unafacted. Also, each of those user-agent lines is a conditon that Apache has to evaluate for every single hit. If the userloads a page with 30 thumbnails Apache has to go loking through that list for each and every thumbnail, doingthrousands of comparisons just to load one page. Performance WILL suffer noticeably if you have any reasonable amount of traffic whatsoever. A slightly better way is to apply one of the cardinal rules of security - disallow everything, then allow only what is OK. Rather than listing hundreds of user agents (browsers) that aren't allowed, you just list the 5 or 6 that ARE allowed - IE, Mozilla and it's variations (including Firefox and Netscape), Windows Media Player, RealPlayer, Opera, and Safari. That takes care of the problem of keeping the list up to date (until you get members using some other browser) and solves the performace issues. It still leaves you wide open to spoofing, though, and because it looks not at the problem, site ripping, but only at the user agent, it's not 100% effective. for example, IE has site rippin built right in! Add your site as a favorite, then select off line viewing and you can use IE itself as a site ripper. Howare you going to block that? Well, you're going to block that by detecting and acting upon the act of site ripping itself, rather than on the name of the software. That's the right way to do it, which brings us to option #3: The best way, probably, is by using Strongbox. Besides having by far the most sophisticated protection against brute force attack and password trading, Strongbox also defends against site rippers by actually detecting and stopping the ripping process - the following of every link. Additionally it also provides an enhanced version of method #2 where a user can only access a page or image on the site by using the same browser he or she logged in with in the first place. These two defenses combined are much, much more effective than naive attemptd based on listing the user-agent headers sent by known rippers. For more info in Strongbox, see: http://www.bettercgi.com/strongbox/ |
they cant with our sites, ppl who dl the whole site will get blocked, im not talkin 100 megs or so...
|
Quote:
I took all my user agent blocks out of htaccess just to see, and my cpu level dropped by half. I'm checking out strongbox right now. |
Quote:
|
Raymor - I don't have a member's area. It's basically a freesite. I'm mostly interested in stopping the slurping. For various reasons I'm stuck with phpAdsNew and the photogallery software I use, and the resource drain by rippers grabbing all the links is pretty bad.
Can I use the slurp protection without having logged in members? |
Weird how people try to do everything from htaccess, I'd just block them on an IP by IP basis.
good old route, nothing beats route! |
A very wise person (Lensman) once told me how to block those slurpers.
Sure enough it works like a charm! We wrote our slurpguard and I get an email 100% of the time someone tries to grab our goodies. It also blocks their IP, so I don't have to pay for clock cycles or bandwidth. |
ok, im interested in this as well. What is the best software to get to prevent these kind of softwares from downloading all of the content on your site at the same time Will pennywise do it?
|
Bump and a huge THANK YOU to raymor. This guy rocks. He spent a couple hours working on anti-ripping scripts and strategies for one of my sites, then helped me out with some regex stuff.
If you need any protection for your paysite member areas or anti-ripping/slurping, do your business with Ray. He really knows his shit and is a super good guy. [fusionx seal of approval goes here]. I gotta photoshop something for this :) |
Just Ha d my traditional Bohemian/German New Years dinner!!!
I had rabbit hausenfeffer withb bread dumplins
and roast pork and sauerkraut...Good stuff :thumbsup Helps when you like marry a czech girl :1orglaugh I love Praha |
Thanks to all the webmasters and programmers for the responses.
I am on a Windows box so Strongbox is not an option, but I do have a comparable product that allows me to throttle and block. Yet I do not use the throttle option as that leads to emails with pissed off members. IF they are clearly using a mass download program, and I have analysed the logs to identify them, I give them a warning. Second time I cut them off and change their password. The benefit of protecting the page counters is, I do enable dictionary attack, which to the server seems like that is happening when they try to Quadsuck 10,000 page counter scripts, and they get cut off for 6 hours. That is enough for one day, they try again the next and receive an email. Fortunately I only have to deal with this a couple times a week with 3000+ members. The real problem is, it will only get worse. As raymor correctly pointed out, this "quad" option is enabled in IE6 by default (offline files) that I had not considered. As technology advances we have to stay ahead of the cheaters yet provide for our members. Arrrrggg. More boxes, more ram, dual CPUs, more bandwidth, and less revenue from the "cheaters". It's a pain in the ass to review logfiles everyday, but hey, who else but me is going to do it ya ? Thanks again for all the viewpoints, some great answers !! take care, i be lurkin' back at ya |
why would anyone in their sane mind allow that?
|
Quote:
|
Good tips in this thread - thanks.
Another aspect to limiting abuse is bandwidth throttling - Apache has mods to limit number of simultaneous sockets and the rate of download (regardless of the user's connection speed). RE the IP blocking approach -- I'm not sure if anyone pointed it out yet, but blocking an IP may block legitimate visitors on using the same provider (aol, etc.). If you do lock out a specific IP, do it for a 'cooling period' and then reinstate it so legitimate users can still get it. Fortunately with members, as soon as a violation of TOS is determined, you can kill their access. RE only dumb site scrapers will reveal they are NOT a browser -- dealing with this type of download via UserAgents is pointless. Generally, the usage pattern will reveal an abusive agent. I lock out anyone who continuously hits my site at a rate faster than two pages per sec. -Dino |
If you have a site where people have to log in to access content and you use a script to show images, it would be easy to use sessions to keep track of how many pictures are being downloaded.
It wouldn't be hard to make it block people who download 5 pictures/second for a full minute straight then stop giving them pictures. |
solution
Following up, I have since figured out a great way to block the quadsuckers. I imbeded three versions of a hidden link on all member pages. Members cannot see these links to click on but their quadsuck software can and will. Once the hidden link file is downloaded, a script is activated that disables their login and sends me an email. Easy as pie, no more quads at my site.
|
All times are GMT -7. The time now is 06:55 AM. |
Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123