![]() |
How smart is Google(bot) when it comes to text
Lets take this text for example. Its from a new sites.
England Prostitutes in the red light district of Ipswich, England, are being warned by police to stay off the streets, after the bodies of two more women were discovered today. If i copied that and put it on my blog Googlebot will find out and kick my ass for it right? What if i changed it like this: England Prostitutes in the red light district of Ipswich, England, are being warned by the police to stay off the streets. This happened after the bodies of two more women were discovered today. Is this enough to create a unique text in Googles eyes? |
No it's still 90% identical,, if that's what you're asking..?
|
Smarter than you!
|
Have no idea where your going with this
|
Quote:
|
Quote:
|
I think if you copied that from somewhere else and popped it on your blog and changed those 2 things, I think googlebot would know and give you lower serp than the original peice of text.
i think this "England Prostitutes in the red light district of Ipswich, England, are being warned by " is long enough for it to see cut and paste. |
btw franck, i think ur cool
|
It's debatable but below 60% seems to be where most people say it becomes 'unique' in google's eyes.
|
Quote:
|
Quote:
|
|
Quote:
|
i'd say about half needs to be unique
|
Quote:
may i stick my wad up your dick? |
Quote:
I reckon Googlebot is smarter than your signature for one. As for your example, then yes, the particular example you gave would pass as you split a compound sentence into two. Googlebot works off summarising content. As it's a bot, it doesn't know what the fuck the content is, just how it's formed. As such, it can only summarise sentences, and if the summary comes out the same then the sentence is considered the same. As it's two sentences, then it passes. eg a summarise to 1% of the intital phrase is identical - the sentence is a perfectly formed compound sentence which cannot be said simpler in 1 sentence. Your new example, can be simply summarised as "England Prostitutes in the red light district of Ipswich, England, are being warned by the police to stay off the streets." since the fact that follows is irrelevent. Maybe GB even strips all adjectives from a sentence before analysing it, in that case your first new sentence would have problems.... If this shit is important to you, you should really buy a mac - it's built in.... or write your own content... |
even to summarise my post to a lax setting wouldn't pass:
summarised as: As for your example, then yes, the particular example you gave would pass, as you split a compound sentence into two.... As such, it can only summarise sentences, and if the summary comes out the same then the sentence is considered the same.... eg To summarise to 1% of the intital phrase is identical - the sentence is a perfectly formed compound sentence which cannot be said simpler in 1 sentence. Your new example, can be simply summarised as "England Prostitutes in the red light district of Ipswich, England, are being warned by the police to stay off the streets."... Maybe Googlebot even strips all adjectives from a sentence before analysing it, in that case your first new sentence would have problems. |
like others have said, you need to change way more than just adding 2 extra words...
|
It's amazing to think that Google can compare EVERY sentence versus every OTHER sentence it has indexed, and then decide who the "legitimate" owners are.
Most of what is "common knowledge" of what Google "does" is likely just FUD: fear, uncertainty, and doubt. Look at Google ITSELF, for example. They run http://news.google.com right? Isn't that just RSS feeds? Isn't it duplicate content? Look at Autoblogger sites as another example. Clearly, customers rank highly in SERPs, clearly Bliggo betas have kicked ass and taken names, and clearly, the majority of content from THOSE sites is pulled from feeds. It just doesn't add up, in my opinion. Your results may vary, but mine have been very good using what others complain may be "duplicate content." Hope that helps. |
Quote:
Summarise and then check and flag up for further analysis later if needs be yes, but not at all sentence by sentence cross-check. |
Great information here.
Thanks :) |
Quote:
If they scare you into thinking they're doing something, you probably aren't going to do it. This allows them to get the "effect" without having to code the "cause." I think they DO check for duplicates, on some level, probably on the same IP or block of IPs. It wouldn't be, for example, advantageous, to create a bunch of clones of the same site on your server. This, also, is speculation, based on nothing more than what Google has said, true or not. |
There is a percentage but not sure what it is.. its probably on a sliding scale. If Google hurt sites that had news feeds that would just be in poor taste. However most bloggers/writers when asked to copy their work require a 15% change in text.
|
Quote:
Take a look at the small print of the site you are taking the text from and you'll see that it's a copyright issue to alter the text in any way shape or form. Keep altering the text of a BBC/ AP/ [insert wire service here] report and you'll find your host pulls your site pretty quickly rather than face a suit. Don't worry about altering the text - as far as google is concerned, your serp will be based on RELEVANCY rather than originality. Copy the text word for word, but don't quote the entire article. Instead, copy part of it and link the rest in a new window. |
Great thread threadstarter!
|
Quote:
The only way for a bot to detect duplicate content is creating some sort of hash(es) for a page, everything else would be way too complicated. So, pages done by a doorway generator or blogs only fed by one sponsor feed will certainly suffer from a DC penalty, but in general DC seems to be a real hype and everybody is blaming DC for worse rankings. Links to the wrong sites, massive direct recip linking, too much crosslinking of your own sites and huge increases of backlinks in a short time hurt alot more than any DC will ever do ! |
I may add: Always check for NPH proxies targetting you, they can shoot you from google very fast...my worst concern about DC
|
Quote:
|
This conversation is terrible. The original question is nonsense. You’re worried about two sentences.
ROBO |
Quote:
|
|
Quote:
It's interesting stuff for sure. |
All times are GMT -7. The time now is 02:05 AM. |
Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123