In the past month or so I’ve unsubscribed from at least 5 so-called “SEO blogs” because they told people to break the law. Scrapers will traditionally use live content, content that is fresh and new. However, these SEO blogs told people to scrape from dead sites instead of live ones.
The process basically goes like this:
- Find dead site
- Go on the Internet Archive
- Steal content
The content on dead sites is not always uncopyrighted content. When a webmaster marks a site as dead, they still hold the copyright on it, thus you can’t copy it unless they explicitly state “This content is in the public domain” or unless they use some sort of license that allows copying.
In addition, scraping dead content can still cause you issues even if it is uncopyrighted. Think about other scrapers who’ve already stolen it: duplicate content. And chances are if the SEs find the other scraper’s version of it before yours then they are viewing that copy as the “original”, which will not harm them but could harm you.
Let’s not forget the fact is is completely unethical. Ethics on the Internet has been going downhill for some time (it’s become much more noticeable since the social networking boom). I tend to compare people without ethics to monkeys who throw their own poo. They are both stupid and lame. Set yourself some good rules and stick with them for the rest of your life.
Personally I think any form of content theft is wrong, even if it is as simple as using a press release (which can also be copyrighted, by the way). How often do you see the really popular bloggers do that? Do you think using someone else’s text and claiming it as your own makes you look like a better blogger? It doesn’t.
Write your own content!
And if you are really out of ideas, why not try guest blogging?
Please subscribe, or else I will cry. Do you really want to make a programmer cry?

July 6th, 2007 at 5:56 pm
An excellent article and a point I had not considered. Bravo!
I’ll have to write a follow up about this.
But here’s the critical thing to remember: Life plus seventy.
If the author has not been dead for seventy years, it is still copyright protected. It doesn’t matter if they destroy it, remove it from the Web, renounce it or are ashamed by it, they still hold copyright protection and, unless they place it into the public domain or CC license it, the default is do not copy.
It is worth saying that people who have abandoned their sites are much less likely to search for their content and try to stop infringement. However, that isn’t to say that they didn’t just pack up and move either…
July 6th, 2007 at 6:10 pm
Would be creepy if some dead person’s site got ripped off and they somehow sued
July 6th, 2007 at 9:27 pm
Well, you can will your copyrights and then have a loved one sue…
It’s happened.
July 6th, 2007 at 11:30 pm
Out of curiosity are you the one using commentful from blogflux? If so, how well does it work?
July 7th, 2007 at 12:56 am
Yes, I am using it. It’s working quite well actually. I needed to keep track of my comments and, sadly, not all blogs offer the neat “subscribe via email” feature your site and mine provide.
I’ve liked it so far, especially with the FF plugin, and it seems to work with more sites that Co.Comment. I’m early in the experiment, and it is indeed an experiment, but I am impressed.
If you want I’ll keep you posted…
July 7th, 2007 at 1:01 am
Yeah, sure, just drop a comment or e-mail or whatever with your experiments and such. Thanks ahead of time.
July 9th, 2007 at 3:10 am
I totally agree with you. I’m sick that some people just copy all content of my posts, and after adding a linkback they think everything is okay. Naturally that linkback is useless, Uncle Google gives duplicate content filter.
By the way, it’s much better to generate
July 9th, 2007 at 9:11 am
a few months back I used to let sploggers go who stole really small posts that did linkbacks (if they are followed), but I switched to a zero tolerance policy and wrote myself a nice script that lets me keep my sanity while filing dmca after dmca. Just insert their url, name, the original post, and boom done.