Content Scraping: What Do You Do?

Travel
by Eileen Smith Jan 4, 2010
Once your internet footprint reaches a certain size, chances are people will start scraping your content. Matador contributor Eileen Smith shares a few thoughts on what happened to her.

I WAS pre-coffee tweeting one morning when I saw a tweet on winetasting in South America, a story I had submitted a few days earlier.

Oh good, I thought, my story is published.

As a freelancer, especially one who writes for the web, even with Google alerts it’s hard to know sometimes when something of yours is going live, and you have to keep your finger on the pulse (or watch your blog traffic) to see what’s up.

Five minutes later, stovetop espresso in hand, I clicked through the link I’d sent my followers. The whole story was scraped. The story which I had pitched, had accepted, researched and written specifically for publication had been lifted, wholesale and placed elsewhere. For free.

Scraping is stealing someone’s content and posting it as your own. In the past I had seen bits and pieces of what looked like my stuff, and even photos I’d taken posted elsewhere. I would write a little, hey, you-know-what email, and usually get some satisfaction, a link at least.

But this? This had my editor messaging me asking if I’d double-submitted, a major no-no in this incipient industry. It also had me wondering just what had gone wrong. It happened that the site which had scraped my article belonged to someone who had recently asked me to do a guest blog post.

I hesitated for a minute, wondering if I’d somehow given permission for him to steal the content. Classic blame the victim mentality.

In the end, my editor contacted the offending party, who removed the content. I retweeted the real URL, and I sat, and fumed, downing more coffee, waiting for an apology that never came. I contacted some people with thicker skins and more years on the job than me, and came away with some different perspectives, and posted my frustration on my blog, where I knew the scraper, my editors, (and every other visitor, and maybe even some of you) would read it.

The question of when content scraping will happen to you is not so much if, but rather when. Do something out of the ordinary, or achieve a small amount of notoriety or write something clever and sit back and relax. Anyone, anywhere can lift your work and pass it off as their own, without so much as a credit, link, or thank you.

So what’s a creative, prolific person to do?

You could not publish anything, anywhere, keeping it all for yourself and under lock and key. Ick. You can watermark photos, or use Flickr’s “all rights reserved” stamp, (though this amounts to nothing more than a “pretty please don’t steal my photos, thanks”).

Writing is trickier. The written word is easily cut and pasted, or retyped from print onto a blog. South African infertility blogger Tertia Albertyn found several entries from a published book she’d written (So Close: Infertile and Addicted to Hope) posted on another blogger’s website.

Julie Schwietert, managing editor at Matador and one of the people who held my hand through my scraping experience, told me about a Cuban photographer friend of hers whose photo she’d seen in a gallery in New York.

He doesn’t follow up on these cases, he says, because the energy required exceeds the benefits he would reap. It’s not that he necessarily throws photo licenses into the wind, just that he knows that realistically, he will make himself sick with effort at trying to track all of these infringements down.

David Miller, Matador’s senior editor, has another take on artists’ rights, which he explained to me over Spanish tortilla one evening in Santiago. He believes Creative Commons licenses are the way to go.

CC defines themselves as “a nonprofit corporation dedicated to making it easier for people to share and build upon the work of others, consistent with the rules of copyright.” CC has gained popularity via Flickr, where users are allowed to specify that the works can be used with credit, for financial gain, or not, etc. Artists using CC have the benefit of increasing their internet footprint, with the possibility of remuneration coming via special projects. A good example is Trey Ratcliff, the most popular travel photographer on the web.

6 Thoughts on Content Scraping

1. Expect it. If you’ve got it out there, expect it to turn up somewhere else.

2. Prevent it. If it’s important to you to prevent it, take steps to do so. Hide it, watermark it, post it as an un-copyable PDF.

3. Find it. Go out and troll likely thieves, search uncommon character or word strings or check your Flickr referrals and see where people are coming from. Often, someone has linked to your photo from Flickr, and not rehosted it, which makes the theft easy to track.

4. Defend it. If you’re irked, set your editors, your blog readers (like Tertia’s), and other bloodhounds you have working on your behalf to storm the castle. Ask politely for the content to be removed. Grow steadily more insistent if they refuse or ignore.

5. Accept it. Take a page from Julie’s photographer friend’s book, and realize that it’s more important to hone your craft than it is to chase down wannabes.

6. Do an end-run around it. By marking your work Creative Commons, you increase exposure. Consider that disseminating your work (even freely) does not cheapen your ability to express yourself, and if you develop your craft and to the point where you have your own voice and vision, no one will believe that anything you create belongs to someone else.

Personally, I’m working on moving towards step 6, but I must report with sadness that I’m still in the capitalist grabby mindset that what’s mine is mine, and it’s not yours to show, publish, make money from or claim as yours unless I give you permission. Let’s see how far that gets me.

Community Connection

Matadorians, where do you find yourselves? Has your content been scraped? Did you follow up? Are you ready to go Creative Commons all the way?

Discover Matador