Backtype: Scraping comments without your consent and trying to monetize it.
It is one of those ideas where you say ‘yeah right, why did no body else think about this”.
TC reports about Backtype, a Twitter for comments:
They’re a blog-comment focused startup – founders Christoper Golda and Michael Montano are for the first time aggregating all comments from millions of blogs into a single, searchable, parsable stream. Think Twitter for all comments on the web.
A quick check shows that yes indeed they scraped a lot of websites. Which I don’t remeber agreeing on.
I left the following comment on the article (instead of rearranging the same sentences here i copy and paste it):
Especially as you are funded, you are in it for the money. Where is my share of you using my content without proper licencing rights?
Basically, you are scraping my content and display it without my consent. The fact that I comment on a blog like this or conciously add content through a site like discus does not make my content fair game for anybody else to scrape.
Do I see this to be useful? Yes. Is it a valuable ressource once filled? Yes. Could you have made a database from all those nuts early beta testers and how they handed over their blogs? Absolutly.
Am I amused to find my content in there displayed in a fashion I do not agree without being asked first? Absolutly not. Even less so as there is no way to opt out of where you should have asked in the first place.
Even more so since you yourself take protecting ‘your’ content seriously. From the T&C
“All trademarks not owned by BackType that appear on this Site are the property of their respective owners, who may or may not be affiliated with, connected to, or sponsored by BackType. BackType-originated content included on the Site, such as text, graphics, logos, data compilations and the compilation of all content on the Site, is the property of BackType and its licensors and protected by US and international copyright laws. Except as set out in these Terms, no reproduction of any BackType-originated content is permitted without written permission from BackType. User-posted content is copyrighted, and any use or reproduction of user-posted content must comply with the terms of the respective license(s) and must include a label indicating such license.”
btw this makes me rethink my idea to move my word to a more ‘scrapeable’ plattform like wordpress.
The thing probably rubbing me off the most is the clear statement on what they want to do with contetn you leave on their sites while at the same time doing the opposite with ‘your’ content. Even more so ‘it’s licensor‘.
Now I know that I am no guru in data mining, but I know that most people have no clue about it. Even worse: No imagination about what is possible. There is a built in business modell for brands and more to have a company like this scan comments left on your site for information they would like to know and compile. Which per se is not a bad thing, I am all for business.
But it is done behind your back, using your ressources and in the end content you provide on other sites to make money off.
Second, it does expose your behaviour on the web. Are you sure you want that for others to see? Do search your name for a moment and see how far back it goes. Basic rule: Everything which can be traced and data mined will be. (It is not as if it suddenly hits me as ‘oh my god I never knew that. Quite the opposite.)
Now, tell me, where can you see that you can delete this comment? Don’t think it is relevant? Have you ever asked somebody to change or remove a comment because of any reason? Good luck with changing it on this site and followers to come. Oh and its licencors.
btw: If you think you have nothing to “hide” you just have no imagination.