The fact that a content creator (i.e., magazines, newspapers, blogs, etc.) owns the copyright on the content they generate is at the heart of the issue. When an aggregator comes along and scrapes the content, it is essentially a copyright violation. In some circumstances, material is licensed for use on other websites (for example, a newspaper that charges a fee). In this case, the publisher/creator may allow the scraping of their material but cannot stop others from re-publishing it.
The best way to understand how scraping affects copyright law is to understand how copyright works. Copyright protects original works of authorship fixed in a tangible medium. It allows the author to control how, when, and where their work is reproduced. For example, an author could give you permission to reproduce their work by saying "You can print copies of my novel for your library, but you can't sell them." However, if the author didn't give you permission to copy their work, then that would be a copyright violation.
When someone steals another person's work, they are violating those rights. The people who own the copyright on the original work have the right to decide what happens to their work. They can choose to file a lawsuit against the scraper, seek compensation through a licensing agreement, or do both.
In conclusion, stealing other people's content is illegal. If you find material on another website, please contact them and ask their permission before you post it.
Attribution isn't enough; you'll still be infringing on someone else's intellectual property. Most media organizations will not monitor every blog that reprints a story or two, but if you do this on a regular basis and on a commercial site, such as one with advertisements, you should expect to receive a takedown request (in the US, a DMCA notice).
In fact, according to the Electronic Frontier Foundation, nearly all news stories are copyrighted by their authors or publishers. Asking permission from each author or publisher would be impossible for most people or businesses. Reprinting articles is therefore unlikely to cause problems for you unless you completely remove any attribution or acknowledgments.
However, if you repeat stories frequently or in large quantities without removing any copyright information, then you could end up getting into trouble with your Internet service provider (ISP). Some ISPs have policies that can lead to disconnections if they detect certain types of activity. For example, if you're using excessive bandwidth or creating lots of traffic for no reason, they may decide to cut you off to protect other customers from being affected by your actions.
It's best to check with your ISP before reprinting stories because they may have different policies regarding third-party use of content. If they do have a policy, they should tell you what that policy is in writing. For example, your ISP might have a policy stating that users cannot infringe copyright online without risking their services.
Millions of Internet users use their connections to download and exchange information that violates intellectual property rights. It's been going on for over two decades and shows no signs of abating. This type of action has no implications for the vast majority of users. However, it can have serious consequences for individual users if they are found to be infringing copyright by others.
The first thing you should know is that downloading copyrighted material is not illegal per se. The only thing that may prevent you from doing so is if you live in one of the few countries where actually sharing copyright-protected material without permission is also prohibited.
That being said, there are three main ways people download copyright-protected material: through P2P programs, private websites, and file sharing search engines. We'll discuss each method in more detail below.
Using P2P Programs
Peer-to-peer (P2P) programs allow users to connect to other users' computers directly instead of using a central server. A popular example of a P2P program is BitTorrent. Users upload pieces of a larger file or stream music/video from other users' computers. When someone wants to download a file, they contact other users who have the file and who are connected to the user looking for content. After agreeing on terms, the users send each other the relevant files which are then downloaded simultaneously.
The content may only be used or reproduced for personal, non-commercial purposes. Framing, scraping, data-mining, extraction, or collecting of the sites' content in any form or by any means is strictly banned.
Despite the fact that data scraping has not been deemed unlawful, its aim might be investigated. In the instance of Facebook, the two firms took advantage of users' access to services via a pair of browser extensions named "Upvoice" and "Ads Feeds" that were meant to obtain data. These tools were actually developed by a company called NebuAd which was hired by Upvoice and Ads Feeds to provide them with data.
The issue before the court was whether or not this activity constituted copyright infringement. Although neither party sought relief beyond compensation for their own injuries, Judge Alsup issued an injunction prohibiting Facebook from continuing this practice.
He first concluded that NebuAd had a valid copyright because software programs are subject to copyright protection. He also noted that NebuAd owned the rights to the code behind Upvoice and Ads Feeds since they were independently created by third parties. This means that neither extension violated anyone's copyrights.
However, despite finding no violation of law, Judge Alsup decided to issue an injunction against Facebook because it showed a clear intent to violate NebuAd's copyright. By using these tools, Facebook was able to obtain data from websites that used NebuAd's technology without paying for it. This behavior demonstrated that it was acceptable for Facebook to use others' work without permission as long as it helped it compete more effectively with Google and other advertising networks.