The difference is that Google didn't agree to not scrape your data. You, as per their TOS, agreed not to scrape theirs, as part of the condition of using their service.
To see the terms that Google thinks you have agreed to, click 'Terms' at the bottom of www.google.com
If that doesn't hold up in court, in future on your first visit to Google it will simply display some text and require that you click 'I agree' to continue.
Either way, it seems reasonable to me that you should agree to their terms in order to use their service.
So if instead I scrape their site (like they are scraping others) I don't have any opportunity to agree to their terms? Much like their scrapers on other sites?
I'm honestly wondering about the double standard. There is a rational way to discuss morality/ethics and subsequent laws regarding most technical aspects, that often mirrors real world (read: offline/analog) scenarios. It's unfortunate that the legal system has instead been appropriated by lawyers.
>> It's unfortunate that the legal system has instead been appropriated by lawyers.
omg, really?
It's unfortunate that the internet has instead been appropriated by hackers.
It's unfortunate that the stock market has instead been appropriated by traders.
It's unfortunate that the asylum has instead been appropriated by inmates.
To some extent, yes. When people spend enough time in their given field to know the ins and outs, those less scrupulous tend to bend the rules more and more. While not _strictly_ against the rules it often ends up going against the spirit at the base of the industry.
Very few traders went to jail after 2008. Seemingly legal (or at least not illegal). Should they have? Most bright/talented lawyers are likely working (again within the law) to get megacorps or rich people off for something poorer people would not. In our field this OP is one of the issues. What information is free and what information is not? What things I'm allowed to do offline am I allowed to do online?
I'm not proposing a solution, but any system populated by humans will be abused by some, and fought for by some idealists, all within that systems rules.
Let's take murder:
I stab someone: murder.
I use a broom to push a flower pot off a balcony hitting someone in the head, killing them: murder.
I swat a butterfly in Beijing, causing a chain of events to a container crushing a dock worker in Rotterdam. Murder? If this extreme example comes down to intent it's thought crime, otherwise I'm playing within the rules of the system, and I just killed someone, scot-free.
While there apparently were no laws prohibiting the upsale of bad mortgages, and banks having the resources to move the market towards more and worse mortgages, that also was within the systems rules, but I personally think it's far beyond the intended use of that market, and well outside the spirit of the laws.
There's a huge difference between judicial justice and what most would agree was "justice". That's where my first comment came in. True about most systems.
Yeah, really. Law is not an end in itself, it's meant to serve a purpose instead. When people, whose jobs is instrumental to the goal, start deciding about what the goal is, bad things happen. The same is with MBAs and businesses.
There's no double standard. In the case of crawling and scraping their site the terms are available in the robots.txt file. And Google abide by the robots.txt terms of other websites.
I'm not sure why you dislike this 'appropriated by lawyers' outcome: For web crawling look at robots.txt, for other uses look at the Terms link on the homepage. If you don't agree to the terms then stop accessing the website. Seems straightforward and fair to me.
Yeah, you're right in response to my comment. It was a bad example. But while google.com (for example) has a robots.txt, you could argue that it's not exactly fair nor inviting disruption. For example whitelisting twitter and facebook for images (subsequently blacklisting everything else). While I won't cry too much foul, I get the feeling that Google entered the stage when internet was quite a bit more wild west (for good and bad) and then the internet changed, partly by them and partly by other actors. For at least some markets I believe it's almost impossible to get a footing now as a new actor, as it's only available to (what is basically) cartels. Email being another one, as you can be locked out of gmail.com or outlook.com communication with basically no discourse if you run your own email server.
The TOS that Google follows is published in the robots.txt file. If you don't want Google to scrape your site, then that's all you need. There's no double standard.
I'm sure that's true for your average Wordpress publisher, but the big guys will either slap you with a law suit or take other measures to make you stop crawling their site.
Scraping and crawling is the same thing btw. I absolutely love how the English language has several words for the same thing. Your language very expressive.
Google is a scraper. Your data will end up in their index. You are perfectly OK with Google "stealing" your data.
A new player crawling your site is an offence to you. How dare someone other than Google or Bing put preasure on my site? How dare they steal my data?
TOS is a joke.
I wonder, what was the intention of the founding fathers of the internet, of the internet? Was it not to make data publicly available?
> If that doesn't hold up in court, in future on your first visit to Google it will simply display some text and require that you click 'I agree' to continue.
This statement is demonstrably false, as shown by all the places in the world where this type of TOS-nonsense actually does not hold up in court.
And in the USA, it's (as usual) even slightly more absurd: The only reason it does hold up in court is because Google can afford justice.
Are they A/B testing this or is acceptance IP-based? I reinstalled recently and I didn't see it. Firefox in private navigation mode also lets me use it without forcing me to agree with anything.
Lucky you. I get their stupid modal overlay more often than I'm happy with. On top of that it now usually defaults to Dutch and Dutch results even when I don't want this. Highly annoying.