I’ve got a few Facebook Applications I’ve played around with developing that are not actually for use (read: they do nothing). I’ve noticed over the past few days their canvas URL’s are seeing traffic in the form of 1 hit approximately every 24 hours. Previously they saw no traffic at all. At first I thought this was just Facebook with some new process to check for malicious apps, which sounds like a good idea. Then I did some digging and found something surprising:
The first thing I found was the hostname where the request originated was out-sw251.tfbnw.net
which is obviously owned by Facebook. That’s not terribly interesting and supports my theory up above.
Then I found these two curious bits in the request:
X-FB-USER-REMOTE-ADDR: 66.249.67.211 USER-AGENT: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
That IP address is crawl-66-249-67-211.googlebot.com
. That UserAgent is very telling and needs no introduction.
The request is otherwise pretty unremarkable other than no query string which a normal person would generate when hitting that canvas URL. However fb_sig_request_method
is set to GET
which suggests to me it’s actually using POST
despite that what it claims. There’s no fb_sig_user
or anything else that would suggest an actual user, which makes sense because fb_sig_logged_out_facebook
is set to 1
.
It appears as of March 20, 2011 Google has started crawling Facebook Apps. I’ve got no idea what it’s intent, abilities or relationship is. I can tell you that I’ve monitored since at least April 2010 and this only started a few days ago.
2 replies on “Googlebot on Facebook?”
Thanks for this information, I follow Google’s Orwellian privacy intrusions and have a full swing love/hate relationship with the company. By the way I also really appreciate your @font-face posts.
Keep up the good work,
josh
[…] bekannt ist, dass Google Facebooks sog. "Apps" crawlt und ausserdem die in Websites eingebundenen Kommentare indiziert. Seit einiger Zeit, genauer seit […]