leifs wrote:The stats from my webhotel I see (for the first time) that google is systematic collecting tiles from my panotour panos.
Are they decoding the XML's ? Can they possible put it together?
Do I have to protect the panos from google ?
Anybody.
leifs
HansKeesom wrote:They collect them because they found a link to them, that's webcrawling like most search engine do this. Nothing out of the ordinary.
If you want to stop it you can place a robots.txt file in the topdirectory of your webserver.
leifs wrote:HansKeesom wrote:They collect them because they found a link to them, that's webcrawling like most search engine do this. Nothing out of the ordinary.
If you want to stop it you can place a robots.txt file in the topdirectory of your webserver.
There are no links to the tiles directly. They are jpg's ok, but the name is produced from the XML's by the krpano viewer.
The robot.txt is for denying access to a directory. I don't mind the bots to index my thumbnails etc, but I have not seen them systematicly grab the tiles.
For now, while thinking, I've denied the two ip-addresses (google-image bots) access to my site by using .htaccess. Google and the others can index my site as before.
as seen below there are quite some robots visiting the site. this is for december until now.
leifs
HansKeesom wrote:So isn't the conclusion then that they are ignoring your robot.txt and get the names from the xml-files which are refered to in the html-files?
leifs wrote:HansKeesom wrote:So isn't the conclusion then that they are ignoring your robot.txt and get the names from the xml-files which are refered to in the html-files?
They are grabbing the XML's. But there are no explicit filenames to grab there. The filenames are produced from the XML's. This is how the name of the thousands jpg's are presented in the virtualtour.XML
<level tiledimagewidth="898" tiledimageheight="898">
<front url="froystadtua_sphere_64/0/0/%v_%u.jpg"/>
<right url="froystadtua_sphere_64/1/0/%v_%u.jpg"/>
<back url="froystadtua_sphere_64/2/0/%v_%u.jpg"/>
<left url="froystadtua_sphere_64/3/0/%v_%u.jpg"/>
<up url="froystadtua_sphere_64/4/0/%v_%u.jpg"/>
<down url="froystadtua_sphere_64/5/0/%v_%u.jpg"/>
</level>
%v and %u are counters which increase from zero to a maximum integer found or calculated from somewhere else in the XML's.
For me it looks like Google-image has reverse engineered the way krpano make tiles and has the ambition to download all the tiles, for maybe later to put them together and so get the original image. They have cooperated with NSA on other issues, so this is probably a piece of cake when you have this kind of resources.
leifs
HansKeesom wrote:The code you give is the code as read by your webserver from disk. The webserver translates it into explicit code and write that to the client.
leifs wrote:HansKeesom wrote:The code you give is the code as read by your webserver from disk. The webserver translates it into explicit code and write that to the client.
Sure ? When I look at the source of the html i've got from the server there is no filename.jpg there. Where is it written on the client, so that google-image can extract jpg filenames from it ?
leifs
HansKeesom wrote:The robot works like a client, it interprets the (java)code.
What I described earlier was server based interpretation, this is client-based.
John360 wrote:I do not think its happening now though.
John
Users browsing this forum: No registered users and 4 guests