collapse collapse
* User Info
 
 
Welcome, Guest. Please login or register.
* Search

* Board Stats
  • stats Total Members: 989
  • stats Total Posts: 18365
  • stats Total Topics: 2500
  • stats Total Categories: 7
  • stats Total Boards: 35
  • stats Most Online: 1144

Author Topic: Problem accessing fc webpages  (Read 17177 times)

0 Members and 1 Guest are viewing this topic.

Offline Dockwagon

  • BeBot Rookie
  • *
  • Posts: 10
  • Karma: +0/-0
Problem accessing fc webpages
« on: July 31, 2007, 08:27:50 pm »
Been struggling a bit with whois_update, and I am not sure what to do anymore

on server with bebot :

user@machine:~/bot$ wget http://www.anarchy-online.com/org/stats/d/1/name/7745538/basicstats.xml
--20:01:16--  http://www.anarchy-online.com/org/stats/d/1/name/7745538/basicstats.xml
           => `basicstats.xml'
Resolving www.anarchy-online.com... 216.74.158.92
Connecting to www.anarchy-online.com|216.74.158.92|:80... failed: Connection refused.
user@machine:~/bot$ wget www.vg.no
--20:11:37--  http://www.vg.no/
           => `index.html'
Resolving www.vg.no... 193.69.165.21
Connecting to www.vg.no|193.69.165.21|:80... connected.
HTTP foresp?rsel sendt, mottar topptekster... 200 OK
Lengde: 181.306 (177K) [text/html]

100%[====================>] 181.306      121.77K/s             

20:11:40 (121.43 KB/s) - `index.html' saved [181306/181306]

On another server I have access to, but can't use :

anothermachine:~$ wget http://www.anarchy-online.com/org/stats/d/1/name/7745538/basicstats.xml
--20:01:50--  http://www.anarchy-online.com/org/stats/d/1/name/7745538/basicstats.xml
           => `basicstats.xml'
Connecting to www.anarchy-online.com:0... connected!
HTTP request sent, awaiting response... 200 OK
Length: 78,092 [text/xml]

    0K .......... .......... .......... .......... .......... 65% @  84.18 KB/s
   50K .......... .......... ......                          100% @ 175.08 KB/s

20:01:51 (102.50 KB/s) - `basicstats.xml' saved [78092/78092]

Been trying to figure out how to use proxy server in php and bebot, without much success, any suggestions?
FC suggests that I clear cookies in my webbrowser, which isnt very helpful
« Last Edit: July 31, 2007, 08:46:52 pm by Dockwagon »

Offline Alreadythere

  • BeBot Maintainer
  • BeBot Hero
  • ******
  • Posts: 1288
  • Karma: +0/-0
Re: Problem accessing fc webpages
« Reply #1 on: July 31, 2007, 08:41:46 pm »
It seems like someone at FC is blocking IPs of botusers, most likely to the possible high amount of http queries the whois queries can produce each day.

Same for me, I can't access www.anarchy-online.com at all, regardless of the way I try to connect - firefox, wget, lynx, telnet, whatever. Always ends in a connection refused right away, even though those clients aren't blocked by any firewall on my side.

Offline Dockwagon

  • BeBot Rookie
  • *
  • Posts: 10
  • Karma: +0/-0
Re: Problem accessing fc webpages
« Reply #2 on: July 31, 2007, 08:45:50 pm »
Hm, that is not good.

Whole org takes a hit when they do stunts like this.

But, is there any way to get in touch with FC and make some servers which can mirror the user database?

Then they could only have 1-10 bots crawling their web and the bandwith/cpu used will be paid by the users willing to mirror. I can mirror no problem

Offline Dockwagon

  • BeBot Rookie
  • *
  • Posts: 10
  • Karma: +0/-0
Re: Problem accessing fc webpages
« Reply #3 on: July 31, 2007, 08:51:24 pm »
Hm, does php have what is called cpickle in python?

Cause crawling the db and save the data needed in gzipped cpickle would be easier on bandwith.

« Last Edit: July 31, 2007, 08:56:57 pm by Dockwagon »

Offline Alreadythere

  • BeBot Maintainer
  • BeBot Hero
  • ******
  • Posts: 1288
  • Karma: +0/-0
Re: Problem accessing fc webpages
« Reply #4 on: July 31, 2007, 08:57:35 pm »
No clue how cPickle relates to the problem - I haven't worked with python yet, but serialization doesn't seem to be our problem, and that seems to be the only thing cPickle offers.

Though any program mirroring the FC site doesn't have to be written in PHP, any language works. Only reason I use php for the whois update script is the easy way it accesses DBs, the fact that I got some experience with php and that bebot is written in php.

Offline Dockwagon

  • BeBot Rookie
  • *
  • Posts: 10
  • Karma: +0/-0
Re: Problem accessing fc webpages
« Reply #5 on: July 31, 2007, 09:13:12 pm »
Can you help me make the fsockopen use proxy then? My php fu is weak :/
I've tried with the example from http://www.ziguras.com/php/using-fsockopen-to-connect-to-remote-servers but can't get it to work properly. Any help would be appriciated :-)

I tested some anonymous proxies from http://www.proxy4free.com/page1.html and wgetting www.anarchy-online.com worked with : export http_proxy=http://url:port

cPickle is just to store/read objects. Objects can be data structures, functions etc. Quick and easy way to do it in python. I just thougth that getting a dump of the data from the xml we need for whois could save bandwith if we spread it with mirrors.

Offline Alreadythere

  • BeBot Maintainer
  • BeBot Hero
  • ******
  • Posts: 1288
  • Karma: +0/-0
Re: Problem accessing fc webpages
« Reply #6 on: July 31, 2007, 10:46:44 pm »
Never tried to get fsockopen to work through proxies. Besides, that would just shift the problem, not solve it.

Offline Alreadythere

  • BeBot Maintainer
  • BeBot Hero
  • ******
  • Posts: 1288
  • Karma: +0/-0
Re: Problem accessing fc webpages
« Reply #7 on: July 31, 2007, 11:04:36 pm »
Just tested the proxy example on the page you posted, and it worked with my private squid proxy.

Offline Dockwagon

  • BeBot Rookie
  • *
  • Posts: 10
  • Karma: +0/-0
Re: Problem accessing fc webpages
« Reply #8 on: July 31, 2007, 11:24:13 pm »
Proves that me and php == incompetence.

Did you include it into bebot's Sources/Bot.php ?

It seems straight forward, but somehow I get it wrong. If you've included it to Bot.php, is it possible to get a copy? :-)

Making a list with lots of different anonymous proxies and a random selection I think I can do myself.
If I get it working, I could open the table for storing character info to the world with read only access.
« Last Edit: July 31, 2007, 11:26:40 pm by Dockwagon »

Offline Alreadythere

  • BeBot Maintainer
  • BeBot Hero
  • ******
  • Posts: 1288
  • Karma: +0/-0
Re: Problem accessing fc webpages
« Reply #9 on: July 31, 2007, 11:44:12 pm »
I tried it with my proxy (just the code snippet in the linked page), but timed out with some of the random proxies of your list that I tested, never got any result. Either those expect some logon or are seriously lagged.

If we do some central updating solution a text listing or even just a diff with changed entries to the day before would be best I think, and then just zip or gzip it to reduce total bandwidth usage.

Offline Dockwagon

  • BeBot Rookie
  • *
  • Posts: 10
  • Karma: +0/-0
Re: Problem accessing fc webpages
« Reply #10 on: August 01, 2007, 12:24:09 am »
Need to use the proxies who are listed as anonymous.

If it were possible to get in touch with fc and make a solution, tuzzy downloaded all the xml files a year ago or so, and she reports the file was ~590 ish mb.
Looking into the whois table in bebot 0.4 it's not much information stored, and I can understand FC wanting to stop what can be considered a DoS attack. To my understanding the whois_chache, which now is included in bebot, tries to cache every org member of an org which someone have done a !whois on. This would grow quite a bit :-)

I havent done much db replication in mysql, aka nil, but if it were possible to let mysql handle the replication, it could be an idea? Or, maybe the config/security issues are to great? Diff seems like a very simple solution too :-)

Or, make a centralized !whois db, a bit like what vhab have done with items.

The key for a permanent solution I think would be to get in touch with fc and make an acceptable solution with them.

Offline Wolfbiter

  • Contributor
  • *******
  • Posts: 149
  • Karma: +0/-0
    • KAZE
Re: Problem accessing fc webpages
« Reply #11 on: August 01, 2007, 04:10:05 am »
My cache was up to ~120k players at one point, and of those 25k was orgless. That made the update script take a long time to finish (and to query the xml files ~26k times during that time). One way to limit the amount would be to clear the cache of orgless people (since they take up the most of the xml queries in a large cache) now and then.

Another thing to do when you have many bots running on the same machine would be to do $noupdate=true on every lookup, so it'll never fetch a new when the timestamp is old, just on people that doesn't exist. That way if you have a large cache there wont be any problem with a slow update script (not sure how long time mine took towards the end, but I think it was around 3-4h at ~80k players) and the bots fetching a fresh lookup every time someone does a whois because the cache isn't updated (could be solved by setting a much higher expire time).

I was never banned from FC, and it was just two weeks ago or so when I cleared my cache. So either they changed something since then, or your bots do an unrealistic amount of connections towards funcom and fixing that sounds like a better idea than to try and go around it.

A central whois database exists already.. it's called www.anarchy-online.com... Any off-box/lan database will be slower which is the purpose of the local cache.
Too many toons.

Offline Ebag333

  • Contributor
  • *******
  • Posts: 134
  • Karma: +0/-0
Re: Problem accessing fc webpages
« Reply #12 on: August 01, 2007, 08:21:53 am »
I was never banned from FC, and it was just two weeks ago or so when I cleared my cache. So either they changed something since then, or your bots do an unrealistic amount of connections towards funcom and fixing that sounds like a better idea than to try and go around it.

In my case I'm pretty sure it was due to a bug in the flexible security plugin causing it to spam Anarchy-Online looking for whois info on alts of toons that were invalid.

Hittings FC's servers every 2ish seconds (plus regular whois lookups) would do it for me real quick, especially considering I use multiple bots. :)

A couple of workarounds would be to use proxies, or to try and parse the HTML from Auno or Vhab's sites.  Vhab's HTML is already clean enough that it should be fairly easy to parse to a XMLish format.

Offline Alreadythere

  • BeBot Maintainer
  • BeBot Hero
  • ******
  • Posts: 1288
  • Karma: +0/-0
Re: Problem accessing fc webpages
« Reply #13 on: August 01, 2007, 08:31:39 am »
My cache was up to ~120k players at one point, and of those 25k was orgless. That made the update script take a long time to finish (and to query the xml files ~26k times during that time). One way to limit the amount would be to clear the cache of orgless people (since they take up the most of the xml queries in a large cache) now and then.
I've got 42k unorged people in my cache, which created that many queries - guess FC didn't like it anymore. They blocked my pretty sudden as of friday last week.

One other way to lower the load is doing updates just for orged people, no need to delete the unorged ones, the most current version of the script simply doesn't try to update them anymore on default.
For me that will result in like 1500 queries, with a higher delay those won't create that much load on the FC site anymore.

Offline Vhab

  • Contributor
  • *******
  • Posts: 180
  • Karma: +0/-0
    • VhaBot Forum
Re: Problem accessing fc webpages
« Reply #14 on: August 01, 2007, 09:40:46 am »
All this blocking sounds a bit strange though.
I run nightly mirrors for quite some time and I haven't been blocked yet.
Auno seems to be fine aswell.
Though, maybe they looked at it case-by-case as it runs from the box helpbot is on.

 

* Recent Posts
0.8.x updates for AO by bitnykk
[June 23, 2024, 03:19:47 pm ]


0.8.x updates for AoC by bitnykk
[June 23, 2024, 03:19:44 pm ]


[AoC] special char for items module by bitnykk
[February 09, 2024, 09:41:18 pm ]


BeBot still alive & kicking ! by bitnykk
[December 17, 2023, 12:58:44 am ]


Bebot and Rasberry by bitnykk
[November 29, 2023, 11:04:14 pm ]

* Who's Online
  • Dot Guests: 157
  • Dot Hidden: 0
  • Dot Users: 0

There aren't any users online.
* Forum Staff
bitnykk admin bitnykk
Administrator
Khalem admin Khalem
Administrator
WeZoN gmod WeZoN
Global Moderator
SimplePortal 2.3.7 © 2008-2024, SimplePortal