Jump to content

Introducing the DU Search Engine


Recommended Posts

I'm not planning on releasing my code yet (but may in the future), however I can set it up so people can pull the raw data from my site if there is interest. That would actually be much faster than everyone trying to pull it from NQ's servers. I'll get started on that so people can have access

Being able to pull the raw data from your site would be great. The code behind it is of no real importance just the data is all I'm really interested in. How i get it and where make no difference to me so long as its not like 3 weeks old and is accurate. With the player and organizational data alone it would make some of my planned program features a lot easier and faster to develop since it would mean not having to write a program to parse HTML source for the data (which i have been working on and is proving to be quite difficult).

Link to comment
Share on other sites

Being able to pull the raw data from your site would be great. The code behind it is of no real importance just the data is all I'm really interested in. How i get it and where make no difference to me so long as its not like 3 weeks old and is accurate. With the player and organizational data alone it would make some of my planned program features a lot easier and faster to develop since it would mean not having to write a program to parse HTML source for the data (which i have been working on and is proving to be quite difficult).

 

Yeah trying to deal with the HTML is a pain :P These sites were not set up to be easy to read. I'll have it set up so anyone can grab the raw data in a couple of days. It'll come in .csv files so you can start planning for it.

Link to comment
Share on other sites

Yeah trying to deal with the HTML is a pain :P These sites were not set up to be easy to read. I'll have it set up so anyone can grab the raw data in a couple of days. It'll come in .csv files so you can start planning for it.

That would be amazing. I assume its the exported table(s) and can we also expect this to be updated with the normal data or will this be updated less often?

Link to comment
Share on other sites

That would be amazing. I assume its the exported table(s) and can we also expect this to be updated with the normal data or will this be updated less often?

 

Yep basically. And it would be updated at the same rate as the rest of the website. Basically the process goes: NQ Servers - Library - Data Tables - Analysis - Website, you'd just get it at the Data Tables stage.

Link to comment
Share on other sites

Yep basically. And it would be updated at the same rate as the rest of the website. Basically the process goes: NQ Servers - Library - Data Tables - Analysis - Website, you'd just get it at the Data Tables stage.

Can you explain, in simple terms, how you pull the data from NQ? I have a hard time understanding how you collect info from their servers.

Link to comment
Share on other sites

Can you explain, in simple terms, how you pull the data from NQ? I have a hard time understanding how you collect info from their servers.

I employed several methods for different types of data. For some data, it's pretty easy and a basic html parser works. For other things it'seems much more difficult to get, and the the program has to mimic how the official website communicates with itself and then record the responses.

Link to comment
Share on other sites

I was fascinated by the pattern the connections create.  Is there any significance to the length of the connection between a person and an organization?  For me at least, they correspond well to how closely I feel attached to the organization.

 

Thanks very much for doing this.

Link to comment
Share on other sites

I employed several methods for different types of data. For some data, it's pretty easy and a basic html parser works. For other things it'seems much more difficult to get, and the the program has to mimic how the official website communicates with itself and then record the responses.

Aha...so for the difficult stuff you basicly talk to the servers i.e. the community portal. And when it responds your program collects and interprets the data. Allowing you to create instructions to handle the date.

 

Something like that? (I'm not a coder)

Link to comment
Share on other sites

I was fascinated by the pattern the connections create.  Is there any significance to the length of the connection between a person and an organization?  For me at least, they correspond well to how closely I feel attached to the organization.

 

Thanks very much for doing this.

 

There is no correlation. It is all based on how the program organizes the players and the orgs. The player's distance from the orgs has no real meaning

Link to comment
Share on other sites

I updated the site, as well as rolled out the second new feature: the forum friend map. Also, the raw data is accessible on the developer resources page. 

 

Like Begogian said, there is no direct connection between distance and any sort of attributes. Nodes are placed based on their connections to orgs and the location of those orgs. 

Link to comment
Share on other sites

Hey great work !

 

Just giving you an Idea that i hope can help you.
instead of exporting on .csv, or program lots of functions..web services etc.. you could...

work on set up a public, read only MySql server,
would be very easy for you to "program", would only take few hours of coding
( or few minutes, depends how you store your data now, if already have it on one MySQL DB you don't need to program at all, just setup MySQL)

a) Create a new DB, or duplicate your tables inside your DB
(if your server secure are good, duplicate the tables, if you can't fully trust, then now DB)

B) Create an SQL USER with permission to access only this tables, as read only, and multiple external login, 
than negate everything else (specially his permission to change his password)

 

c) Use a single line SQL update code to query your main DB and update the cloned public tables
set a server cron job to do it periodically (2 days .. 15 min ..1 min .... not matter its a simple SQL that uses very few server resources)

 

d) public this MySQL socket to the community (give the IP of DB,  User / pass)

see very simple and secure
since the data will be inside a MySQL server will be much easy to every one else that create their own modules, (of any language )
the clients will access and send single SQL queries, retrieving only data they need and receiving it clean and easy to handle.

This method is very low server resource demanding, and for sure wayyyy less demanding then a script runing to: 
1- Query for the data inside the DB ,
2- put everything in arrays(that are at server memory)
3- call a module to right as .csv
4- access hard drives to store the .csv
5- and many clients conecting via http, (triggering a hole web server process pipeline) to download of .csv file

if you think of this, with 100 times more data than now and every 15 min ..... wow you will need a new server !

Now using the method i suggest:
1- clients log in MySQL server and send a single line SQL query, only for the portion os data they need, and close connection. Only this
Its a single line command that MySQL process very quickly and the data packages transfer is also very small, the port os faster and secure.
no other server or software are involved.


so thx for read this all and hope it can help.
bye

Link to comment
Share on other sites

Awe-inspiring work, handy and interesting data in a simple visual format. Many thanks for your efforts.

(What, 'friend map'? Did not even know there was such a forum mechanic... Ignorance is bliss?)

 

EDIT: Seeing as the colour scheme is red-green I would imagine colourblind fellows have really hard time reading the community map.

Link to comment
Share on other sites

This is fantastic, really interesting this visualization works really well and you can quickly make some clear judgments about the state of the community.

 

1. There is a "Neutral Zone" at the center, people here are affiliated with Cinterfall, the Alpha Academy and Silver Light but also with lots of other orgs which means they have diverse loyalty.

2. There are a whole bunch of isolated communities out on the fringes that very insular the most insular being Frogswarm, which makes sense being a french language org

3. There is a huge "belt of loners" around the Neutral Zone that consist of Orgs with just one person.

4. The TU is a huge monster that looks like it's going to a dominate force in the game

 

It will be interesting to see how this changes, I suspect that we will see a lot of convergence as the game starts as org's either succeed or fail people. 

 

Please Void as a next step can you keep a snapshot for each version you create so we can observe how things change over time

 

Also could we get an option to exclude 1 person org's? I think it distorts the data because people are creating orgs as a joke

Link to comment
Share on other sites

Awesome work!

 

I didn't feel super comfortable with the method by which this data is being collected being closed-source, though, so I'm just going to leave this here...

 

https://github.com/malignantz/DualScraper

 

 

I think, if anything, this illustrates the need for more privacy controls with our profiles. All of this data is pretty publicly available once you start scratching at the surface.
 

 

Edit: I didn't write this tool, nor is it on my github repo. I'm just trying to illustrate the privacy issue by pointing out how easy it is for people to collect this data.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...