tesseract and the bus system

being new to the bus system at cornell/ithaca had me thinking about a quick app (iOS or web) that could tell me what bus to take (from a nearby stop) to get to where i wanted to go. i’m sure this problem has been solved before, and i’m sure pretty thoroughly. i like re-inventing the wheel – it helps me learn.

i’m going to need…

a couple of things:

  • the gps location of the user/phone
  • the latlong locations of all the bus stops
  • the bus routes (what stops they go to – in order – and the time)

it would give me a quick introduction to

  1. gps location on web apps
  2. neo4j (i figure bus routes are a graph…)

mini-hacks

i hate using the term ‘hack’, in pretty much all contexts. hacking the gibson, can’t hack it, hack-a-thon, computer hacker, hacky sack. despite this, i realized i need to focus on smaller, more achievable goals for my hobbies, especially since my time will be limited once i start working. so i’m trying to start something i’m begrudgingly calling ‘mini-hacks’. this will be my first.

tesseract

i have the name. now i need a clear and simple goal – find the nearest bus stop by loading up a web page. we can build from there.

scraping a site

i went to the tcat bus system website and went to their google maps powered page. after some failed chrome exploring (you can’t copy a large amount of data out of the response body), i ended up at my go-to browser Firefox (complete with firebug). i grabbed all the stops and dumped them to individual .js files. with some quick console exploring (chrome dev tools shine at this), i found the array and element structure and put the snippet below together. i now have a CSV with stop names and lat/long locations. finding which one is closer should be something quick… :)

and the code

var s = 'cornell';

    function printStop(element) {
        document.write(element.name + ',');
        document.write(element.latlng.lat + ',');
        document.write(element.latlng.lng + ',');
        document.write(s + 'br');
    }

    s = 'cornell';
    cornell.kmlOverlay.markers.forEach(printStop);
    s = 'county';
    county.kmlOverlay.markers.forEach(printStop);
    s = 'downtown';
    downtown.kmlOverlay.markers.forEach(printStop);

i created high level objects (cornell, county, downtown) from each of the response bodies.

at the very least, i hope this post shows me what i was doing one day, struggling with a new bus system…

gexf4j 0.2.0 ALPHA Released - Supports full GEXF 1.1 Spec

Downloads

Version 0.2 – ALPHA

NOTE: The alpha version does NOT have a full test suite.

The underlying API has changed dramatically. Gexf4j now supports the entire GEXF file format, including:

  • Data Graphs
  • Dynamics
  • Hierarchy
  • Phylogeny
  • Visualization

Also introduced in 0.2 is a chaining API, allowing for a more descriptive interaction with the API. Here’s an example:

Node gephi = gexf.getGraph().createNode("0");
gephi
    .setLabel("Gephi")
    .setStartDate(toDate("2009-03-01"))
    .getAttributeValues()
        .addValue(attUrl, "http://gephi.org")
        .addValue(attIndegree, "1");

Roadmap Changes

  • 0.2 – Write GEXF Files
  • 0.3 – Read GEXF Files
  • 0.4 – Data Validation & Integerity
  • 0.5 – Helper Functionality for Dynamic Timelines
  • 1.0 – Finalize API

Sample Code

The following code creates the same graph located on: http://gexf.net/format/data.html

Gexf gexf = new GexfImpl();

gexf.getMetadata()
    .setLastModified(toDate("2009-03-20"))
    .setCreator("Gephi.org")
    .setDescription("A Web network");

AttributeList attrList = new AttributeListImpl(AttributeClass.NODE);
gexf.getGraph().getAttributeLists().add(attrList);

Attribute attUrl = attrList.createAttribute("0", AttributeType.STRING, "url");
Attribute attIndegree = attrList.createAttribute("1", AttributeType.FLOAT, "indegree");
Attribute attFrog = attrList.createAttribute("2", AttributeType.BOOLEAN, "frog")
    .setDefaultValue("true");

Node gephi = gexf.getGraph().createNode("0");
gephi
    .setLabel("Gephi")
    .getAttributeValues()
        .addValue(attUrl, "http://gephi.org")
        .addValue(attIndegree, "1");

Node webatlas = gexf.getGraph().createNode("1");
webatlas
    .setLabel("Webatlas")
    .getAttributeValues()
        .addValue(attUrl, "http://webatlas.fr")
        .addValue(attIndegree, "2");

Node rtgi = gexf.getGraph().createNode("2");
rtgi
    .setLabel("RTGI")
    .getAttributeValues()
        .addValue(attUrl, "http://rtgi.fr")
        .addValue(attIndegree, "1");

Node blab = gexf.getGraph().createNode("3");
blab
    .setLabel("BarabasiLab")
    .getAttributeValues()
        .addValue(attUrl, "http://barabasilab.com")
        .addValue(attIndegree, "1")
        .addValue(attFrog, "false");

gephi.connectTo("0", webatlas);
gephi.connectTo("1", rtgi);
webatlas.connectTo("2", gephi);
rtgi.connectTo("3", webatlas);
gephi.connectTo("4", blab);

gexf4j-core 0.1 released

so there are a couple of updates for gexf4j-core! i’m trying to remain active and motivated, so here goes! any advice, fixes, criticism would go a long way in the motivation department ;) the latest version of gexf4j-core is on github!

finalized roadmap

the roadmap has been finalized. after working on the writing functionality, i’ve realized i haven’t worked on LOADING a GEXF file into memory. the next step is to be able to READ GEXF files. the next steps are data validation & integrity and finally helper functions/functionality. currently gexf4j-core doesn’t validate the data you put inside of it – dates do not have to match up, data types are slightly off, etc. these validation checks will be added in the future. helper functions will also be added, making it easier to work with dynamic graphs, through the use of timelines. wish me luck!

added a license

i’ve decided on a license for the project: the Apache 2.0 license. it seems to fit my needs the best. everyone can use the code, in open and closed source projects!

beginning of documentation (wiki + javadoc)

as i work with geting to 0.2, i’ll be adding javadoc comments to the source code, as well as adding more information to the github wiki pages. any help is greatly appreciated!

other projects

i’m going to try to work on some other things, as well as this project. it might be too ambitious, but i need a hobby ;)

  • gexf4js – a javascript library for viewing GEXF files using the Protovis library you
  • gephi contributions – contributing to the gephi project, my first open source project! :)
  • cayuga, grace and artemis – a data visualizer (protovis + mongodb), a single-node javascript map reducer, and artemis, my online tracking platform

gexf4js - Powered by Protovis

Screen_shot_2010-07-10_at_10

The source can be found at: http://github.com/jmcampanini/gexf4js

Usage

Insert an IFRAME into your website, appending the GEXF File to the querystring of the gexf4js.html file.

Installing

I suggest you keep the gexf4js library in the same directory as the GEXF file.

  1. Copy all the files to a gexf4js folder
  2. Copy the GEXF file to the folder
  3. Make any modifications you wish to the gexf4js.html file (like linking jQuery)
  4. Place the IFRAME tag on your page.

Files

  • gexf4js.html – use in an IFRAME. The URL to the GEXF file in the querystring (?url=).
  • jquery-1.4.2.min.js – jQuery Library
  • protovis-r3.2js – Protovis Library
  • demo.gexf – OPTIONAL – the GEXF File to load

Known Bugs

  • Doesn’t work in Chrome.
  • Graph occasionally freaks out.
  • Firefox occasionally places all the nodes on top of each other.

Road Map

  • Works on ALL Browsers
  • Graph Properties – Size, Interaction
  • Node Properties – Labels, Colors, Mouseovers
  • Protovis Visualization – Matrix Diagram Implementation
  • Protovis Visualization – Arc Diagram Implementation

gexf4j now has data support

gexf4j now supports data graphs. here is a sample of the code necessary to create the sample data graph at gexf.net

you can find the code on github at: http://github.com/jmcampanini/gexf4j-core

its a pretty raw implementation, sticking closely to the object relationships. its lacking in ease-of-use and error-checking, which i’ll be adding as this progresses.

the goal is to provide an interface for gexf that can be used as a front-end for:

  • in-java graph creation
  • neo4j graph exporting
  • sql/nosql graphing
  • file structure graphing
  • site-mapping
  • facebook graphing (eventually)

any and all comments are very welcome. this is the first real project i’ve coded in public, so any advice regarding the code or the use of github is greatly appreciated!

gexf4j-core goes public

after trying to find a quick way to work with gephi and dynamic graphs, i decided to start gexf4j, a java library to help me deal with creating gexf files.

to begin with, it’s nothing impressive. it basically hides the XML details from you, and provides auto-generated IDs for nodes and edges. the next steps are to add basic support for data and dynamic graphs.

the gexf4j-core project has the basic graph building blocks and writer. gexf-util will have utilities to bring in data from different data sources, ideally neo4j.

the gexf4j-core github link.

here is some sample gexf4j-core code:

// create a graph and 3 nodes
Graph g = new Graph();
Node n1 = new Node();
Node n2 = new Node();
Node n3 = new Node();

// create 2 edges and add them manually
Edge e1 = new Edge(n1, n2);
Edge e2 = new Edge(n2, n3);
n1.getEdges().add(e1);
n2.getEdges().add(e2);

// connect n3 to n1
n3.connectTo(n1);

// add all 3 nodes to the graph
g.addNode(n1);
g.addNode(n2);
g.addNode(n3);

// create a graphWriter and file output stream
GraphWriter gw = new GraphWriter();
File f = new File("textout.gexf");
FileOutputStream fos = new FileOutputStream(f);

// write the file and close the stream - no XML worries!
gw.write(g, fos);
fos.close();

website sitemap using gexf4j and gephi

Untilted
created this pretty picture with gexf4j-core. steps taken:

  1. write web crawler in java using apache httpclient
  2. re-write web crawler using jetty-client after discovering 302 bug with apache-httpclient
  3. save html to ~5,000 pages to mongodb
  4. write gexf4j
  5. dump out the website data using directory structure rather than link structure
  6. play with it in gephi and create this image

the code takes a URL and builds out a graph, linking files and directories to their parents, giving this pretty site map.

the gexf4j-core github link.