Having put in the toil getting my head around the d3.js library I wanted to revisit a problem I had rendering NodeXL generated data in a scalable vector graphic based webpage. In my post The art of discovery: Looking at how UK Web Focus, OUseful.info and MASHe interconnect using Google Spreadsheets and NodeXL I said:
It would be great if NodeXL had a way of publishing graphs whilst maintaining some of this interactivity, a bit like the way I can embed basic Twitter networks using the Hirst-Hawksey Protovis (Friendviz) Google Gadget. After I tip from Tony I had a look at the D3.js library which has superseded Protovis. I had a go at changing the data source in this example by adding a custom column to the Edges sheet in NodeXL with =”{source: “””&[@[Vertex 1]]&”””, target: “””&[@[Vertex 2]]&”””, type: “”licensing””},” but which generated something – more tweaking required
In a recent post by Tony on Visualising New York Times Article API Tag Graphs Using d3.js he highlighted how there is NetworkX D3 helper library (networkx-d3) for the NetworkX visualisation package for Python. Not being a Python developer (yet) I thought it would be interesting to adapt the same philosophy for NodeXL. So after a bit more tweaking I get this rendering of a selection of interconnected posts:
Do it yourself
Step 1: Preparing the data in NodeXL
In your NodeXL spreadsheet:
- On the Vertices sheet insert a new column in ‘Other Columns’ and call it ‘index’. Insert the following formula into the first cell beneath the column heading (on row 3)
=ROW([@Vertex])-3
. This should fill the column with sequential numbers. In row 1 of the index column enter=COLUMN()
and take a note of the number it calculates - Insert another column in the Other columns and call it something like d3 data and insert the formula
="{""id"":"&[@index]&", ""name"": """&[@Label]&""", ""url"":"""&[@[Custom Menu Item Action]]&""", ""nodeSize"":"&ROUND([@Size],2)&"},"
where:- [@index] – is the number column you just created
- [@Label] – is the node label you’ve applied
- [@[Custom Menu Item Action]] – is designed to be filled with urls for menu actions. If you don’t have a custom menu action you should replace
"&[@[Custom Menu Item Action]]&"
(including 1st set of quotes with#
eg&""", ""url"":""#"", ""nod
- [@Size] – is the visual properties size
- Move to the Edges sheet and insert another ‘Other columns’, call it d3 data and insert the formula
="{""source"": "&VLOOKUP([@[Vertex 1]],Vertices!A:AC,29,FALSE)&", ""target"": "&VLOOKUP([@[Vertex 2]],Vertices!A:AC,29,FALSE)&"},"
Important: you need to replace AC and 29 with your own ‘index’ column letters and numbers so if you index column number is 30 replace AC,29 with AD,30 Also note there are two instances of this range in the formula.
Step 2: Edit the data file
So far what we’ve done is prepare the spreadsheet to dump some data. The next part is to insert this into a data file for a template html page to render. For this I’m going to show you how to do it using the code repository GitHub and the Bl.ocks.org viewer but if you prefer you can download the project files, edit them offline before uploading somewhere of your own.
- Visit this page and click on the ‘fork’ button (if you’re not signed up registration is free)
- Once signed in and back on the code page click on ‘edit’ button
- In the force.json window paste the values from d3 data column on the Vertices sheet overwriting the ‘paste your vertices here’ text. On the last row of the pasted data remove the last comma from the pasted data
- Next paste the d3 data from your edges sheet where it says ‘paste your edges data here’ and again remove the last comma from the pasted data (here’s an example of what your file should look like)
- Scroll down and click ‘Save Gist’
- At this point you can download your project files to upload them somewhere else or if you want to see if it works go to http://bl.ocks.org/{adding your gist number} (for example http://bl.ocks.org/1300700 is a live version of http://gist.github.com/1300700)
And that should be you. Because the d3.js library is rendering the data live in your browser there’s a limit to the number of nodes/edges you can render (I reckon 150 nodes, 2000 edges is safe, more than that might be a problem. You can now go off be merry and prosper, but if you want stick around to find out on how this process could be streamlined and some d3.js tricks I picked up.
Building blocks for a macro
Your still here yeah! The process of creating the the force.json data file is a bit cumbersome but could be streamlined using a macro. I’m not that familiar with Visual Basic so won’t be doing this myself just yet but here is the pseudocode and code snippets I’ve found if I were to do it.
- Prompt user for destination filename/location
- Read Edges Vertex 1 and 2 and Vertices Vertex, Label, Size, and Custom Menu Action columns into edges and nodes arrays
- For all nodes write data to file
- For al edges write source and target ids
- Package file(s) for distribution
d3.js tricks
The index.html file was based on based on the mobile patent suits example with a couple of additions:
- node size – [line 90] pulls the nodeSize attribute from the json data for a node and in this case multiples by 3 to get a radius (attr(“r”))
- marker position – [lines 73] because we have a variable node size the marker end position needs to be dynamically adjusted. This is done by pulling the nodeSize again from the data and adjusting the attr(“refX”) (
I think this value is based on diameter rather rather radius, but I’m very unsure about thatit’s the radius + the marker height + a bit more for line width) - marker duplication/attachment to path – [lines 71 and 85] the thing I’m getting my head around is d3 is basically an interface for pure SVG so it’s not enough to just now javascript, you need to know how svg markup works. If I had nodes all of the same size I could create a SVG marker and append it to every path as a marker-end by using it’s marker id url(#markerName). As the refX varies I need to make a unique marker for each edge, then attach it using it’s id. This is what lines 71 and 85 do, create an id then attach that marker to a path. There is also a whole markup language for marker and path shapes. Here’s where I started learning about markers.
And if you are still reading this thank you for sticking with it ;). What do you think is displaying force diagrams from NodeXL with d3.js practical? The limit to the number of nodes and edges is very restrictive. Perhaps it would just be better if NodeXL just generated some static SVG markup for users to embed on websites, after all, all the major browsers now support this format.
EDGESExplorer: Simple force layout diagrams from edge lists stored in Google Spreadsheets [NodeXL Gephi] – MASHe
[…] a force layout diagram (that’s what I think he was suggesting anyway ;).I’ve played around with other ways to get online network (force layout) visualisations from tools like NodeXL and Gephi, but these need a high level of faff. To take an edge list […]
NodeGL: An online interactive viewer for NodeXL graphs uploaded to Google Spreadsheet – MASHe
[…] output, mainly from NodeXL, online to allow users to explore and interact with the data (e.g. A template for rendering small NodeXL visualisations on the web … or EDGESExplorer: Simple force layout diagrams from edge lists stored …)Most of these […]