Visualization with GraphViz

Visualization with GraphViz

Lately, I've been tinkering a bit with GraphViz for visualization. Starting from some simple flowcharts and diagrams to relatively involved graphs generated programmatically, I've spent several hours with the documentation. But for a few features that are missing as of now, I like it. Here is a short initiation into GraphViz.

If you want to try out some of these examples, you may install GraphViz or try it on online editors like magjac or dreampuf.

GraphViz generates images in various formats like SVG on PNG from dot files or gv files, specified using the DOT language.

Let us jump into the examples.

Example 1:

Let's do a simple binary tree.

digraph G {
  
  node[shape=box]
    
  root  -> left;
  root  -> right;
  left  -> left_left;
  left  -> left_right;
  right -> right_left;
  right -> right_right;
                
}
        

Any specification of a GraphViz diagram contains either a digraph (has directed edges) or a graph ( undirected edges)

So, here I've specified a digraph, named G. Firstly, I'm specifying the shape of each node in the digraph. A box or rectangle refers to the same shape. Once this is specified, all the nodes in the graph have this shape, unless specified otherwise.

Following this, there are some edge specifications showing the source and destination of the edges. That's all.

Save this content in a file and run

dot -Tpng example01.dot > example01.png        


No alt text provided for this image

Example 2:

Let us move on to something slightly more complex. This tree is not binary anymore. Let's give some labels, colours and fonts.

digraph G {
  graph [splines=true];
  node [shape=rectangle,fontname="Arial, sans-serif"]
  
  root [label="Root"]
  left [label="Left\nChild",color="#0000ff"]
  right [label="Right\nChild",color="#ff0000",shape=ellipse]
  left_left [label="Left\nGrand Child",color="#0000ff"]
  right_left [label="Right\nGrand Child",color="#ff0000",shape=ellipse]
  left_right [label="Left\nGrand Child",color="#0000ff"]
  right_right [label="Right\nGrand Child",color="#ff0000",shape=ellipse]
  right_right_right [label="Right\nGreat Grand Child",color="#ff0000",shape=ellipse]
  
	
  root        -> left[minlen=3];
  root        -> right;
  left        -> left_left [color="#0000ff",minlen=2];
  left        -> left_right [color="#0000ff"];
  right       -> right_left [color="#ff0000"];
  right       -> right_right [color="#ff0000"];
  right       -> right_right_right [color="#ff0000"];
  root        -> right_right;
  root        -> right_left;
  root        -> right_right_right;
  right_right -> right_right_right [color="#ff0000"];

                
}
        

Here are the new elements in this graph:

A more complex tree

  1. Shape specification that overrides inherited shape.
  2. Labels for each node.
  3. Color for nodes and edges.
  4. Minimum length (minlen) specification for edges.
  5. Font specification.
  6. Splines allowed, which produces the beautiful curved edges.


Now, why should we write the graph and generate the visualization? Why not draw it using WYSIWYG editors?

Example 3:

When the graph is not static and is based on some model that is changing, instead of repeatedly editing the graph representation., it can be generated programmatically. To illustrate the point, let us consider this problem statement of Advent Of Code 2020 Day 7 - Part A.

TL;DR Certain colored bags can be placed within other colored bags and we want to know how many different colored bags could eventually hold a shiny gold bag.

I've modified my solution for this problem slightly so as to output the graph in a dot file format.

Here is the sample input specified in the problem statement:

light red bags contain 1 bright white bag, 2 muted yellow bags
dark orange bags contain 3 bright white bags, 4 muted yellow bags.
bright white bags contain 1 shiny gold bag.
muted yellow bags contain 2 shiny gold bags, 9 faded blue bags.
shiny gold bags contain 1 dark olive bag, 2 vibrant plum bags.
dark olive bags contain 3 faded blue bags, 4 dotted black bags.
vibrant plum bags contain 5 faded blue bags, 6 dotted black bags.
faded blue bags contain no other bags.
dotted black bags contain no other bags.
        

Run

python bags.py < sample1.input > bags1.dot
dot -Tpng bags1.dot > bags1.png        

and open the bags1.png to see a graph like this.

No alt text provided for this image

The original problem statement asks for a count of the number of distinct colored bags that will eventually contain a shiny gold bag. From this graph, it is clear that there are 4 such colored bags.

There is another sample input also provided in the GitHub repo below. Try generating more graphs either programmatically or by manual edit and have fun!

Source Code

To view or add a comment, sign in

More articles by Menaka Sankaralingam

  • Learning by Doing: Scala (and Spark)

    Over the past months, I’ve been building data pipelines in Scala using Apache Spark, and the experience has been……

    1 Comment
  • What makes an engaging workplace? - 1

    Years back, one day, my boss stormed the door open and screamed at me "Menaka! Do you check your emails?" The naïve…

    2 Comments
  • Why I do and why you should do the Advent of Code?

    On Sunday the 25th of December, I finished my fourth year of Advent of Code. Yes, I've been doing it since 2019.

  • My Journey to becoming a GCP Data Engineer

    I’m glad to share with you all that I recently became a Google Certified Professional Data Engineer. The Motivation…

    3 Comments
  • To D or to 2D, that is the question

    To be, or not to be? That is the question - Hamlet When Shakespeare wrote the above lines, I wonder if he would have…

Others also viewed

Explore content categories