The power of a tiny language
Both as a teacher and as a student I enjoy using dot to generate graphs. Due to it's single purpose, it can be quite an elegant language. I use it to quickly create graphs that help me explain (and understand) concepts, processes and software projects. You can find it's documentation here but I think it's easier to understand with an example.
Recently I had to create a "Mind Map" summarizing the contents of a Data Structure and Algorithms class. I'm no expert in mind map and I'm a big fan of delegating the drawing job, since I'm not quite good at it. So I decided to use dot.
Check out this simple example of dot. Here's the code:
graph g {
Analysis--{
AlgorithmEvaluation,
GrowthOfFunctions,
AsymptoticAnalysis
}
}
and here's the result:
As you can see, one needs only to connect the nodes with "--" to create the lines and one can include multiple nodes within braces to avoid the need to specify each edge.
That's great. Let's expand it:
graph g {
Analysis--{
AlgorithmEvaluation,
GrowthOfFunctions,
AsymptoticAnalysis
}
AlgorithmEvaluation--Approaches
Approaches--{
EmpiricalApproach,
TheoreticalApproach
}
TheoreticalApproach--{ComputationalModel, ResourceConsumption}
ComputationalModel--RAM_Model
GrowthOfFunctions--GrowthOFFunctionsSimplifications
GrowthOfFunctions--GrowhHierarchy
RAM_Model--{
RAMModelModel,
RAMModelAssumptions
}
RAMModelAssumptions--Operations
ResourceConsumption--CountingConsumption
ResourceConsumption--InputDependentCounting[label="depends of"]
InputDependentCounting--CountingCases
AsymptoticAnalysis--{BigO,BigΩ}
{BigO,BigΩ}--BigΘ
BigO--Littleo
BigΩ--Littleω
}
Here is the resulting graph:
There are few new concepts here. You can see one can use UTF-8 and connect multiple nodes within braces to another node, thus creating edges between all nodes in the set to the target node. This is awesome, but I needed to improve my mind map to include concepts and make it more descriptive. I wanted my nodes to have fields that could connect to each other.
Record-based nodes
Within dot there is a tiny language that is literally described in three lines of documentation, and yet, it is impressively useful. Record-based nodes are one of the shapes available in dot. When you use them you can define the node labels using a language with this specification :
rlabel = field ( '|' field )*
where field = fieldId or '{' rlabel '}'
and fieldId = [ '<' string '>'] [ string ]
That's it! What it does is the following: fields are separated with pipes. Curly braces create a set that will be shown in the alternate direction, that is, if the previous direction was horizontal, within braces it will be vertical. Here's an example from the documentation:
You can see that each node now have several fields and that it is possible to connect edges between fields of different nodes.
This is achieve with the following code:
digraph structs {
node [shape=record];
struct1 [label="<f0> left|<f1> mid&#92; dle|<f2> right"];
struct2 [label="<f0> one|<f1> two"];
struct3 [label="hello&#92;nworld |{ b |{c|<here> d|e}| f}| g | h"];
struct1:f1 -> struct2:f0;
struct1:f2 -> struct3:here;
}
Now let's use record based nodes to improve my mind map. I think it is really interesting that such a tiny language that is specified in 3 lines can allow us to change that previous simple graph into this much more descriptive one.
I'm quite happy with that mind map. Let's take a look at the code. You'll notice that nodes are customized by using square brackets after the node declaration. I like to separate the relations part of the code from the description of the nodes, so I repeat the each node after the relations section ends.
Within the square brackets we define a label using our tiny language.
Notice that I usually start the label with an open curly braces. That is to make fields vertical. Whenever I need another line, I simply use a pipe. If I need to split a line into columns, I open another curly braces. Finally, if I need to reference a field from a node I use the notation nodeA--nodeB:fieldname. Within fieldB we include the name within angle brackets.
Dot may not be a language widely used and it won't provide you with all customization options a GUI tool may offer, and there is a more generalist option in dot, where you can use HTML like tags, but I really enjoy the elegant simplicity of theses record-based nodes.
graph g {
node[shape=record];
size="8.3,11.7!";
margin=0;
// config
overlap="prism"
overlap_shrink=true
overlap_scaling=1
// Structure
Analysis--{
AlgorithmEvaluation,
GrowthOfFunctions,
AsymptoticAnalysis
}
AlgorithmEvaluation--Approaches
Approaches--{
EmpiricalApproach,
TheoreticalApproach
}
TheoreticalApproach--{ComputationalModel, ResourceConsumption}
ComputationalModel--RAM_Model
GrowthOfFunctions--GrowthOFFunctionsSimplifications
GrowthOfFunctions--GrowhHierarchy
RAM_Model--{
RAMModelModel,
RAMModelAssumptions
}
RAMModelAssumptions:f0--Operations
ResourceConsumption--CountingConsumption
ResourceConsumption--InputDependentCounting[label="depends of"]
InputDependentCounting:f0--CountingCases
AsymptoticAnalysis--{BigO,BigΩ}
{BigO,BigΩ}--BigΘ
BigO--Littleo
BigΩ--Littleω
// Descriptions
Analysis [
label="{
Analysis of Algorithms|
Prediction of the resources\n
required by an algorithm}"
style=filled,
background="#eeeeee"
]
AlgorithmEvaluation[
label="{
Analysis|
Correctness|
Ease of understanding|
<f0> Resource consumption
}"
]
GrowthOfFunctions [
label="Growth of Functions"
]
AsymptoticAnalysis [
label="{
Asymptotic Analysis|
describe the behaviour of\n
a function when the argument\n
tends to infinity
}"
]
EmpiricalApproach[
label="{
Empirical|
- Precise measurement\n
- Implementation required\n
- Specific technology
}"
]
TheoreticalApproach[
label="{
Theoretical|
- Estimate measurement\n
- Implementation independent\n
- Universal (within model)
}"
]
ComputationalModel[
label="{
Computational Model|
A simplified representation\nof the implementation\n technology to be used
}"
]
RAM_Model[
label="{RAM Model|Random-Access\nMachine Model}"
]
RAMModelModel[
label="{
Model|
IO\n
CPU\n
Storage
}"
]
RAMModelAssumptions [
label="{
Assumptions|
{
time |
{ Syncrhronous (Single CPU) |
<f0> Constant Time Operations\n (1 simple operation=1 time unit)|
Loops and functions ∉ Simple Op|
No memory hierarchy
}
} |
{
space |
Constant Space Varibles
}
}"
]
Operations [
label="{
Simpl Operations|
Numerical\nControl\nData Movement
}"
]
ResourceConsumption [
label="{
Resource Consumption| {
Time Complexity| Space Complexity
}|
Create a function that describes\n
the resource consumption of the algorithm
}"
]
CountingConsumption [
label="{
Time & space units\n
counting tips:|
{count|{
memory reads/assignments|
operations inside for loop header|
function calls & return|
loop variables
}}|
{don't count|{
argument variables
}}
}"
]
GrowthOFFunctionsSimplifications [
label="{
Simplifications|
{why?|{
Rough approximation|
RAM model assumptions|
Interested in growth
}}|
{how?|{
Use constants|
Ignore constants|
Ignore lower terms
}
}
}"
]
GrowhHierarchy [
label=" {
Typical growth\n
functions
|
constant\n
logN\n
N\n
NlogN\n
N²\n
2^N\n
N!
}"
]
InputDependentCounting [
label="
{
Input Size|
One expression
}|{
Input Content|<f0>
Several expressions
}
"
]
CountingCases [
label="
{
Worst case|
Maximum number of\n instructions
}
|
{Average Case}
|
{
Best Case|
Minimum number of\n instructions
}"
]
BigO [
label="{
Big O |
Asymptotically\nnot slower than|
T(N) ∈ O(g(N)) if:\n
c*g(N)≥T(N) ∀ N ≥ n'
}"
]
Littleo [
label="{
Little o|
Asymptotically\n faster than|
c*g(N) \> T(N) ∀ N ≥ n'
}"
]
BigΩ [
label="{
Big Ω|
Asymptotically\nnot faster than|
T(N) ∈ Ω(g(N)) if:\n
c*g(N) ≤ T(N) ∀ N ≥ n'
}"
]
Littleω[
label="{
Little ω |
Asymptotically\nslower than |
c*g(N) \< T(N) ∀ N ≥ n'
}"
]
BigΘ [
label="{
Θ notation|
Both Big O and Big Ω|
c*g(N)≥T(N) ∀ N ≥ n'\n
c*g(N) ≤ T(N) ∀ N ≥ n'
}"
]
}