Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm a technical writer and good diagrams are extremely valuable. I'm very skeptical of "diagrams as code" because it seems like the goal is to minimize effort, not to produce a useful diagram.

Good diagrams, including architecture diagrams, require careful consideration: What does the reader need to get from this diagram? What should I include and what should I leave out? How do I organize items so that readers can understand the diagram quickly?



Maybe diagrams made in, say, Visio, draw.io, or Inkscape are fine when you're the sole author of them, and you create typographic quality documentation, a business presentation, or otherwise need eyecandy.

In my practice across a few companies, diagrams made in code-oriented tools like Graphwiz, PlantUML, or Mermaid were vastly more useful for documenting real, evolving engineering projects.

* You can store them along with other documentation texts. Their source is always obvious.

* Version control! You can diff them and make sensible pull requests when updating some small aspect of a larger diagram, along with text, and possibly as a part of a code change. You can review, discuss, and approve the changes using tools familiar to every engineer.

* The amount of visual tweaking the language allows (graphwiz, quite some; mermaid, minimal) prevents bouts of bikeshedding and keeps the diagrams focused on their essential parts.

* Since the language is open, and no special tools are needed (except a text editor and a renderer on a web page), the diagrams are now everyone's concern and purview, as opposed to the person who happens to be good at graphics.

* The formats are not locked up; you don't need an authoring license, a subscription, a particular OS to run the editor, etc. You own your diagrams.

* Sometimes you can generate these diagrams from other data you have, pretty easily, and with consistent results.

These considerations appear to overweigh whatever visual imperfection issues these tools have.


I very much agree and the only thing that I'm missing is tools that generate a somewhat stable output. What I want is tools that take their last output into consideration and generate new diagrams that are both correct and as similar as possible to the last output they produced. They should do what a human does out of necessity when updating a diagram manually - make it represent the current state correctly with minimal effort.

Of course that means the diagrams will not be optimal in any case. For example, there could be better arrangements of boxes or less crossings of lines. I sill think stable diagrams are important for our limited brain and one important reason why people shy away from automatic diagram generation is the mental chaos a complete rearrangement of elements brings with it.

One example of a software that got this right is Taskjuggler, the obvious counterexample is Graphviz.


> What I want is tools that take their last output into consideration and generate new diagrams that are both correct and as similar as possible to the last output they produced.

Something I've been struggling with as well. Now it sounds like a task for ChatGPT and dot, need to try that today.


If you're not too fussy about results it might not be "that hard" to load a previous layout into dot, and optionally mark the nodes and edges that are allowed to be moved (or preferred not to be moved), and run the algorithms on that. It "just" needs someone to work on it, or pay for developers to do it. We wrote a paper about this in the mid 90s and Gordon Woodhull even got a successor of that to work, see dynagaph.org


The graph drawing community extensively studied algorithms that update and "preserve the user's mental map" of various types of layouts. It's all heuristics anyway; sometimes they produce locally optimal layouts, but it's not as if extant layout systems are producing mathematically optimal layouts in general.


I find the code-based drawing tools to be a little finicky for my taste, and the compilation pipelines turn into a real pain if you want to, say, embed a .png in a markdown file and just have it display in your Git web interface of choice (obviously GitHub's Mermaid support helps if you're in that environment).

My preferred workflow is to use draw.io's PNG support. They embed the draw.io data into the PNG's chunks, so a regular image viewer opens it fine but you can directly edit the file in draw.io. If you have Git set up to diff images, that also "just works" when reviewing the history. Plus, there's an excellent draw.io plugin for VS Code that embeds the editor right into VS Code and will transparently open and save "foo.drawio.png".


I just keep my draw.io files, generated pngs, and markdown docs all with the code. Version control, etc.


With draw.io you can also embed the diagram source in the PNGs export


Minimizing effort is IMHO almost always a worthy goal. All else being equal, less friction in making or maintaining a diagram means diagrams are more likely to be made and maintained, or that more thought can go into a fixed amount of diagrams. People who don't care about diagrams being useful when making a diagram is easy won't suddenly start caring when it's hard.

Also, not all diagrams are there to communicate a carefully considered point. One other value of "diagrams as code" is the ease with which you can turn the output of some program into a diagram, providing another useful way to think about the problem you're solving. For example, when debugging or verifying correctness of code I've been writing, I find it occasionally useful to add a bunch of print statements sculpted so that I can paste the output log into PlantUML and immediately get a nice, visual representation of the thing I care about. Such diagrams are throwaway, generated on the fly, but can be of great help.


That is a very valid point. A bit of a counter argument at the bottom:

There is a model and auto diagrams which represent the current state. Good for analysis (automatic or not etc). Can go very detailed. Model as Code, in a database, xmi files, auto analyzed from code. Whatever.

However, there is a difference in the architecture depending on the moment and the perspective. Before coding, you actually communicate the architectural solution. Here presentation matters a lot. A pure model and auto diagraming sucks. Same when you go to a manager. They do not give a shit about the correctness. If they do not find themselves within 2 seconds in your diagram, the diagram is lost.

So unfortunately, both angles have very important uses.


That sounds useful but I can't quite picture what I'd do. Do you have an example?


Random example I did the other day: I wrote a small Dependency Injection library for a project in Common Lisp. Because it handled wiring up both big and small systems, and was designed to be composable, it wasn't easy to see the entire graph of all the components from code. It wasn't impossible - I designed it to be declarative and locally readable - it was just tedious.

My solution for this was to write a helper function that, when called, would walk the entire graph of those components, and print it in PlantUML form. More specifically, the code walked the pointer graph and made a mirror of it, in the form of lists of vertices and edges - then, it would loop over vertices to print a "class Foo as Bar <<Baz>>" kind of line for each of them, and then loop over edges, printing lines like "Foo <-- Quux" or "Foo +.. Quux". The result, also bracketed by @startuml and @enduml lines, could be copy-pasted or streamed directly into a file, which I then fed straight to PlantUML, resulting in a very nice DAG of actual relationships between all concrete instances of components in the running program.


> add a bunch of print statements sculpted so that I can paste the output log into PlantUML

This sounds quite amazing but I really don't understand how one could actually do it.


Simple. You just make the program print out all the data that will allow you to construct a visualization you want. How to best do it depends on your particular problem and preference.

So, on one side of the map, you have your program, which you can modify to make it print stuff. On the opposite side, you have PlantUML (or GraphViz, or ggplot for plotting charts, etc.). How you connect them to produce the visualization you need is up to you.

For simpler problems, you can do it the way I described in the part you quoted - print diagnostics about an object in a format that's valid PlantUML syntax. For example, as you walk a DAG of objects, do:

  print("\"%s\" <-- \"%s\" : \"%s\"\n", self.name, other.name, "Some relation")
You can then literally copy and paste that part of stdout into a file, perhaps decorate it with @startuml and @enduml (or just script it away too), and you'll get your diagram.

For more complex problems, you'll want to print what you can where you can, and have auxiliary script to sort and group that output properly, and then write out a PlantUML diagram.

In the middle sits a technique I found particularly useful. PlantUML itself is a programming language (and probably a Turing-complete one). It has variables. It has functions. It has preprocessor macros. You can prepare yourself a small DSL in PlantUML, so that you can meet your program in the middle - whether it's because you want to keep your output more readable, or because it would be too invasive to modify a program to print things the way PlantUML wants them.

Don't know what else to say here. The idea itself is not something particularly brilliant, or difficult.


Doing a diagram with miro is less friction that in code imo


Architecture diagrams have 3 different lifespans.

The shortest one is for a single communication. Think "created in a meeting, used in a meeting". These need to have zero friction to make, their maintenance life is measured in hours, and quality doesn't really matter. You'll do best to use a whiteboard or paper. Take a photo with a camera if you regret thinking it's of no use in 24 hours, and convert to one of the other formats.

The second one is for a planning stage. You might argue different design decisions and their merits. You need to be able to point to a few different solutions to keep the dialog going. These likely need to be pre-made (can't make them IN the meeting as that's too time consuming) but they should be easy to adjust during a series of meetings. Here is where you may want to use some simple drafting tool or Visio or similar. This type of diagram also works well for a presentation where you want to show a group of people a design/architecture - but you can then throw away or archive the presentation. The key is that these diagrams aren't live documents. They have a lifespan limited by some specific event such as a meeting, a presentation. You may be tempted to keep them, but it will be a historic artifact. A recording of the presentation is more useful than the diagram itself. The key thing to remember is that they are useless for describing an evolving system as no one has ever successfully maintained/updated a set of Visio drawings over time to accurately reflect the system they described.

The third one is when you want to achieve that impossible goal of having a living document. The only way to have a living document for software is as code/text, since the only source of truth is version control. Either it needs to be constantly modified, or (ideally) it has to be constantly auto generated. The quality of a generated document is going to be nowhere near a hand-drawn visio thing, but on the other hand it will describe the system, unlike the visio document which was last updated six years ago.


Most of the architecture diagrams that I review and sometimes write are of the second kind. Once you have planned and set for execution, the architecture rarely changes. If it did, then there was something fundamentally wrong with the first architecture and as s result this diagram will be modified in a review or a rewrite or whatever.

A live architecture document derived from code, even on a batch basis if not in real time, will be a really cool endeavour. If you can somehow look at a repo and generate a diagram describing all the different subsystems and their interactions, boy imagine that !

A simple traceability might be just through imports starting with the main program. Start with the main functions and trace them through. May be using a tool like sourcegraph.


Between the high level architecture diagrams which don't change much and the code itself, it feels like we are missing a strategy or tool for visualising mid-level code structure which is in between these two ends of the spectrum.

At work we have a bunch of Python services which do REST API and CRUD for the most part. There we have a simple plain python dependency injection set up which puts together the different components/classes of service and makes sure that each is provided with the references its dependencies. This is defined in one python file. I experimented with parsing this file and graphing the relations between the components, and the result was pretty good (given my low effort). It did a good job of exposing the relations between most of the important classes and where things had gotten messy.


Super useful, thanks!


It is extremely hard to keep documentation up to date without a full time technical writer (GitHub had none). Rendering a diagram usually means unique steps to get there and once the maintainer is gone, so are the updates (a great example was how their loadbalancer diagram was 5 years out of date when I got there). I went about talking loudly about the importance of a diagram which resulted in folks saying “they are not visual they can only understand the code in their head”, senior and staff engineers said this. I immediately realized I had to solve this. I made a few actions that would render mermaidjs files (which was later automagically rendered without requiring actions) much like a DIAGRAM.md file. Rave reviews from everyone who came across it “wow now I understand how this platform works” “it’s so easy to make an update to the diagram during architecture planning”.

Unfortunately my team was dismissed while I was in the middle of this project, so I never saw how it ended. To answer your questions, I advocated to break the architecture into chunks. You can get the big picture by following each diagram, or include other diagrams into one large one but to focus the output to specific readership (usually part of the documentation that defined the components)


This is true, but for me it's in tension with my desires to a) minimize duplication, b) keep code in sync with docs, and c) keep cost of change low.

In service of that, I think the article's correct when it says, "we should be using code to generate architecture diagrams". Anything else produces expressive duplication between the system and the diagrams, which either increases the cost of change or guarantees documentation drift over time.


Diagrams are valuable and useful. But some of us simply can't create them using "graphical" tools. I literally can't even make a drawing for something as simple as a dog house with any on-screen tool (mouse, select, point, click, drag etc) - I've tried so many times. I just can't do it.

But I can write relationships, connections and the like using text, and then have a tool draw them for me. When I do that, I "see" something in my mind which is an overview of the architecture, the structure, whatever, just not in something which looks like an actual drawing would. And I can use that to write the "code".

I know it's hard for those who can do such things using everything from xfig to CAD tools, to imagine that it's difficult, or basically impossible, for some others to do the same. And it's not about training, my mind just doesn't work that way. At work we've been using graphical tools for architecture etc. in various forms on and off during the decades I've worked there, and I've used them during those same decades - it doesn't work. I never manage to do that. I waste months. Instead I have sometimes quickly written code to take descriptive input and used that to create output - e.g. FrameMaker output, or some *TeX output, or Graphviz, and saved tons of time (and for a much better result). And the "code" can be version controlled and diffed and the like, exactly as some other comments say, and what the article mentioned.

I'll take a deeper look at what the article says about this.


The problem is the physical place the boxes are arranged matters. It is easy to draw a box and all the relations, it is hard to make the drawing useful.

I once did an automatic creation (dot) of my system - my monitor turned entirely black. I was finally able to see things when I zoomed in to the level that there were 10 boxes on my screen - but the boxes were not related to each other in any meaningful way as the system doesn't have any way to capture box a and c are part of the same subsystem, while box b is part of a different one.


I consider diagrams as being useful at different levels. There's the high level, conceptual style for either non-technical or CxO level discussion.

There's the larger, more detailed with lots of annotations for architects, engineers.

Surprisingly, the higher level the diagram, actually the longer lived it is. This is because it details intent, and not implementation.

The worst, IMHO, is when things are super-detailed, as you either need to ensure it's kept updated, or it will be outdated as soon as it's complete.

P.S. all this discussion reminds me of a short story from Borges: https://en.wikipedia.org/wiki/On_Exactitude_in_Science


Correct, that is the way.

I stopped doing code blue prints. They are either documentation only or you should look at code. There is however a area (think systems in a network) where typical IT Management / Cloud Management / etc fails. Here documenting a blueprint model helps with analysis (e.g. checking against reality)


I'd love to have a middle ground; I'd like to provide a diagram generator with a layout template and layouting hints, then have a tool generate a diagram conforming to that layout. I feel like that would be the best of both worlds. I could emphasize boundaries and structures visually, but I wouldn't have to keep everything updated by hand (as long as things don't drift too far).


I’m a big fan of diagrams as code but only when I’m doing them usually.

Mainly because I usually start out with an idea of what I want then try to recreate it. Usually that means I’ve had to read the entirety of the tool’s documentation end to end, and often I’m still bummed because the tool can’t recreate what I see in my head. But whatever, I’ll compromise.

But a lot of people go the other way. They make the diagram out of what they know how to use, and so generally their diagrams-as-code diagrams lack the detail you’d find if they had created it with a WYSIWYG tool.

But I hate WYSIWYG tools. Editing and versioning them is challenging.


I like WYSIWYG tools for a static system. However since I'm documenting a dynamic system that is changing all the time the tool needs to reflect the actual state of the system as it is now, not how is was 2 weeks ago.

20+ years ago I worked with a UML code generation system, so our diagrams reflected reality: you couldn't change the code layout except by using the WSIWYG tool. As such we always had correct up to date diagrams. It was great, every Monday the first thing I did was print the current diagram out and pasted it to my wall, then the rest of the week I'd refer to that all the time, but by Friday there were noticeable differences. Unfortunately such tools never caught on - in part because they have other limitations - but the diagrams were something I still miss.


Another way to say this is that diagrams and code have different audiences. Even if it is the aame person, it is in a different context.


I think that diagrams as code become useful when all the stakeholders (engineers, coders, devops etc) are involved in building it together. It'll eliminate spaghetti code and ultimately hold information on important key decisions made by different builders. It'll be a type of dev tool that helps churn out diagrams (code visualization, code maps) where different levels of builders understand it and it holds all the information on the whys of the software.


> I'm very skeptical of "diagrams as code" because it seems like the goal is to minimize effort, not to produce a useful diagram.

It is, and this is a valid goal. The ideal you're chasing doesn't happen in most projects: minimising efforts means those projects are much more likely to have an up-to-date diagram of some description.


I suppose it's not the same as "architecture" but since we set up our data pipelines in Dagster and the diagrams became something that the code generates it is MUCH easier to onboard people and explain how everything relates. We had manually maintained ERD's before but they were never as good or up to date.


Imo good software engineering === minimizing effort.

More specifically:

* Minimizing effort in both the short and the long term (biased towards the long term because it's... longer).

* Minimizing effort as long as product quality is the same/better.

If you make things easy to do well, all other desired qualities will follow automatically.


> good software engineering === minimizing effort

Hello fellow Javascript developer! :-)


As I see it, the goal is to be able to version them, to keep them up to date and in sync alongside the code they relate to.

I'm certain you can make as crap a diagram in a WYSIWYG tool as you can in code, and that you can be at least as careful and organised in code.


I agree. Same as “the code is the doc” is a lazy way of saying “I don’t want to write doc”


Well in that case, the docs are written separately to the code is just a lazy way of saying the docs do not reflect reality or are out of date.

Waterfall development doesn't work. It doesn't really work for specs and it doesn't really work for documentation either.


Not sure what waterfall dev has to do with that, and "the docs do not reflect reality or are out of date" is always true (even if within the code).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: