Learning to Love Document Generation
Hi. I'm Joe, and this is now my blog. For my sketchblog that was previously on write.as, see my new sketchblog.
F:\freelance\repos\Dawnline\Dawnline\Build>raptor.py "http://pubassets.voidspiral.com/dawnline/The Dawnline.html" DOCRAPTOR: http://pubassets.voidspiral.com/dawnline/The Dawnline.html > The Dawnline.pdf generating...
I'm compiling The Dawnline right now.
Which is pretty freaking cool, to be honest.
Let's take a step back to 2016, when I was still working Oubliette Second Edition. At that time, and all the way up to that time, I was using Adobe Indesign for laying out my table-top role-playing games. Like many others, I thought this was pretty much the only way to get industry-standard layout results. Obviously I'd used it a lot in undergrad, and in fact I'd even used it back in high-school, when I'd discovered how much better (even at the time) it was than the dreaded Quark Express. Anyway, at the end of my undergrad, I purchased the CS6 design collection, seconds before Adobe announced the whole Creative Suite thing, with it's new pricing structure. It's a spectacular understatement to say that I was glad I got out while I could.
Back in 2016, though, I was still “happily” using Indesign CS6 (and Photoshop and Illustrator and Acrobat, but this isn't about those guys) in my day-to-day job as a game designer. The past projects had taught me that the only way to not waste weeks re-laying-out documents after single-letter typo fixes was to find a way to automate the process.
This started me down the long road of automation. I thought I knew what that was when I started playing around with using #Markdown to generate ICML for Indesign using Pandoc, but I'd barely even scratched the surface. That process wasn't much more automated than working with original manuscripts in Word, but what it did do is introduce me to the possibilities of standardized content with no formatting involved in the manuscript.
For those who don't know, in web design, you generally keep your content and the style of that content as separate as you possibly can, for many reasons. That's why we have HTML and CSS, and why they're not the same thing.
So with my
Markdown ► ICML ► Indesign pipeline I'd at least standardized my manuscripts. I couldn't accidentally change a style or something anymore. But the problem with this setup wasn't perfect: My ICML still needed to be processed with Python and a tree-parsing library to fix some various bullshit, and it was nowhere near able to correctly handle tables, which if you know ttrpgs, you see the problem.
I finished out Oubliette and moved on to a new project, and with it, a new setup. On Heroines of the First Age I figured out a way to chop out the table content in the manuscript (written in Markdown of course) using more Python. That in turn was broken down into plaintext and placed in Illustrator table documents that live-updated based on the content. Then those tables were placed into the Indesign document, which was also driven by the edited Markdown.
If that sounds complicated, it was. Just imagine inventing it. Trial and error was involved, to say the least.
That process was ok, but it was only a little while before I found a new tool called PrinceXML. It consumes HTML and spits out PDFs. AMAZING. I'd just stumbled upon the holy grail of document generation: Write books like a web page. What could be better? Sure, if I want to design something more like a magazine ever, I'll probably have to go back to Indesign (maybe... who knows, I've unlocked the Gates of Babylon of doc gen here, I'm almighty), but I could generate the layout from the content, instead of doing it separately. I don't know if I can properly articulate how important that was for me.
I work mostly alone. When I have help, it's usually one person handing me a bit of content, editing something and making suggestions, or giving me artwork. It's not hyperbole to say that I've basically got to do all the jobs in the shop. So not having to focus so much of my time on layout was dramatically disruptive to my workflow.
Now, of course, I still have to do some graphic design and layout, from image placement to page size and typography, but the point is that I don't have to manually do every single freaking thing anymore. Huge time saver.
And I thought that I'd be golden. I thought that this would be the end of all my troubles. But as it turns out, while my headaches are greatly reduced, they're not gone. Because I'm not rolling in cash, I've got to use a hosted service to compile my final PDF documents, instead of using nearly-instant PrinceXML on my local machine. Since I use that to preview my documents, that means it's possible for errors to creep in between when I do my previews and when I do the final build.
Those errors can eat up time, just like adjusting the layout after fixing typos in Indesign. And it makes me sad that this new process isn't a gold-plated silver bullet. But here's what I've gained:
- Control over my process. My pipeline is all scripted, and I control those scripts. If I need to add something, I can do it myself.
- Control over my data. I'm not vendor locked to any app, except maybe to the PrinceXML parser, which is distributed enough that I'm not that worried. They don't own (or even retain) my data or the format that it's in.
- Speed. It's much much faster to iterate on a document, which means I can afford to do more testing and more revisions.
- CSS tools. I can do almost anything in a book that CSS can do, which is actually quite a bit more than Indesign CS6 could do.
- Data durability. I keep my manuscripts in git repositories now, so I can not only work on them on multiple computers, but I can also share them and keep them safe from a total loss at the office.
- One click builds. I can just hit
build.batand get a completed document in a couple of minutes. This is excellent for testing and distro.
- No more hyper-fragile documents. When working on long-form documents, Indesign would often turn into a molasses-simulator, and I spent many an hour touching it ever-so-carefully, like trying to pet a skittish wild bird with a remote-controlled robotic arm.
Here are the things that aren't so great:
- Ease of Collaboration. While it's pretty easy to get people setup on the repo, it's not Google Docs. If I'm going to collaborate with someone in Gdocs, I've got to teach them good Markdown habits.
- Layout errors. Instead of trouble-shooting in gui, I've got to do it “blind.” If there's a problem with the placement of something, I've got to handle that in the code rather than simply dragging it to where I want. That's mostly fine with me, coming from a web-design background.
- Tech support is me. If I get problems in the build, it's up to me alone to figure them out and fix them. This is mostly only a problem when I'm transitioning between phases of a project and the build process needs to update to reflect new goals, but it can be just as frustrating as Indesign sometimes. At least with these errors, I can fix them instead of just accepting that the program will never change its behavior or get an update to add a new feature.
But I mean, if that's all, I'd say it was more than worth it.