Automating standards publishing


In this presentation from the 2021 Typefi Standards Symposium, Guy van der Kolk, Product Manager at Typefi, gives a demonstration showing how Typefi’s automated publishing platform produces multiple output formats from a single NISO STS XML file.

Guy also shares examples of how typical content elements such as tables, figures, math, table of contents, hyperlinks, and alternative text are handled by Typefi’s sophisticated Adobe InDesign-driven composition engine.

“Typefi has several workflows that help us produce PDF, HTML, and EPUB from an XML file.”

Transcript | Presenter


00:00 Automated Standards Publishing
00:22 Typefi: XML to PDF workflow demo
03:18 Typefi: Adobe InDesign composition engine demo
06:52 Typefi: XML to EPUB workflow demo
08:02 Typefi: XML to HTML workflow demo
09:45 Typefi: XML to DOCX workflow demo

Automated Standards Publishing (00:00)

GUY VAN DER KOLK: To get this started, I’d like to talk specifically about two general areas. One of them is obviously the reuse of content—the ability to effectively reuse content from a single source of truth, in this case in NISO STS or ISOSTS XML input file.

[Guy shares his screen to display the Typefi user interface and the demo begins.]

Typefi: XML to PDF workflow demo (00:22)

The other area is the idea of options. Continuing on from accessibility in the previous session, the idea that you can give your end users options.

In this case, what you can see in the Typefi user interface is that we’ve got several workflows that are going to help us produce PDF, HTML, and EPUB from an XML file. So let’s go a little bit deeper into the details of that.

Let’s first take a look at the workflow—the one designated with a little rocket ship—with XML to PDF, because PDF is still the number one format for the distribution of content at this time.

And when we’re looking at that workflow, you can see in Typefi that a workflow is built up of several modular actions, so that it can be put together in different ways to produce ultimately a different kind of output or using different kind of transforms.

What we can see here is that the first action—called Import ISOSTS—takes a NISO or ISOSTS XML file and transforms it to Content XML, or .cxml as you can see here, which is an intermediary format that Typefi has developed over the years to work well with InDesign and other formats, which we’ll get into a little bit more later.

We’ve got some conditions here that allow us to control for various outputs, what we do and do not want to have. We can control content that we only want to see in the PDF and not in the EPUB and vice versa, or things that we only want to see in the HTML. We can control that using conditions.

And then we start with a Create InDesign Document action using an InDesign template, that will take that Content XML and layout the pages, and then we’re exporting to PDF. Ultimately, we’re modifying that PDF to improve its accessibility. So that’s the thought, that’s what’s happening here.

What I’m going to do is, I’m actually going to click on the Run button and I’m going to choose an input file. In this case that is going to be a NISO STS file. And we click the Run button and Typefi starts working through the steps in the action. The first step is transforming the NISO STS XML into Content XML. Subsequently, that Content XML is going to be fed into InDesign, which is still at this time in the top 90% of applications used worldwide for the creation of content.

And while that happens, soon we’ll see InDesign come to the front of our screen and we will actually see the composition start.

[On screen: InDesign starts up, a new document opens and the Typefi composition engine is laying out the content on pages.]

Typefi: Adobe InDesign composition engine demo (03:18)

So now the Typefi composition engine is taking that Content XML and is going to be placing it on the page and processing it. Right now it’s starting with the cover and as more content gets put in, we will see it continuing to generate pages. This will take a couple of minutes.

Now that we’re here, we might as well take a short moment to talk about the fact that Typefi has chosen to standardise around InDesign when it comes to the composition of pages.

Like I said previously, InDesign is currently among the top 90% of content production applications out there in the world and it has a huge following. It’s also quite easy to find InDesign experts. There’s literally thousands of experts out there that use InDesign on a regular basis. So that means that the learning curve for working with Typefi is also greatly reduced.

Another great benefit of InDesign and why we chose to use it is its scriptability.

[On screen: Math equations are being laid out on the page.]

On this page, we can see that there’s actually math, and this math is live MathML.

So we’re taking the MathML that’s part of the NISO STS XML, we’re placing it on the page and an InDesign JavaScript is taking that MathML and feeding it through an InDesign tool called MathTools that creates the math on the page.

That’s one of the big benefits of using InDesign.

So slowly as you’re seeing, you saw some graphics, you’re now seeing tables being laid out on the page, and soon we’re going to reach the end of this composition.

Then the next step in the process is going to take place, which is taking this InDesign file and exporting a PDF from it, and then taking that PDF and making some changes to it so that it is more accessible than it was before. That’s the step that we’re currently at and, in a moment, we can go to the Jobs view.

[On screen: Guy switches his browser back to the Typefi user interface and is looking at a list of jobs run, in the Jobs view.]

I’m going to switch to the browser and navigate to the Jobs view in Typefi, where we can see the history of jobs that were run successfully. This one is going to be finished soon.

As soon as it’s finished, we can go ahead and look into the Job folder where we’re going to see that all of the input files and output files that were produced as part of this job are available to the customer and can be downloaded and worked with.

As you can see here, all of the files are available, including the InDesign file and a PDF file that was generated from this.

[On screen: Guy clicks on a PDF file which opens in his browser and he scrolls through a few pages highlighting content features.]

I’m going to quickly open up that PDF file—in this case it’s going to open up in the browser—and here you have the PDF.

It has clickable hyperlinks, it has a table of contents that immediately takes you to a page, there’s alternative text in here because the long description tag in the XML is automatically processed in the Typefi workflow and is added to images, and we can see our MathML that is fully text-based, ready for other formats, like HTML, which are getting better and better at rendering MathML natively.

[On screen: Guy switches back into the Typefi user interface.]

Typefi: XML to EPUB workflow demo (06:52)

Now, because we only have 10 minutes or so, is I want to go back to the workflows and I quickly want to kick off another workflow, in this case the XML to EPUB workflow which, when we look at the details of that workflow, is similar to our PDF except that we’ve got a different transform that’s being applied, and we also have a different InDesign template.

What I’m going to do is I’m going to click the Run button here and again we are going to have the same NISO XML file that we’ve been, that was used to generate the PDF, and that’s going to run the EPUB file.

This is again first transformed to Content XML and then in a moment it’s going to be handed off to InDesign to be processed. Now, we’ve already seen that with the PDF, so I’m not going to spend a lot of time in watching this run through, I just want to see that being kicked off.

Typefi: XML to HTML workflow demo (08:02)

Now let’s switch to one of the other workflows and that is the generation of HTML, again from that same file.

What I’m going to do is, I’m going to take this workflow that you can see is much shorter than the other ones. This one converts from ISOSTS to Content XML, and then generates an HTML file from that. So let’s click Run, choose our input file and let’s get some HTML generation going.

Because this is an XML to essentially another kind of XML, because that’s what HTML is, this is going to be pretty quick.

Now that the job is finished, let’s go into the job file and download our HTML file, and let’s actually open that up in the browser.

[On screen: Guy switches to another tab in the browser and is scrolling down through the HTML text.]

This is as plain HTML as it can be and the purpose of this is not to look pretty, because we don’t know yet what you may want.

HTML is a clear separation. It is like XML—it doesn’t have any visual representation. That is what the CSS takes care of.

So this is just HTML, but the beauty of this, the links are there, some basic structure in the form of heading levels are there, the MathML is being inserted into the into the HTML and we’re seeing some basic heading levels.

The beautiful thing about this is it can be moulded to whatever kind of HTML you want. The main takeaway for this is we can produce HTML out of the same input file, easily and quickly.

Typefi: XML to DOCX workflow demo (09:45)

[On screen: Guy switches back to Typefi user interface and is in the Files section.]

The last thing that I want to end with is actually a different kind of transform all together.

We can take the NISO XML and actually produce a .docx file from it.

What you can see here is a workflow that a slightly different. It still starts with Import ISOSTS, because that’s the first stepping stone, but then there are some other actions in here that Export to DOCX, but also that modify the DOCX file to make it look the way that ISO wanted when they commissioned this work.

So I’m quickly going to Run this file, again taking the same input XML file that was preloaded into this workflow, and it has converted and transformed into a DOCX file.

So I’m going to download that file right now and open it up in Microsoft Word.

[On screen: Microsoft Word starts up and a DOCX file produced from the XML file opens and Guy scrolls through the document as he talks.]

It’s going to open up in protected view because that’s what Microsoft Office does with everything that is downloaded from a browser. But I’m just going to enable editing because I trust the source.

I want to show you that this is a Word file that was produced from an XML file, and that has things like clickable hyperlinks, tables of content, lists that are actual automatic lists—so you can go enter and it will actually produce a new, in this case, bulleted list but it works with the numbers as well—and it has headers and footers, as you can see.

And I think one of the coolest parts of this entire work, is the MathML. The power of that MathML—these are fully editable in Microsoft’s native Equation Editor. So we’ve taken the MathML and transformed them in Microsoft Open XML Editor, and they can now be edited directly in Word.

[Demo ends.]

Thank you for your time.

Guy van der Kolk

Guy van der Kolk

Product Manager | Typefi

Guy first got hooked on publishing while attending an international school in Ivory Coast, where he used Pagemaker, Photoshop and an Apple Quicktake 100 camera to help create the yearbook. After many hours of hard work, while holding the final printed product, he knew this was an industry he wanted to be a part of.

Having spent the first 17 years of his life in West Africa, Guy is fluent in three languages and has a multicultural background that has served him well in his career. As an IT consultant and trainer for an Apple Premium reseller and then as a Senior Solutions Consultant for Typefi, he has trained thousands of people to get the most out of their software. In 2020, Guy moved into the role of Typefi Product Manager, working with the product and engineering teams to continue to improve on Typefi’s world-leading publishing software.