Continuing Adventures in Standards XML Publishing
In this case study presentation from the 2021 Typefi Standards Symposium, Patrick Gibbons, Senior Solutions Manager for IEEE Standards Association—Content, Production and Management, provides an overview of how IEEE SA met the challenges of implementing an XML publishing system, and outlines some of the continuing challenges they face.
“The IEEE Standards Association publishes about 120 standards per year. They range anywhere from five to 5,000 pages, typically about 100 on average, and the PDFs are posted on IEEE’s online platform: IEEE Xplore.”
“Around 2010 IEEE mandated that the IEEE Standards Association provide XML at the same time as our PDFs.”
|00:46||About IEEE SA Publishing|
|04:07||“Turnaround” Document Solution|
|04:44||New Challenge 1: MS Equation Editor|
|06:11||New Challenge 2: Macro Security|
Continuing Adventures in Standards XML Publishing: Introduction (00:00)
PATRICK GIBBONS: Hello, I’m Patrick Gibbons, Senior Solutions Manager for the IEEE Standards Association—Content, Production and Management department.
We’ve been working with Typefi since 2015 to generate the PDFs we publish from our XML.
Now it may not be obvious from looking at these gorgeous PDFs, but a lot of blood, sweat, and tears with indispensable help from Inera went into getting these results. And though our solution is a stable, well-oiled machine, providing users with tools that will help us put their input into our system and spit it back out for them, continue to be a constant challenge.
I’m going to give you an overview of our process and how we’re handling those challenges.
About IEEE SA Publishing (00:46)
First, a little background, the IEEE Standards Association publishes about 120 standards per year. They range anywhere from five to 5,000 pages, typically about 100 on average, and the PDFs are posted on IEEE’s online platform: IEEE Xplore.
IEEE also houses our in-house XML in a MarkLogic database.
Now, we do things a little differently than the IEEE Publications editors.
Our staff is intimately involved with the standards development process. We guide the authors from the moment they submit their project, we help them know how to best cite different references, what they can do, what they can’t do. So we don’t just edit the documents. We’re very instrumental in helping them understand what they can and can’t do. And this is reflected in some of the publishing choices and processes that we we implement.
Around 2010 IEEE mandated that the IEEE Standards Association provide XML at the same time as our PDFs.
After exploring an XML-first publishing solution, which just proved to be unworkable for our users—they really had to have something where they could work offline and they were uncomfortable with not being able to collaboratively work online—it ended up being really expensive, every time we asked for something else the bill just kept, got bigger and bigger, we had to scrap the whole thing.
We regrouped and we landed on our solution with Inera and Typefi.
Pre-XML Workflow (02:26)
So, before we started working with Inera and Typefi, we had our old workflow, which you’re probably familiar with if you’re an old timer like me.
We would take Microsoft Word documents created in our template, the staff would do a style and format edit, and then we would publish a PDF straight from Word, and then the source file and/or PDF were converted into XML by a vendor, an outside vendor.
Now we still do this for those who give us files other than from Word. Some of our Computer Society authors still use FrameMaker, and we also have a handful of people who give us PDFs from LaTeX. They do all the work on their end and we just tell them what we need, but most people do give us Word files now.
And back then, when everything was just made from Word, we just gave them back the source file and then they could use that to do revisions and there was no problem.
XML Workflow (03:28)
So when we started working with Inera and Typefi, the workflow didn’t really change for the users.
They still used our template and gave us the Word document. And on our end, we would still apply style and format edits, but instead of making a PDF from Word, we now utilise eXtyles’ plug-in to generate XML from our Word files, to create our PDFs in Typefi and to supply the IEEE MarkLogic database.
However, the processed Word file had been stripped of our template macros. So we were unable to provide our users with the final version of their standard back to them.
“Turnaround” Document Solution (04:07)
Luckily, Inera was able to assist us and actually they created a, what we call a turnaround macro, which put the eXtyles-processed Word file back into the IEEE SA template.
Now this was no short order, it involves updating the metadata variables, format changes, adding a table of contents, restyling paragraphs, and putting in cross-references. Now this was, and is a big deal. There was a lot of celebration. I made a whole presentation on this and everything was right with the world.
Or was it? No.
New Challenge 1: MS Equation Editor (04:44)
The first big challenge came in 2018 when Microsoft stopped supporting their old equation editor.
Now, our users were no longer able to edit math, set in either the old Microsoft Equation Editor 3.0 or in MathType, unless they happen to have a MathType licence, which most of our users don’t.
For those of you who don’t use eXtyles or or MathType, MathType is a plug-in for Word that we need to use with eXtyles. What it does, is it takes the equation, the math objects, and it converts them into, in our case LaTeX, which we needed in our XML. But other people convert it into other things like MathML to put into XML.
So, not long after they took away the old Microsoft Equation Editor, they put a new equation editor into Word, for which we have a plug-in from GrindEQ, which will convert both MathType and old Microsoft Equation Editor math into the new equation editor, but you can only do it in a .docx or a .docm format. The old .doc format, the macro-enabled document, will not allow you to edit the math.
So, our solution was to give everybody a .docm format so they could edit these math objects. However, the next challenge arose very recently, just within the past couple of months.
New Challenge 2: Macro Security (06:11)
We’re no longer able to send our .docm files through Gmail. Gmail out of hand says, “It’s a .docm, it’s infected. We’re not going to let you send it.” You can’t even zip it up. You can’t do anything with it via email.
So this is a real problem for us. There’s no one size fits all solution for us:
- A .docm is the only macro-enabled format that allows Microsoft math to be edited, but it can’t be sent via an email
- A .doc is macro-enabled, but it doesn’t allow the math to be edited, though it can be sent via email.
- A .docx allows you to edit the math, and it can be sent via email, but it’s not macro-enabled. So people can’t use our macros for styling.
So as of this, what are we going to do?
Template Solution (07:13)
We decided we’re going to start sending a .doc file to our users and anyone who wants to be able to insert math into that can download our .docm template from a dedicated link on one of our servers.
Even in that case, we found users who are unable to open the document once they’d downloaded. Different people’s servers or home, corporate systems, won’t allow them to use a .docm, sometimes not even a .doc. We saw this already over the past couple of years with a lot of people in the military, a lot of corporations were no longer allowing it to be downloaded.
So as of this recording, it’s looking like we may have to scrap the whole macro-enabled template and go with a .doc. They can edit the math, they can send it through email and it seems to be really, the only way Microsoft wants people to go, going forward.
So, that’s where we’re at.
Hopefully at my next presentation there’ll be able to have a better template solution for our users. Maybe there’ll even be an XML-first editing system. You know, I can keep dreaming!
At one of the recent presentations I saw Debbie and Tommie from Mulberry Technologies were saying that they thought it might be around the corner. That remains to be seen for some of the needs of our users, but that would be, that’s what everyone’s looking for. But until then, this is what we’ve got.
Can you tell us a little bit more about GrindEQ?
It’s a plug-in, and it’s very good at converting both MathType and the old Microsoft Equation Editor math objects into the new Microsoft Equation Editor. It works really well, it’s not that expensive and yeah, as long as you don’t need to send it in a .docm or .doc you’d be in good shape.
Is IEEE using its XML just for PDF, or for other things as well?
Our primary use is that we’re using it to create our PDFs. It’s maintained in a MarkLogic database, IEEE Xplore. Hopefully someday we’ll be able to represent it in HTML on Xplore. There’s also been some apps that have been created from some of the XML, and working with an outside vendor on another solution may be another way to present our standards.
Authors sometimes don’t like the formatting changing from what gets authored to what gets published. Are you seeing that?
There’s always someone who’s going to not like the way it looks, and so many of our users were used to seeing the direct PDF of the Word file. That’s certainly not what they’re getting anymore from us through Typefi.
I’ve been, I’d say about 90% successful, just telling people: “Don’t worry about it. It’s fine.” And then you get people who kind of give us a bit of a struggle sometimes, and we have to make concessions.
At the 2019 Typefi User Conference, you said that you were still trying to figure out how much you wanted freelancers and editors to format documents themselves, and how much you would do once you got the file. Where are you up to with that?
The way we work, because of our balloting process and we publish the drafts in progress and the approved drafts, we don’t really get involved with the document until it’s been approved. So there’s not a lot of author interface at that point.
We have freelancers who will do some of the styling that the editors will also do. Typically, we’ll still have the freelancers working in Word, as if it’s going to be published as a PDF, and then we’ll take it from there and do our eXtyles and Typefi process.
Sometimes it turns out that something just won’t work, and then we have that copy to go back to, but by and large, the simpler documents, the ones we don’t have to send out for freelancing, they just run through smooth sailing.
All the editors in-house, we have our own way of doing things. I personally will just grab a document and go right through the processing. I don’t even worry too much about keeping a backup because I’m pretty confident that as long as I know the document is in pretty good shape, we’ll be able to get it through.
IEEE is something of a publishing powerhouse. You have a monograph program and a journals program. How much do you share processes, tools, and thinking between these programs?
Not much, they pretty much take what we give them. We send them a XML package and our PDF to be posted on Xplore. I can’t even really speak to how it is that they publish their journals and conference proceedings.
You mentioned that you’re looking forward to one day maybe having a collaborative XML editing environment. What challenges might you see taking your current committees into an environment like that?
It’d be a matter of being able to host it somewhere and everyone being able to have access.
And the growing pains of them again, not seeing it, because right now they’re still working in Word and they don’t see a lot of this data and how it looks when you apply tags and things. So, there’d be some buy-in to get from the users.
Do you see a bigger challenge for those using FrameMaker or LaTeX to go into online authoring?
I would say yes. There are certainly groups who use Frame and really like it, and I think it would be hard to get them to move away from it. But that remains to be seen.
Right now, I didn’t even really think about it that way. I was really thinking more along the lines of just dealing with the Word authors, because currently we treat Frame and LaTeX as out of scope, but yeah, maybe that could change.
Since implementing your new publishing workflow, has your publication time sped up, slowed down, has efficiency improved?
I know that our publishing time has improved since we’ve done this. I’m sure that it’s made an impact.
Senior Solutions Manager | IEEE Standards Association
The Institute of Electrical and Electronics Engineers Standards Association (IEEE SA) is a leading consensus building organisation that nurtures, develops and advances global technologies through IEEE, the world’s largest technical professional organisation. With collaborative thought leaders in more than 160 countries, IEEE SA promotes innovation, enables the creation and expansion of international markets, and helps protect health and public safety.
Patrick Gibbons has an MLS with a concentration in Information Storage and Retrieval from Rutgers University, and has over 25 years’ experience in electronic publishing with Elsevier, ProQuest, and IEEE. As Senior Solutions Manager for the IEEE Standards Association, he is the technical point person for IEEE SA’s standards XML and PDF publishing program.