Case study: Innodata


In this presentation from the 2024 Pacific Virtual User Group, team members from Typefi client and partner Innodata discuss how their organisation has been using Typefi automated publishing software to help its clients publish better content in less time.

Arjeh Keizer, AVP at Innodata, starts off with a short introduction and background about Innodata. Innodata is a global data engineering company that works with some of the world’s leading enterprises to drive operation excellence through AI driven solutions and platforms.

Typefi’s publishing automation software is built into Innodata’s Intelligent Automated Publishing Platform (IAPP), a modular publishing system that leverages Typefi to produce fully composed documents automatically.

Nalaka Fernando, Technology Program Manager at Innodata, continues with a closer look at the IAPP system and the numerous modules that can be implemented. He then hands over to colleagues Sameera and Deepak who go into more detail about some of Innodata’s STEM and legal use cases.

Sameera and Deepak discuss how Typefi worked with Innodata to tackle some serious publishing challenges, including lengthy complex tables that run over hundreds of pages and some tricky indexing requirements. Innodata was able to overcome all the publishing difficulties it faced thanks to Typefi and some custom JavaScript.

Before Typefi, all the typesetting work at Innodata was done manually and a typesetter was able to do 30 to 40 pages per day. Now with Typefi, that has increased drastically to 300-400 pages per day. As a result, Innodata’s turnaround times have decreased from a few weeks to just a few days.


00:00Company background
03:47Publishing platform
05:48Solution overview
08:03STEM & Legal use cases
17:54Case studies

Company background (00:00)

ARJEH: Hopefully everybody can see our screen now.

AUDIENCE: Yep. All good. Yep.

ARJEH: Perfect. I’m also audible?


ARJEH: Okay, great. Then let’s get started.

So thanks for the introduction and for having us speak today, Chandi and Lukas. Thanks for coordinating us being available for today to present some case studies.

Hello everyone. Even although I partially introduced myself already in one of the breakout sessions today, my colleague Nalaka and I will talk you through some use cases of Typefi implementations we did for our customers.

But when I say customers, what does that actually mean? Let me give you a short introduction of what Innodata is, who we are. This will take just five minutes. It’s more important that Nalaka talks because he will dive into the use cases and a short demo which will take around 15 to 20 minutes.

Now, should you have any questions along the way, we encourage you to please note them because at the end of our talk we will reserve some time for questions and answers.

So I’m Arjeh. I’m based out of the Middle East. I’m at Innodata for 10 years now and have a long history of working in the publishing and information providing business. And among others I support our publishing and information providing customers with implementing and optimising their operational publishing workflows and processes.

Now my colleague Nalaka Fernando, based out of Colombo, Sri Lanka, is responsible for the successful implementation of our technological solutions with these clients and he is also working on Typefi technology in our platform. And Nalaka with the help of Deepak and Sameera is going to be demoing our use cases that we prepared for today’s session.

So very briefly, some words on Innodata. We are a global data engineering company that helps leading enterprises in different market verticals to drive operational excellence by the use of our AI driven solutions, platforms, and subject matter experts. Our head office is based in New Jersey. We’re traded on the NASDAQ Stock Exchange and last year we celebrated our 30th birthday.

Our company has offices and facilities all over the globe from the US and Canada through Europe, the Middle East and Asia. And we’re just over, I think today 4,000 subject matter experts and technology subject specialists.

Our client base is quite diverse as we work with, for instance, the top five technology companies in the world, supporting them with AI services to help them further develop their foundational large language models.

I think Caleb already talked a little bit about Gen AI. This is also of course for us an important direction where we see that our customers are really developing capabilities but also are looking for implementation possibilities.

And especially our corporate product and services providing customers, our information providers and publisher customers, are part of our customer base who are also utilising our services.

Now we do these services for clients across a vast number of domains. We have customers in the legal space, scientific, educational, medical and mobility sectors and we provide them with services in dozens of different languages.

Publishing platform (03:47)

Now Innodata’s added value for our customers is to provide economic benefits, of course, workflow efficiencies and time-to-market improvements. And in the publishing workflows, we offer our clients our proprietary cloud-based, AI driven, intelligent automated publishing platform. It’s a mouthful, so that’s why abbreviated it to our platform called IAPP.

Now the principles of IAPP are actually quite simple. It’s a management and production solution for end-to-end XML-based workflows which allow for authoring, editing, typesetting proofing, publishing and distribution of structured online, digital and print publications all from a single source of truth. I think Caleb also spoke a little bit about that.

Now our cloud platform contains modules that handle content creation, content editing, content management and performance tracking and monitoring. And one of its driving components is the Typefi intelligent publishing automation software.

Now typically an optimised end-to-end publishing workflow for us looks like something we portray on this page. We’re a bit short on time so we will not go into all these workflow steps now, but we provide our customers more or less from the creation of content publications all the way to the downstream distribution of that.

But Nalaka will address some of these steps in the workflow as he goes through the demo and the use cases that he’s going to present to you all today. That was my part of the introduction about Innodata and how we support our customers.

Nalaka, may I ask you to take over from here?

Solution overview (05:48)

NALAKA: Thanks Arjeh. Yeah, so I think you all got a overview about this end-to-end publishing platform. So I’ll just give a little detail and then we’ll jump into a demonstration and then we’ll discuss a few case studies how Typefi helped us to implement this platform.

So if you look at it like this publishing platform, so we call it IAPP. So this has four main modules where the publication planning can happen. So where the publication management team can do the publication planning and allocation to the authors.

So the authors will work on the content, they write the content or either they edit the content. And also we do have an option where we can absorb the Word files as well if the authors have written the content outside of the platform.

And the advantage of this platform is, this platform is modularized. So we have connected the content harmonisation module. So if authors write the content outside of the platform, when they drop the Word file, there’s an AI based module which identifies the content structures and transforms it into structured content and then it’ll be available for the further editing in the publishing platform in XML format.

So once the editing is completed, then it goes to the typesetting flow, so that is backed by Typefi. So we are using several features of Typefi in order to automate the publishing process.

This process is automated, so at the click of a button, it goes to Typefi and then it creates all the deliverables like PDF and Word outputs, EPUB, and the XML also. So that is through Typefi we were able to generate these automatically.

And once it’s generated, it goes to the authors for their review. So this is the lifecycle, this is the cycle of this publishing platform. You do the planning, allocation, and then authors write the content or edit the content, and then it goes for the publishing platform where Typefi supports it to generate the PDF or the other outputs and go for the review.

STEM & Legal use cases (08:03)

So we’ll jump into a demonstration of this platform quickly and then we will touch base on how Typefi helped us to overcome certain challenges we had in the typesetting. Thanks Arjeh. So I want Sameera to share your screen and then go to the demonstration.

SAMEERA: Yeah, thank you Nalaka. So I’m sharing my screen. Yeah, I hope all of you can see.


SAMEERA: Yep. Okay. Alright. So what you can see in my screen is the IAPP platform.

So I log into the platform as a sort of publication management level user and the first thing we can see is sort of a dashboard to track the different stages. And then there’s a publication planning module where we can preplan our content lifecycle.

So the scenario we are going to explore here today is like if the authors are writing the content outside or maybe they can write the content directly within the platform, how they can generate the different kinds of layouts according to the project requirement through Typefi.

So the first scenario, like Nalaka mentioned, if the authors are writing the content outside the platform, there’s a module we can drop the Word content into the platform where it’ll go through this Generative AI to automatically structure the content. So the segment that I’m going to demonstrate is, once this content gets transformed, how we can do some content editing and then generate the PDF output.

So this is one of the technical manuals from one of the customers. So this interface is a XML-based editing interface, but it’ll present to the author as a Word like interface because most of the authors are not much familiar to the XML-level editing.

So this is the real XML content but it gives sort of generic features as you’ve seen in Microsoft Word, like the inline styles, listing, tables, footnote, likewise many options.

So in the left hand side it’s showing the document structure and the right hand segment you have this XML level attribute information because still we are working on the XML.

So I’ll do a few edits and then let’s generate the output from Typefi and let’s see how it’s going to look like. I’ll do a few content edits, content, and I’ll delete some text. So these are getting track changes.

I’ll apply some inline styles, making the content bold, and I’ll add one more list item, new list item. Then we’ll also include a simple kind of a table, and some other thing we can input a small image as well. So I’ll go to the image library and I’ll put a simple image into the document.

So I did a few quick edits into the document. Now what I’m going to do is I’m going to generate the PDF with the help of Typefi. So for that we do have a print option and we can generate the PDF.

So it’ll take a few seconds based on the content length of the file depending on the number of pages we have.

So in the background what is happening is we are transforming this XML into the CXML conversion to make it compatible with Typefi and then it’ll ingest into the Typefi platform through the APIs. And once that PDF is generated through Typefi, we will be automatically loading into this interface.

I think it got through. So this is kind of a simple and clean layout. So we created based on the customer expectation for the technical manuals, you can see the updates added to the content are reflected over there including the image. So all these are coming through the template, how we defined it.

So I’ll show you another two examples that are from the different layouts, how we can customise these layouts from Typefi and get a different sort of output into this interface.

And also we do have integrated Word generation as well if authors need to download a copy of this document as a Word file. So this is also coming through Typefi, we can download the same content in the Word format. So I’ll shift to another document.

NALAKA: So that actually, the advantage that we got from Typefi is to generate the Word files as well because we have a separate template for that. And then we’re generating Word outputs, which we’ll talk about in a while. Our Typefi expert from Innodata side, Deepak, will sort of touch base on a couple of case studies on how we are using Typefi features.

SAMEERA: So this document is sort of medical domain content and as you can see the layout is different. It’s a two column layout with a lot of colours and the background information. So this template we are generating from Typefi based on the customer requirement. And the next one I’m going to show from the sort of legal content.

NALAKA: So these formats are autogenerated using the Typefi features and also doing scripting from our side as well. But it’s supported heavily by Typefi features to get these automated.

SAMEERA: So this is legal content, if you look at there are some specific things like marginal numbers and the footnotes. So these are generated from Typefi. And different kind of box layouts to preserve some of the primary content.

And then if I go a little bit down I can show you a few table structures as well. Okay. There’s a two column table with one structure and if I further scroll down there’s another table example.

So in the legal content, I think one friction like Nalaka initially raised as well, we are having a couple of hundred page length documents as well. Tables are continuing to maybe sometimes it’s more than 200, 300 pages even, but still we have managed to generate those kinds of tables from Typefi. So that’s a plus point.

So all these, to summarise again, we are generating through the templates with the help of Typefi in a real-time interface from the platform.

NALAKA: So what we do is, I mean as I mentioned earlier, so once these are generated automatically, then it goes to the authors for their review.

So the advantage of this platform is, authors can do their author updates, comments, reviews, directly into the platform with comments and with track changes so that it’ll be reflected in the content and then it can generate the PDF, the authors, editors can generate the PDF and they can see it real time with these features.

I think Sameera, that’s all from your side?

SAMEERA: Yeah, that’s what I have had to showcase today.

NALAKA: Great, good. So let’s move into the slide deck again Sameera, and we’ll touch base on a couple of case studies. What are the features of Typefi we used and then how those features helped us. You can go to the next slide.


Case studies (17:54)

NALAKA: Yeah, so I would like us, Deepak, to talk a little about and then I will also. Deepak is the Typefi Specialist, actually he got trained from Typefi team. And we have a set of team that got trained from Typefi team. And Deepak, you want to explain a little bit of these case studies?

DEEPAK: Yeah, so hi, I’m Deepak. So I’m working as a Typefi template expert here.

So the few case studies you can see on the screen, so Typefi supports, Typefi provides a different workflow and actions for different kinds of layout creation.

For instance, you can see in the left screenshot where one of our customer requirements was that they have one template but within the one template, based on some XML attributes, they want to change the layout on the fly. So as Typefi supports JavaScript at different levels of events, we were able to achieve this layout with the one template.

You can see on the screen there are 14 kinds of layout requirements of the customer. And based on this—1111, 1211—we have to read the XML and manipulate the layout on the fly. And with Typefi and the support of the JavaScript from different levels like document start and document end, page end, spill, we were able to achieve that.

That was a great advantage of Typefi and the support, which the way Typefi supports the JavaScript, was very good for us and rather than creating different types of templates, 14 templates, we were able to achieve in a single template all 14 types of layouts.

And apart from that, as there are some limitations within InDesign where it doesn’t support the complex tables, it doesn’t support some background behind the tables, and there’s some footnotes issues also there but with the help of Typefi and the scripting support we were able to overcome.

Earlier, even though we had Typefi, but because of the InDesign limitations we had to manage those manually. But with the support of this JavaScript at the backend, we were able to achieve those layouts. Can you move to the next slide?


DEEPAK: And because Typefi has a plug-in within InDesign. So with InDesign and Typefi, we can manage any complex layout, any complex scenario.

As you can see in the screen, we have floats within the body, we have floats through. So Typefi gives us more flexibility to define the pagination rules within InDesign with the help of Typefi plug-ins. So we can decide, okay, which table has to be present where, if we need to.

So the way they are supporting the tables with the cell style, which is not quite easy with the traditional ways or manual. So the way they are supporting, you can see in the screenshot where in one table we have different shades of the colour. So with Typefi we can read the XML and we can manipulate the layout or we can define the style sheet in the template.

Similarly, you can see different types of pagination rules we can manage, we can define in the template where my float should appear, how it should look, and what kind of colour.

Even in one project, a requirement was that in each part, the colour had to be changed. So we defined all the colours in the template and through Typefi and the JavaScript on the fly, we changed the colour by reading XML. Can you move to the next?

So you can see on the next slide also we have some more layout differences and with the Typefi InDesign template and Typefi pagination rule, we were able to achieve that. Can you move next?

And so this is one more feature we have in Typefi where not even the template and the plug-in and the script, it allows us to read their Content XML within the template, which is a very good feature.

In one of our projects, the problem was that while Typefi supports the index options, a requirement was in this project, which you can see on the screen, that they have the entire index XML, but they wanted to add the page numbers at the end. So through scripting, Typefi allowed us to read their Content XML. And with these IDs you can see in the left screenshot, we were able to get, with these IDs, we were able to get all the page numbers in the right screenshot onto the proper index entry.

So with the normal traditional way, it’s all manual, somebody has to check those entries and see where the entry on the page is and needs to update because here they have given the entire XML with the ID. So with the help of JavaScript within Typefi, we read their CXML content and we’re able to get the exact page location and update the CXML and generate output.

And one more thing I want to add here. Earlier before Typefi, because we have so much legal content where the book volumes are very high, sometimes it goes a thousand, more than a thousand pages, 5,000 pages. In the earlier process, the team had to do those things manually.

It was a very tedious task. They were taking a lot of time. And with that Typefi support, we can just push the XML within Typefi and define the templates. And now in a few hours we were able to generate thousands of pages from the backend. And during that time, a user can do other work and so we save a lot of time in the composition especially.

Yeah. Thank you. This is for my side.

NALAKA: And also one thing, a few things to touch base, because earlier when we didn’t have the Typefi solution, we were doing this typesetting work manually. So that actually, compositor, typesetter, was able to do 30 to 40 pages per day. But with Typefi we were able to increase drastically, around three, 400 pages.

So now a typesetter can do it with all these automations, so it’s like automated publishing. So with the support from Typefi, we were able to achieve that. That is something I want to highlight here.

And also that actually helped us to achieve the turnaround times because earlier the turnaround times spread across weeks, but we were able to deliver these turnaround times for customers really quickly. I mean within days we were able to achieve that.