The future of documents


The Future of Documents is shaped by a fundamental change in how information is shared and agreements are secured. This future sees a shift from traditional ‘e-paper’, formatted and optimised for reading by humans, towards semantically tagged information in an open digital format.

Strategically, NISO STS has prepared standards creation very well for this Future of Documents. In this presentation from the 2021 Typefi Standards Symposium, Jan Benedictus, Fonto CEO, reflects on Fonto’s experience with standards development organisations and what this means for authors, editors and reviewers of standards, as well as consumers of them.

“Solving the root cause is really to go to a structured and semantically tagged source format… that has potential to become the leading paradigm in documents and documentation in general.”

Transcript | Q&A | Presenter


00:00 Introduction
00:34 About Fonto: Structured content authoring
03:33 The future of documents
04:16 What’s driving the shift to structured content?
07:28 Making documents both human- and machine-readable
09:04 Trends in authoring practices
13:41 The future is structured and data-driven

Introduction (00:00)

JAN BENEDICTUS: Thank you very much. And thank you very much also, of course, to Typefi for organising this. I realise for some of you it’s early morning, for others it’s late in the evening or even midnight, so thanks for having me here.

My name is Jan Benedictus. Indeed, I’m the founder of Fonto. And I don’t want to talk too much about Fonto the tool, but a bit more on the strategic thoughts that are behind Fonto, that we summarise in the future of documents.

About Fonto: Structured content authoring (00:34)

So, a few words about Fonto. Our mission is we want to make it easy for essentially anyone to create, edit and review structured content.

And I think one of the main key points of that is that we want people to be able to manage and author and edit structured content directly, so without any conversions coming in between.

Essentially, that means that we are trying to help organisations that have made mostly the strategic choice to shift from an unstructured format-oriented process to a structured content process. And we try to do so by engaging their authors.

I saw already some comments here and earlier that, of course, engaging authors into a new way of working is one of the hardest things to do, yet we are trying to do this.

We do so by providing Fonto, which is a structured authoring solution, or a structured authoring platform. It’s web-based, so it’s prepared for online collaboration, and it can be configured for a schema.

That means that Fonto is not only used in standardisation organisations, but also in other industries like publishing technical documentation, aviation, life sciences, but also very much in standardisation.

Currently, Fonto is being piloted and rolled out at ISO, IEC, CEN-CENELEC, a number of national organisations, and then a number of, I would say, specialised SDOs.

Why we do this is that, of course, by making it possible that authors without any specific training, but still are working in the structured content format directly, is that we tried to leverage on the value of a standard like NISO STS.

You could say that as long as you cannot author in a standard or in the schema, that its mere added value is more in content exchange, maybe publishing.

But we want to alleviate that, or leverage that, and say, okay, by authoring in the structured source content directly, you’re kind of creating a platform, an opportunity, to do a lot of innovations in terms of documentation and documentation publishing.

And it was nice to see that already, even in the previous talk, there was a lot of talk about publishing to multiple formats.

But we see perhaps even more disruption, or maybe fundamental disruption, in the way that organisations and people are thinking of documents and documentation.

The future of documents (03:33)

Some of that work and some of those insights are very nicely summarised in a report that was published by Forrester only last December.

For anyone working in documentation, I would recommend this read. That’s essentially saying that documents and documentation after, say, almost a generation like 25, 30 years, what is it that we do, what-you-see-is-what-you-get authoring, that vision and the way you’re thinking about documents, will fundamentally change.

And one of the key takeaways that they say is that documents will be data-driven and purpose-built for specific audiences.

What’s driving the shift to structured content? (04:16)

If you think a little bit about the driver behind that, behind that paradigm shift that they are predicting, then of course, it’s kind of logical to think about a little bit how structured content has evolved and how it started.

And apparently, it’s not apparently but probably, it’s safe to say that publishing and tech doc and perhaps also standards in a way, or at least to a certain extent, were a bit on the forefront of adopting structured content.

But if you look, for instance, at publishing or tech doc, that whole DITA standard or DITA schema based infrastructure or world, you see that those industries were probably somewhat further advanced when it comes to replacing unstructured to structured document and content management workflows.

A big driver, of course, for publishers, was that in the late nineties, they were confronted with the urgent need to start publishing in digital formats, driven by a fairly sudden market demand even, in a very efficient way. So they were quite early in adopting structured content and content authoring.

But, and that’s also a bit sketched in that Forrester article, there’s a trend that was perhaps reserved for that niche that is now becoming more mainstream, and that is that business documents in general are increasingly consumed or processed in an automated way.

We have seen, actually from our own practice, some of our customers have seen projects that want to not only publish entire standards, maybe not only PDF, but in HTML, but also parts of the standards or bits from the standards, information from within the standards, in a specific way so that it can be deeply embedded in knowledge systems, for instance, by the customers.

So this growing mismatch between automated consumption or processing of documents in general, and aesthetic file formats as they are currently in use, that is an increasing problem.

Maybe to give another example, if you think of something like invoices or purchase orders, you see that those documents typically are automatically processed by the receiver.

But even on a larger scale, if you think of pharma, you’ll see a fairly, maybe even somewhat surprising practice, that on a fairly large scale, pharma companies are putting a lot of data into documents. They spend a lot of effort to get the data correctly in the documents to send them off to the authorities.

During the next step, see the authorities spending a lot of time and a lot of effort to try to get the data and information, again, out of those documents. So the document format becomes a blocker or a bottleneck.

Making documents both human- and machine-readable (07:28)

Of course, there are, I would say, solutions or partial solutions to alleviate that problem.

The first whole area of course, is text recognition, data extraction, natural language processing, artificial intelligence, the whole set of technologies intended to interpret the information that is stored in a document that’s actually optimised for human reading.

The second approach that we see is a bit more from the sender. So you’ll see the documents are enveloped or joined with a lot of metadata. So the fairly, well the essentially closed documents, are then enriched with a whole wrapper of metadata. That essentially is kind of like summarising the key information that’s actually present in that document.

Forrester kind of describes them, and we also recognise this from our practice that that is, in a way, fighting the symptoms rather than solving the core or the root cause, because solving the root cause is really to go to a structured and semantically tagged source format.

That’s probably not something that we would have to preach too much here in this audience. But we see really that that has potential to become the leading paradigm in documents and documentation in general.

Trends in authoring practices (09:04)

Of course, and it’s also mentioned here quite often and quite a bit already, authors are hard to change, in a way. They are very, very strongly bound to their current way of working.

I think that the phrase ‘everybody loves Word’ is quite often heard. On the other hand, I think you could say that a lot of people hate Word as well. Of course, there’s a lot of ways to try to tweak Word to do what you do.

One way or another, the opinion is that a lot of people like their current word processing tool.

However, we think that that is not a status quo, so that this may be today’s state, but we do see—and actually, I saw an interesting comment there on age and the adoption of Google Drive—because there are significant trends that in authoring in general that may open the door for a fundamental different way of thinking about documents.

So the first one is that there’s a real blending of authoring and collaboration, so where the classical paradigm might be, or have been, that an author is solitary, working in a word processor to finish the document and then send it off, a workflow for approval commenting, and go back. But it’s essentially a solitary thing.

That is fundamentally changing. If you look at Office 365, if you look at all kinds of online collaboration tools, if you look what we’re doing with Fonto, and also if you look at something like Google Docs, so it’s collaboration and authoring is blending.

And in a way it feels like at some point you’re deciding, okay, this version we’re now going to send off, but the collaboration doesn’t per se stop.

The other, I would say trend, is that you see a shift from end to end and large multipage documents to content assets. And this has for long been hard to accept for authors, that they are merely contributing to a body of knowledge by submitting content assets, fragments, components, topics, however you would call them. But still the trend is ongoing.

If we look at how younger people, for instance, here in our company are working, they are submitting, committing, information to a bigger body. It’s not so much that somebody owns a document.

Of course, the whole area of content orchestration becomes more important when this trend is ongoing.

Of course, assisted and automated writing. So we see that authoring and writing, even though it is still an art perhaps, and it still has a lot of room for creativity, but it’s also becoming more and more assisted.

So we all appreciate things like spell check, grammar check, but that extends, of course, to the checks, so all kinds of references. Maybe you’re already the generation of pieces of content, the type ahead kind of features. And the trend is that these kind of features are going to help the author to not only write, but also perhaps semantically tag their content.

And then last, but certainly not least, there is that blurring line between content and data. So we see, and if you adopt a content schema, a schema control, the XML format, as your source format, then you’re probably going to move away from formatting for the way it should look towards tagging for what it means.

So when you go away from tagging data points, tagging phrases, paragraphs, or even sections for how they should look, how they should be outputted, and you go towards the semantics, the meaning of it, then you see that actually content becomes data in a way. And data embedded in content also gives it value.

And we see that that trend going on, of course, neither of these trends is autonomously changing the whole document perception. But altogether we feel that there’s a lot of momentum gaining and a lot of momentum growing, that the documents and the future of documents is going to be fairly fundamentally different from what it is today.

The future is structured and data-driven (13:41)

So going back to that report of the future of documents that says, okay, documents will be data-driven and will be purpose-built for their intended audiences.

We strongly believe in that, and we think that that means that documents or content or documentation, however we should call this, will be componentised, they will be structured, and they will be semantically tagged, which then makes them universal to be used in various ways and output channels.

Of course, and I don’t have to say that here, but of course, standards development organisations are professional document producing factories. So, it’s not surprising that if we look at the strategy of ISO that they just published earlier this year, that we see alignment with the trends as Forrester describes them, as we see them. So, ISO standards will be delivered more quickly in a form that the market’s asked for.

And from Fonto, we are just very grateful and happy to be a small part of that future of documents for standards.

Thank you.


How do document revisions work with Fonto?

Fonto is not a standalone solution, it is embedded in a content management system or document management system, so revisions will be stored in the document management system. In Fonto, we have a way to visualise all those versions to compare them one by one, or to compare them in groups.

The other thing that we are implementing is something that we call ‘review’, which is kind of like allowing you to add comments to a document, but technically these are not being stored in the document itself. Each individual comment is a separate entity.

That means that they can be bundled when you process them. It means that the document creation process, the authoring, can go on in parallel with review or commenting. And it also means that you could do something like automatically generate a common resolution table from those comments.

Can Fonto take an existing document from Word and help refine it for further editing?

Yes, you can, but you’ll have to have it converted to NISO STS before you open it in Fonto. A NISO STS document can be authored in Fonto directly.

What is the largest size of document that you’ve tested it with? What is the user experience like with a doc that’s hundreds of pages in length? Does it dynamically load/unload off-view parts of the doc?

Document size has been a point of care in our very first meetings with ISO, IEC, and the other standards development organisations. We’re trying to follow also that paradigm of component-based authoring.

So what we do technically is we are kind of like chunking up the documents in smaller units, and that gives us the ability to go to an unlimited document size. We do something called ‘just in time loading’, so only those documents, fragments, that part of the document that is in view of an author, will actually be loaded to the editor.

This also means that collaboration is easier because it’s only those components, those fragments of the entire document, that are now locked by this user, so that other users can work in parallel.

Jan Benedictus

Jan Benedictus

Founder & CEO | Fonto

Fonto makes structured content authoring easy, enabling subject matter experts to create, edit and review mission-critical documents quickly and efficiently.

Jan Benedictus started working in the field of online and digital publishing in the late 1990s. He started Fonto in 2014, with the mission of making structured content authoring available for everyone. He is a regular speaker on the subject of The Future of Documents, sharing Fonto’s experiences and insight from working with structured content across industries and sectors.