On 2018-05-14 20:50, Richard Gaskin via use-livecode wrote:
They are indeed for very different purposes, and we've been using PDF
for so long that it's become the hammer that makes everything look
like a nail, applied to so much while it's only truly best for a much
smaller subset.

Of course the subtle detail here is the use of 'best' - 'best' relative to what requirements?

PDF is a very general format - it models the notion of printed matter which we still grow up with (although, admittedly, increasingly less as time goes by). This suggests the problem is not with PDF, nor with PDF being used - it is with the ties humans have to 'printed matter'.

In the course of my work I often go through periods of research, which
inevitably has me reading a lot of academic research papers and
corporate white papers.  Nearly all of them are published as PDF, many
exclusively in that format.

Two things to point out here:

Academic research papers are generally written using the format which the journal publishers require - typically LaTeX/TeX for anything beyond the normal written word and embedded figures.

Corporate white papers are usually written using Word Processors, in a format / layout defined by the company they are coming from. In many cases they will also go through some sort of 'design' phase afterwards, particularly if they are to be published widely - and often that will be using some page layout tool (such as InDesign).

In both these cases, the author/designer is designing at a fixed width (the joy of the rise of WYSIWIG in the 80's / 90's perhaps?)

The circumstances in which I'm immersed in such focus vary, and the
devices I have with me vary as well.  With reflowing content it
doesn't matter which device I happen to be using at the time, the work
continues unabated.

But when I encounter a PDF while using screen less than 8.5" wide, the
need to constantly zoom in and out and scroll back and forth so slows
progress that it kills the joy of research, bringing the work to a
halt until I can get to a device that happens to emulate size
characteristics of paper, even though I'll never print anything I'm
reading.

Curious if I'm alone with the time I spend on smaller screens led me
to research that as well.  And it turns out I'm far from alone; it's
where people are spending most of their computing time these days.
And since this trend is driven largely by people younger than me it
seems unlikely to slow down, at least until the next displacing form
factor comes along (but then we'll be doing something entirely
different still).

Right so the problem is nothing to do with PDF, it is to do with the fact that humans work better designing things at fixed width and the general tools which people learn to use, and continue to use support this frame of mind.

If a document is any more than 'just text' (as in something which can be rendered using a single font independent of page width) then requiring documents to work at any layout width means the author has to abstract and then instruct a tool to preserve that.

Certainly for many individual cases of 'document type' you can mechanize and assist; however, then the authors need to be aware of precisely what document type they are producing, and learn how to instruct a tool to encode content for that document type.

I'd like to be optimistic here, but I honestly don't think this is a problem with tooling - semantic representation of content has been around for as long as I have (probably longer), I was playing with systems which offered it when I was in my teens; and yet in my entire life since then I still see the majority of documents produced using word processors, or similar 'unconstrained' tools.

The problem I think is that humans don't like to be constrained when writing - any tool which appears to constrain what they can do in what they think (at the time) is an unreasonable way tend to be considered to be 'bad'. However, to achieve the goal of representing content in a contextual manner (relative to some abstract pattern which can be processed in the ways necessary to free us of fixed width layout, in this case) constraints are absolutely necessary.

Admittedly the rise of the web, and particularly HTML/CSS means we have an ever increasing body of practitioners who do have to think about the patterns of content, rather than just the content, but the knowledge they have and are able to apply has been hard won and learned by them (just like any other domain specific endeavour).

Different tools for different jobs indeed.  Not everything is a nail,
but the combination of technological inertia combined with an an
acceptance among the majority of people who are not inventors of
making due with whatever tool is handed to them, we keep using hammers
to drive screws.

Ideally all content would be represented at a semantic level requisite to its context.

e.g. Why use anything other than ASCII text, if your text can be entirely represented using ASCII?

... in exactly the same way as the author intended.

This is the only part of what you wrote I disagree with, if we were to
try it on as a general rule.

Writing is the flow of ideas from one mind to another, encoded in
streams of text.

Line breaks are often a meaningful part that communication, and on
occasion page breaks as well.

But for most writing, aside from perhaps code and poetry, column width
is rarely a semantic consideration at all.  Even printed books come in
different sizes.

By general do you mean either:

  - for a 'high' percentage of cases

  - for all cases

I'm guessing you meant the former - I was talking about the latter.

The point is that there is no general rule - I can guarantee for every constraint which you add to a system for representation of content, there will be numerous (entire families in fact) of existing examples which cannot fit into it. Similarly, what you will find is that if a system is required to be used, then people will find a way to 'work around' the constraints - leaving you back where you started - i.e. your system will work exceptionally well for things written precisely to work with it; but poorly for the rest, and over time the poor cases will start to become a noticeable percentage of content.

As people who write software, we have the ability to create abstract representations of content but the problem is mapping the concrete form to the abstract - particularly when we live in a world where concrete forms abound in their billions, and entire workflows are centered around it. Any system which can't deal with the concrete or interoperate with it is unlikely to ever gain a huge amount of traction.

From that point of view, I do think ePub is a bit of a 'red herring' here - it isn't really anything 'more' than a container format, with a reasonable way to encode indicies/document structure. Internally it uses the web technologies, which are good for reflowing text, certainly, but you still need to generate the HTML/CSS etc. and it is the mapping from 'what I want to say' to 'how do I encode it in a way which works in all the ways other people want it to' which is the hard part.

I'm sure things like ePub will help a bit - at least it is trying to instigate some bounds on communication of such things - however, I do strongly suspect it will become a technical detail which is largely irrelevant at some point though.

After all, what the world perhaps needs (rather than another file format) is a way to take the existing forms of how we communicate and turn them into a form which is more amenable to modern usage patterns mechanically. (i.e. A system which turns a PDF into a re-flowable document).

Warmest Regards,

Mark.

--
Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
  • Re: PDF Paul Dupuis via use-livecode
    • Re: PDF Mike Bonner via use-livecode
      • Re: PDF Paul Dupuis via use-livecode
        • Re: PDF Mike Bonner via use-livecode
  • Re: PDF Richard Gaskin via use-livecode
    • Re: PDF Mike Bonner via use-livecode
    • Re: PDF Alex Tweedly via use-livecode
      • Re: PDF Richard Gaskin via use-livecode
        • Re: PDF Mark Waddingham via use-livecode
          • Re: PDF Richard Gaskin via use-livecode
            • Re: PDF Mark Waddingham via use-livecode
    • Re: PDF Bob Sneidar via use-livecode
      • Re: PDF Richard Gaskin via use-livecode
        • Re: PDF Bob Sneidar via use-livecode
  • Re: PDF Dr. Hawkins via use-livecode
  • Re: PDF R.H. via use-livecode
    • Re: PDF Mike Bonner via use-livecode
      • Re: PDF Bob Sneidar via use-livecode
    • Re: PDF Richard Gaskin via use-livecode
    • Re: PDF Bob Sneidar via use-livecode
  • PDF R.H. via use-livecode

Reply via email to