Thank you very much for the detailed info! I've been very busy with other work, but I will respond here once I have done some more testing and have uploaded a sample PDF and sample code on my personal Google Drive account.

It sounds like the proprietary web service should be using only the Annotation information to establish the starting order for what it calls "fields" in its configuration UI. I'll do some testing to see if that is the case. If so then I know at least the area to focus on.

Thank you for the info on the field tree. I was aware of that but I now see how it is an even stronger reason for leaving the field tree (effectively the "data model" component of form fields) alone if at all possible and working exclusively with the Annotations that represent the "widgets" (that determine the "view or views" on the form fields).

And thank you for taking your time to explain how I went down the wrong path by focusing on the field tree and not the ("Widget") annotations!

Have a great weekend!!!

- Dwayne

On 1/31/2024 11:27 AM, sahy...@fileaffairs.de wrote:

added note: as Tabs is defined on a Page level it's clear that this
addresses the Annotation and not the form field as a form field can
have multiple representations on the same and/or multiple pages. The
visual part of the form field is defined by the annotation. The form
field itself i.e. without the annotation(s) doesn't have a physical
representation.

Am Mittwoch, dem 31.01.2024 um 18:22 +0100 schrieb
sahy...@fileaffairs.de:
Dear Dwayne,

for a generic solution reordering the fields won't help.

A field can be nested inside a field tree but let's say one of the
nested fields is on top of the page and the other is on the bottom of
the page. Now another nested structure might have fields in between.
You can not move fields out of a nested structure to match the
physical
order as this will have consequences such as naming etc.

E.g.

Visual Order on the page

        [Policy.PolicyNumber]    [Date]
        [Policy.PolicyName]


from that you'd need to prompt for Policy.PolicyNumber, Date,
Policy.PolicyName



AcroForm Order

        Policy
                PolicyNumber
                PolicyName
        Date

You can not move PolicyName below Date as it's nested inside Policy.
If
you move PolicyName out the structure would be

        Policy
                PolicyNumber
        Date
        PolicyName

But that changes a) the fully qualified name of the field and as
childs
can inherit from parents moving might miss properties only defined in
Policy.


The UI application needs to follow the definition of the visual order
as specified in the PDF. This should depend on the (Widget)
annotations
location on the page as this defines the physical location and is
what
you are looking for.

There is also a (optional) Tabs key inside the Page dictionary which
can define the order the application should follow when tabbing
through
the (visual appearance) of the fields.

from the spec:

"R (row order), C (column order), and S (structure
order). Beginning with PDF 2.0, additional values also include A
(annotations array order) and W (widget order). Annotations array
order refers to the order of the annotation enumerated in the Annots
entry of the Page dictionary (see "Table 31 — Entries in a page
object"). Widget order means using the same array ordering but
making two passes, the first only picking the widget annotations and
the second picking all other annotations."


Now if the proprietry software doesn't follow these rules what about
parsing the PDF and generating the "prompt" list instead of doing it
manually. Generating can be done by looking at the physical location
of
the Widget annotations associated to a particular form field so you'd
be able to generate the field list the way they appear in the PDF and
feed that into your configuration for the form.

BR
Maruan


Am Dienstag, dem 30.01.2024 um 18:40 -0600 schrieb Dwayne Parks:
I am almost certain that the expected order is basically top-left
to
bottom-right, yes.  Currently there is no calculation being used
that
I
know of.

Flattening:  The issue isn't in the actual flattening itself.  I
need
to
explain more about the way the PDFs are used.

The proprietary software is running as a web service where we
upload
multiple "forms" in PDF form as a library.  At the simplest level,
the
fields on the form are one of two types.

Field Type 1 is an internal field name that the software matches to
internal data that it uses to set the field's value.  Say, if the
field
name is "Policy.PolicyNumber" then it sets the field's contents to
its
internal data for the Policy # data that it has... and that is what
it
uses when it flattens the PDF.

Field Type 2 has a user-defined field name and the software (during
the
process of generating the output PDF, before flattening the fields)
prompts the user for each user-defined field's contents that will
be
used during the flattening.

There is a configuration page for each form that allows some
control
over the prompting of data from the user (validation constraints,
descriptive names for prompts, etc.) and a basic way to reorder the
order that the fields are processed (drag and drop a field up or
down
in
the order, one field at a time), but if the form is edited in any
way,
this order "resets" to one based off of the PDF's contents.

Some forms have hundreds of fields on them and so we are having
semi-technical people trying to "build out" multiple forms and
getting
very frustrated when they need to make a small change in an edition
of a
form and suddenly the order is reset to an unexpected order (I
believe
the same order that fields/widgets appear in the PDFBox debugger's
"internal structure" tree view under Root/Catalog -> AcroForm ->
Fields)
when they re-upload the PDF file.

Why this order is important (for the Type 2 fields only) is that we
want
the user to be prompted for each user-defined field in order from
top-left to bottom-right, row by row.  When the order is off, this
is
no
longer possible.

No errors are thrown as the proprietary software will happily
prompt
the
user for the user-defined fields, but... it is adding hours to the
form
updating time and starting to drive our semi-technical people
crazy.

One other approach is to figure out how to force the order of the
fields
in Acrobat (which can be changed by dragging the fields up/down to
position them in the list of field names) to be "honored" when it
writes
out the PDF contents to a file.  It doesn't appear to do so.  And
it
also sometimes creates Fields with Widgets as "Kids" and fields
with
the
Widget data combined with the Field data when new fields are
created
via
copy/paste...  all of this I had hoped to handle with a "cleanup"
utility that would take the user-edited PDFs as a source and create
cleaned up PDFs as separate output files.

I hope that that makes more sense on the why.  Thanks for
listening!!!

- Dwayne

On 1/30/2024 3:33 PM, sahy...@fileaffairs.de wrote:


what is the expected order? Is it by location, top left to bottom
right? Calculation order ...

Never heard that order matters for flattening. Is the proprietry
software throwing any errors which would be a hint?


BR
Maruan

Am Dienstag, dem 30.01.2024 um 15:27 -0600 schrieb Dwayne Parks:
Hello list!

I'm dealing with a proprietary software product that accepts
PDFs
with
fields in them to "flatten" into a final output PDF.  The
difficulty
is
that it expects the ordering of the fields (or their associated
widgets)
to be in a certain order.  I don't know the exact details of
this,
but
it takes much trial and error for our folks here manually
deleting
and
recreating fields, trying them and seeing if they are accepted.

So, to greatly streamline the process of getting the
field/widget
content in the PDF files in a correct order, I would like to
write a
utility that takes a configuration file containing a list of
Field
Names
and reorders the content in the PDF to match the order they are
in
the
configuration file.

My naive initial idea is to:

   - Write a utility that outputs the current list of fields
(in
the
     PDF in the order that they are there) into a config file
   - Allow a user to reorder the lines of field names as
desired
   - Write a utility that takes the config file and the PDF and
     rebuilds the field list/tree in the order that the config
file
     specifies... then writes out the updated PDF contents to a
new
     PDF file

Alternately, I believe that there is an order for forms/widgets
that
is
specified in Adobe Acrobat (tab order?) that I might be able to
try
to
try to recreate.  I'm not sure if that will work, but it would
allow
non-technical users to define the needed order without
intervention
from
technical staff.

I realize that there might be issues with combined field/widget
fields
if it comes to needing to order the widgets instead, but I am
wanting
to
start with the above and go from there.

So, I have a few questions to start with that someone might be
able
to
help me out with!

- Are there any examples of doing this sort of order
modification?
- Is it possible to reorder field contents at the PDDocument /
   PDAcroForm / PDField level?
- Is it possible to reorder widget annotations at the
PDAnnotiation /
   PDAnnotationWidget level?
- Do I need to drop down to the COS* object level to do this?

Thanks in advance for any pointers, info or suggestions!

- Dwayne

---------------------------------------------------------------
--
----
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



-----------------------------------------------------------------
--
--
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



-------------------------------------------------------------------
--
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to