Y-coordinates in PDFs

Discuss PDF file handling with the Polybios plugin here
Post Reply
User avatar
jPV
Posts: 600
Joined: Sat Mar 26, 2016 10:44 am
Location: RNO
Contact:

Y-coordinates in PDFs

Post by jPV »

Is there a reason why you get coordinates from pdf.GetRects and pdf.GetPageLinks (and probably elsewhere too) so that origin is at bottom?

For example with this PDF: sample-link_1.pdf

And this code:

Code: Select all

@REQUIRE "polybios", {Version=1, Revision=1}
pdf.OpenDocument(1, "sample-link_1.pdf")
pdf.LoadPage(1, 1, True)
t = pdf.GetRects(1, 1, 0, 3)
DebugPrint("Top:", Int(t[0].top), "Bottom:", Int(t[0].bottom))
t = pdf.GetPageLinks(1, 1)
DebugPrint("Top:", Int(t[0].top), "Bottom:", Int(t[0].bottom))
You get this output:
Top: 736 Bottom: 727
Top: 479 Bottom: 464

So, the top values are bigger than bottom values.

But one funny detail... I've found one PDF that does give the link (but not rects) values in different (more logical) order:
Top: 796 Bottom: 782
Top: 723 Bottom: 737

...but all the others so far have shown it like mentioned. Soo, I have to check in code which, top or bottom, is bigger... but any idea why it's like this?
User avatar
airsoftsoftwair
Posts: 5425
Joined: Fri Feb 12, 2010 2:33 pm
Location: Germany
Contact:

Re: Y-coordinates in PDFs

Post by airsoftsoftwair »

Yes, that's normal. PDF uses an origin in the bottom-left corner instead of the top-left corner (as described here). I don't know, though, what's the deal with that PDF you mentioned which seems to have the coordinates in canonical order?! Can you provide that one so I can take a look?
PEB
Posts: 567
Joined: Sun Feb 21, 2010 1:28 am

Re: Y-coordinates in PDFs

Post by PEB »

My guess is that the PDF with strange coordinates is a non-standard size. If I create a PDF with the standard 8.5 x 11 dimensions, the coordinates work as expected (with 0, 0 in the lower left-hand corner). But if I create a PDF in a non-standard size, then the coordinate system is way off (as if the document were rotated 90 degrees to the left).
User avatar
airsoftsoftwair
Posts: 5425
Joined: Fri Feb 12, 2010 2:33 pm
Location: Germany
Contact:

Re: Y-coordinates in PDFs

Post by airsoftsoftwair »

After examining the PDF I can confirm that it really has the coordinates in the unusual order. But according to the Adobe PDF specification, this is completely legit:
Rectangles are used to describe locations on a page and bounding boxes for a variety of objects. A rectangle shall be written as an array of four numbers giving the coordinates of a pair of diagonally opposite corners.

Although rectangles are conventionally specified by their lower-left and upper-right corners, it is acceptable to specify any two diagonally opposite corners. Applications that process PDF should be prepared to normalize such rectangles in situations where specific corners are required. Typically, the array takes the form [llx lly urx ury] specifying the lower-left x, lower-left y, upper-right x, and upper-right y coordinates of the rectangle, in that order. The other two corners of the rectangle are then assumed to have coordinates (llx, ury) and (urx, lly).
So this means your code really has to be able to deal with this.
Post Reply