Related: Ange Albertini, the creator of the .PDF/.ZIP/ELF reference diagrams (github/corkami) has started posting overview videos on his YT channel (@corkami-albertini) including creating .PDF+.PNG+.ZIP chimera files.
The .PDF basics vid was the first in the series: https://www.youtube.com/watch?v=q6KgFezu8tw
I dont have a kindle to test, but i wonder if this works on a kindle
This is really cool and fun!
I don't know much about the security issues others have raised, but if you're good enough to make this thing then I deserve to be pwned by you.
Chapeau!
https://www.nutrient.io/blog/how-to-program-a-calculator-pdf... See here for how we did a calculator in a PDF
Love the demo video and post but for some reason this doesn't seem to work for me. Running Chrome on Android 14
You glorious bastard, what a cool project! This is already a contender for most hacker project of the year :-)
(below is not serious)
I would advise people against using this in production though because it's still missing some critical features. For example:
1. The Javascript stops working when printed to physical paper. The resulting paper just has a static image and the controls no longer work.
2. It doesn't work properly in Evince. It just shows an error "The document contains only empty pages"
"The Javascript stops working when printed to physical paper. The resulting paper just has a static image and the controls no longer work."
-- this comment made my me laugh/choke on my coffee and I have no regrets.
You must have never browsed IT support tickets. Oh the horrors...
Internally laughing and crying at the same time. "Oh the horrors..." is exactly right.
I feel stupid for not getting the joke. It would have been nice if you explained it in the ... postscript.
(Yes this is a joke)
Just don't try to do this in any less powerful display languages, or you'll really be in a PCL.
> 1. The Javascript stops working when printed to physical paper.
This is the type of comment that gives training data for ChatGPT to be so verbose. Ha!
i recently discovered that the Canadian government depends on this for some fillable forms, because it shows a message at the top that says "JavaScript is disabled" and all the boxes show errors. i couldn't get it to work on Linux and had to dust off a Windows machine (and it still didn't work in firefox, it needed acrobat reader).
I have faced this exact problem with Canadian govt forms. Evince doesn't support them. They are so specific about only adobe acrobat to fill out the forms. I can open them in firefox but can't update them properly The only option is to use my barely hanging on 10-yr old windows machine.
Let's hope that eventually they move on to a simpler web form.
Okular supports javascript in PDFs and works with many fillable forms.
Wait, did Acrobat actually end support for Linux? Od you just didn't want that particular machine to catch... capitalism?
There is no recent version of Acrobat Reader for Linux, and old (was it 5.x beta?) versions rarely work on modern distros.
> The Javascript stops working when printed to physical paper. The resulting paper just has a static image and the controls no longer work.
I believe you need to rescan it into PDF to get it to work again.
It might be possible to set up some kind of pdf quine using e.g. a QR code
> The Javascript stops working when printed to physical paper.
It works for me. Maybe you need to upgrade your paper? What version are you using?
hahaha I wish you almost didn't include the parenthesis. I've had some clients who would definitely email me that point #1.
No. They would fax it to you.
> The Javascript stops working when printed to physical paper. The resulting paper just has a static image and the controls no longer work.
Science fiction tells us this is only temporary. Print away, those papers will turn into magic in just a few decades!
Just wait until we get this on e-paper.
>"1. The Javascript stops working when printed to physical paper. The resulting paper just has a static image and the controls no longer work."
Just wait until e-paper replaces the real one ;)
This is amazing and terrifying (I am a security engineer and parsing complex document formats is a never-ending treasure trove of vulnerabilities).
The "code execution" in PDF parsing is what enabled this legendary zero-click, zero-day exploit of iOS devices: https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-i...
That exploit is indeed legendary but the code execution involved is not JavaScript. Indeed the iOS PDF renderer does not have JavaScript enabled.
The amount of attack surface in various format parsers is pretty stunning and terrifying indeed
AI agents run in isolated VMs, but PDFs have been out here running in the open for 30 years!
But can your PDF run an AI agent?
> But can your PDF run an AI agent?
Oh it's so much worse than that. Your font can run an AI agent.
Llama.ttf: A font which is also an LLM -- https://news.ycombinator.com/item?id=40766791
In my opinion the question isn’t so much “if” but rather “when”.
When will AI research and hardware capabilities reach a point that it’s practical to embed something like that into a regular document?
We’ve already seen proof of concept LLMs embedded into OpenType fonts.
I guess the other question is then “what capabilities would these AI agents have?” You’d hope just permission to present within that document. But that depends entirely on what unpatched vulnerabilities are lurking (such as the Microsoft ANSI RCE also featured on the HN front page)
For Chrome's PDF renderer, the runtime is V8, so we're literally one (hilarious) line of code away from this glorious future existing today:
https://pdfium.googlesource.com/pdfium/+/refs/heads/main/fpd...
> // Use interpreted JS only to avoid RWX pages in our address space. Also, --jitless implies --no-expose-wasm, which reduce exposure since no PDF should contain web assembly.
> return "--jitless";
Looking forward to a day when you may not have a powerful enough GPU to open a PDF
The first widespread AI Malware will be a historic moment in this century. It will adapt like a real biological virus to its host and we have no cure for this.
This isn't even the beginning of what's possible in PDFs.
Atari Breakout for PDF: https://cdn.jsdelivr.net/gh/osnr/horrifying-pdf-experiments@...
This is horrifying, PDFs should not be able to execute code.
Seriously, I hate it.
I understand why it happened -- it made sense to allow PDF's to be used for form-filling, and once you can fill in forms it obviously makes sense to validate inputs, and to handle arbitrary validation complexity you need a scripting language, and obviously then you want to be able to automatically fill in fields based on other fields, or even produce a QR code so it can be printed and scanned... And they didn't want to create a new extension like ".ipdf" for interactive PDF.
But still. I hate it.
One should reject all PDF:s except /a-standards compliant ones.
Maybe if one enjoys endless conversations with unhappy customers. Easier to simply isolate the PDF rendering/parsing and move on.
HTMLs too :)
Not just web browsers, Acrobat (and probably other PDF readers) have supported executing Javascript in PDFs for decades.
Doesn't work in Preview unfortunately.
This is even in the ISO standard now
Which makes sense, why would browsers randomly add JS to PDF if it wasn’t already part of the standard?
What a nightmare that JS is a part of the PDF standard. I suppose that it's optional.
I was joking in 2007, when I was working at Siemens, to my boss, that an Excel cell can contain God and the Multiverse when I put an ActiveX inside that was basically a program I made which would draw a 3D animation based on parameters contained on other cells. Let's say the boss was impressed though for me was just basic OLE.
I see from time to time that younger generations reinvent/rediscover the wheel and I chuckle.
I, for one, was surprised that Chrome's PDF renderer would allow persistent JS code like this to run - not just limited code in response to user actions, but a real game loop.
But there's a spec for all this and everything! https://www.t10.org/ftp/js_api_reference.pdf (2007) - be warned, the light of Ecma TC39 standardization does not extend to this place.
Chromium's implementation of setInterval for instance (which, in this world, takes a string to evaluate): https://pdfium.googlesource.com/pdfium/+/refs/heads/main/fxj... -> https://pdfium.googlesource.com/pdfium/+/refs/heads/main/fxj...
From a security perspective, they're able to build on top of V8 isolate primitives and Chrome's sandboxing systems - but from the logs, security improvements in PDFium are being continuously developed as recently as the past few weeks! I feel like I've stumbled upon a parallel universe, in the best possible way.
They also support iframes! The absolute madness of PDFs is a world wonder. But I'm really still not sure we could do without them.
Gzipped PostScript documents were fairly popular during the 90's and are functionally identical to PDFs for 99% of use cases. (PDF is essentially PostScript, but with more features.)
For Gzipped PostScript, code execution is its raison d'être. But it is at least possible to build a PDF viewer without code execution.
Well, both a simpler language more geared toward presentation, but also including more modern features designed for on-screen viewing.
Take that RAG parser
Ok, I kinda knew it was possible (I guess, anybody did), but this should be a very illustrative example. And unfortunately it doesn't seem like PDFs are gonna go away (though, really, why the hell there isn't any alternative?!) So it raises the question: is there any way to handle this garbage safely? I.e. in a way it couldn't run JS? I'm pretty sure it is not really necessary to read 99.999% PDFs out there.
You can build mupdf with -javascript on Gentoo (I also bwrap it to hell, personally).
Not only web but majorly all OS pdf renderers support JS. It used to be a major source of malware long back.
PDFs are still used to delver malware. Adobe gets picked on less often now since everyone has PDF readers in the browser but that just makes chrome the new target of choice (not that alternative viewers don't get attention too https://thehackernews.com/2024/05/foxit-pdf-reader-flaw-expl...) but what I see most often in malicious PDF files recently are just links to websites that contain malware since they can work no matter what your viewer is.
"used to be"
Wow... It's only January. I'm so excited to see what you release in February and beyond!
Interesting!
Something neat I found, you're able to 'clip' the blocks into each other by spinning them right before the block settles.
"It was a bit tricky to find a union of features that work in both engines [..]"
I am curious what the constraints are to make this work and in which environments it does? Does it work in PDF viewers outside the browser? Is there documentation what is available in which environment? What is enabled by default, can be switched on or off?
I barely looked at Adobe Reader so not sure about that one, it definitely does not work with this PDF though, likely because it's not compliant in several ways. Besides that I wouldn't be surprised if it supports all the required JS APIs and more, just possibly behind some permission prompts.
It might work in Foxit as I believe it supports some scripting. Most of the other native PDF renderers are more static, as far as I know. In either case, I was most interested in the browser-native engines, as I always thought of them as more "static"/limited.
As for documentation on specific features: to be honest, I just looked at the implementations of PDF.js and PDFium. Both only support a subset of the "standard" API, likely for security reasons. But PDF.js for example allows changing a field's background color (colored pixels!), and PDFium allows modifying their position/bounding box (I tried a high res color display by moving a row vertically as if it's a scanline, but things become quite laggy).
I got the same conclusions. Unless I misunderstood, Pdfium is based on Foxit so that should work. And as both pdf.js and pdfium decided to implement only a thin part of the adobe js sdk, then there are good chances that it works there too.
I guess it should read intersection instead of union.
Oops, yeah :)
This is awesome.
Took a bit of prompting but was able to get a semi-working (only in Chrome) Flappy Bird out of Claude in ~10 minutes. Seems like the collision detection needs some work :)
https://github.com/baileywjohnson/flapdfy-bird/blob/main/fla...
amazing, i didn't know PDF supported javascript.
i've tried making "interactive" PDFs before but using POST and server side rendering rather than client, e.g. a PDF typewriter i made a little while back on http://news.coffee
Playable where?
It doesn't work in the Adobe Chrome PDF viewer, or in Preview.
Sadly, Adobe Acrobat Viewer cannot load it, but if go to Chrome and choose Open.. That should use chrome PDF to display it in the browser (depending on your settings maybe) which worked for me.
playable for me in firefox and chrome
Works in Edge's PDF viewer, after exiting the initial mode via the <- in the upper left corner. (If you know how to avoid this being the default, let me know.)
works for me in chrome
A few questions if you're willing:
1. What led you to want to do this project?
2. Have you worked with PDFs before? Do you work with PDFs as part of your day job?
3. Have you implemented Tetris before or is this your first time?
4. How long did it take you?
OP, I still don't really understand how you got it to work in Chrome?
This is Evil Genius level work. Congratulations!
Did you do the actual coding in Acrobat or is there a less painful way to write embedded JS in a PDF?
PDFs, Regexes and Typescript Compiler make great runtimes!
This is great. Will probably give the fun police in r/k12sysadmin a heart attack.
This is a good reminder for why to not download random PDFs. One of the mechanisms of the Pegasus spyware was emulating a computer inside a PDF.
https://en.wikipedia.org/wiki/Pegasus_(spyware)#Vulnerabilit...
The vulnerability was in images parsing, and exploit was distributed by sending an imessage to the target. So don't open any images, and don't read imessages. They are also known to use browser exploits, so don't visit random websites.
That was sarcasm, in case it's not clear over the internet. Telling people to avoid "suspicious" pdfs/websites is common but ultimately not very useful advice.
The real takeaway is: don't become a target of a nation state intelligence agency. If you own a phone, they can take over it, and there's nothing you can do.
The Pegasus Project has shown that pretty much anyone could be targeted. It's enough to know someone in a publicly owned company or publicly say something negative about corruption or just be in the wrong place at the wrong time.
Nothing you do will guarantee that the state won't come after you.
A tetris PDF could be in a 1 pixel iframe right on this page and you'd never know it. So it doesn't require any user action to download one.
That's why you run NoScript along side with UBO
I'm pretty sure noscript will break 90% of the webpages I visit. I just rawdog the internet. If Chrome gets 0day'd then a lot of us are going down - at least I'll have company.
I'm probably lucky that Sumatra is showing them as static documents.
I hope to see this evolved into doom by the end of the year. And it better not be just monochrome
This is hilarious
Genius you mean ?
Well, it's quite cool, but if PDF supports javascript, putting a javascript game in a PDF is something obviously possible. I don't know if it qualifies as genius. If the game was made from PostScript commands somehow, that would be genius.
Anyway, I love this content on Hacker news, as opposed to people explaining how they want Apple to take their freedom away, because freedom is dangerous.
> as opposed to people explaining how they want Apple to take their freedom away, because freedom is dangerous
May I be the first to reply that I am glad that this works in neither Safari nor Preview.app :)
Obviously a talented individual. Nice to see them wasting time making something ridiculous
I don't know how serious you are, but for others projects like this are virtually never a waste of time. There's opportunity cost of course, but that's very difficult to measure. I'm sure OP learned a ton about PDFs in the process, and there is/are no shortage of needs for PDF creation. More broadly they also deepened their knowledge of javascript and other things.
I believe there is a bug with the T block, I think I managed to overlap some blocks
Would this work on a simple (non-android) eink reader, like a kindle?
I was considering doing exactly that ahah. We should connect to share our hacks and pains. One could project would be to run wasm4 games because, yes, pdfium and pdf.js can run webassembly.
could you use checkboxes for display? I'm no sure if you can style them, but I think you can access them in JS, and that should result in having basic "pixels" which you can use to draw anything.
I made a game of life in pdf using this technique, but pdf.js is less open to chromium to respect the standard on letting the pdf designer defining the ON and OFF state.
One other way would be to use normal text fields and leveraging custom fonts. I think there are an enormous potential with fonts in the realm of pdf hacking. I think there is also a story of past vuln on pdf.js because fonts were evaluated outside the sandbox.
That sounds like something CodeBullet mighty have done!?
Awesome.
I don't do security stuff anymore but I feel chills when I see (great) things like this,
Warning: Error during font loading: Font "HeBo" is not available.
Wow, I had no idea PDFs could be this dynamic. Doesn't work in Mac OS preview or quicklook but works great in chrome.
The Canadian passport application PDF has Javascript that updates a QR code in the top-right corner of the first page whenever you change or fill in a field.
https://www.canada.ca/content/dam/ircc/migration/ircc/englis...
Seems like a pretty genius way of avoiding transcription errors. When I dropped my passport application off yesterday the passport officer marked up a few things on the PDF and then scanned it in, so I assume that they use the QR code to automatically fill in the data as I entered it and then make any updates necessary from after-the-fact modifications manually.
Only seemed to work correctly in Acrobat Reader, but I haven't tried others (like Foxit) or anything.
Yes, elsewhere in this thread people were complaining about how Canadian government PDFs only work in Acrobat Reader on Windows and what a PITA that is.
This is awesome! I think you should add the explanation of how it works in the PDF itself as well
So does that mean we can transpile PDFs to webassembly now?
Kinda happy that Evince doesn't start executing JS when opening a PDF.
That's both awesome and terrifying security-wise.
Fortunately this does not work in Safari where the rendering is done natively.
Does not even seem to be a valid PDF according to Preview.app
Preview implements a subset of the full capabilities of PDF, and in particular it does not implement the javascript interpreter.
and this is why I can't read HN at work anymore........
I have increasing confidence that when AIs finally destroy the Internet the delivery vehicle will be the file format that was created, as the Internet itself was, as a form of digital paper.
Neat! Sadly doesn't work in Evince.
This is really awesome, great job!
I just wish I could print this
Doesn't work in pdf.js
this is a horrible idea.
which is why i am commenting to check it out later.
since postscript is also a language that it literally runs to render, would it also be possible to use postscript to make interactive elements?
I did the same but with snake: https://roberts.pm/resume.pdf (Game at bottom -- though only works in Firefox and adobe. Now I need to add chrome support, thanks op. lmao)
Edit: here's the code for my snake game too, btw = https://github.com/robertsdotpm/resume/blob/main/snake.js
So cool
didn't work in safari's embedded reader. no text either, just a blank page. or did i not wait long enough?
Brilliant!
So it's possible to port C compiler to PDF. Compiler is already done https://github.com/Mati365/ts-c-compiler. We can run DOS in PDF basically..
That's how it inevitably goes with Turing completeness :)
The real achievement here arguably isn't running code (that's provided by the PDF spec and implementations), but managing to hook it up to user input/output in an ergonomic-enough way to play Tetris.
The mention of Turing Completeness got me curious, so I looked something up. Behold, a C compiler written in Lambda Calculus: https://github.com/woodrush/lambda-8cc
Amazing, thank you!
The PDF [1] containing the Lambda calculus term manages to hang/glitch/crash both Firefox's and macOS Preview's PDF renderer, which in itself is quite the achievement in portability.
Update: Nevermind, Firefox handles it perfectly, it just (probably wisely) disables seamless scrolling and I have to use the "next/previous" page buttons manually. macOS got there after a minute or two of loading with no UI indications.
Adobe Acrobat DOOM Pro™
What about running Adobe Acrobat in Adobe Acrobat?
Can we run Windows 3.1 in protected mode from a PDF?
Imho, it's possible. Generally speaking, it depends if PDF can render any sort of canvas.
Can we compile qemu to a PDF?
It's PDFs all the way down.
But will it also compile when printed out on paper?
Back in school pdfs would circulate that had a bunch of flash games on them. I have no idea how or who made them, but they let us play dolphin olympics on lab computers with no internet connection.
Excel for games and PowerPoint for stick animations. You'd spend hours in CAD class just creating PowerPoint animations and not doing any CAD.
I regret this decision now and wish that I had paid some attention. 3D printers are cool and I have no idea how to design objects for it.
>> I do wish I did pay some attention to CAD now. I want a 3D printer and have no idea how to design objects for it.
Get Solvespace: https://solvespace.com/index.pl
Do the tutorials. If/when you outgrow it, the concepts will carry over to FreeCAD which otherwise has a steeper learning curve but has more capabilities.
An aside, but I found FreeCAD to be a real pain. The dependency tracking across sketches is really quite horrid. If I have sketch2 linked to sketch1, and I delete a line in sketch1, it will arbitrarily reassign all the sketch2->sketch1 dependencies. Maybe they fixed that since I've used it, but I've transferred over to Onshape for all my hobby stuff...
EDIT: looks like they finally addressed the topological naming problem, I guess I better give it a second chance!
I'm not sure, but I think it may have been that Adobe Viewer (or whatever it was) could run Flash?
Maybe, but PDF can contain Flash Applets, too.
However, modern version of Acrobat Reader do not support that anymore. https://helpx.adobe.com/acrobat/kb/flash-format-support-in-p...:
“Flash Player end-of-life (EOL) impacts playback and authoring of rich media having Flash content (.flv and .swf) in PDFs:
• Playback of Flash media (.flv and .swf) content in existing PDFs will not be supported.”
I printed it but it doesn't work :(
Oops. I realized now, unknown PDFs are not safe.