Comments Page - The smallest Hello World program

« Back The smallest Hello World programblog.lohr.devSubmitted by michidk 8 days ago

bd01 5 days ago
This is pretty bad. Let's start with the very first instruction:
```
  mov rax, 1
```
An actual "mov rax, 1" would assemble to 48 B8 01 00 00 00 00 00 00 00, a whopping TEN bytes.
nasm will optimize this to the equivalent "mov eax, 1", that's 6 bytes, but still:
```
  xor eax, eax ; 2 bytes
  inc eax      ; 2 bytes
```
would be much smaller. Second line:
```
  mov rdi, 1
```
You already have the value 1 in eax, so a "mov edi, eax" (two bytes) would suffice. Etc. etc.
- xpasky 5 days ago
  push 1 pop rax
  is even shorter (credit: https://old.reddit.com/r/programming/comments/q6mnz1/what_is...)
  musicale 5 days ago
  I feel like I shouldn't love x86 encoding, but there is something charming about this. Probably echoing its 8-bit predecessors. It seems like it's designed for tiny memory environments (embedded, bootstrapping, etc.) where you don't mind taking a hit for memory access.
- michidk 4 days ago
  I was able to shave off one additional byte with this:
  ... xor rax, rax ; = 0 inc rax ; = 1 - syscall: sys_write mov rdi, rax ; copy 1 - file descriptor: stdout lea rsi, [rel msg] ; pointer to message mov rdx, 14 ; message length syscall ... $ nasm -f bin -o elf elf.asm; wc -c elf; ./elf 166 elf Hello, World!
  So I guess NASM already optimizes this quite well
  However, using the stack-based instructions as xpasky hinted at:
  ... push 1 ; syscall: sys_write pop rax pop rdi ; copy 1 - file descriptor: stdout lea rsi, [rel msg] ; pointer to message push 14 ; message length pop rdx syscall ...
  I get down to 159 bytes! I updated the article to reflect that
  bd01 4 days ago
  That second snippet is pretty funny:
  push 1 pop rax pop rdi
  You can't push a value once and pop it twice, that's not how a stack works! You're popping something else off the stack. So why does this even work?
  Linux passes your program arguments on the stack, with argc on top. So when you don't pass any arguments, argc just HAPPENS to be 1. Which you then pop into rdi. Gross!
  michidk 3 days ago
  Of course - you are completely right, an oversight in wanting to correct my mistake as quickly as possible.
  With that fixed, is there any reason not to use push here?
  bd01 3 days ago
  Yes, because:
  push 1 ; 6A 01 (2 bytes) pop rdi ; 5F (1 byte)
  is longer than a simple:
  mov edi, eax ; 89 C7 (2 bytes)
  michidk 2 days ago
  I think your statement might only apply to 32 bit (one of the constraints mentioned early in the blog post was 64 bit).
  But even if it was 32 bit, then we would't have to copy a 1, since the syscall number for sys_write would be 4 instead of 1.
  I get the same total size with both variants in 64 bit mode.
  push 1 pop rax mov rdi, rax
  Assembling to 48 89 C7 (3 bytes)
  seems to be same in size as
  push 1 pop rax push 1 pop rdi
  Assembling to 6A 01 5F (3 bytes)
  bd01 2 days ago
  That's because you're using `mov rdi, rax` again. You keep changing `edi, eax` to `rdi, rax`. Why?
  The default operand size in 64-bit mode is, for most instructions, still 32 bits. So `mov edi, eax` encodes the same in 32- and 64-bit mode.
  For `mov rdi, rax` you need an extra REX prefix byte [1], that's the 48 you're seeing above, but you don't need it here.
  [1] https://wiki.osdev.org/X86-64_Instruction_Encoding#REX_prefi...
  michidk 2 days ago
  okay, I didn't know that, thanks for the background. I wonder why the assembler would not optimize this though.
  I noticed that I then could also shave of one byte more by using lea esi, [rel msg] instead of lea rsi, [rel msg].
  michidk 3 days ago
  should be ... push 1 ; syscall: sys_write pop rax push 1 pop rdi
  of course
- rep_lodsb 5 days ago
  Linux initializes all general purpose registers to zero. It's not documented AFAIK, but should be reliable - it has to init them to some value anyway to avoid leaking kernel state. So you can get away with:
  mov al,1 ;write mov edi,eax ;handle=stdout mov esi,msg ;assumes load address below 4G mov dl,msg.len syscall mov al,60 ;assuming syscall succeeded, EAX was bytes written xor edi,edi syscall
  The load address stays constant unless there's some magic GNU extension header to enable ASLR. If we could get the code loaded below 64K, we could save another byte by using SI instead of ESI; however this doesn't work by default, you'd have to run 'echo 0 > /proc/sys/vm/mmap_min_addr' as root first.
  bd01 5 days ago
  Initial register state is documented to be undefined except for rbp, rsp and rdx [1].
  Can you say for certain that no other Linux version ever used GPRs to pass something else?
  [1] System V ABI, page 29 (last line) and 30, https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf
  rep_lodsb 5 days ago
  For certain? No, but I wouldn't expect it. Not sure what that function pointer in rdx is intended for, but Linux doesn't use it.
  (Note for pedants: rsp is technically a "general purpose register", but of course it is initialized to point to the userspace stack instead of zero.)
  retrac 5 days ago
  Assuming it is initial zero
  inc eax
  is a byte shorter than mov al, 1
  rep_lodsb 5 days ago
  Yes, but only in 32 bit mode. Not that it matters, except for the hypothetical future processor or Linux kernel that is no longer compatible with that :)
- michidk 5 days ago
  Thanks, that makes total sense. I was so focused on the ELF part that I didn't even consider optimizing the initial assembly further. Will fix it and edit the article.
Tepix 5 days ago
Here's a tiny DOS COM file that does it in 18 bytes:
```
    ;; 18 bytes
    DB 'HELLO_WOIY<$'  ; executes as machine code, returning SP to original position without overwriting return address
    
    mov  dx, si    ; mov dx,0100h MS-DOS (all versions), FreeDOS 1.0, many other DOSes
    xchg ax, bp    ; mov ah,9     MS-DOS 4.0 and later, and FreeDOS 1.0
    int  21h
    ret
```
(credits: https://stackoverflow.com/questions/72635031/assembly-hello-...)
- rep_lodsb 5 days ago
  Well, it prints something that is the same length as the correct message at least.
  undefined 4 days ago
  [deleted]
- musicale 5 days ago
  COM files for CP/M and DOS really are a no-nonsense executable format.
  I'm a bit disappointed that Linux (or BSD, macOS, etc.) doesn't support them (or similar) out of the box, though Windows will sort of run them via ntvdm.
smokel 5 days ago
My favorite language for implementing short Hello World programs in is HQ9+ [1].
Joking aside, this page [2] used to be a great tutorial on writing small ELF binaries, but I'm not sure whether it will still work in 64-bit land. It proved very helpful for writing a 4K intro back in 1999.
[1] https://esolangs.org/wiki/HQ9%2B
[2] https://www.muppetlabs.com/~breadbox/software/tiny/teensy.ht...
mrfinn 5 days ago
These challenges are funny - they remind me of the old days. Back in the DOS/Windows days, we used to have the .com format, which was perfect for tiny programs. One could even write a program of less than 10 bytes that could actually do something!
We've come a long way since then, and is like, at some point, nobody cared about optimizing executable size anymore
- secondcoming 5 days ago
  People in the embedded space care. Symbian OS was compiled for small size, only certain parts were allowed use O3, such as the jpeg decoder.
- theandrewbailey 5 days ago
  Some people care about executable size, (mostly) everyone else ships Electron apps.
  ramon156 5 days ago
  Tell it to my boss, he wants his app last week.
- hinkley 5 days ago
  I learned to write COM programs at some point but quickly unlearned it. There were some spots where you can use them and not .bat files, but outside of that it’s a lot.
- smokel 5 days ago
  debug -a 100 178A:0100 int 19 178A:0102 -r cx CX 0000 :2 -n reboot.com -w Writing 00002 bytes -q
  dim13 4 days ago
  Some more:
  Quick'n'dirty:
  .model small .code org 100h start: int 19h ; Bootstrap loader end start
  More "correct":
  .model small .code org 100h start: db 0EAh ; Jump to Power On Self Test - Cold Boot dw 0,0FFFFh end start
  Even more "correct":
  .model small .code org 100h start: mov ah,0Dh int 21h ; DOS Services ah=function 0Dh ; flush disk buffers to disk sti ; Enable interrupts hlt ; Halt processor mov al,0FEh out 64h,al ; port 64h, kybd cntrlr functn ; al = 0FEh, pulse CPU reset end start
  mrfinn 4 days ago
  Great example, a two bytes reboot utility. From the times when we could turn off the computer with a push of a button without fearing a global catastrophe...
- xpasky 5 days ago
  JMP FFFF:0000
  mrfinn 5 days ago
  INT 13h... uff chills
5- 5 days ago
here's an 80 byte x86_64 linux 'hello world' (okay, not 'Hello world!'). convert to binary with xxd -r -p:
```
  7f454c46488d3537000000ffc7b20eeb03003e00
  b001eb1a01000000050000001800000000000000
  1800000005000000b03c0f05ebfa380001006865
  6c6c0000010068656c6c00006f20776f726c640a
```
i'm sure this can be improved -- but i could never get any x86_64 linux elf to under 80 bytes. see if you can fit the exclamation point still.
- michidk 4 days ago
  Yeah I thought sth like this is possible, but (correct me if I'm wrong) this (ab)uses the ELF header and punts data in there, which goes against my requirement
  > It should be a ‘proper‘ executable binary according to the spec
  5- 4 days ago
  yes, this one conforms to 'whatever linux agrees to exec(2)', which apparently is a lot that is out of spec.
whynotmaybe 5 days ago
Could a script be a program?
Because it would be much smaller in a bat file than contains :
echo Hello World!
- theandrewbailey 5 days ago
  For the purposes of this challenge, no.
  > Let’s first establish some rules for our ‘Hello World’ program:
  > It should be able to execute directly without passing to any other programs first (so no decompression)
  > It should be a ‘proper‘ executable binary according to the spec
- hinkley 5 days ago
  Include the shebang but it’s still crazy how big minimal programs are.
- __m 4 days ago
  php would be just
  Hello World
gr33kdude 5 days ago
Linking a similar, very popular past example of this: Teensy: https://www.muppetlabs.com/~breadbox/software/tiny/teensy.ht...
- BergAndCo 4 days ago
  Thank you, I knew I had read somewhere someone put the program in the ELF header itself and got it down to 45 bytes, this is that exact post.
musicale 5 days ago
I realize TFA is trying for object code, but for source code, QuickBASIC (and its successors) isn't bad:
```
    ? "hello, world!"
```
PILOT eliminates the quotes:
```
    T:hello, world!
```
Of course a typical REPL (Python, JavaScript, Lisp, etc.) will print out something similar (but often quoted) if you just type the quoted string.
And I'm sure there is already some language (call it HELLO) which simply prints "hello, world!" for an empty program.
- charlieyu1 5 days ago
  There are probably some golfing languages out there where an empty program outputs Hello World.
  musicale 5 days ago
  I'm certain there is, but I don't have a reference for it yet other than my imaginary HELLO (Highly Efficient Limited Line Output) language.
  undefined 5 days ago
  [deleted]
undefined 5 days ago
[deleted]
fjfaase 4 days ago
Would it be fair to name the program 'Hello World!' and than use argv[0], which is on the stack, to print out 'Hello World!'?
- michidk 4 days ago
  Hehe a nice idea. But then it would not always print Hello World and you cannot execute it directly.
xpasky 5 days ago
Now, can we make it even smaller applying https://nathanotterness.com/2021/10/tiny_elf_modernized.html ? We shouldn't need the full ELF header...
- xpasky 5 days ago
  Oh it has been done: https://nathanotterness.com/2021/10/hello_105.asm
oneshtein 5 days ago
The smallest "hello, world" programs in Rust I did for Arduino are 294 bytes for blink and 388 bytes for hw.
Multicomp 5 days ago
129 byes...supposedly that's 2 punch cards according to Dr gpt. Small program!