=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- title: A Polyglot Hello World Has Appeared! date: 2023-12-30 00:00:00 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- You ever have an interesting problem drop in your lap that you just can't let go of? I was goofing off on the Internet last night and was completely nerd sniped [0] by the phrase "polyglot hello world." In a nutshell, a polyglot hello world is a "Hello World" program that is syntactically correct in as many programming languages as possible. For example, something that can be executed as both C and Python. This concept wormed itself into the deep recesses of my brain and, quite literally, broke me for the next 6 hours. After being exposed to the concept, I deliberately closed the article I was reading because I wanted to see what I could come up with on my own, and after a ton of trial and error, this is what I came up with: ``` #include #define print(a) int main(){puts(a);return 0;} #if 0 ### #= """echo" "Hello World!";"exit";puts "Hello World!"; __END__ print("Hello World!"); ### console.log 'Hello World!' process.exit() # =# print("Hello World!\n"); exit(); """#""" #endif print("Hello World!"); ``` Looks like a fucking mess, right? Well, what you're reading is a block of code that can be successfully executed (or compiled) in C, C++, and Objective-C (which isn't very impressive), but _also_ Ruby, Python, Julia, CoffeeScript, _and_ Bash. Impressed yet? ## C/C++/Objective-C To craft this beautiful monstrosity, I had to employ a number of different tricks, the most important one is the fact that a few of these languages make use of the `#` character in a meaningful way (C, C++, and Objective-C), while the others use it as a comment character. That allows us to take advantage of C's preprocessor support, and craft a macro that looks and operates in exactly the same way as Pythons. To make things more readable, here is what this source code looks like when syntax highlighted as a C program: ```c++ #include #define print(a) int main(){puts(a);return 0;} #if 0 ### #= """echo" "Hello World!";"exit";puts "Hello World!"; __END__ print("Hello World!"); ### console.log 'Hello World!' process.exit() # =# print("Hello World!\n"); exit(); """#""" #endif print("Hello World!"); ``` As you can see, it boils down to just a few lines of _actual_ code and some old-school compiler tricks: ```c #include #define print(a) int main(){puts(a);return 0;} ``` What the above block of code does is simply define a macro called `print()` that, when seen by the compiler, gets re-written as: ``` int main() { puts(a); return 0; } ``` Additionally, C has an unofficial "multi-line comment" system using the preprocessor directives, allowing you to effectively wrap a block of code in an `if FALSE` conditional, which you can see here: ```c #if 0 ### #= """echo" "Hello World!";"exit";puts "Hello World!"; __END__ print("Hello World!"); ### console.log 'Hello World!' process.exit() # =# print("Hello World!\n"); exit(); """#""" #endif ``` So what the C-based compilers ultimately see when they are executed is code that looks like this: ```c #include int main() { puts(a); return 0; } ``` Pretty cool, right? ## Bash Alright, let's move on to the Bash script. With our syntax highlighter flipped to "Shell," you can see that all that the first five lines are just comments: ```sh #include #define print(a) int main(){puts(a);return 0;} #if 0 ### #= """echo" "Hello World!";"exit";puts "Hello World!"; __END__ print("Hello World!"); ### console.log 'Hello World!' process.exit() # =# print("Hello World!\n"); exit(); """#""" #endif print("Hello World!"); ``` The actual first line of code the shell script runs is the only one that matters, namely: ```sh """echo" "Hello World!";"exit";puts "Hello World!"; ``` So what's happening here? Well, since Bash scripts are _effectively_ just an executed collection of commands you'd run in the terminal, which in themselves are just typed strings, and semicolons are equivalent to hitting the `return` key in the terminal, let's remove some of the extraneous quotes and get a better idea of what's getting executed: ```sh echo "Hello World!" exit puts "Hello World!"; ``` Easier to read? I thought so to. But what's with that `puts` at the bottom? That doesn't get executed, because the `exit` directive stops the script immediately after echoing "Hello World!" So, in reality, our script ends up being just the following code: ```sh echo "Hello World!" exit ``` ## Ruby Still curious about that `puts`, though? Well that's one of the only pieces of code that Ruby cares about. If we switch our syntax highlighter up again, you can see what our Ruby interpreter sees: ```ruby #include #define print(a) int main(){puts(a);return 0;} #if 0 ### #= """echo" "Hello World!";"exit";puts "Hello World!"; __END__ print("Hello World!"); ### console.log 'Hello World!' process.exit() # =# print("Hello World!\n"); exit(); """#""" #endif print("Hello World!"); ``` Again, there are a few special quirks we're taking advantage of to note. First, the `#` lines are comments in Ruby, so those get ignored. Secondly, the `__END__` directive tells the Ruby interpreter to treat everything after it as one big comment block (some syntax highlighters do this well, while others don't). So, with that in mind, the Ruby script looks more like this: ```ruby """echo" "Hello World!";"exit";puts "Hello World!"; ``` Just like the other languages, semicolons are mostly treated as command separators, so a different way to look at it is this: ```ruby """echo" "Hello World!" "exit" puts "Hello World!" ``` Clear as mud, right? Keep in mind that you can define raw strings into any Ruby script like this, and if they don't get assigned to a variable, then they're effectively just ignored. So what is _actually_ happening is this: ```ruby puts "Hello World!" ``` Et voila! Now we've got Ruby! ## Python Next on our list is Python: ```python #include #define print(a) int main(){puts(a);return 0;} #if 0 ### #= """echo" "Hello World!";"exit";puts "Hello World!"; __END__ print("Hello World!"); ### console.log 'Hello World!' process.exit() # =# print("Hello World!\n"); exit(); """#""" #endif print("Hello World!"); ``` Just like the others, Python uses the `#` character to denote single-line comments, but what makes Python stand out is the unique `"""` character sequence to denote multi-line comments, and when combined with Bash and Ruby's special consideration for string characters, we can take advantage of it by throwing in a few back-to-back "empty" or "unnecessary" strings: ``` """echo" "Hello World!";"exit";puts "Hello World!"; ``` Python interprets the above as a block comment thanks to the three quote characters at the beginning, so thanks to that hacky reality and our single-line comments, what's really getting seen by the Python interpreter is this: ```python print("Hello World!"); ``` ## Julia So once I knocked Python, Ruby, Bash, and the C-based languages off of my list, I started to find ways I could stretch a bit further. I've never written a line of Julia in my life. To be honest, I couldn't even tell you what Julia is normally _used_ for, but it landed on my radar as a candidate because of two criteria I was able to land on: 1. The language must support `#`-based single-line comments. 2. The language must have a _unique_ way to denote multi-line comments _without interfering in the other languages_. Julia was the first language I found that matched both of those criteria perfectly: ```julia #include #define print(a) int main(){puts(a);return 0;} #if 0 ### #= """echo" "Hello World!";"exit";puts "Hello World!"; __END__ print("Hello World!"); ### console.log 'Hello World!' process.exit() # =# print("Hello World!\n"); exit(); """#""" #endif print("Hello World!"); ``` For Julia, the multi-line comments look like this: ```julia #= THIS IS A MULTILINE COMMENT =# ``` That equals-sign at the close of the multi-line comments _almost_ got in the way, but I discovered that adding a `# ` in front of it allowed it to be ignored by the other interpreters while still being properly understood by Julia. Gross. Also, like many of the other languages, Julia has a `print()` function that we can use that _won't_ interfere with the other interpreters. But because I was shooting for consistent output ("Hello World" followed by a newline), I couldn't just have it fall back to the Python/C `print()` function at the bottom. It needed to _print a newline_! Which means I both _had to_ and _got to_ use Julia's multi-line comments in order to both print our "Hello World" and then `exit()` early. This required the Julia code to happen _after_ all the other code, but _before_ the Python block comments closed, ultimately leaving us with the following Julia code: ```julia print("Hello World!\n"); exit(); ``` ## CoffeeScript And finally, we've hit CoffeeScript: ```coffeescript #include #define print(a) int main(){puts(a);return 0;} #if 0 ### #= """echo" "Hello World!";"exit";puts "Hello World!"; __END__ print("Hello World!"); ### console.log 'Hello World!' process.exit() # =# print("Hello World!\n"); exit(); """#""" #endif print("Hello World!"); ``` An other language I've never actually written, CoffeeScript is very JavaScript-esque, except that it supports (you guessed it) `#`-based comments instead of C-style `//` and `/*` comments. It also matched my previous criteria, and CoffeeScript's multi-line comment syntax is a pretty simple (and easily ignorable by other languages) `###` tag to start and end. The other advantage CoffeeScript has (that is shared amongst most of these languages) is that it didn't seem to check the syntax of any of the code _after exiting_, which means that invalid functions can show up after `process.exit()` is called, and the CoffeeScript interpreter couldn't care less. This let us shoehorn our CoffeeScript "Hello World" in just before the Julia interpreter, like so: ```coffeescript console.log 'Hello World!' process.exit() ``` ## Fin. Not gonna lie, this monstrosity of code is both the most beautiful thing I've ever written, and the most disturbing. Seriously, once I got the challenge into my head I couldn't get it out. I _dreamt_ about it last night—when I was actually able to sleep. I'd love to find a way to expand on this with other languages, but I'm not sure I have the emotional fortitude for it at this point. I'm going to go take a nap. --- [0]: https://xkcd.com/356/ --- >> This is post 013 of #100DaysToOffload EOF