
I stumbled on a post where someone asked how a single variable could hold three different values, apparently simultaneously? There were some clever answers in there, but one answer really caught my eye for the pure havoc it could wreak if someone actually used it in a codebase. 😈
Here's the code Jeff suggested in his answer, which I pasted out to JSFiddle:
Before you head over to the thread to check out his answer (and assuming you haven't already figured out what's going on), check out my ports to a few other languages first. I wanted to see if they'd allow it too. Then I'll explain how he did it. 😉
C# / VB.net / F#
The .NET languages are cool with it...
Ruby
And so is Ruby...
aᅠ = 1;
a = 2;
ᅠa = 3;
puts "a: #{aᅠ}"
puts "a: #{a}"
puts "a: #{ᅠa}"
puts (aᅠ = 1 && ᅠa = 3 && a = 2);
irb(main):001:0> require("./test.rb")
a: 1
a: 2
a: 3
2
=> true
GoLang
Golang too...
package main
import (
"fmt"
)
func main() {
var aᅠ, a, ᅠa int = 1, 2, 3;
fmt.Printf("%v\n", aᅠ)
fmt.Printf("%v\n", a)
fmt.Printf("%v\n", ᅠa)
fmt.Println(aᅠ == 1 && a == 2 && ᅠa == 3);
}
// output:
// 1
// 2
// 3
// true
Try it out here: https://play.golang.org/p/a9JQPV5KUoq
Python
Our first hiccup. Python3 supports it...
aᅠ = 1;
a = 2;
ᅠa = 3;
print(aᅠ == 1 and a == 2 and ᅠa == 3); # True
... but Python2 tosses you a syntax error. Non-ASCII character? Hmm.
SyntaxError: Non-ASCII character '\xef' in file main.py on line 3, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
Paste it here to try it out: https://pyfiddle.io
Erlang
Erlang's not happy at all.
-module(test).
-export([run/0]).
run() ->
aᅠ = 1,
a = 2,
ᅠa = 3,
io:format(aᅠ == 1 andalso a == 2 andalso ᅠa == 3).
Illegal character, Will Robinson! 🤖
test.erl:6: illegal character
test.erl:8: illegal character
test.erl:10: illegal character
test.erl:10: illegal character
test.erl:10: syntax error before: '=='
test.erl:3: function run/0 undefined
error
What gives?
Okay, now you can head over to his answer if you'd like. He took advantage of a unicode character known as the zero-width space, which you can read more about here. That's right, a character that invisible to the naked eye. I'm sure it has its uses, but wow could that be abused. In fact, the wikipedia article brings up that it's banned for use in domain names, and it should be pretty obvious why.
How it's interpreted is based on a few things, like the editor you happen to be using. If you use an editor like Atom or VSCode, it's invisible. When I use it from an iTerm session on OSX, it varies. Here's what happens when I paste it in a Ruby session (very obvious), and an Erlang shell and Bash shell (where at least it shows as a blank space).
grantw:test gwinney$ irb
irb(main):001:0> \U+FFEF\U+FFBE\U+FFA0
grantw:test gwinney$ erl
Erlang/OTP 20 [erts-9.3] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]
Eshell V9.3 (abort with ^G)
1> ᅠ
* 1: illegal character
grantw:test gwinney$ ᅠ
-bash: ᅠ: command not found
It also depends on the language, and whether non-latin characters are allowed in a variable name. The default encoding in Python3 is UTF-8, apparently, so it's allowed; but it's not in Python2 so it threw out a syntax error. The .NET languages are all cool with it out of the box, I guess.
Erlang rejected the zero-width space, but there might be a way to allow it by setting the default encoding. They have quite a bit of documentation on unicode support), but heck if I can figure it out.
This was all just in fun. I had heard of the zero-width space before, but hadn't thought of the implications of using it this way. If you're looking for a way to make your team hate you, have fun. 😜