Kyli Rouge
.JS file
Compile to generic JavaScript, runnable by anything that can interpret it
- Whitespace is reduced as much as possible (replace /[\s^\r^\n]+/gim with a single space)
- Whitespace is replaced with underscores in class, function, and variable names, instead opting for camel-case (
Hello World
becomesHello_World
) - Underscores are replaced with two underscores in class, function, and variable names, instead opting for camel-case (
Hello_World
becomesHello__World
)
- Whitespace is replaced with underscores in class, function, and variable names, instead opting for camel-case (
- Punctuation is replaced with
;
(replace /[!,\.:\?…‽]+/gim with;
) - Keywords are replaced with JavaScript keywords and functions
- Binary prefix operators which have a partner infix operator (think add 12 and 2) are converted to a single infix operator (think 12+2)
- Class declarations are stripped of their inheritance and their ending
;
is replaced with a {. The class inheritance is placed after class by overriding prototype. Dear
PrincessCelestia
:
HelloWorld
;
becomes class HelloWorld{}HelloWorld.prototype=new PrincessCelestia();
- Literals are translated to JavaScript equivalent
- Fancy quotes replaced with plain ones (
“Hello, World!”
becomes "Hello, World!") - Booleans are replaced with
true
andfalse
- Replace
"'"'"
escapes with\"
- /(["”“]['‘’]){2}["”“]/gim becomes \"
- /["”“]['‘’]["”“]['‘’](?=\s*[!,\.:\?…‽])/gim becomes \""
- /\s['‘’]["”“]['‘’]["”“]/gim becomes "\"
- Writer name added at top as a documentation comment of the format
/** * This report, entitled “Hello World”, was written by * Kyli Rouge on the 21st of January, 2014 */
- Fancy quotes replaced with plain ones (
- If compiler is told to do so, spacing and formatting are added to make it readable
- Original source added as a comment to the top of the program
JavaScript representations of Phrases
- To do
Example
Hello World.FPP (190B)
Dear
Princess Celestia
:
Hello World
!
Today I learned
how to say hello world
!
I said
“Hello, World!”
!
That's all about
how to say hello world
.
Your faithful student
,
Kyli Rouge
.
Would compile into one of the following 3 files, depending on the compiler options specified:
Default
Hello World.JS (416B)
Princess_Celestia = function(){}; /** * This report, entitled “Hello World”, was written by * Kyli Rouge on the 21st of January, 2014 */ function Hello_World() { this.today = function() { this.how_to_say_hello_world(); }; this.how_to_say_hello_world = function() { console.log("Hello, World!"); }; } Hello_World.prototype = new Princess_Celestia(); new Hello_World().today();
Long
Compiler told to keep source
Hello World.long.JS (641B)
/* (Original FiM++ Source) Dear Princess Celestia: Hello World! Today I learned how to say hello world! I said “Hello, World!”! That's all about how to say hello world. Your faithful student, Kyli Rouge. */ Princess_Celestia = function(){}; /** * This report, entitled “Hello World”, was written by * Kyli Rouge on the 21st of January, 2014 */ function Hello_World() { this.today = function() { this.how_to_say_hello_world(); }; this.how_to_say_hello_world = function() { console.log("Hello, World!"); }; } Hello_World.prototype = new Princess_Celestia(); new Hello_World().today();
Min
Compiler told to minify Hello World.min.JS (247B)
Princess_Celestia=function(){};function Hello_World(){this.today=function(){this.how_to_say_hello_world()};this.how_to_say_hello_world=function(){console.log("Hello, World!")}}Hello_World.prototype=new Princess_Celestia();new Hello_World().today()
99 Jugs of Cider
Long
99 Jugs of Cider.long.JS (2941B, ~2.87KiB)
/* Remember when I wrote about Applejack? (I don't!) Dear Princess Cadence and Shining Armor: Cider Jugs. Today I learned Applejack's Drinking Song. Did you know that Applejack likes the number 99? (Applejack likes a lot of things...) I remembered how to sing the drinking song using Applejack. That's all about Applejack's Drinking Song! I learned how to sing the drinking song using the number ciders. As long as ciders were more than 1: I sang ciders" jugs of cider on the wall, "ciders" jugs of cider,". There was one less ciders. When ciders had more than 1, I sang "Take one down and pass it around, "ciders" jugs of cider on the wall."! Otherwise, I sang "Take one down and pass it around, 1 jug of cider on the wall."! That's what I would do, That's what I did. I sang "1 jug of cider on the wall, 1 jug of cider. Take it down and pass it around, no more jugs of cider on the wall. No more jugs of cider on the wall, no more jugs of cider. Go to the celler, get some more, 99 jugs of cider on the wall.". That's all about how to sing the Drinking Song! Your faithful student, Twilight Sparkle. P.S. Twilight's drunken state truely frightened me, so I couldn't disregard her order to send you this letter. Who would have thought her first reaction to hard cider would be this... explosive? I need your advice, your help, everything, on how to deal with her drunk... self. -Spike */ function Applejack() {} /*I don't!*/ function Princess_Cadence() {} function Shining_Armor() {} //Dear Princess Cadence and Shining Armor: Cider Jugs. Cider_Jugs = function () { this.today = function() { this.Applejacks_Drinking_Song(); }; this.Applejacks_Drinking_Song = function() { var Applejack = 99; /*Applejack likes a lot of things...*/ this.how_to_sing_the_drinking_song(Applejack); }; this.how_to_sing_the_drinking_song = function(ciders) { while(ciders > 1) { console.log(ciders + " jugs of cider on the wall, " + ciders + " jugs of cider,"); --ciders; if(ciders > 1) { console.log("Take one down and pass it around, " + ciders + " jugs of cider on the wall."); } else { console.log("Take one down and pass it around, 1 jug of cider on the wall."); } } console.log("1 jug of cider on the wall, 1 jug of cider.\r\nTake it down and pass it around, no more jugs of cider on the wall.\r\nNo more jugs of cider on the wall, no more jugs of cider.\r\nGo to the celler, get some more, 99 jugs of cider on the wall."); }; }; Cider_Jugs.prototype = new Princess_Cadence(); new Cider_Jugs().today(); // Twilight's drunken state truely frightened me, so I couldn't disregard her order to send you this letter. Who would have thought her first reaction to hard cider would be this... explosive? I need your advice, your help, everything, on how to deal with her drunk... self. -Spike
Interpretation steps
- Replace all actual newlines (U+000A, U+000D, or U+000A U+000D) in string literals with
\r\n
Feedback
Pros
- Easy to decompile
- Easy to manually compile
- Easy to run (in any browser)
Cons
- Slower than compiled to assembly
- Larger output file than assembly
.FR file
A .FR file (abbreviation for "Friendship Report") is proprietary FiM++ bytecode and must be read and executed by a virtual machine.
- Whitespace and Comments are entirely removed (except for programmer name)
- Keywords are represented by Unicode characters starting at U+0001, which represent their function, not the actual used keyword (any two synonyms are compiled to the same character).
- Binary prefix operators which have a partner infix operator (think add 12 and 2) are converted to a single infix operator (think 12+2)
- Variables, class names, and method names are compiled into hex digits
- surrounded by the Unicode character
- Literals are kept as-is, with any source quotes removed
- Booleans are preceded by
- Numbers are preceded by
- Characters are preceded by
- Strings are surrounded by
- Replace any instance of in character literals or in String literals with U+0000. This is a weak point, as it leaves two characters being represented the same way.
- Punctuation is compiled into 
Unicode representations of Phrases
- (U+0000): CLASS
- (U+0001): END_CLASS
- (U+0002): IMPORT
- (U+0003): IMPLEMENTS
- (U+0004): METHOD
- (U+0005): MANE_METHOD
- (U+0006): END_METHOD
- (U+0007): RETURN_TYPE
- (U+0008): PARAMETERS
- (U+0009): RETURN
- (U+000A): REACALL
- (U+000B): VARIABLE
- (U+000C): BOOL
- (U+000D): BOOL_ARRAY
- (U+000E): CHARACTER
- (U+000F): CHARACTER_ARRAY
- (U+0010): CHARACTER_ARRAY_ARRAY
- (U+0011): NUMBER
- (U+0012): NUMBER_ARRAY
- (U+0013): ASSIGN
- (U+0014): ASSIGN_CONSTANT
- (U+0015): REASSIGN
- (U+0016): IF
- (U+0017): IF_PARTNER_SUF
- (U+0018): END_IF
- (U+0019): ELSE
- (U+001A): END_ELSE
- (U+001B): SWITCH
- (U+001C): CASE
- (U+001D): CASE_PARTNER_POST
- (U+001E): DEFAULT
- (U+001F): WHILE
- ( ): END_WHILE
- (!): DO_WHILE
- ("): END_DO_WHILE
- ($): PROMPT
- (%): READ
- (&): ADD_IN
- ('): ADD_PRE
- ((): ADD_PRE_ARTNER_IN
- ()): DIVIDE_IN
- (*): DIVIDE_PRE
- (+): DIVIDE_PRE_PARTNER_IN
- (,): MULTIPLY_IN
- (-): MULTIPLY_PRE
- (.): MULTIPLY_PRE_PARTNER_IN
- (/): SUBTRACT_IN
- (0): SUBTRACT_PRE
- (1): SUBTRACT_PRE_PARTNER_IN
- (2): DECREMENT
- (3): INCREMENT
- (4): AND
- (5): OR
- (6): XOR
- (7): XOR_PARTNER_IN
- (8): NOT
- (9): EQUAL
- (:): NOT_EQUAL
- (;): GREATER_THAN
- (<): GREATER_THAN_OR_EQUAL
- (=): LESS_THAN
- (>): LESS_THAN_OR_EQUAL
- (?): NOTHING
- (@): TRUE
- (A): FALSE
Example
Hello World.FPP (190B)
Dear
Princess Celestia
:
Hello World
!
Today I learned
how to say hello world
!
I said
“Hello, World!”
!
That's all about
how to say hello world
.
Your faithful student
,
Kyli Rouge
.
Would compile into:
Hello World.FR (93B (51% compression)); click to view, as Wikia won't let special characters on the site
Interpretation steps
Hello World.FPP (190B)
- Read in original code:
-
Dear
Princess Celestia
:
Hello World
!
Today I learned
how to say hello world
!
I said
“Hello, World!”
!
That's all about
how to say hello world
.
Your faithful student
,
Kyli Rouge
.
-
- Remove comments (except the special programmer name comment) and unnecessary whitespace:
-
Dear
Princess Celestia
:
Hello World
!
Today I learned
how to say hello world
!
I said
“Hello, World!”
!
That's all about
how to say hello world
.
Your faithful student
,
Kyli Rouge
.
-
- Replace phrases and punctuation with generics:
-
CLASS
Princess Celestia
;
Hello World
;
MAIN_METHOD
how to say hello world
;
PRINT
“Hello, World!”
;
END_METHOD
how to say hello world
;
END_CLASS
;
Kyli Rouge
;
-
- Surround literals with special Unicode characters, removing quotes:
-
CLASS
Princess Celestia
;
Hello World
;
MAIN_METHOD
how to say hello world
;
PRINT
Hello, World!
;
END_METHOD
how to say hello world
;
END_CLASS
;
Kyli Rouge
;
-
- Replace class, method, and variable names with numbers:
-
CLASS
0
;
1
;
MAIN_METHOD
2
;
PRINT
Hello, World!
;
END_METHOD
2
;
END_CLASS
;
Kyli Rouge
;
-
- Remove remaining whitespace:
-
CLASS
0
;
1
;
MAIN_METHOD
2
;
PRINT
Hello, World!
;
END_METHOD
2
;
END_CLASS
;
Kyli Rouge
;
-
- Replace generic phrases and punctuations with special Unicode characters:
-
U+0000
0

1

U+0005
2

#
Hello, World!

U+0006
2

U+0001

Kyli Rouge

-
Hello World.FR (93B)
Feedback
Pros
- Easy to decompile
- Easy to manually compile
- Easier to read than my pseudo-assembly :)
Cons
- Proprietary; requires a virtual machine
- Some redundancies with surrounding characters
- Maybe it's too simple to manually decompile, and could not hold the levels of "security" proprietary bytecode needs.
- It's unnaturally difficult for a program to read, and I personally think it's not bytecode but a crunched compression. Just look at the Java bytecode! It's totally another thing in respect to the source!
- Some statements are redundant. Why you have infix and prefix operators when you can base your code entirely on the former?
- Good point. I'll work on it :3
UNiTY (Mattia Borgo)
I was thinking of a more easy to execute bytecode, much like a VM.
Format: .fb and .fba files
.fb files (FiM++ bytecode) is another language by itself, put under FiM++ and used in a VM.
.fba files (FiM++ bytecode archive) are libraries of classes.
Format Specification
.fb
The first character of a .fb file is always ú (0xfb).
After it, the author's name, in a Pascal-like manner (lentgh of string (byte), then string).
Then the compiled bytecode.
Instructions
Code | Instruction | Arguments | Meaning |
---|---|---|---|
0x01 | NEWC | depends* | Push a new class slot. |
0x02 | MANC | - | Reserve the mane class slot. |
0x10 | NEWM | 1 byte, 1 byte, depends*** | Declare a new method in the selected class slot. |
0x11 | MANM | - | Declare a mane method in the selected class slot. |
0x12 | CALL | 1 word, 1 word, depends** | Call a method of a class and discard result. |
0x13 | RETM | depends** | Return a value. |
0x20 | NEWV | - | Push a new variable. |
0x21 | ASSV | 1 dword, depends** | Assign a value to the selected variable. |
0x23 | ASSC | 1 dword, depends** | Assign a constant value to the selected variable. |
0x40 | WHEN | depends** | If statement. |
0x42 | ELSE | - | Else statement. |
0x50 | SWTC | depends** | Switch statement. |
0x51 | CASE | depends** | Case statement. |
0x61 | WHLE | depends** | While statement. |
0x62 | WHDO | depends** | Do while statement. |
0x63 | WEND | - | End of (do) while statement. |
0x80 | PRNT | depends** | Print statement. |
0x81 | INPT | 1 dword, depends** | Input statement. |
0xa0 | INCR | 1 dword | Increment statement. |
0xa1 | DECR | 1 dword | Decrement statement. |
0xf0 | PORT | 1 byte | Runtime-load a specific builtin module |
v0.01 -> Interfaces aren't needed, since they can be applicated at compilation time.
*
Class declaration has a different type of instruction: in the list above its argument is marked with a * (single asterisk).
To make things clear, I'll start with an example.
Dear
Princess Luna
and
Shining Armor
and
Cadence
:
An Update
:
Dear
directly translates into NEWC
, which needs a word to define its superclass, than a byte specifying the number of interfaces and their respective words.
We'll assume Shining Armor
and Cadence
have interface numbers 1 and 2, respectively;
Princess Luna
will be class slot 1.
Coding by-hand the example gives
NEWC
1
,2
:1
,2
This translates in the following bytes:
0x01 0x00 0x01 0x02 0x00 0x01 0x00 0x02
**
Many instruction have a a ** (double asterisk) next to their arguments.
It essentially means that the argument is a runtime-calculated value.
It follows the syntax described below (Values and Operators).
Example:
Did you know that
Spike’s age
is
the number
10
?
This translates to:
NEWV
ASSV
0
,n
:10
***
Method declaration has a different type of instruction: in the list above its argument is marked with a *** (triple asterisk).
To make things clear, I'll start with an example.
I learned
how to take the sum of a set of numbers
with
a number
using
the numbers
X
.
I learned
directly translates into NEWM
, which needs a byte defining the return value for the method and a byte specifying the number of arguments in input plus 1.
All arguments' types are specified.
Coding by-hand the example gives
NEWM
n
,0
:n*
This translates in the following bytes:
0x10 0x4f 0x00 0x00 0x5f
Values and Operators
Operator | Low nibble | Meaning |
---|---|---|
% | 0x0 | Explicit end of value or placeholder |
+ | 0x1 | Add |
* | 0x2 | Multiply |
- | 0x3 | Subtract |
/ | 0x4 | Divide |
& | 0x5 | And |
^ | 0x6 | Exclusive or |
| | 0x7 | Or |
! | 0x8 | Not |
> | 0x9 | Greater than |
>= | 0xa | Greater than or equal |
< | 0xb | Less than |
=< | 0xc | Less than or equal |
= | 0xd | Equal |
!= | 0xe | Not equal |
$ | 0xf |
Explicit start of value |
Type | High nibble | Meaning |
---|---|---|
_ | 0x0 | No type (nothing) |
b: | 0x1 | Boolean (simple) |
c: | 0x2 | Character (simple) |
b*: | 0x3 | Boolean (array) |
n: | 0x4 | Number (simple) |
n*: | 0x5 | Number (array) |
c*: | 0x6 | Character (array) |
c**: | 0x7 | Character (2-array) |
v: | 0x8 | Variable |
f: | 0x9-0xf | Method call with masked type |
- Numbers: 64-bit floating point values according to the IEEE 754 specs.
- Booleans: either 0x00 or 0xff, FALSE and TRUE, respectively.
- Characters: 1 Unicode character.
- Variables: 1 double word pointer.
- Methods: 1 word pointer to class, 1 word pointer to method, arguments.
- Arrays in a Pascal-like manner.
Examples
Hello World program:
Dear
Princess Celestia
:
Hello World
!
Today I learned
how to say hello world
!
I said
“Hello, World!”
!
That's all about
how to say hello world
.
Your faithful student
,
Kyli Rouge
.
Compiled language:
> author Kyli Rouge
MANC
MANM
PRNT
c*
:"Hello, World!"
ENDM
ENDC
Compiled bytecode:
0xfb 0x0a 0x4b 0x79 0x6c 0x69 0x20 0x52 0x6f 0x75 0x67 0x65 0x02 0x11 0x80 0x60 0x00 0x0d 0x48 0x65 0x6c 0x6c 0x6f 0x2c 0x20 0x57 0x6f 0x72 0x6c 0x64 0x21 0x14 0x04
or, in one dump:
fb0a4b796c6920526f75676502118060000d48656c6c6f2c20576f726c64211404
The compiled bytecode's lentgh is 33B, with a compression ratio of 81%.
Feedback
Pros
- Seems like it'd easily compile to BASIC or Assembler, making it astoundingly fast.
Cons
- Numbers should be IEEE 754 64-bit floating-point values, and this seems to only support 16-bit values.
- Were those FP values? Oh man. Anyway, I never said numbers would take 16 bit values.
- I know, but I don't see any 64-bit support, here...
- Now there is! :3
- No, there's not. Everything's still in 16-bit format, except Booleans, which you just made 32-bits...
- Well... numbers take eight bytes like a C double. I thought it was obvious, since you can't allocate 64 bits on a single memory cell. By the way, instruction format, as far as I can tell, is 8 bit format.
- No, there's not. Everything's still in 16-bit format, except Booleans, which you just made 32-bits...
- Now there is! :3
- I know, but I don't see any 64-bit support, here...
- Were those FP values? Oh man. Anyway, I never said numbers would take 16 bit values.
- Booleans take up 16 bits, rather than the ideal 1
- Done.
- If by "Done", you mean "Increased to 32 bits rather than decreased to 1 byte", then yes :I
- Where you get a 32 bit count is unknown to me, at most these are 16 bits (0xffff) or 24 with the operator. Anyway, I read your advice backwards, and I thought the original implementation was 16 bit long booleans rather than my original byte (and if with 16 bits you meant the operator AND the value, well I couldn't find another way).
- If by "Done", you mean "Increased to 32 bits rather than decreased to 1 byte", then yes :I
- Done.
- You use
>
as both a comment starter and operator- Actually, that was only a way of saying who the author is, as there are no comments in a compiled source.
Alxg833 (Alex Gould)
I wouldn't really trust myself with full bytecode editing yet, so I'm just going to work on translating files into Java and C++. A hypothetical compiler would have settings to create one or both.
Format: .fppj and .fppc files
A .fppj file would just be a standard Java file, with a signature at the beginning to denote the translation of said file back into FiM++. (Instead of just displaying it in the FiM++ editor as a Java program.) Descriptions of which specific commands the programmer used (said vs. sang, etc.) would be contained in unicode sequences written in comments at the end of each line. (The user would never actually see the Java code, and would therefore not notice the extraneous commenting, which would be hidden in the editor, possibly through use of another unicode character to mark the comment as compiler-made.) This would be fairly easy to do, as the structure of the two languages are somewhat similar. This would also allow faster compilation into .class files, and simpler interaction with Java's class library. The downside of course is that you'd be working through Java on a VM, which could be fairly slow to rewrite and compile.
A .fppc file would be similar (albeit in C++), however the code would have to be deconstructed a bit more in order to work in the same way as a .fppj file. (I was thinking of using UNiTY's instruction scheme, which should aid the translation to C++.) After that, compilation would be faster than in Java. However, C++ is a more exacting language, and I see the probability for translation errors as very high for .fppc files, much higher than their Java counterparts.
Feedback
Pros
.fppj
- Works on any OS
- Less translation between languages
- Easy to decompile
- Easy to read in a text-editor
.fppc
Cons
.fppj
.fppc
- Only works on one OS at a time
- Ambiguous; are you trying to make a C++ source file out of it or an assembly file?
- I was thinking a C++ source file. It would look coded kind of strangeley, but I think it would be doable. As per the instruction scheme, I was just thinking of simplifying the code out into syntax like NEWC, MANC, etc, before translating to C++.
- What about pointer? There is no point of using C++ if there is no pointer, and currently FiM++ is not supporting pointer
- I was thinking a C++ source file. It would look coded kind of strangeley, but I think it would be doable. As per the instruction scheme, I was just thinking of simplifying the code out into syntax like NEWC, MANC, etc, before translating to C++.
User
I propose to use AVM2 as a virtual machine for FiM.
Feedback
Pros
- We don't need to write a virtual machine.
- We don't need to invent a new file format.
- It is hard to decompile the code back into the human-readable form.
- The code will run everywhere because Tamarin is cross-platform.
Cons
- We will have to write a FiM++ to ABC bytecode compiler. It is not going to be easy because it will have to be a real compiler, not a simple translator. You cannot replace tokens with 1-symbol identifiers to get an ABC bytecode.
- Not invented here.