diff --git a/README.md b/README.md index dcd5886..8a61f95 100644 --- a/README.md +++ b/README.md @@ -1,42 +1,64 @@ # Bits Runner Builder +Welcome to Bits Runner Builder! Compiler for the Bits Runner Code (BRC) language 🤘 + +## Quick links +- [BRC Language Reference](Reference.md) +- [Detailed Syntax](Syntax.md) ## Overview -Bits Runner Builder (brb) is a compiler for Bits Runner Code (brc) language, which has been designed for the Bits Runner Builder operating system. It aims to be a low-level language, which can be a replacement for C while providing a revised syntax and a couple of quality of life improvement. It's a simple system programming language, so no class hierarchies, templates, or other unnecessary fluff. +Bits Runner Builder is a compiler for Bits Runner Code (brc) language, which has been designed for the [Bits Runner](https://github.com/rafalgrodzinski/bits-runner) operating system. It aims to be an opinionated, low-level language, a sort of improved C while providing a revised syntax and a couple of quality of life improvement. It's a simple system programming language, so no class hierarchies, templates, or other unnecessary fluff. -It has been been built on top of LLVM. +It has been been built with LLVM so it should be fairly performant. Keep in mind that it is still work in progress so not everything is finished and there is still probably plenty of bugs and gremlins hiding around 🙈 -## Show me the code! +## Main features +BRC allows for low-level system programming, so one of the main features is a seamless support for embeded assembly, pointers mainipulation, and explicit data handling. For this reason data types have explicit byte-sizes, there is no runtime and the memory is manually managed. -### Comments -Like in C, comments can specified using either `\\` which will run until the end of the line or through `/* */` block. However, unlike C, the `/* bla bla /* bla */ */` comments can be also embeded inside each other. - -### Literals -**Number literals** can be specified as decimal, hexadecimal, and binary numbers. Digits can be separated by an '_' but it cannot be the first or the last character (otherwise it will get interpreted as and identifier). +The language aims to be simple, easy to reason about, and predictable. Because of this there a class-like features, but no inheritance. Composition is much better anyway and doesn't lead to incomprehensible codebases (did I mention that it's opinionated?). +## Examples ``` -// Valid examples: -1024 -1_024 -1.245 +// Basic hello world +// +@extern putchar fun: character u32 -> u32 -1_000. +main fun -> u32 + text data <- "Hello, world!\n" + + rep i u32 <- 0, text[i] != 0: + putchar(text[i]) + i <- i + 1 + ; -0xffa -0xffaa_42bb - -0b1101 -0b1010_0101 - -// Invalid examples: -_100 -1000_.100 - -0x_fa - -0b10_ -_0b1101 + ret 0 +; ``` -### Control flow +## But why? +The idea was to build the whole computing environment from scratch which can be its own thing. Many project of this kind try to be sort of recoding of C/Unix, but this is not the point in the case. This project doesn't aim at compatibility so it may hapilly break things in order to make things simpler, more modern, or just different. -### Functions +It's mostly a learning opportunity and a bit of fun, but maybe you can find some bits of interesting knowledge for your own project. + +## Quick Start +Make sure that you have cmake, llvm, and lld installed on your system. +``` +cmake -B build +cmake --build build --config Release +// or +cmake --build build --config Debug +``` +You'll then be able to finde the executable under `build/brb`. + +There are also "Build (Debug)" and "Clean" tasks specified for VSCode. There is also a launch configuartion, which you can launch by pressing F5 will will then build and start debugging using command `brb -v samples/test.brc`. You'll need to have "LLDB DAP" extension installed in VSCode. + +## Samples +Hello World +[samples/hello.brc](samples/hello.brc) + +Fibonaci Numbers +[samples/fib.brc]() + +#### How to build the samples +``` +brb samples/hello.brc +cc -o hello hello.o +``` \ No newline at end of file diff --git a/Reference.md b/Reference.md new file mode 100644 index 0000000..c57cd0b --- /dev/null +++ b/Reference.md @@ -0,0 +1,240 @@ +# BRC Language Reference + +## Overview +Bits Runner Code (BRC) borrows a lot of concepts and syntax from C, but in a slightly modified way. The idea is to use familiar concept in a simplified way, avoiding usage of unnecessary fluff and just to make the code simpler and shorter, while avoiding any unambigouity. + +Semicolons are not placed at the end of statements, but instead they delimit blocks, such as body of a function or a loop. There are no curly brackets. They are not necessary if you, for example, declare an external method or have an `if` expression in a single line. Round brackets `()` are also not ncessary in most of the cases, for example when defining a function or evaluating a condition, but are required for function calls. New lines also play important role and may be required or invalid, depending on the context. + +Single equal sign `=` denotes comparion and instead left arrow `<-` is used as an assign symbol. + +Source code is grouped into named modules, each module can be compromised of number of files. There is no separate header file, instead prefix `@pub` if attached to externally visible symbols. + +## Language Elements +- Comments (`//`, `/* */`) +- Literals (`123`, `0xa2`, `0b0101`, `3.14`, `"Hello"`, `'!'`, `true`, `false`) +- Operators (`+`, `-`, `*`, `/`, `%`, `<-`, `<`, `<=`, `>`, `>=`, `=`, `!=`) +- Variables (`u8`, `u32`, `s8`, `s32`, `r32`, `data`, `blob`) +- Functions (`fun`) +- Raw Functions (`raw`) +- Conditional Expressions (`if`, `else`) +- Loops (`rep`) + +## Comments +Like in C, comments can specified using either `\\` which will run until the end of the line or through `/* */` block. However, unlike C, the `/* bla bla /* bla */ */` comments can be also embeded inside each other. +``` +// this is a main function +main fun -> u32 + /* + num1 <- 2 + 5 + /* num1 <- 4 * num1 */ + // num1 <- 5 * num1 + */ + ret 0 +; + +``` + +## Literals +**Number literals** can be specified as decimal, hexadecimal, and binary numbers. Digits can be separated by an '_' but it cannot be the first or the last character (otherwise it will get interpreted as an identifier). + +``` +// Valid examples: +1024 +1_024 +1.245 + +1_000. + +0xffa +0xffaa_42bb + +0b1101 +0b1010_0101 + +// Invalid examples: +_100 +1000_.100 + +0x_fa + +0b10_ +_0b1101 +``` + +**Text literals** can be specified either as an implicitly zero terminated string, or as a single character. Strings are converted into arrays. Characters can be also backslash '\' escaped, just like in C. +``` +// Examples +"Hello world" +"Hello world\0" // in this case, the final zero is not appended +'H' +'!' +'\n' + +// Escape sequences +'\b' // backspace +'\n' // new line +'\t' // tab +'\\' // backslash +'\'' // single quotation mark +'\"' // double quotiation mark +'\0' // 0 (as in integer 0) +``` + +**Boolean literals** can also be specified using `true` or `false` keywords. There is no implicit conversion from integer to boolean and vice-versa. + +## Operators +All the standard operators, such as `+`, `-`, `*`, `/`, `%` are available. The biggest difference is that the assignment uses the left arrow `<-` and thus a comparison can be done through a single equal sign `=`. +``` ++ // addition +- // subtraction +* // multiplication +/ // division +% // division reminder += // is equal +!= // is not equal +< // is less than +<= // is less or equal +=> // is greater than +> // is gearter or equal +<- // assignment +( ) // precdence +``` + +## Variables +Variables are specified by first providing the name and then the type. There is also an optional initializer. +``` +bytesInKilobyte u32 <- 1_024 +text data <- "Hello world!" +pi r32 <- 3.14 +``` + +**Simple variables** +There are standard float and integer types available, but unlike in C, you have to be explicit about their size and signiness. You can only perform `=` and `!=` operations on booleans. There is no `void` type or an equivalent. +``` +u8 // unsigned integer, 8 bits +u32 // unsigned integer, 32 bits +s8 // signed integer, 8 bits +s32 // signed integer, 32 bits +r32 // floating point (real), 32 bits +bool // true or false +``` + +**Data variables** or arrays, as known in C. They are a sequence of static length or elements of the same type. Length has to be specified either explicitly or through and initializer. +``` +text data <- "Hello world!" +fibonaciNumbers <- [1, 1, 2, 3, 5, 8] // Anything past the first 4 numbers will be ignored +``` + +**Blob variables**, otherwise known as structures. Composite types which we can specify by ourselves. The usage is fairly smillar as in C. Semicolon and new line are required in the definition. +``` +user blob + age u32 + name data + isActive bool +; + +bob user +bob.age <- 18 +bob.name <- "Bob" +bob.isActive <- true +``` + +## Functions +Functions in BRC work just like in C. You can specify an optional list of arguments and a return type. Calls require usage of round brackets. Colon should be omitted if there are no arguments. Arrow has to be on the same line as the return type. +``` +// Valid examples +main fun -> u32 + ret 0 +; + +addNums fun: num1 s32, s32 -> s32 + ret num1 + num2 +; + +addNums fun: + num1 s32, + num2 s32 + -> s32 + + ret num1 + num2 +; + +addNums(5, 4) + +logInUser fun: user User + // do some internet stuff 📡 +; + +logInUser(bob) + +explodeEverything fun + // Do a boom! 💥 +; + +explodeEverything() + +// Invalid examples +addNums num1 s32, num2 s32 -> s32 +[..] + +addNums: num1 s32 + ,num2 s32 -> s32 +[..] + +addNums: num1 s32, num2 s32 -> + s32 +[..] +``` + +## Raw Functions +A unique feature of BRC is a seamless use of inline assembly. Raw functions can be used just like normal functions, altoght there is a couple of limitations and they require so called constraints to be specified. It's the same as in gcc or clang, but they are specified as a single string instead of splitting them into input, output, and clobbers. Some more information can be found here [A Practical Guide to GCC Inline Assembly](https://blog.alex.balgavy.eu/a-practical-guide-to-gcc-inline-assembly/). Intel syntax is used for the assembly. +``` +rawAdd raw<"=r,r,r">: num1 u32, num2 u32 -> u32 + add $1, $2 + mov $0, $1 +; + +// later on +result u32 <- rawAdd(5, 4) +``` + +## Conditional Expressions +If-Else statements can be written on a single or multiple lines and are an expression, which allows them to return values. +``` +isValid bool <- if count = 0: doForEmpty() else doForCount(count) + +if numer > 0: + doStuff(number) +else + fatalError() +; + +if featureFlag: + // Special case ⚰️ +; + +if hasCompleted: exit() + +if processedElementsCount < 10: print("Success) else + print("Failure") + processFailure(processedElementsCount) +; +``` + +## Loops +C-style for, while, and do-while are all combined into a single `rep` loop. The format is `rep init, pre-condition, post-condition`. `init` allows to setup a counter, pre-condition is evaluated before and post after each loop. Each part is optional, but if you include post-condition, pre-condition must also be include. Body can be specified on the same line as the loop, in which case the final semicolon should not be included. +``` +// infinite loop +rep: doStuff() + +// do things ten times +rep i u32 <- 0, i < 10: + doStuff(i) + i <- i + 1 +; + +// do things at least once +rep i u32 <- 0, true, i < someValue: + doStuff(i) +; +``` diff --git a/Syntax.md b/Syntax.md index bcb2e2b..967cad0 100644 --- a/Syntax.md +++ b/Syntax.md @@ -1,6 +1,6 @@ # Detailed Syntax -This documents specifies what is the allowed syntax for statements and expressions! +This documents specifies what is the allowed syntax for statements and expressions 🤓 ### Symbols used `?` 0 or 1 instances diff --git a/lines_count.sh b/lines_count.sh new file mode 100755 index 0000000..821501f --- /dev/null +++ b/lines_count.sh @@ -0,0 +1,2 @@ +#!/bin/bash +find . \( -name "*.h" -o -name "*.cpp" \) -print0 | xargs -0 wc -l diff --git a/src/Parser/Parser.cpp b/src/Parser/Parser.cpp index 2720985..bd27bed 100644 --- a/src/Parser/Parser.cpp +++ b/src/Parser/Parser.cpp @@ -253,8 +253,8 @@ shared_ptr Parser::matchStatementFunction() { Parsee::groupParsee( ParseeGroup( { - Parsee::tokenParsee(TokenKind::RIGHT_ARROW, true, false, false), Parsee::tokenParsee(TokenKind::NEW_LINE, false, false, false), + Parsee::tokenParsee(TokenKind::RIGHT_ARROW, true, false, false), Parsee::valueTypeParsee(true, true, true) } ), false, true, false