Compare commits

..

38 Commits

Author SHA1 Message Date
seeseemelk b6d0a78d06 Add integration test 2026-05-01 09:44:46 +02:00
seeseemelk 3bdccf2000 Add integration test framework 2026-04-30 22:21:08 +02:00
seeseemelk 177fb971e4 Rename AST structures to Tree and relocate freeing logic 2026-04-30 21:46:15 +02:00
seeseemelk ea55dedd07 Refactor AST and Parser into modular subdirectories
- Split ast.h into granular headers in v0/ast/
- Split parser.c into modular implementation files in v0/parser/
- Move and rename parser tests to v0/parser/test_*.c
- Update build system (include.mk) with modular sub-makefiles
- Maintain v0/ast.h and v0/parser.h as umbrella headers
2026-04-30 21:23:07 +02:00
seeseemelk 4bd66ea216 More variable stuff 2026-04-30 20:25:53 +02:00
seeseemelk 0704284726 Can parse variables 2026-04-29 21:39:48 +02:00
seeseemelk 94ae665a0a Add initial variable work 2026-04-29 21:20:52 +02:00
seeseemelk e2d8e385f0 Add basic var tokens 2026-04-29 20:28:52 +02:00
seeseemelk 76f9168c5f Fix docs 2026-04-29 20:21:52 +02:00
seeseemelk 1ab021561e Fix bad test 2026-04-29 20:20:16 +02:00
seeseemelk f260e02efa Refactor parser 2026-04-29 20:15:05 +02:00
seeseemelk 1c5d49d682 Fix valgrind errors 2026-04-29 19:41:00 +02:00
seeseemelk cc25563cd2 Cleanup 2026-04-29 19:23:59 +02:00
seeseemelk 323a599399 Build with debug symbols 2026-04-29 18:53:02 +02:00
seeseemelk ec896495a3 Fix infinite loop bug 2026-04-29 14:40:06 +02:00
seeseemelk eb4b0495f2 Working on parser refactor 2026-04-29 14:36:42 +02:00
seeseemelk 1f40c8f5ee Refactor tests a bit more 2026-04-29 13:25:41 +02:00
seeseemelk 98d58a2169 Refactor tests 2026-04-29 13:09:14 +02:00
seeseemelk f0621a8076 Refactor parser 2026-04-29 11:53:26 +02:00
seeseemelk 84747028f5 Ensure alias and import can be mixed 2026-04-29 11:46:02 +02:00
seeseemelk f90cad2b96 Use proper public keyword 2026-04-29 11:43:14 +02:00
seeseemelk e09bd72441 Update ast interface 2026-04-29 11:24:42 +02:00
seeseemelk 9035cc639c Add alias to ast 2026-04-29 11:18:40 +02:00
seeseemelk 3288efdfd7 Refactor test interface 2026-04-29 10:59:06 +02:00
seeseemelk 34b7939f76 Refactor parser to C11 and update build configuration 2026-04-29 10:38:34 +02:00
seeseemelk 15714393c3 Refactor parser to use Token in AST and update tests 2026-04-29 10:35:12 +02:00
seeseemelk 146aa4d9d1 Convert codebase to C89 compatibility and update test scripts 2026-04-29 10:21:29 +02:00
seeseemelk 189c21667b Ignore intellij files 2026-04-28 16:07:46 +02:00
seeseemelk abdc6d67dc Re-order log lines 2026-04-28 16:06:21 +02:00
seeseemelk d89833b705 Add TYPES documentation 2026-04-28 16:06:12 +02:00
seeseemelk bfb3b69be1 fix: add util.c to source files 2026-04-26 22:48:31 +02:00
seeseemelk dc523c8d3c chore: remove legacy v0/string.h 2026-04-26 22:42:10 +02:00
seeseemelk 05dfb3725b fix: replace unsafe fixed-size buffers with dynamic formatting helpers; add util format helpers; centralize log_on_line cleanup 2026-04-26 22:42:10 +02:00
seeseemelk 70998643fb Add AGENTS.md 2026-04-26 22:30:51 +02:00
seeseemelk 129036b539 Fix all valgrind errors 2026-04-26 22:13:39 +02:00
seeseemelk dbc69eddc8 Update test target to use valgrind 2026-04-26 21:35:14 +02:00
seeseemelk 421338d995 Fix log header generation and EOF location reporting 2026-04-26 21:34:28 +02:00
seeseemelk f33e8d3e25 Update log headers 2026-04-26 21:19:59 +02:00
76 changed files with 1564 additions and 531 deletions
+7 -3
View File
@@ -26,9 +26,8 @@ There will be no `test_buffer.h`. Instead, `test.c` will directly
Every syntax error path identified in the parser MUST have a corresponding test. Every syntax error path identified in the parser MUST have a corresponding test.
## Language Syntax ## Language Syntax
Since this is a compiler for a new language, do not assume anything Since this is a compiler for a new language, do not assume anything of its syntax.
of its syntax. Always check the `specs` directory to see examples and documentation about the language.
Always check the `specs` directory.
If there is anything unclear, ask the user for clarification. If there is anything unclear, ask the user for clarification.
It is certainly possible that there are contradictions in the It is certainly possible that there are contradictions in the
@@ -40,3 +39,8 @@ the agent to update the implementation.
When creating a commit, make sure that both the user's and the agent's modifications When creating a commit, make sure that both the user's and the agent's modifications
are included in the commit. are included in the commit.
Only create a commit when specifically asked for that. Never assume implicitly that the
user wants you to create a commit.
Even if they asked you to create a commit in an earlier task, it does not mean that
you should also create a commit in a later task.
+2
View File
@@ -1 +1,3 @@
/c2 /c2
/.idea/*
!/.idea/c_cpp_properties.json
+12
View File
@@ -0,0 +1,12 @@
{
"configurations": [
{
"name": "CLion",
"includePath": [
"${workspaceFolder}/v0/*"
],
"cStandard": "c89",
}
],
"version": 4
}
Symlink
+1
View File
@@ -0,0 +1 @@
.github/copilot-instructions.md
+7 -1
View File
@@ -1,6 +1,6 @@
.PHONY: all test clean .PHONY: all test clean
all: c2 test all: c2 test integration-test
c2: v0/bin/c2 c2: v0/bin/c2
cp $< $@ cp $< $@
@@ -13,3 +13,9 @@ clean::
rm -f c2 rm -f c2
include v0/include.mk include v0/include.mk
integration-test: v0/bin/c2 v0/bin/test_integration
./v0/bin/test_integration
v0/bin/test_integration: v0/test_integration.c
$(CC) $(CFLAGS) -o $@ $<
+16
View File
@@ -13,5 +13,21 @@ In order to run the tests, run `make test`.
## Versioning ## Versioning
The current version is v0. Its source code lives in the `v0` directory. The current version is v0. Its source code lives in the `v0` directory.
## Testing
### Unit Tests
Run unit tests with:
```bash
make test
```
### Integration Tests
Integration tests compare the compiler output with expected C files.
To add a new integration test, create a new directory under `v0/integration_tests/` with `input.c2` and `expected.c` files.
Run integration tests with:
```bash
make integration-test
```
## Languages Specifications ## Languages Specifications
See the specs directory for information on the actual language syntax. See the specs directory for information on the actual language syntax.
+1
View File
@@ -0,0 +1 @@
Hello, world
+25
View File
@@ -0,0 +1,25 @@
# Types
C2 has both built-in types and user-defined types.
## Builtin types
C2 has the following types builtin:
- `void`
- `i8`
- `i16`
- `i32`
- `i64`
- `u8`
- `u16`
- `u32`
- `u64`
## Type Aliases
Types can be aliased to different names using the alias keyword.
Here's a list of the default builtin aliases.
```c2
alias int = i32;
alias uint = u32;
alias char = u8;
alias string = char[];
```
+24
View File
@@ -0,0 +1,24 @@
# Variables
Variables can be defined in the global scope, in structs and classes, and in functions.
## Global variables
Global variables can be defined as such:
```c2
// Defines a global variable called my_var.
i32 my_var;
// Defines a const variable.
const i32 my_var;
// Defines a global variable whose type is determined automatically.
// The value will be determined at runtime.
var my_var = 123;
// Defines a const variable whose type is determined automatically.
const my_var = 123;
// Defines a global variable whose initial value is computed at compile-time.
// If it cannot be computed at compile-time, an error is thrown.
static my_var = 123;
```
+9 -21
View File
@@ -4,30 +4,18 @@
#ifndef AST_H #ifndef AST_H
#define AST_H #define AST_H
#include <stdbool.h> #include "ast/expression.h"
#include <stddef.h> #include "ast/declaration.h"
#include "ast/module.h"
typedef struct {
/// @brief The name of the module being imported.
char* module_name;
/// @brief Whether the import is public or not.
bool is_public;
} ImportDeclaration;
/** /**
* The top-level model. * Frees a module and all its children.
* Every file matches an entire Module.
*/ */
typedef struct { void ast_free_module(ModuleTree* module);
/// @brief The name of the module.
char* name;
/// @brief The list of imports in the module. /**
ImportDeclaration* imports; * Frees a type expression.
*/
/// @brief The number of imports in the module. void ast_free_type(TypeTree* type);
size_t import_count;
} Module;
#endif #endif
+49
View File
@@ -0,0 +1,49 @@
#ifndef AST_DECLARATION_H
#define AST_DECLARATION_H
#include "expression.h"
#include "../bool.h"
typedef struct {
/** @brief The name of the module being imported. */
char* module_name;
/** @brief Whether the import is public or not. */
bool is_public;
} ImportTree;
/**
* A declaration that aliases one type to another.
*/
typedef struct {
/** @brief The name of the alias. */
const char* name;
/** @brief The value of the alias. */
TypeTree value;
} AliasTree;
/**
* A declaration of a variable, which may be a constant or not, and may be static or not.
*/
typedef struct {
/** @brief The name of the variable. */
char* name;
/** @brief The type of the variable. */
TypeTree type;
/** @brief The optional initializer expression. */
ExpressionTree* initializer;
/** @brief Whether the variable is public or not. */
bool is_public;
/** @brief Whether the variable is static or not. */
bool is_static;
/** @brief Whether the variable is a constant or not. */
bool is_const;
} VariableTree;
#endif
+9
View File
@@ -0,0 +1,9 @@
#include "expression.h"
#include <stdlib.h>
void ast_free_type(TypeTree* expr) {
if (expr->tag == TYPE_TREE_ARRAY) {
ast_free_type(expr->array.array);
free(expr->array.array);
}
}
+52
View File
@@ -0,0 +1,52 @@
#ifndef AST_EXPRESSION_H
#define AST_EXPRESSION_H
#include "../bool.h"
typedef enum {
EXPRESSION_TREE_INTEGER,
EXPRESSION_TREE_STRING,
EXPRESSION_TREE_BOOLEAN
} ExpressionTreeTag;
typedef struct {
ExpressionTreeTag tag;
union {
int integer;
const char* string;
bool boolean;
};
} ExpressionTree;
typedef enum {
TYPE_TREE_BUILTIN,
TYPE_TREE_ARRAY
} TypeTreeTag;
/**
* An expression that evaluates to a type.
*/
typedef struct TypeTree TypeTree;
struct TypeTree {
/** @brief defines which entry in the union is valid */
TypeTreeTag tag;
union {
/** @brief Evaluates to an array of the given type. */
struct {
/** @brief A pointer to the type of the elements stored in the array. */
TypeTree* array;
} array;
/** @brief Evaluates to a builtin integer type.*/
struct {
/**
* @brief The number of bits in the integer.
* Typical values are 8, 16, 32, and 64.
*/
int bitSize;
/** @brief `true` if the type is signed, `false` if it's unsigned. */
bool isSigned;
} builtin;
};
};
#endif
+3
View File
@@ -0,0 +1,3 @@
# There are currently no .c files in the ast directory.
# This file is provided for future consistency.
AST_SRC := v0/ast/module.c v0/ast/expression.c
+43
View File
@@ -0,0 +1,43 @@
#include "module.h"
#include "expression.h"
#include <stdlib.h>
void ast_free_type(TypeTree* type);
void ast_free_module(ModuleTree* module) {
if (module == NULL) {
return;
}
if (module->imports != NULL) {
for(size_t i = 0; i < module->import_count; i++) {
free(module->imports[i].module_name);
}
free(module->imports);
}
if (module->aliases != NULL) {
for(size_t i = 0; i < module->alias_count; i++) {
free((void*)module->aliases[i].name);
ast_free_type(&module->aliases[i].value);
}
free(module->aliases);
}
if (module->variables != NULL) {
for(size_t i = 0; i < module->variable_count; i++) {
free(module->variables[i].name);
ast_free_type(&module->variables[i].type);
if (module->variables[i].initializer) {
if (module->variables[i].initializer->tag == EXPRESSION_TREE_STRING) {
free((void*)module->variables[i].initializer->string);
}
free(module->variables[i].initializer);
}
}
free(module->variables);
}
free(module->name);
free(module);
}
+34
View File
@@ -0,0 +1,34 @@
#ifndef AST_MODULE_H
#define AST_MODULE_H
#include "declaration.h"
#include <stddef.h>
/**
* The top-level model.
* Every file matches an entire Module.
*/
typedef struct {
/** @brief The name of the module. */
char* name;
/** @brief The list of imports in the module. */
ImportTree* imports;
/** @brief The number of imports in the module. */
size_t import_count;
/** @brief The list of aliases in the module. */
AliasTree* aliases;
/** @brief The number of aliases in the module. */
size_t alias_count;
/** @brief The list of variables in the module. */
VariableTree* variables;
/** @brief The number of variables in the module. */
size_t variable_count;
} ModuleTree;
#endif
+10
View File
@@ -0,0 +1,10 @@
/* Minimal boolean type for C89 compatibility */
#ifndef BOOL_H
#define BOOL_H
typedef int bool;
#define true 1
#define false 0
#endif
+15 -2
View File
@@ -1,4 +1,7 @@
V0_SRC := v0/main.c v0/token.c v0/parser.c v0/log.c include v0/ast/include.mk
include v0/parser/include.mk
V0_SRC := v0/main.c v0/util.c v0/token.c $(AST_SRC) $(PARSER_SRC) v0/log.c v0/str.c
# V0_TEST must only include `v0/test.c` itself, as all other test Csource files are # V0_TEST must only include `v0/test.c` itself, as all other test Csource files are
# included directly into `v0/test.c` using `#include "test_xyz.c"`. # included directly into `v0/test.c` using `#include "test_xyz.c"`.
@@ -11,6 +14,8 @@ V0_TEST_OBJ := $(patsubst v0/%.c,v0/bin/%.o,$(V0_TEST))
V0_SRC_DEPS := $(V0_SRC_OBJ:.o=.d) V0_SRC_DEPS := $(V0_SRC_OBJ:.o=.d)
V0_TEST_DEPS := $(V0_TEST_OBJ:.o=.d) V0_TEST_DEPS := $(V0_TEST_OBJ:.o=.d)
CFLAGS += -Werror -Wall -pedantic -std=c11 -g
v0/bin/c2: $(V0_SRC_OBJ) v0/bin/c2: $(V0_SRC_OBJ)
$(CC) $(CFLAGS) -o $@ $^ $(CC) $(CFLAGS) -o $@ $^
@@ -19,8 +24,16 @@ V0_SRC_OBJ_NO_MAIN := $(filter-out v0/bin/main.o,$(V0_SRC_OBJ))
v0/bin/test: $(V0_SRC_OBJ_NO_MAIN) $(V0_TEST_OBJ) v0/bin/test: $(V0_SRC_OBJ_NO_MAIN) $(V0_TEST_OBJ)
$(CC) $(CFLAGS) -o $@ $^ $(CC) $(CFLAGS) -o $@ $^
# Only run tests under valgrind on Linux. On macOS (Darwin) valgrind is
# typically unavailable or unsupported, so run the test binary directly.
ifeq ($(shell uname -s),Linux)
TEST_CMD := valgrind --quiet --leak-check=full --error-exitcode=1 v0/bin/test
else
TEST_CMD := v0/bin/test
endif
test:: v0/bin/test test:: v0/bin/test
v0/bin/test $(TEST_CMD)
generate_golden:: v0/bin/test generate_golden:: v0/bin/test
GENERATE_GOLDEN=1 v0/bin/test GENERATE_GOLDEN=1 v0/bin/test
+4
View File
@@ -0,0 +1,4 @@
#include <stdint.h>
// u32 simple:x
static uint32_t v_6simple_1x = 123;
+2
View File
@@ -0,0 +1,2 @@
module simple;
u32 x = 123;
+6 -6
View File
@@ -4,24 +4,24 @@
#ifndef LOCATION_H #ifndef LOCATION_H
#define LOCATION_H #define LOCATION_H
#include "string.h" #include "str.h"
#include <stddef.h> #include <stddef.h>
typedef struct { typedef struct {
/// @brief The name of the file where the token was found. /* @brief The name of the file where the token was found. */
char* filename; char* filename;
/// @brief The entire line of text where the token was found. /* @brief The entire line of text where the token was found. */
String line_text; String line_text;
/// @brief The line number where the token was found. /* @brief The line number where the token was found. */
int line; int line;
/// @brief The starting column number where the token was found. /* @brief The starting column number where the token was found. */
int column_start; int column_start;
/// @brief The ending column number where the token was found. /* @brief The ending column number where the token was found. */
int column_end; int column_end;
} Location; } Location;
+44 -40
View File
@@ -1,4 +1,6 @@
#include "log.h" #include "log.h"
#include "util.h"
#include <stdio.h> #include <stdio.h>
#include <string.h> #include <string.h>
#include <stdlib.h> #include <stdlib.h>
@@ -18,66 +20,68 @@ void log_error(const char* msg) {
} }
} }
void log_on_line(Location* loc, int to_column, const char* msg, ...) { void log_on_line(Location* loc, const char* msg, ...) {
char line_prefix[32]; /* Declarations first for C89 */
int prefix_len = snprintf(line_prefix, sizeof(line_prefix), "%d| ", loc->line); char* line_prefix = NULL;
char* formatted_msg = NULL;
char* header = NULL;
char* buffer = NULL;
va_list args;
int caret_len;
char* p;
int i1, i2;
size_t i3;
size_t total_size;
int caret_len = to_column - loc->column_start + 1; line_prefix = format_string("%d| ", loc->line);
if (!line_prefix) goto cleanup;
caret_len = loc->column_end - loc->column_start + 1;
if (caret_len < 1) caret_len = 1; if (caret_len < 1) caret_len = 1;
// Format the message /* Format the message */
va_list args;
va_start(args, msg); va_start(args, msg);
char formatted_msg[256]; formatted_msg = format_string_va(msg, args);
vsnprintf(formatted_msg, sizeof(formatted_msg), msg, args);
va_end(args); va_end(args);
if (!formatted_msg) goto cleanup;
// Custom header logic to match the user's specific updated logs /* Header logic */
char header[512]; if (loc->filename && loc->filename[0] != '\0') {
header[0] = '\0'; header = format_string("--- %s ---\n", loc->filename);
char* p_header = header;
if (strstr(loc->filename, "missing_semicolon_import")) {
p_header += sprintf(p_header, "--- \n");
} else if (strstr(loc->filename, "missing_semicolon_module")) {
p_header += sprintf(p_header, "--- \n ---\n");
} else if (strstr(loc->filename, "unknown_token")) {
p_header += sprintf(p_header, "--- \n ---\n");
} else if (strstr(loc->filename, "log_on_line")) {
p_header += sprintf(p_header, "--- %s ---\n", loc->filename);
} else if (loc->filename && loc->filename[0] != '\0') {
char buf[25];
strncpy(buf, loc->filename, 24);
buf[24] = '\0';
p_header += sprintf(p_header, "--- %s ---\n", buf);
} else { } else {
p_header += sprintf(p_header, "--- \n"); header = format_string("--- \n");
} }
if (!header) goto cleanup;
size_t total_size = strlen(header) + 20 + total_size = strlen(header) + 20 +
prefix_len + loc->line_text.length + 2 + // line| text\n strlen(line_prefix) + loc->line_text.length + 2 + /* line| text\n */
prefix_len + loc->column_start - 1 + caret_len + 2 + // indent + ^^\n strlen(line_prefix) + loc->column_start - 1 + caret_len + 2 + /* indent + ^^\n */
prefix_len + 3 + strlen(formatted_msg) + 2 + // indent + msg\n strlen(line_prefix) + 3 + strlen(formatted_msg) + 2 + /* indent + msg\n */
100; 10;
char* buffer = (char*)malloc(total_size); buffer = (char*)malloc(total_size);
if (!buffer) return; if (!buffer) goto cleanup;
char* p = buffer; p = buffer;
p += sprintf(p, "%s", header); p += sprintf(p, "%s", header);
p += sprintf(p, "%s%.*s\n", line_prefix, (int)loc->line_text.length, loc->line_text.data); p += sprintf(p, "%s%.*s\n", line_prefix, (int)loc->line_text.length, loc->line_text.data);
// Caret line /* Caret line */
for (int i = 0; i < prefix_len + loc->column_start - 1; i++) *p++ = ' '; for (i1 = 0; i1 < (int)(strlen(line_prefix) + loc->column_start - 1); i1++) *p++ = ' ';
for (int i = 0; i < caret_len; i++) *p++ = '^'; for (i2 = 0; i2 < caret_len; i2++) *p++ = '^';
*p++ = '\n'; *p++ = '\n';
// Message line /* Message line */
for (int i = 0; i < 3; i++) *p++ = ' '; for (i3 = 0; i3 < strlen(line_prefix); i3++) *p++ = ' ';
p += sprintf(p, "%s\n", formatted_msg); p += sprintf(p, "%s\n", formatted_msg);
*p = '\0'; *p = '\0';
log_error(buffer); log_error(buffer);
cleanup:
free(line_prefix);
free(formatted_msg);
free(header);
free(buffer); free(buffer);
} }
+1 -2
View File
@@ -28,10 +28,9 @@ void log_error(const char* msg);
* It additionally supports the `%S` format specifier, which can be used to format a `String` structure from `string.h`. * It additionally supports the `%S` format specifier, which can be used to format a `String` structure from `string.h`.
* *
* @param loc The location where the error occurred. * @param loc The location where the error occurred.
* @param to_column The column number where the error ends.
* @param msg The error message to log. This can contain format specifiers like printf, and the additional arguments will be formatted into the message. * @param msg The error message to log. This can contain format specifiers like printf, and the additional arguments will be formatted into the message.
* @param ... Additional arguments to format into the error message. * @param ... Additional arguments to format into the error message.
*/ */
void log_on_line(Location* loc, int to_column, const char* msg, ...); void log_on_line(Location* loc, const char* msg, ...);
#endif #endif
+1
View File
@@ -2,4 +2,5 @@
int main(int argc, char** argv) { int main(int argc, char** argv) {
puts("Hello, world"); puts("Hello, world");
return 0;
} }
-102
View File
@@ -1,102 +0,0 @@
#include "parser.h"
#include "log.h"
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
Module* parser_parse(TokenStream* ts) {
Token t = tokenstream_next(ts);
if (t.token != TOKEN_MODULE) {
log_on_line(&t.location, t.location.column_end, "expected 'module' keyword");
return NULL;
}
t = tokenstream_next(ts);
if (t.token != TOKEN_IDENTIFIER) {
log_on_line(&t.location, t.location.column_end, "expected module name");
return NULL;
}
Module* module = (Module*)malloc(sizeof(Module));
if (module == NULL) {
fprintf(stderr, "Out of memory\n");
exit(1);
}
module->name = (char*)malloc(t.text.length + 1);
if (module->name == NULL) {
fprintf(stderr, "Out of memory\n");
exit(1);
}
memcpy(module->name, t.text.data, t.text.length);
module->name[t.text.length] = '\0';
t = tokenstream_next(ts);
if (t.token != TOKEN_SEMICOLON) {
log_on_line(&t.location, t.location.column_end, "expected ';' after module name");
parser_free(module);
return NULL;
}
module->imports = NULL;
module->import_count = 0;
while (1) {
t = tokenstream_next(ts);
if (t.token != TOKEN_IMPORT) {
break;
}
ImportDeclaration* new_imports = realloc(module->imports, (module->import_count + 1) * sizeof(ImportDeclaration));
if (!new_imports) {
fprintf(stderr, "Out of memory\n");
exit(1);
}
module->imports = new_imports;
t = tokenstream_next(ts);
bool is_public = false;
if (t.token == TOKEN_IDENTIFIER && strncmp(t.text.data, "public", t.text.length) == 0) {
is_public = true;
t = tokenstream_next(ts);
}
if (t.token != TOKEN_IDENTIFIER) {
log_on_line(&t.location, t.location.column_end, "expected module name to import");
parser_free(module);
return NULL;
}
module->imports[module->import_count].module_name = (char*)malloc(t.text.length + 1);
if (!module->imports[module->import_count].module_name) {
fprintf(stderr, "Out of memory\n");
exit(1);
}
memcpy(module->imports[module->import_count].module_name, t.text.data, t.text.length);
module->imports[module->import_count].module_name[t.text.length] = '\0';
module->imports[module->import_count].is_public = is_public;
module->import_count++;
t = tokenstream_next(ts);
if (t.token != TOKEN_SEMICOLON) {
log_on_line(&t.location, t.location.column_end, "expected ';' after import");
parser_free(module);
return NULL;
}
}
return module;
}
void parser_free(Module* module) {
if (module == NULL) return;
if (module->imports != NULL) {
for (size_t i = 0; i < module->import_count; i++) {
free(module->imports[i].module_name);
}
free(module->imports);
}
free(module->name);
free(module);
}
+1 -8
View File
@@ -10,13 +10,6 @@
* @param ts The TokenStream to read. * @param ts The TokenStream to read.
* @returns The parsed module. * @returns The parsed module.
*/ */
Module* parser_parse(TokenStream* ts); ModuleTree* parser_parse(TokenStream* ts);
/**
* Frees the parsed AST.
*
* @param module The AST return by parser_parse.
*/
void parser_free(Module* module);
#endif #endif
+52
View File
@@ -0,0 +1,52 @@
#include "internal.h"
#include "../str.h"
#include "../log.h"
#include <stdlib.h>
void parser_next_token(Parser* p) {
p->token = tokenstream_next(p->ts);
}
bool parser_accept(Parser* p, TokenType token) {
if (p->token.token == token) {
parser_next_token(p);
return true;
}
return false;
}
bool parser_expect(Parser* p, TokenType token, const char* msg) {
if (parser_accept(p, token)) {
return true;
}
log_on_line(&p->token.location, msg);
return false;
}
bool parser_peek(Parser* p, TokenType token) {
if (p->token.token == token) {
return true;
}
return false;
}
bool parser_require(Parser* p, TokenType token, const char* msg) {
if (parser_peek(p, token)) {
return true;
}
log_on_line(&p->token.location, msg);
return false;
}
char* parser_to_text(Parser* p) {
char* str = string_copy(p->token.text);
parser_next_token(p);
return str;
}
bool parser_accept_primitive(Parser* p) {
return parser_peek(p, TOKEN_I8) || parser_peek(p, TOKEN_I16) ||
parser_peek(p, TOKEN_I32) || parser_peek(p, TOKEN_I64) ||
parser_peek(p, TOKEN_U8) || parser_peek(p, TOKEN_U16) ||
parser_peek(p, TOKEN_U32) || parser_peek(p, TOKEN_U64);
}
+87
View File
@@ -0,0 +1,87 @@
#include "internal.h"
#include <stdlib.h>
#include <string.h>
bool parse_import_declaration(Parser* p, ModuleTree* module, bool is_public) {
module->import_count++;
module->imports = realloc(module->imports, sizeof(ImportTree) * module->import_count);
ImportTree* import = &module->imports[module->import_count - 1];
memset(import, 0, sizeof(ImportTree));
import->is_public = is_public;
if (!parser_require(p, TOKEN_IDENTIFIER, "expected module identifier")) {
return false;
}
import->module_name = parser_to_text(p);
if (!parser_expect(p, TOKEN_SEMICOLON, "expected ';' after import")) {
return false;
}
return true;
}
bool parse_alias_declaration(Parser* p, ModuleTree* module, bool is_public) {
(void)is_public;
module->alias_count++;
module->aliases = realloc(module->aliases, sizeof(AliasTree) * module->alias_count);
AliasTree* alias = &module->aliases[module->alias_count - 1];
memset(alias, 0, sizeof(AliasTree));
if (!parser_require(p, TOKEN_IDENTIFIER, "expected alias identifier")) {
return false;
}
alias->name = parser_to_text(p);
if (!parser_expect(p, TOKEN_ASSIGN, "expected '=' after alias name")) {
return false;
}
if (!parse_type_expression(p, &alias->value)) {
return false;
}
if (!parser_expect(p, TOKEN_SEMICOLON, "expected ';' after alias declaration")) {
return false;
}
return true;
}
bool parse_variable_declaration(Parser* p, ModuleTree* module, bool is_public, bool is_static, bool is_const) {
module->variable_count++;
module->variables = realloc(module->variables, sizeof(VariableTree) * module->variable_count);
VariableTree* var = &module->variables[module->variable_count - 1];
memset(var, 0, sizeof(VariableTree));
var->is_public = is_public;
var->is_static = is_static;
var->is_const = is_const;
if (parser_accept_primitive(p)) {
if (!parse_type_expression(p, &var->type)) {
return false;
}
}
if (!parser_require(p, TOKEN_IDENTIFIER, "expected variable identifier")) {
return false;
}
var->name = parser_to_text(p);
if (parser_accept(p, TOKEN_ASSIGN)) {
var->initializer = malloc(sizeof(ExpressionTree));
if (!parse_expression(p, var->initializer)) {
return false;
}
}
if (!parser_expect(p, TOKEN_SEMICOLON, "expected ';' after variable declaration")) {
return false;
}
return true;
}
+98
View File
@@ -0,0 +1,98 @@
#include "internal.h"
#include "../log.h"
#include <stdlib.h>
bool parse_primitive_type_expression(Parser* p, TypeTree* expr) {
if (parser_accept(p, TOKEN_U8)) {
expr->tag = TYPE_TREE_BUILTIN;
expr->builtin.bitSize = 8;
expr->builtin.isSigned = false;
return true;
} else if (parser_accept(p, TOKEN_U16)) {
expr->tag = TYPE_TREE_BUILTIN;
expr->builtin.bitSize = 16;
expr->builtin.isSigned = false;
return true;
} else if (parser_accept(p, TOKEN_U32)) {
expr->tag = TYPE_TREE_BUILTIN;
expr->builtin.bitSize = 32;
expr->builtin.isSigned = false;
return true;
} else if (parser_accept(p, TOKEN_U64)) {
expr->tag = TYPE_TREE_BUILTIN;
expr->builtin.bitSize = 64;
expr->builtin.isSigned = false;
return true;
} else if (parser_accept(p, TOKEN_I8)) {
expr->tag = TYPE_TREE_BUILTIN;
expr->builtin.bitSize = 8;
expr->builtin.isSigned = true;
return true;
} else if (parser_accept(p, TOKEN_I16)) {
expr->tag = TYPE_TREE_BUILTIN;
expr->builtin.bitSize = 16;
expr->builtin.isSigned = true;
return true;
} else if (parser_accept(p, TOKEN_I32)) {
expr->tag = TYPE_TREE_BUILTIN;
expr->builtin.bitSize = 32;
expr->builtin.isSigned = true;
return true;
} else if (parser_accept(p, TOKEN_I64)) {
expr->tag = TYPE_TREE_BUILTIN;
expr->builtin.bitSize = 64;
expr->builtin.isSigned = true;
return true;
} else {
log_on_line(&p->token.location, "expected type expression");
return false;
}
}
bool parse_array_type_expression(Parser* p, TypeTree* expr) {
TypeTree elementType;
if (!parse_primitive_type_expression(p, &elementType)) {
return false;
}
if (parser_accept(p, TOKEN_BRACKET_OPEN)) {
expr->tag = TYPE_TREE_ARRAY;
expr->array.array = malloc(sizeof(TypeTree));
*expr->array.array = elementType;
if (!parser_expect(p, TOKEN_BRACKET_CLOSE, "expected ']' to end array type")) {
return false;
}
} else {
*expr = elementType;
return true;
}
return true;
}
bool parse_type_expression(Parser* p, TypeTree* expr) {
return parse_array_type_expression(p, expr);
}
bool parse_expression(Parser* p, ExpressionTree* expr) {
if (parser_peek(p, TOKEN_INTEGER)) {
expr->tag = EXPRESSION_TREE_INTEGER;
expr->integer = atoi(p->token.text.data);
parser_next_token(p);
return true;
} else if (parser_peek(p, TOKEN_STRING)) {
expr->tag = EXPRESSION_TREE_STRING;
expr->string = parser_to_text(p);
return true;
} else if (parser_accept(p, TOKEN_TRUE)) {
expr->tag = EXPRESSION_TREE_BOOLEAN;
expr->boolean = true;
return true;
} else if (parser_accept(p, TOKEN_FALSE)) {
expr->tag = EXPRESSION_TREE_BOOLEAN;
expr->boolean = false;
return true;
}
log_on_line(&p->token.location, "expected expression");
return false;
}
+1
View File
@@ -0,0 +1 @@
PARSER_SRC := v0/parser/core.c v0/parser/expression.c v0/parser/declaration.c v0/parser/module.c
+36
View File
@@ -0,0 +1,36 @@
#ifndef PARSER_INTERNAL_H
#define PARSER_INTERNAL_H
#include "../parser.h"
#include "../token.h"
#include "../ast.h"
typedef struct {
TokenStream* ts;
Token token;
} Parser;
// Core functions
void parser_next_token(Parser* p);
bool parser_accept(Parser* p, TokenType token);
bool parser_expect(Parser* p, TokenType token, const char* msg);
bool parser_peek(Parser* p, TokenType token);
bool parser_require(Parser* p, TokenType token, const char* msg);
char* parser_to_text(Parser* p);
bool parser_accept_primitive(Parser* p);
// Base parsing (expressions, types)
bool parse_primitive_type_expression(Parser* p, TypeTree* expr);
bool parse_array_type_expression(Parser* p, TypeTree* expr);
bool parse_type_expression(Parser* p, TypeTree* expr);
bool parse_expression(Parser* p, ExpressionTree* expr);
// Declaration parsing
bool parse_import_declaration(Parser* p, ModuleTree* module, bool is_public);
bool parse_alias_declaration(Parser* p, ModuleTree* module, bool is_public);
bool parse_variable_declaration(Parser* p, ModuleTree* module, bool is_public, bool is_static, bool is_const);
// Module parsing
bool parse_module_declaration(Parser* p, ModuleTree* module);
#endif
+87
View File
@@ -0,0 +1,87 @@
#include "internal.h"
#include "../log.h"
#include <stdlib.h>
#include <string.h>
bool parse_module_declaration(Parser* p, ModuleTree* module) {
if (!parser_expect(p, TOKEN_MODULE, "expected keyword 'module'")) {
return false;
}
if (!parser_require(p, TOKEN_IDENTIFIER, "expected module identifier")) {
return false;
}
module->name = parser_to_text(p);
return parser_expect(p, TOKEN_SEMICOLON, "expected ';' after module name");
}
ModuleTree* parser_parse(TokenStream* ts) {
Parser* p = malloc(sizeof(Parser));
p->ts = ts;
parser_next_token(p);
ModuleTree* module = malloc(sizeof(ModuleTree));
memset(module, 0, sizeof(ModuleTree));
if (!parse_module_declaration(p, module)) {
goto fail;
}
while (!parser_peek(p, TOKEN_EOF)) {
bool is_public = false;
bool is_static = false;
bool is_const = false;
bool terminal = false;
while (!terminal) {
if (parser_accept(p, TOKEN_IMPORT)) {
if (is_static) {
log_on_line(&p->token.location, "import declarations cannot be static or const");
goto fail;
}
if (is_const) {
log_on_line(&p->token.location, "import declarations cannot be static or const");
goto fail;
}
if (!parse_import_declaration(p, module, is_public)) {
goto fail;
}
terminal = true;
} else if (parser_accept(p, TOKEN_ALIAS)) {
if (is_static) {
log_on_line(&p->token.location, "alias declarations cannot be static or const");
goto fail;
}
if (is_const) {
log_on_line(&p->token.location, "alias declarations cannot be static or const");
goto fail;
}
if (!parse_alias_declaration(p, module, is_public)) {
goto fail;
}
terminal = true;
} else if (parser_accept(p, TOKEN_PUBLIC)) {
is_public = true;
} else if (parser_accept(p, TOKEN_STATIC)) {
is_static = true;
} else if (parser_accept(p, TOKEN_CONST)) {
is_const = true;
} else if (parser_accept(p, TOKEN_VAR) || parser_accept_primitive(p)) {
if (!parse_variable_declaration(p, module, is_public, is_static, is_const)) {
goto fail;
}
terminal = true;
} else {
log_on_line(&p->token.location, "unexpected token");
goto fail;
}
}
}
free(p);
return module;
fail:
free(p);
ast_free_module(module);
return NULL;
}
+8
View File
@@ -0,0 +1,8 @@
#include "../test.h"
#include "../parser.h"
// Currently core utilities are tested indirectly through other parser tests.
// Placeholder for future explicit core utility tests.
static void test_parser_core_placeholder(void) {
// No-op
}
+89
View File
@@ -0,0 +1,89 @@
#include "../test.h"
#include "../parser.h"
#include <string.h>
#include <stdlib.h>
static void test_parser_missing_semicolon_import(void) {
test_get_ast();
assert_log_file("expected error for missing semicolon");
}
static void test_parser_bad_import_name(void) {
test_get_ast();
assert_log_file("expected error for bad import name");
}
static void test_parser_imports(void) {
ModuleTree* m = test_get_ast();
assert_not_null(m, "expected module to be parsed");
assert_str("my_module", m->name, "expected name 'my_module'");
assert_not_null(m->imports, "expected imports to be parsed");
assert_int(1, (int)m->import_count, "expected one import");
assert_str("other_module", m->imports[0].module_name, "expected import name 'other_module'");
assert_false(m->imports[0].is_public, "expected import to not be public");
}
static void test_parser_public_imports(void) {
ModuleTree* m = test_get_ast();
assert_not_null(m, "expected module to be parsed");
assert_str("my_module", m->name, "expected name 'my_module'");
assert_not_null(m->imports, "expected imports to be parsed");
assert_int(1, (int)m->import_count, "expected one import");
assert_str("other_module", m->imports[0].module_name, "expected import name 'other_module'");
assert_true(m->imports[0].is_public, "expected import to be public");
}
static void test_parser_alias_simple(void) {
ModuleTree* m = test_get_ast();
assert_not_null(m, "expected module to be parsed");
assert_int(1, (int)m->alias_count, "expected correct number of aliases");
AliasTree alias = m->aliases[0];
assert_str("myalias", alias.name, "expected correct alias name");
}
static void test_parser_variable_simple(void) {
ModuleTree* m = test_get_ast();
assert_not_null(m, "expected module to be parsed");
assert_int(1, (int)m->variable_count, "expected correct number of variables");
VariableTree var = m->variables[0];
assert_str("my_var", var.name, "expected correct variable name");
assert_false(var.is_const, "expected not const");
assert_false(var.is_static, "expected not static");
}
static void test_parser_variable_const(void) {
ModuleTree* m = test_get_ast();
assert_not_null(m, "expected module to be parsed");
assert_int(1, (int)m->variable_count, "expected correct number of variables");
VariableTree var = m->variables[0];
assert_str("my_const", var.name, "expected correct variable name");
assert_true(var.is_const, "expected const");
assert_false(var.is_static, "expected not static");
}
static void test_parser_variable_static(void) {
ModuleTree* m = test_get_ast();
assert_not_null(m, "expected module to be parsed");
assert_int(1, (int)m->variable_count, "expected correct number of variables");
VariableTree var = m->variables[0];
assert_str("my_static", var.name, "expected correct variable name");
assert_false(var.is_const, "expected not const");
assert_true(var.is_static, "expected static");
}
static void test_parser_multiple_vars(void) {
ModuleTree* m = test_get_ast();
assert_not_null(m, "expected module to be parsed");
assert_int(2, (int)m->variable_count, "expected correct number of variables");
assert_str("var1", m->variables[0].name, "expected first variable name 'var1'");
assert_str("var2", m->variables[1].name, "expected second variable name 'var2'");
}
+52
View File
@@ -0,0 +1,52 @@
#include "../test.h"
#include "../parser.h"
#include <string.h>
#include <stdlib.h>
static void test_parser_alias_simple_type(void) {
ModuleTree* m = test_get_ast();
assert_not_null(m, "expected module to be parsed");
assert_int(1, (int)m->alias_count, "expected correct number of aliases");
AliasTree alias = m->aliases[0];
assert_int(TYPE_TREE_BUILTIN, alias.value.tag, "expected correct alias tag");
assert_int(32, alias.value.builtin.bitSize, "expected bitSize 32");
assert_true(alias.value.builtin.isSigned, "expected signed");
}
static void test_parser_alias_array(void) {
ModuleTree* m = test_get_ast();
assert_not_null(m, "expected module to be parsed");
assert_int(1, (int)m->alias_count, "expected correct number of aliases");
AliasTree alias = m->aliases[0];
assert_int(TYPE_TREE_ARRAY, alias.value.tag, "expected correct alias tag");
TypeTree* valueType = alias.value.array.array;
assert_not_null(valueType, "expected pointer to array type");
assert_int(TYPE_TREE_BUILTIN, valueType->tag, "expected correct type tag");
assert_int(32, valueType->builtin.bitSize, "expected bitSize 32");
assert_true(valueType->builtin.isSigned, "expected signed");
}
static void test_parser_variable_init(void) {
ModuleTree* m = test_get_ast();
assert_not_null(m, "expected module to be parsed");
assert_int(1, (int)m->variable_count, "expected 1 variable");
VariableTree* var = &m->variables[0];
assert_str("x", var->name, "expected variable name 'x'");
assert_not_null(var->initializer, "expected variable to have an initializer");
assert_int(EXPRESSION_TREE_INTEGER, var->initializer->tag, "expected integer initializer");
assert_int(123, var->initializer->integer, "expected value 123");
}
static void test_parser_variable_simple_type(void) {
ModuleTree* m = test_get_ast();
assert_not_null(m, "expected module to be parsed");
assert_int(1, (int)m->variable_count, "expected correct number of variables");
VariableTree var = m->variables[0];
assert_int(TYPE_TREE_BUILTIN, var.type.tag, "expected correct type tag");
assert_int(32, var.type.builtin.bitSize, "expected bitSize 32");
assert_true(var.type.builtin.isSigned, "expected signed");
}
+21
View File
@@ -0,0 +1,21 @@
#include "../test.h"
#include "../parser.h"
#include <string.h>
#include <stdlib.h>
static void test_parser_module_name(void) {
ModuleTree* m = test_get_ast();
assert_not_null(m, "expected module to be parsed");
assert_str("my_module", m->name, "expected name 'my_module'");
}
static void test_parser_bad_module_name(void) {
test_get_ast();
assert_log_file("expected error to be logged for bad module name");
}
static void test_parser_missing_semicolon_module(void) {
test_get_ast();
assert_log_file("expected error for missing semicolon");
}
+11
View File
@@ -0,0 +1,11 @@
#include "str.h"
#include <string.h>
#include <stdlib.h>
char* string_copy(String string) {
char* str = malloc(string.length + 1);
memcpy(str, string.data, string.length);
str[string.length] = '\0';
return str;
}
+12 -5
View File
@@ -1,8 +1,8 @@
/** /**
* Contains the definition of the String structure, which is a simple representation of a string in C. * Contains the definition of the String structure, which is a simple representation of a string in C.
*/ */
#ifndef STRING_H #ifndef STR_H
#define STRING_H #define STR_H
#include <stddef.h> #include <stddef.h>
@@ -10,11 +10,18 @@
* A simple string structure that holds a pointer to the character data and its length. * A simple string structure that holds a pointer to the character data and its length.
*/ */
typedef struct { typedef struct {
/// @brief A pointer to the character data of the string.
char* data; char* data;
/// @brief The length of the string.
size_t length; size_t length;
} String; } String;
/**
* Creates a copy of a string.
*
* Note that this copy has to be freed afterwards.
*
* @param string The string to copy.
* @returns A null-terminated copy of the string.
*/
char* string_copy(String string);
#endif #endif
+165 -71
View File
@@ -1,43 +1,43 @@
#include "test.h" #include "test.h"
#include "util.h"
#include "parser.h"
#include <setjmp.h> #include <setjmp.h>
#include <stdio.h> #include <stdio.h>
#include <string.h> #include <string.h>
#include <stdlib.h> #include <stdlib.h>
static jmp_buf s_testJmp; static jmp_buf s_testJmp;
static const char* s_failMsg; static char s_failMsg[1024];
static char* s_logOutput = NULL; static char* s_logOutput = NULL;
static const char* s_currentTestName = NULL; static const char* s_currentTestName = NULL;
static char* s_testSource = NULL; static char* s_testSource = NULL;
static ModuleTree* s_currentModule = NULL;
static TokenStream* s_currentTokenStream = NULL;
void fail(const char* msg) { void fail(const char* msg) {
s_failMsg = msg; if (msg) {
strncpy(s_failMsg, msg, sizeof(s_failMsg) - 1);
s_failMsg[sizeof(s_failMsg) - 1] = '\0';
} else {
s_failMsg[0] = '\0';
}
longjmp(s_testJmp, 1); longjmp(s_testJmp, 1);
} }
void assert_not_null(void* ptr, const char* msg) {
if (ptr == NULL) {
fail(msg);
}
}
void assert_str(const char* expected, const char* actual, const char* msg) {
if (expected == NULL || actual == NULL || strcmp(expected, actual) != 0) {
fail(msg);
}
}
void assert_log(const char* expected, const char* msg) {
assert_str(expected, s_logOutput, msg);
}
char* read_file_content(const char* filepath) { char* read_file_content(const char* filepath) {
FILE* f = fopen(filepath, "r"); FILE* f;
long size;
char* content;
f = fopen(filepath, "r");
if (!f) return NULL; if (!f) return NULL;
fseek(f, 0, SEEK_END); fseek(f, 0, SEEK_END);
long size = ftell(f); size = ftell(f);
fseek(f, 0, SEEK_SET); fseek(f, 0, SEEK_SET);
char* content = malloc(size + 1); content = malloc(size + 1);
if (!content) { if (!content) {
fclose(f); fclose(f);
return NULL; return NULL;
@@ -48,37 +48,105 @@ char* read_file_content(const char* filepath) {
return content; return content;
} }
void assert_log_file(const char* msg) { void assert_not_null(void* ptr, const char* msg) {
char filepath[256]; if (ptr == NULL) {
snprintf(filepath, sizeof(filepath), "v0/tests/%s.log", s_currentTestName); fail(msg);
}
}
const char* generate = getenv("GENERATE_GOLDEN"); void assert_string(const char* expected, String actual, const char* msg) {
if (expected == NULL || actual.data == NULL || strlen(expected) != actual.length || strncmp(expected, actual.data, actual.length) != 0) {
fail(msg);
}
}
void assert_str(const char* expected, const char* actual, const char* msg) {
if (expected == NULL || actual == NULL || strcmp(expected, actual) != 0) {
fail(msg);
}
}
TokenStream* test_get_tokenstream(void) {
if (s_currentTokenStream == NULL) {
char* filepath = NULL;
filepath = format_string("v0/tests/%s.c2", s_currentTestName);
if (!filepath) {
fail("out of memory");
return NULL;
}
if (s_testSource) free(s_testSource);
s_testSource = read_file_content(filepath);
if (!s_testSource) {
puts(filepath);
free(filepath);
fail("could not read test source file");
return NULL;
}
s_currentTokenStream = tokenstream_open(filepath, s_testSource);
free(filepath);
}
return s_currentTokenStream;
}
ModuleTree* test_get_ast(void) {
if (s_currentModule == NULL) {
s_currentModule = parser_parse(test_get_tokenstream());
}
return s_currentModule;
}
void assert_log(const char* expected, const char* msg) {
assert_str(expected, s_logOutput, msg);
}
void assert_log_file(const char* msg) {
char* filepath = format_string("v0/tests/%s.log", s_currentTestName);
const char* generate;
char* content;
if (!filepath) {
fail("out of memory");
return;
}
generate = getenv("GENERATE_GOLDEN");
if (generate && strcmp(generate, "1") == 0) { if (generate && strcmp(generate, "1") == 0) {
FILE* f = fopen(filepath, "w"); FILE* f = fopen(filepath, "w");
if (!f) { if (!f) {
free(filepath);
fail("could not open golden file for writing"); fail("could not open golden file for writing");
return; return;
} }
fputs(s_logOutput ? s_logOutput : "", f); fputs(s_logOutput ? s_logOutput : "", f);
fclose(f); fclose(f);
free(filepath);
return; return;
} }
char* content = read_file_content(filepath); content = read_file_content(filepath);
if (!content) { if (!content) {
free(filepath);
fail("could not open golden file for reading"); fail("could not open golden file for reading");
return; return;
} }
assert_str(content, s_logOutput, msg); bool match = strcmp(content, s_logOutput ? s_logOutput : "") == 0;
free(content); free(content);
free(filepath);
if (!match) {
fail(msg);
}
} }
void assert_int(int expected, int actual, const char* msg) { void assert_int(int expected, int actual, const char* msg) {
if (expected != actual) { if (expected != actual) {
char buf[64]; char* buf = format_string("%s (expected %d, got %d)", msg, expected, actual);
snprintf(buf, sizeof(buf), "%s (expected %d, got %d)", msg, expected, actual); if (buf) {
fail(buf); fail(buf);
free(buf);
} else {
fail("out of memory");
}
} }
} }
@@ -94,20 +162,6 @@ void assert_false(bool condition, const char* msg) {
} }
} }
TokenStream* tokenstream_get_test(void) {
char filepath[256];
snprintf(filepath, sizeof(filepath), "v0/tests/%s.c2", s_currentTestName);
if (s_testSource) free(s_testSource);
s_testSource = read_file_content(filepath);
if (!s_testSource) {
fail("could not read test source file");
return NULL;
}
return tokenstream_open(filepath, s_testSource);
}
static void log_append(const char* msg) { static void log_append(const char* msg) {
size_t oldLen = s_logOutput ? strlen(s_logOutput) : 0; size_t oldLen = s_logOutput ? strlen(s_logOutput) : 0;
size_t newLen = oldLen + strlen(msg) + 1; size_t newLen = oldLen + strlen(msg) + 1;
@@ -124,7 +178,7 @@ static void log_append(const char* msg) {
} }
} }
static void log_clear() { static void log_clear(void) {
free(s_logOutput); free(s_logOutput);
s_logOutput = NULL; s_logOutput = NULL;
} }
@@ -135,50 +189,72 @@ typedef struct {
} TestCase; } TestCase;
#include "test_token.c" #include "test_token.c"
#include "test_parser.c" #include "parser/test_module.c"
#include "parser/test_declaration.c"
#include "parser/test_expression.c"
#include "parser/test_core.c"
#include "test_log.c" #include "test_log.c"
static int s_totalTests; static int s_totalTests;
static int s_greenTests; static int s_greenTests;
#define TEST(name) {#name, name},
static TestCase s_tests[] = { static TestCase s_tests[] = {
{"tokenstream_open_fail", test_tokenstream_open_fail}, TEST(test_log_error)
{"tokenstream_simple_keyword", test_tokenstream_simple_keyword}, TEST(test_log_on_line_variadic)
{"tokenstream_keywords_and_symbols", test_tokenstream_keywords_and_symbols}, TEST(test_log_on_line)
{"tokenstream_parentheses_and_brackets", test_tokenstream_parentheses_and_brackets}, TEST(test_parser_module_name)
{"tokenstream_comma", test_tokenstream_comma}, TEST(test_parser_bad_module_name)
{"tokenstream_whitespace_ignored", test_tokenstream_whitespace_ignored}, TEST(test_parser_missing_semicolon_module)
{"tokenstream_void_function_signature", test_tokenstream_void_function_signature}, TEST(test_parser_missing_semicolon_import)
{"tokenstream_unknown_token", test_tokenstream_unknown_token}, TEST(test_parser_bad_import_name)
{"tokenstream_info", test_tokenstream_info}, TEST(test_parser_imports)
{"parser_module_name", test_parser_module_name}, TEST(test_parser_public_imports)
{"parser_bad_module_name", test_parser_bad_module_name}, TEST(test_parser_alias_simple)
{"parser_missing_semicolon_module", test_parser_missing_semicolon_module}, TEST(test_parser_alias_simple_type)
{"parser_missing_semicolon_import", test_parser_missing_semicolon_import}, TEST(test_parser_alias_array)
{"parser_bad_import_name", test_parser_bad_import_name}, TEST(test_parser_variable_simple)
{"parser_imports", test_parser_imports}, TEST(test_parser_variable_simple_type)
{"parser_public_imports", test_parser_public_imports}, TEST(test_parser_variable_const)
{"log_error", test_log_error}, TEST(test_parser_variable_init)
{"log_on_line", test_log_on_line}, TEST(test_parser_variable_static)
{"log_on_line_variadic", test_log_on_line_variadic}, TEST(test_parser_multiple_vars)
TEST(test_parser_core_placeholder)
TEST(test_tokenstream_comma)
TEST(test_tokenstream_info)
TEST(test_tokenstream_keywords_and_symbols)
TEST(test_tokenstream_open_fail)
TEST(test_tokenstream_parentheses_and_brackets)
TEST(test_tokenstream_primitive_types)
TEST(test_tokenstream_simple_keyword)
TEST(test_tokenstream_unknown_token)
TEST(test_tokenstream_void_function_signature)
TEST(test_tokenstream_whitespace_ignored)
}; };
int main(int argc, char** argv) { int main(int argc, char** argv) {
const char** failedTests;
int failedCount;
(void)argc; (void)argc;
(void)argv; (void)argv;
s_totalTests = sizeof(s_tests) / sizeof(s_tests[0]); s_totalTests = sizeof(s_tests) / sizeof(s_tests[0]);
s_greenTests = 0; s_greenTests = 0;
const char* failedTests[s_totalTests + 1]; // Allocate failed tests array dynamically to avoid VLAs
int failedCount = 0; failedTests = (const char**)malloc((s_totalTests + 1) * sizeof(const char*));
failedCount = 0;
for (int i = 0; i < s_totalTests; i++) { for (int i = 0; i < s_totalTests; i++) {
s_currentTestName = s_tests[i].name; // Add 5 to strip the 'test_' prefix.
s_currentTestName = s_tests[i].name + 5;
log_set_output(log_append); log_set_output(log_append);
printf("%s...", s_tests[i].name); printf("%s...", s_tests[i].name);
s_failMsg = NULL; fflush(stdout);
s_failMsg[0] = '\0';
if (setjmp(s_testJmp) == 0) { if (setjmp(s_testJmp) == 0) {
log_clear(); log_clear();
@@ -190,20 +266,38 @@ int main(int argc, char** argv) {
printf(" [OK]\n"); printf(" [OK]\n");
s_greenTests++; s_greenTests++;
} else { } else {
printf(" [FAIL]: %s\n", s_failMsg ? s_failMsg : ""); printf(" [FAIL]: %s\n", s_failMsg[0] ? s_failMsg : "");
failedTests[failedCount++] = s_tests[i].name; failedTests[failedCount++] = s_tests[i].name;
// Log output on failure
if (s_logOutput && s_logOutput[0]) {
printf("%s\n", s_logOutput);
} }
} }
// Free AST and TokenStream after each test
if (s_currentModule) {
ast_free_module(s_currentModule);
s_currentModule = NULL;
}
if (s_currentTokenStream) {
tokenstream_close(s_currentTokenStream);
s_currentTokenStream = NULL;
}
fflush(stdout);
}
if (s_testSource) free(s_testSource); if (s_testSource) free(s_testSource);
log_clear();
if (failedCount > 0) { if (failedCount > 0) {
printf("\nFailed tests:\n"); printf("\nFailed tests:\n");
for (int i = 0; i < failedCount; i++) { for (int j = 0; j < failedCount; j++) {
printf(" - %s\n", failedTests[i]); printf(" - %s\n", failedTests[j]);
} }
} }
printf("\n%d/%d tests passed.\n", s_greenTests, s_totalTests); printf("\n%d/%d tests passed.\n", s_greenTests, s_totalTests);
free(failedTests);
return failedCount > 0 ? 1 : 0; return failedCount > 0 ? 1 : 0;
} }
+25 -3
View File
@@ -5,8 +5,7 @@
#define TEST_H #define TEST_H
#include "token.h" #include "token.h"
#include "ast.h"
#include <stdbool.h>
typedef void (*Test)(void); typedef void (*Test)(void);
@@ -37,6 +36,17 @@ void assert_not_null(void* ptr, const char* msg);
*/ */
void assert_str(const char* expected, const char* actual, const char* msg); void assert_str(const char* expected, const char* actual, const char* msg);
/**
* Asserts that a string has the expected value.
*
* Calls `fail` if the assertion does not hold.
*
* @param expected The expected value. This is typically a string literal.
* @param actual The actual value. This is typically an expression.
* @param msg The message to print if these do not match.
*/
void assert_string(const char* expected, String actual, const char* msg);
/** /**
* Asserts that the logged output matches the expected value. * Asserts that the logged output matches the expected value.
*/ */
@@ -56,6 +66,8 @@ void assert_int(int expected, int actual, const char* msg);
/** /**
* Asserts that a condition is true. * Asserts that a condition is true.
*/ */
#include "bool.h"
void assert_true(bool condition, const char* msg); void assert_true(bool condition, const char* msg);
/** /**
@@ -66,7 +78,17 @@ void assert_false(bool condition, const char* msg);
/** /**
* Get the token stream used for this test. * Get the token stream used for this test.
* It reads from the `v0/tests/xyz.c2` file, where xyz is the test name. * It reads from the `v0/tests/xyz.c2` file, where xyz is the test name.
*
* At the end of the test, the tokenstream will be freed automatically by the test harness.
*/ */
TokenStream* tokenstream_get_test(void); TokenStream* test_get_tokenstream(void);
/**
* Gets a parsed module for the this test.
* It reads from the `v0/tests/xyz.c2` file, where xyz is the test name.
*
* At the end of the test, the AST will be freed automatically by the test harness.
*/
ModuleTree* test_get_ast(void);
#endif #endif
+63
View File
@@ -0,0 +1,63 @@
#define _DEFAULT_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <dirent.h>
int run_test(const char* dir_name) {
char cmd[2048];
char input_path[1024];
char expected_path[1024];
snprintf(input_path, sizeof(input_path), "v0/integration_tests/%s/input.c2", dir_name);
snprintf(expected_path, sizeof(expected_path), "v0/integration_tests/%s/expected.c", dir_name);
if (snprintf(cmd, sizeof(cmd), "./v0/bin/c2 %s > actual.c", input_path) >= sizeof(cmd)) {
printf("Command buffer too small for %s\n", dir_name);
return 1;
}
if (system(cmd) != 0) {
printf("Failed to run compiler for %s\n", dir_name);
return 1;
}
if (snprintf(cmd, sizeof(cmd), "diff -u %s actual.c", expected_path) >= sizeof(cmd)) {
printf("Command buffer too small for %s\n", dir_name);
return 1;
}
if (system(cmd) != 0) {
printf("Test %s failed: Output mismatch\n", dir_name);
return 1;
}
printf("Test %s passed\n", dir_name);
return 0;
}
int main() {
DIR* d = opendir("v0/integration_tests");
if (!d) {
perror("opendir");
return 1;
}
struct dirent* dir;
int passed = 0;
int failed = 0;
while ((dir = readdir(d)) != NULL) {
if (dir->d_type == DT_DIR && strcmp(dir->d_name, ".") != 0 && strcmp(dir->d_name, "..") != 0) {
if (run_test(dir->d_name) == 0) {
passed++;
} else {
failed++;
}
}
}
closedir(d);
printf("\nTotal tests: %d, Passed: %d, Failed: %d\n", passed + failed, passed, failed);
return failed > 0 ? 1 : 0;
}
+26 -21
View File
@@ -1,47 +1,52 @@
#include "test.h" #include "test.h"
#include "log.h" #include "log.h"
#include <string.h> #include <string.h>
#include <stdlib.h>
#include "util.h"
static char s_lastLoggedError[256]; static char* s_lastLoggedError = NULL;
static void mock_log(const char* msg) { static void mock_log(const char* msg) {
strncpy(s_lastLoggedError, msg, sizeof(s_lastLoggedError) - 1); free(s_lastLoggedError);
s_lastLoggedError[sizeof(s_lastLoggedError) - 1] = '\0'; s_lastLoggedError = format_string("%s", msg ? msg : "");
} }
static void test_log_error(void) { static void test_log_error(void) {
log_set_output(mock_log); log_set_output(mock_log);
memset(s_lastLoggedError, 0, sizeof(s_lastLoggedError)); free(s_lastLoggedError);
s_lastLoggedError = NULL;
log_error("test error message"); log_error("test error message");
assert_str("test error message", s_lastLoggedError, "expected 'test error message'"); assert_str("test error message", s_lastLoggedError, "expected 'test error message'");
log_set_output(NULL); // Reset to default log_set_output(NULL);
free(s_lastLoggedError);
s_lastLoggedError = NULL;
} }
static void test_log_on_line(void) { static void test_log_on_line(void) {
Location loc = { Location loc;
.filename = "v0/tests/log_on_line.c2", loc.filename = "v0/tests/log_on_line.c2";
.line_text = { "int main() []", 13 }, loc.line_text.data = "int main() []";
.line = 1, loc.line_text.length = 13;
.column_start = 12, loc.line = 1;
.column_end = 13 loc.column_start = 12;
}; loc.column_end = 13;
log_on_line(&loc, 13, "unexpected token"); log_on_line(&loc, "unexpected token");
assert_log_file("expected formatted error message"); assert_log_file("expected formatted error message");
} }
static void test_log_on_line_variadic(void) { static void test_log_on_line_variadic(void) {
Location loc = { Location loc;
.filename = "v0/tests/log_on_line_variadic.c2", loc.filename = "v0/tests/log_on_line_variadic.c2";
.line_text = { "int main() []", 13 }, loc.line_text.data = "int main() []";
.line = 1, loc.line_text.length = 13;
.column_start = 12, loc.line = 1;
.column_end = 13 loc.column_start = 12;
}; loc.column_end = 13;
log_on_line(&loc, 13, "unexpected token '%c'", 'x'); log_on_line(&loc, "unexpected token '%c'", 'x');
assert_log_file("expected formatted error message with variadic args"); assert_log_file("expected formatted error message with variadic args");
} }
-86
View File
@@ -1,86 +0,0 @@
#include "test.h"
#include "parser.h"
#include <string.h>
static void test_parser_module_name(void) {
TokenStream* ts = tokenstream_open("test.c", "module my_module;");
Module* m = parser_parse(ts);
assert_not_null(m, "expected module to be parsed");
assert_str("my_module", m->name, "expected name 'my_module'");
parser_free(m);
tokenstream_close(ts);
}
static void test_parser_bad_module_name(void) {
TokenStream* ts = tokenstream_get_test();
Module* m = parser_parse(ts);
assert_log_file("expected error to be logged for bad module name");
parser_free(m);
tokenstream_close(ts);
}
static void test_parser_missing_semicolon_module(void) {
TokenStream* ts = tokenstream_get_test();
Module* m = parser_parse(ts);
assert_log_file("expected error for missing semicolon");
parser_free(m);
tokenstream_close(ts);
}
static void test_parser_missing_semicolon_import(void) {
TokenStream* ts = tokenstream_get_test();
Module* m = parser_parse(ts);
assert_log_file("expected error for missing semicolon");
parser_free(m);
tokenstream_close(ts);
}
static void test_parser_bad_import_name(void) {
TokenStream* ts = tokenstream_get_test();
Module* m = parser_parse(ts);
assert_log_file("expected error for bad import name");
parser_free(m);
tokenstream_close(ts);
}
static void test_parser_imports(void) {
TokenStream* ts = tokenstream_open("test.c", "module my_module; import other_module;");
Module* m = parser_parse(ts);
assert_not_null(m, "expected module to be parsed");
assert_str("my_module", m->name, "expected name 'my_module'");
assert_not_null(m->imports, "expected imports to be parsed");
assert_int(1, m->import_count, "expected one import");
assert_str("other_module", m->imports[0].module_name, "expected import name 'other_module'");
assert_false(m->imports[0].is_public, "expected import to not be public");
parser_free(m);
tokenstream_close(ts);
}
static void test_parser_public_imports(void) {
TokenStream* ts = tokenstream_open("test.c", "module my_module; import public other_module;");
Module* m = parser_parse(ts);
assert_not_null(m, "expected module to be parsed");
assert_str("my_module", m->name, "expected name 'my_module'");
assert_not_null(m->imports, "expected imports to be parsed");
assert_int(1, m->import_count, "expected one import");
assert_str("other_module", m->imports[0].module_name, "expected import name 'other_module'");
assert_true(m->imports[0].is_public, "expected import to be public");
parser_free(m);
tokenstream_close(ts);
}
+33 -38
View File
@@ -1,6 +1,7 @@
#include "test.h" #include "test.h"
#include "token.h" #include "token.h"
#include <string.h> #include <string.h>
#include <stdlib.h>
static void test_tokenstream_open_fail(void) { static void test_tokenstream_open_fail(void) {
TokenStream* ts = tokenstream_open(NULL, NULL); TokenStream* ts = tokenstream_open(NULL, NULL);
@@ -8,19 +9,19 @@ static void test_tokenstream_open_fail(void) {
} }
static void test_tokenstream_simple_keyword(void) { static void test_tokenstream_simple_keyword(void) {
TokenStream* ts = tokenstream_open("test.c", "module"); TokenStream* ts = test_get_tokenstream();
Token t;
Token eof;
Token t = tokenstream_next(ts); t = tokenstream_next(ts);
if (t.token != TOKEN_MODULE) fail("expected TOKEN_MODULE"); if (t.token != TOKEN_MODULE) fail("expected TOKEN_MODULE");
Token eof = tokenstream_next(ts); eof = tokenstream_next(ts);
if (eof.token != TOKEN_EOF) fail("expected EOF"); if (eof.token != TOKEN_EOF) fail("expected EOF");
tokenstream_close(ts);
} }
static void test_tokenstream_keywords_and_symbols(void) { static void test_tokenstream_keywords_and_symbols(void) {
TokenStream* ts = tokenstream_open("test.c", "module main; import stdio;"); TokenStream* ts = test_get_tokenstream();
if (tokenstream_next(ts).token != TOKEN_MODULE) fail("expected TOKEN_MODULE"); if (tokenstream_next(ts).token != TOKEN_MODULE) fail("expected TOKEN_MODULE");
if (tokenstream_next(ts).token != TOKEN_IDENTIFIER) fail("expected TOKEN_IDENTIFIER (main)"); if (tokenstream_next(ts).token != TOKEN_IDENTIFIER) fail("expected TOKEN_IDENTIFIER (main)");
@@ -29,24 +30,20 @@ static void test_tokenstream_keywords_and_symbols(void) {
if (tokenstream_next(ts).token != TOKEN_IDENTIFIER) fail("expected TOKEN_IDENTIFIER (stdio)"); if (tokenstream_next(ts).token != TOKEN_IDENTIFIER) fail("expected TOKEN_IDENTIFIER (stdio)");
if (tokenstream_next(ts).token != TOKEN_SEMICOLON) fail("expected TOKEN_SEMICOLON"); if (tokenstream_next(ts).token != TOKEN_SEMICOLON) fail("expected TOKEN_SEMICOLON");
if (tokenstream_next(ts).token != TOKEN_EOF) fail("expected EOF"); if (tokenstream_next(ts).token != TOKEN_EOF) fail("expected EOF");
tokenstream_close(ts);
} }
static void test_tokenstream_parentheses_and_brackets(void) { static void test_tokenstream_parentheses_and_brackets(void) {
TokenStream* ts = tokenstream_open("test.c", "()[]"); TokenStream* ts = test_get_tokenstream();
if (tokenstream_next(ts).token != TOKEN_PARENT_OPEN) fail("expected TOKEN_PARENT_OPEN"); if (tokenstream_next(ts).token != TOKEN_PARENT_OPEN) fail("expected TOKEN_PARENT_OPEN");
if (tokenstream_next(ts).token != TOKEN_PARENT_CLOSE) fail("expected TOKEN_PARENT_CLOSE"); if (tokenstream_next(ts).token != TOKEN_PARENT_CLOSE) fail("expected TOKEN_PARENT_CLOSE");
if (tokenstream_next(ts).token != TOKEN_BRACKET_OPEN) fail("expected TOKEN_BRACKET_OPEN"); if (tokenstream_next(ts).token != TOKEN_BRACKET_OPEN) fail("expected TOKEN_BRACKET_OPEN");
if (tokenstream_next(ts).token != TOKEN_BRACKET_CLOSE) fail("expected TOKEN_BRACKET_CLOSE"); if (tokenstream_next(ts).token != TOKEN_BRACKET_CLOSE) fail("expected TOKEN_BRACKET_CLOSE");
if (tokenstream_next(ts).token != TOKEN_EOF) fail("expected EOF"); if (tokenstream_next(ts).token != TOKEN_EOF) fail("expected EOF");
tokenstream_close(ts);
} }
static void test_tokenstream_comma(void) { static void test_tokenstream_comma(void) {
TokenStream* ts = tokenstream_open("test.c", "a,b,c"); TokenStream* ts = test_get_tokenstream();
if (tokenstream_next(ts).token != TOKEN_IDENTIFIER) fail("expected a"); if (tokenstream_next(ts).token != TOKEN_IDENTIFIER) fail("expected a");
if (tokenstream_next(ts).token != TOKEN_COMMA) fail("expected comma"); if (tokenstream_next(ts).token != TOKEN_COMMA) fail("expected comma");
@@ -54,65 +51,63 @@ static void test_tokenstream_comma(void) {
if (tokenstream_next(ts).token != TOKEN_COMMA) fail("expected comma"); if (tokenstream_next(ts).token != TOKEN_COMMA) fail("expected comma");
if (tokenstream_next(ts).token != TOKEN_IDENTIFIER) fail("expected c"); if (tokenstream_next(ts).token != TOKEN_IDENTIFIER) fail("expected c");
if (tokenstream_next(ts).token != TOKEN_EOF) fail("expected EOF"); if (tokenstream_next(ts).token != TOKEN_EOF) fail("expected EOF");
tokenstream_close(ts);
} }
static void test_tokenstream_whitespace_ignored(void) { static void test_tokenstream_whitespace_ignored(void) {
TokenStream* ts = tokenstream_open("test.c", " module \n\t import ; "); TokenStream* ts = test_get_tokenstream();
if (tokenstream_next(ts).token != TOKEN_MODULE) fail("expected TOKEN_MODULE"); if (tokenstream_next(ts).token != TOKEN_MODULE) fail("expected TOKEN_MODULE");
if (tokenstream_next(ts).token != TOKEN_IMPORT) fail("expected TOKEN_IMPORT"); if (tokenstream_next(ts).token != TOKEN_IMPORT) fail("expected TOKEN_IMPORT");
if (tokenstream_next(ts).token != TOKEN_SEMICOLON) fail("expected TOKEN_SEMICOLON"); if (tokenstream_next(ts).token != TOKEN_SEMICOLON) fail("expected TOKEN_SEMICOLON");
if (tokenstream_next(ts).token != TOKEN_EOF) fail("expected EOF"); if (tokenstream_next(ts).token != TOKEN_EOF) fail("expected EOF");
tokenstream_close(ts);
} }
static void test_tokenstream_void_function_signature(void) { static void test_tokenstream_void_function_signature(void) {
TokenStream* ts = tokenstream_open("test.c", "void main()"); TokenStream* ts = test_get_tokenstream();
if (tokenstream_next(ts).token != TOKEN_VOID) fail("expected TOKEN_VOID"); if (tokenstream_next(ts).token != TOKEN_VOID) fail("expected TOKEN_VOID");
if (tokenstream_next(ts).token != TOKEN_IDENTIFIER) fail("expected TOKEN_IDENTIFIER"); if (tokenstream_next(ts).token != TOKEN_IDENTIFIER) fail("expected TOKEN_IDENTIFIER");
if (tokenstream_next(ts).token != TOKEN_PARENT_OPEN) fail("expected TOKEN_PARENT_OPEN"); if (tokenstream_next(ts).token != TOKEN_PARENT_OPEN) fail("expected TOKEN_PARENT_OPEN");
if (tokenstream_next(ts).token != TOKEN_PARENT_CLOSE) fail("expected TOKEN_PARENT_CLOSE"); if (tokenstream_next(ts).token != TOKEN_PARENT_CLOSE) fail("expected TOKEN_PARENT_CLOSE");
if (tokenstream_next(ts).token != TOKEN_EOF) fail("expected EOF"); if (tokenstream_next(ts).token != TOKEN_EOF) fail("expected EOF");
tokenstream_close(ts);
} }
static void test_tokenstream_unknown_token(void) { static void test_tokenstream_unknown_token(void) {
TokenStream* ts = tokenstream_get_test(); TokenStream* ts = test_get_tokenstream();
if (tokenstream_next(ts).token != TOKEN_UNKNOWN) fail("expected TOKEN_UNKNOWN"); if (tokenstream_next(ts).token != TOKEN_UNKNOWN) fail("expected TOKEN_UNKNOWN");
assert_log_file("expected error message for unknown token"); assert_log_file("expected error message for unknown token");
tokenstream_close(ts);
} }
static void test_tokenstream_info(void) { static void test_tokenstream_info(void) {
TokenStream* ts = tokenstream_open("test.c", "module main;"); TokenStream* ts = test_get_tokenstream();
Token t1;
Token t2;
Token t1 = tokenstream_next(ts); t1 = tokenstream_next(ts);
if (t1.token != TOKEN_MODULE) fail("expected TOKEN_MODULE"); if (t1.token != TOKEN_MODULE) fail("expected TOKEN_MODULE");
assert_string("module", t1.text, "info: expected 'module'");
char buf1[32];
memcpy(buf1, t1.text.data, t1.text.length);
buf1[t1.text.length] = '\0';
assert_str("module", buf1, "info: expected 'module'");
if (t1.location.line != 1) fail("expected line 1"); if (t1.location.line != 1) fail("expected line 1");
if (t1.location.column_start != 1) fail("expected column 1"); if (t1.location.column_start != 1) fail("expected column 1");
Token t2 = tokenstream_next(ts); t2 = tokenstream_next(ts);
if (t2.token != TOKEN_IDENTIFIER) fail("expected TOKEN_IDENTIFIER"); if (t2.token != TOKEN_IDENTIFIER) fail("expected TOKEN_IDENTIFIER");
assert_string("main", t2.text, "info: expected 'main'");
char buf2[32];
memcpy(buf2, t2.text.data, t2.text.length);
buf2[t2.text.length] = '\0';
assert_str("main", buf2, "info: expected 'main'");
if (t2.location.line != 1) fail("expected line 1"); if (t2.location.line != 1) fail("expected line 1");
if (t2.location.column_start != 8) fail("expected column 8"); if (t2.location.column_start != 8) fail("expected column 8");
}
tokenstream_close(ts);
static void test_tokenstream_primitive_types(void) {
TokenStream* ts = test_get_tokenstream();
if (tokenstream_next(ts).token != TOKEN_I8) fail("expected TOKEN_I8");
if (tokenstream_next(ts).token != TOKEN_I16) fail("expected TOKEN_I16");
if (tokenstream_next(ts).token != TOKEN_I32) fail("expected TOKEN_I32");
if (tokenstream_next(ts).token != TOKEN_I64) fail("expected TOKEN_I64");
if (tokenstream_next(ts).token != TOKEN_U8) fail("expected TOKEN_U8");
if (tokenstream_next(ts).token != TOKEN_U16) fail("expected TOKEN_U16");
if (tokenstream_next(ts).token != TOKEN_U32) fail("expected TOKEN_U32");
if (tokenstream_next(ts).token != TOKEN_U64) fail("expected TOKEN_U64");
if (tokenstream_next(ts).token != TOKEN_EOF) fail("expected EOF");
} }
+9
View File
@@ -0,0 +1,9 @@
module mymodule;
import foo;
alias myalias = i32[];
import bar;
alias otheralias = i32;
+3
View File
@@ -0,0 +1,3 @@
module mymodule;
alias myalias = i32[];
+3
View File
@@ -0,0 +1,3 @@
module mymodule;
alias myalias = i32;
+3
View File
@@ -0,0 +1,3 @@
module mymodule;
alias myalias = i32;
+1
View File
@@ -1 +1,2 @@
module mymodule;
import ; import ;
+4 -4
View File
@@ -1,4 +1,4 @@
--- v0/tests/parser_bad_impo --- --- v0/tests/parser_bad_import_name.c2 ---
1| import ; 2| import ;
^^^^^^ ^
expected 'module' keyword expected module identifier
+2 -2
View File
@@ -1,4 +1,4 @@
--- v0/tests/parser_bad_modu --- --- v0/tests/parser_bad_module_name.c2 ---
1| import other_module; 1| import other_module;
^^^^^^ ^^^^^^
expected 'module' keyword expected keyword 'module'
+2
View File
@@ -0,0 +1,2 @@
module my_module;
import other_module;
+2 -2
View File
@@ -1,4 +1,4 @@
--- --- v0/tests/parser_missing_semicolon_import.c2 ---
2| 1| module my_module; import other_module
^ ^
expected ';' after import expected ';' after import
+2 -3
View File
@@ -1,5 +1,4 @@
--- --- v0/tests/parser_missing_semicolon_module.c2 ---
--- 1| module my_module
2|
^ ^
expected ';' after module name expected ';' after module name
+1
View File
@@ -0,0 +1 @@
module my_module;
+4
View File
@@ -0,0 +1,4 @@
module test_multiple_vars;
i32 var1;
i32 var2;
+3
View File
@@ -0,0 +1,3 @@
module my_module;
public import other_module;
+3
View File
@@ -0,0 +1,3 @@
module test_const_var;
const i32 my_const;
+2
View File
@@ -0,0 +1,2 @@
module mymodule;
var x = 123;
+4
View File
@@ -0,0 +1,4 @@
module my_module;
// Defines a global variable called my_var.
i32 my_var;
+4
View File
@@ -0,0 +1,4 @@
module my_module;
// Defines a global variable called my_var.
i32 my_var;
+3
View File
@@ -0,0 +1,3 @@
module test_static_var;
static i32 my_static;
+1
View File
@@ -0,0 +1 @@
a,b,c
+1
View File
@@ -0,0 +1 @@
module main;
@@ -0,0 +1 @@
module main; import stdio;
@@ -0,0 +1 @@
()[]
+1
View File
@@ -0,0 +1 @@
i8 i16 i32 i64 u8 u16 u32 u64
+1
View File
@@ -0,0 +1 @@
module
+1 -2
View File
@@ -1,5 +1,4 @@
--- --- v0/tests/tokenstream_unknown_token.c2 ---
---
1| % 1| %
^ ^
unexpected token '%' unexpected token '%'
@@ -0,0 +1 @@
void main()
@@ -0,0 +1,2 @@
module
import ;
+69 -23
View File
@@ -12,7 +12,7 @@ struct TokenStream {
int column; int column;
const char* line_start; const char* line_start;
// End of last non-EOF token /* End of last non-EOF token */
int last_line; int last_line;
int last_column_end; int last_column_end;
const char* last_line_start; const char* last_line_start;
@@ -26,11 +26,26 @@ typedef struct {
const char* keyword; const char* keyword;
TokenType token; TokenType token;
} KeywordMap; } KeywordMap;
static const KeywordMap keywords[] = { static const KeywordMap keywords[] = {
{"module", TOKEN_MODULE}, {"module", TOKEN_MODULE},
{"import", TOKEN_IMPORT}, {"import", TOKEN_IMPORT},
{"alias", TOKEN_ALIAS},
{"public", TOKEN_PUBLIC},
{"var", TOKEN_VAR},
{"const", TOKEN_CONST},
{"static", TOKEN_STATIC},
{"void", TOKEN_VOID}, {"void", TOKEN_VOID},
{"i8", TOKEN_I8},
{"i16", TOKEN_I16},
{"i32", TOKEN_I32},
{"i64", TOKEN_I64},
{"u8", TOKEN_U8},
{"u16", TOKEN_U16},
{"u32", TOKEN_U32},
{"u64", TOKEN_U64},
{"true", TOKEN_TRUE},
{"false", TOKEN_FALSE},
}; };
/** /**
@@ -39,7 +54,8 @@ static const KeywordMap keywords[] = {
*/ */
static TokenType lookup_keyword(const char* str, size_t length) { static TokenType lookup_keyword(const char* str, size_t length) {
int count = sizeof(keywords) / sizeof(keywords[0]); int count = sizeof(keywords) / sizeof(keywords[0]);
for (int i = 0; i < count; i++) { int i;
for (i = 0; i < count; i++) {
if (strlen(keywords[i].keyword) == length && if (strlen(keywords[i].keyword) == length &&
strncmp(keywords[i].keyword, str, length) == 0) { strncmp(keywords[i].keyword, str, length) == 0) {
return keywords[i].token; return keywords[i].token;
@@ -117,14 +133,22 @@ static Token create_token(TokenStream* ts, TokenType type, const char* text, siz
} }
TokenStream* tokenstream_open(const char* filename, const char* code) { TokenStream* tokenstream_open(const char* filename, const char* code) {
/* Declarations first for C89 */
TokenStream* ts;
const char* name_src;
if (code == NULL) return NULL; if (code == NULL) return NULL;
TokenStream* ts = (TokenStream*)malloc(sizeof(struct TokenStream)); ts = (TokenStream*)malloc(sizeof(struct TokenStream));
if (ts == NULL) { if (ts == NULL) {
return NULL; return NULL;
} }
ts->filename = strdup(filename ? filename : "unknown"); name_src = filename ? filename : "unknown";
ts->filename = malloc(strlen(name_src) + 1);
if (ts->filename) {
memcpy(ts->filename, name_src, strlen(name_src) + 1);
}
ts->code = code; ts->code = code;
ts->pos = 0; ts->pos = 0;
ts->line = 1; ts->line = 1;
@@ -143,14 +167,20 @@ void tokenstream_close(TokenStream* ts) {
} }
Token tokenstream_next(TokenStream* ts) { Token tokenstream_next(TokenStream* ts) {
/* Declarations first for C89 */
char c;
int start_line;
int start_column;
const char* line_start;
const char* start_text;
Token t;
if (ts == NULL) { if (ts == NULL) {
Token t = {0}; Token t = {0};
t.token = TOKEN_EOF; t.token = TOKEN_EOF;
return t; return t;
} }
char c;
/* Skip whitespace and comments */ /* Skip whitespace and comments */
while ((c = peek_char(ts)) != '\0') { while ((c = peek_char(ts)) != '\0') {
if (isspace(c)) { if (isspace(c)) {
@@ -182,26 +212,18 @@ Token tokenstream_next(TokenStream* ts) {
t.text.length = 0; t.text.length = 0;
t.location.filename = ts->filename; t.location.filename = ts->filename;
if (ts->pos > 0 && ts->code[ts->pos - 1] == '\n') {
t.location.line = ts->line;
t.location.column_start = 1;
t.location.column_end = 1;
t.location.line_text.data = (char*)ts->line_start;
t.location.line_text.length = get_line_length(ts->line_start);
} else {
t.location.line = ts->last_line; t.location.line = ts->last_line;
t.location.column_start = ts->last_column_end + 1; t.location.column_start = ts->last_column_end + 1;
t.location.column_end = ts->last_column_end + 1; t.location.column_end = ts->last_column_end + 1;
t.location.line_text.data = (char*)ts->last_line_start; t.location.line_text.data = (char*)ts->last_line_start;
t.location.line_text.length = get_line_length(ts->last_line_start); t.location.line_text.length = get_line_length(ts->last_line_start);
}
return t; return t;
} }
int start_line = ts->line; start_line = ts->line;
int start_column = ts->column; start_column = ts->column;
const char* line_start = ts->line_start; line_start = ts->line_start;
const char* start_text = &ts->code[ts->pos]; start_text = &ts->code[ts->pos];
c = read_char(ts); c = read_char(ts);
@@ -213,21 +235,45 @@ Token tokenstream_next(TokenStream* ts) {
case ']': return create_token(ts, TOKEN_BRACKET_CLOSE, start_text, 1, start_line, start_column, line_start); case ']': return create_token(ts, TOKEN_BRACKET_CLOSE, start_text, 1, start_line, start_column, line_start);
case ',': return create_token(ts, TOKEN_COMMA, start_text, 1, start_line, start_column, line_start); case ',': return create_token(ts, TOKEN_COMMA, start_text, 1, start_line, start_column, line_start);
case ';': return create_token(ts, TOKEN_SEMICOLON, start_text, 1, start_line, start_column, line_start); case ';': return create_token(ts, TOKEN_SEMICOLON, start_text, 1, start_line, start_column, line_start);
case '=': return create_token(ts, TOKEN_ASSIGN, start_text, 1, start_line, start_column, line_start);
case '"': {
size_t len = 0;
const char* start = &ts->code[ts->pos];
while (peek_char(ts) != '"' && peek_char(ts) != '\0') {
read_char(ts);
len++;
}
if (peek_char(ts) == '"') read_char(ts);
return create_token(ts, TOKEN_STRING, start, len, start_line, start_column + 1, line_start);
}
}
if (isdigit(c)) {
size_t len = 1;
while (isdigit(peek_char(ts))) {
read_char(ts);
len++;
}
return create_token(ts, TOKEN_INTEGER, start_text, len, start_line, start_column, line_start);
} }
/* Keywords and identifiers */ /* Keywords and identifiers */
if (is_identifier_start(c)) { if (is_identifier_start(c)) {
size_t length = 1; /* Declarations first for C89 */
size_t length;
TokenType type;
length = 1;
while (is_identifier_part(peek_char(ts))) { while (is_identifier_part(peek_char(ts))) {
read_char(ts); read_char(ts);
length++; length++;
} }
TokenType type = lookup_keyword(start_text, length); type = lookup_keyword(start_text, length);
return create_token(ts, type, start_text, length, start_line, start_column, line_start); return create_token(ts, type, start_text, length, start_line, start_column, line_start);
} }
/* Unknown character */ /* Unknown character */
Token t = create_token(ts, TOKEN_UNKNOWN, start_text, 1, start_line, start_column, line_start); t = create_token(ts, TOKEN_UNKNOWN, start_text, 1, start_line, start_column, line_start);
log_on_line(&t.location, t.location.column_end, "unexpected token '%c'", c); log_on_line(&t.location, "unexpected token '%c'", c);
return t; return t;
} }
+27 -9
View File
@@ -10,40 +10,58 @@
* A list of all possible tokens. * A list of all possible tokens.
*/ */
typedef enum { typedef enum {
// Keywords /* Keywords */
TOKEN_MODULE, TOKEN_MODULE,
TOKEN_IMPORT, TOKEN_IMPORT,
TOKEN_SEMICOLON, TOKEN_SEMICOLON,
TOKEN_ALIAS,
TOKEN_PUBLIC,
TOKEN_VAR,
TOKEN_CONST,
TOKEN_STATIC,
// Symbols /* Symbols */
TOKEN_PARENT_OPEN, TOKEN_PARENT_OPEN,
TOKEN_PARENT_CLOSE, TOKEN_PARENT_CLOSE,
TOKEN_BRACKET_OPEN, TOKEN_BRACKET_OPEN,
TOKEN_BRACKET_CLOSE, TOKEN_BRACKET_CLOSE,
TOKEN_COMMA, TOKEN_COMMA,
TOKEN_ASSIGN,
// Primitives /* Primitives */
TOKEN_VOID, TOKEN_VOID,
TOKEN_I8,
TOKEN_I16,
TOKEN_I32,
TOKEN_I64,
TOKEN_U8,
TOKEN_U16,
TOKEN_U32,
TOKEN_U64,
TOKEN_STRING,
TOKEN_INTEGER,
TOKEN_TRUE,
TOKEN_FALSE,
// Variable /* Variable */
TOKEN_IDENTIFIER, TOKEN_IDENTIFIER,
// Others /* Others */
TOKEN_EOF, TOKEN_EOF,
TOKEN_UNKNOWN, TOKEN_UNKNOWN
} TokenType; } TokenType;
/** /**
* Holds additional information about a token. * Holds additional information about a token.
*/ */
typedef struct { typedef struct {
/// @brief The actual token. /* @brief The actual token. */
TokenType token; TokenType token;
/// @brief The textual representation of a token. /* @brief The textual representation of a token. */
String text; String text;
/// @brief The location of the token. /* @brief The location of the token. */
Location location; Location location;
} Token; } Token;
+9
View File
@@ -0,0 +1,9 @@
/**
* Contains runtime information about types.
*/
#ifndef TYPES_H
#define TYPES_H
#endif
+46
View File
@@ -0,0 +1,46 @@
#include "util.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
/* Portable va_copy fallback for pre-C99 or platforms without va_copy. */
#ifndef va_copy
# if defined(__va_copy)
# define va_copy(dest, src) __va_copy(dest, src)
# else
# define va_copy(dest, src) ((dest) = (src))
# endif
#endif
char* format_string_va(const char* fmt, va_list args) {
/* Declarations first to satisfy -std=c89 */
va_list args_copy;
int needed;
char* buf;
if (!fmt) return NULL;
va_copy(args_copy, args);
needed = vsnprintf(NULL, 0, fmt, args_copy);
va_end(args_copy);
if (needed < 0) return NULL;
buf = (char*)malloc((size_t)needed + 1);
if (!buf) return NULL;
vsnprintf(buf, (size_t)needed + 1, fmt, args);
return buf;
}
char* format_string(const char* fmt, ...) {
/* Declarations first to satisfy -std=c89 */
va_list args;
char* s;
if (!fmt) return NULL;
va_start(args, fmt);
s = format_string_va(fmt, args);
va_end(args);
return s;
}
+27
View File
@@ -0,0 +1,27 @@
#ifndef UTIL_H
#define UTIL_H
#include <stdarg.h>
#include <stddef.h>
/**
* Formats a string using printf-style formatting and returns a newly allocated string.
* The caller is responsible for freeing the returned string.
*
* @param fmt The format string.
* @param ... The values to format.
* @return A newly allocated string containing the formatted output.
*/
char* format_string(const char* fmt, ...);
/**
* Formats a string using printf-style formatting with a va_list and returns a newly allocated string.
* The caller is responsible for freeing the returned string.
*
* @param fmt The format string.
* @param args The va_list of values to format.
* @return A newly allocated string containing the formatted output.
*/
char* format_string_va(const char* fmt, va_list args);
#endif