Source code of non bootstrapped version of compiler?


Hi, devs! I think it would be useful for community to have an IDE for Stanza. I've been trying to implement Eclipse based IDE using Xtext technology. I have stalled after implementation of top level forms.

The reason is, I don't have many experience in work with macroses, neither with Stanza ). But I know something about parser generators and ANTLR specially. So for me it's a bit complicated to reverse engineer the grammar from the codes and rebuild it. Sure I can try, but there is a lot work related to UI and all other things besides parsing of language.

I hope you can share the old version of compiler written in C. It would be more convinient to understand the structure of language and create better code.



Many thanks for trying to provide Eclipse support for Stanza! It's an often requested feature, but none of us on the core team are familiar with it. 

The original bootstrap compiler was written in Scheme and the language has deviated a lot since then, so I don't think you'll find that helpful.

The most important hurdle to solve is that Stanza's grammar is handled by two distinct phases. The first phase (in core/reader.stanza) converts a list of characters into an s-expression. This uses a custom hand-coded lexer, and you'll have to dig through the source if you want 100% compatibility.

The actual parsing of the s-expression is much more systematically handled. The definition in compiler/stz-core-macros.stanza define a pretty readable grammar.





Here's a little program that will read in a file and print out the lexed s-expressions within it. Additionally, it will also expand it using Stanza's core macros and you can see the actual AST that is compiled by the Stanza compiler. 

You can run it using:

./harness somefile.txt

The macro system requires just a little bit of cleanup and then we'll release and document it fully. 

Hope this helps!


defpackage mypackage :
   import core
   import reader
   import stz/core-macros
defn main () :
   val forms = read-file(command-line-arguments()[1])
   println("=== S-EXPRESSIONS ===")
   println("=== CORE S-EXPRESSIONS ===")
   with-syntax(core) :
      match-syntax(forms) :
         (?forms:#exp ...) :
            do(println, forms)


Seems like this API may have changed. I get:

Construct match-syntax with no base syntax package cannot use production exp

I was able to use the parse-syntax function like so:

val parsed = parse-syntax[core / #exp](List(forms))
    for p in parsed do :

Is there a way to get all symbols instead of just those in the current file?


Sure! Do you have a list of filenames that you want to analyze? You can also use a small function to collect all the xyz.stanza file in the current directory.


I mean, how do I add the builtins and imported terms to the results? 

On Thu, Jun 30, 2022, 12:40 AM Patrick S. Li <patrickli.2001@...> wrote:

Sure! Do you have a list of filenames that you want to analyze? You can also use a small function to collect all the xyz.stanza file in the current directory.


Oh I see,

For the builtins, e.g. the standard macros and language constructs, you might need to just hardcode a list.

For the imported terms, a strategy that might work is: - Scan through the defpackage and import statements within the file. That will tell you the names of the packages that are imported into this file. - Next, you need to somehow figure out the corresponding file that holds a package. This can be done by either scanning the directory, or looking for a stanza.proj file and seeing whether it is listed there.


What about the definitions-database command provided? I've generated a file but it's a binary format, can I read that file into a stanza program? What about in a different language? If not, perhaps there could be a --csv or --text option so that other tools can utilize this feature.

Thank you for your help!


Ah that's right! The thought completely slipped my mind.

That's definitely the easiest way.

You can deserialize the file using this command:

val input = FileInputStream("defsdb.dat")
val db = stz/defs-db-serializer/read-definitions-database(input)

The actual structs for the definitions database are defined in the package stz/defs-db-ir.


So, I am making some headway towards a makeshift language server.

The actual server is a node/shell script (server.mjs) using google/zx. It makes calls to the stanza compiler to rebuild the defs database and also to a stanza script (deserialize.stanza) that deserializes for the language server to then utilize in providing the details. I also convert the package paths for core and clib into their actual stanza file paths. I'd like to come up with a more thorough approach but it would be great if the structure had a source-map field that is gathered when the pkg file is made.

The main extension will start the zx script and request details through a console. The zx server script watches for file changes to rebuild the defsdb and reads the console commands, performing searches against the serialized definitions data to gather the required info. Using this workflow, I should be able to support these LSP actions:

  • document-symbols
  • workspace-symbols
  • references
  • hover
  • completions
  • diagnostic
  • definition
  • implementations
  • signature

I will need to make some decisions on how to best handle settings (and error messages) for the stanza command as that will need to be available for this extension to work.

Let me know what you think!