i came across the huxley godel machine while building clonebob.com. my key takeaways are:
- it is a self-evolving codebase
- it improves against a benchmark (e.g. SWE-bench, polyglot) over multiple iterations (genetic variants of itself)
my questions:
- in the paper, it is applied only to SWE-bench, while it is said that it is extensible to other domains. what if the "domain" in question is arbitrary? do we construct our own benchmark manually?
- how does it hold up against OpenEvolve and Backpropamine? (it is my believe that Backpropamine shows actual plasticity, and not just in code; i.e. it is fundamentally different from HGM and DGM)?
- which one of these paradigms are more promising?
i came across the huxley godel machine while building clonebob.com. my key takeaways are: - it is a self-evolving codebase - it improves against a benchmark (e.g. SWE-bench, polyglot) over multiple iterations (genetic variants of itself)
my questions: - in the paper, it is applied only to SWE-bench, while it is said that it is extensible to other domains. what if the "domain" in question is arbitrary? do we construct our own benchmark manually?
- how does it hold up against OpenEvolve and Backpropamine? (it is my believe that Backpropamine shows actual plasticity, and not just in code; i.e. it is fundamentally different from HGM and DGM)?
- which one of these paradigms are more promising?