I generally agree with OpenBLAS---great in (nearly) all ways---except doesn't it have a bunch of Fortran? (29% according to GitHub) Perhaps it's simple enough to break out the Fortran so we're only concerned with independent C code, or maybe it's easy enough to translate Fortran, or maybe a very clean FFI that we're just conceding to be OK with for the purposes of a starter example. Tom Anderson <tom@cs.washington.edu> writes:
Thank Amit for the recap. To amplify a couple things:
Gilbert mentioned potentially recruiting some students to work on the project, which sounds great. To do that we need a message and what works for DARPA is not always what works for students. One way to put this is what's the most interesting part of this topic? I'd guess its lifting - inferring intent from code, in a way that's useful. Transpilation is one use case but not the only one. Can we extract a spec from code? I think that would be super useful.
Indeed, a good paper might be "just" be extracting a spec that could be used for, e.g., testing or verification of a port to a new language/platform/etc. I can see uses for this beyond automatically translating for Rust, including manually translating to Rust (or something else), evaluating different implementations of the same protocol (this is actually a real real problem with 6lowpan implementations that are just not compatible), etc.
Second, my experience with using ML is that you *really* need to break the problem down into solvable sub-pieces. One can think of this as curricular learning - how do you get better at something? Start with something simple and add complexity as you get better at it. If the challenge is, let's try to infer intent, let's start with some code where that's relatively easy, and add complexity as we go. One reason I suggest openblas is (1) it uses convenience pointers (2) its functional (3) it has a spec for each function in comments.
So it should be relatively simple (lol) to set as an initial subproblem - hoist openblas into a spec, then produce a verus program as output - rust code plus verus spec, that we use verus to prove correct so we don't need test generation. of course if you think the interesting part is elsewhere then maybe that's not a good initial step and if so lets discuss. I think driver transpilation is super useful, but for everyone I talk to about this, I get back that they wouldn't even know where to start.
On Thu, Nov 7, 2024 at 10:53 AM Amit Levy <amit@amitlevy.com> wrote:
Hi all (Mike, Tom, Baris, Gilbert, Haoran),
After submitting the DARPA proposal, Tom suggested we start thinking and doing what we would either want to have done by the time funds arrive or what we want to have done for an NSF large in 10 months or so.
Haoran has already started looking at this problem, so would be great to concentrate on some useful starting examples, and keep posted on any progress.
I suggested some libraries I'm particularly interested in:
- TPM2, the reference implementation for the trusted platform module specification.
- OpenThread or LwIP -- both network stacks used as "userspace" libraries
- LVGL GUI library
- BoringSSL/aws-lc -- we have work verifying the safety of the assembly against a Rust function parameter, the aws-lc people _want_ to have a version that is entirely Rust and it's likely relatively straightforward to port (e.g., there mostly aren't any complex data structures or anything.
Three of the four have pretty robust test suites.
Tom helpfully pointed out that some of these are likely too high a bar to start with, and that we really want to start with things that are fairly simple and almost certainly do not have bugs in them.
-Amit _______________________________________________ Harvest mailing list -- harvest@lists.betterbytes.org To unsubscribe send an email to harvest-leave@lists.betterbytes.org