- Harvest - lists.tockos.org

(No) talk by Galen Hunt
by Michael Ernst 19 Nov '24

19 Nov '24

I invited Galen Hunt to give a talk at UW. He said he doesn't have one prepared and isn't sure what he would say. He asked me to ping him again next year, and I will do so. -Mike

1 0

Harvest Kick-off
by Amit Levy 14 Nov '24

14 Nov '24

Hi all (Mike, Tom, Baris, Gilbert, Haoran), After submitting the DARPA proposal, Tom suggested we start thinking and doing what we would either want to have done by the time funds arrive or what we want to have done for an NSF large in 10 months or so. Haoran has already started looking at this problem, so would be great to concentrate on some useful starting examples, and keep posted on any progress. I suggested some libraries I'm particularly interested in: - TPM2, the reference implementation for the trusted platform module specification. - OpenThread or LwIP -- both network stacks used as "userspace" libraries - LVGL GUI library - BoringSSL/aws-lc -- we have work verifying the safety of the assembly against a Rust function parameter, the aws-lc people _want_ to have a version that is entirely Rust and it's likely relatively straightforward to port (e.g., there mostly aren't any complex data structures or anything. Three of the four have pretty robust test suites. Tom helpfully pointed out that some of these are likely too high a bar to start with, and that we really want to start with things that are fairly simple and almost certainly do not have bugs in them. -Amit

4 5

HP Progress Report 20241105
by Haoran Peng 12 Nov '24

12 Nov '24

This Week's Progress: * Tried running c2rust on ESA libmCS * A math library that does not contain complicated data structures * This is the first milestone given by TRACTOR * Running configure generates a makefile, then running `bear -- make` generates a `compile_commands.json`, which is taken by c2rust to decide which files to transpile under what configurations * c2rust runs the C preprocessor before performing transpilation, which means that: * Macros that are used to accommodate different machine platforms are expanded, and the transpiled code would only work for the platform on which the preprocessor was run, and the configurations that are designate when running configure * Function-like macros are expanded, which makes the transpiled rust code less readable * Two possible research sub-topics related with macro expansion: * Configure-make-macro is a commonly used C project management paradigm, with the execution and expansion of which the code that targets a specific environment is generated, and the rest is pruned. Preprocessed code may contain platform-specific primitives, or even replace large chunks implementations containing multiple functions. For project-level C to Rust translation, it is important to keep this portability. What would be an idiomatic project management paradigm for Rust, and how do we migrate to that from a C project? * Without access to generics, C programmers often make use of macros to implement what is equivalent to a generic function. At preprocessing, such macros are copy-pasted to source code. This makes transpiled Rust code hard to read, and loses information about this meaningful structure. Ideally, they should be transpiled into Rust's generic functions. Here are some possible solutions: * Identify and promote these macros to C functions before preprocessing * Pros: transparent to the preprocessor, fits in C grammar so it does not need extra IR * Cons: Programmers have possibly chosen macro over function because it requires void * and extra params to implement a "generic" function in C, which is likely to occur in a lifted function. Later transpilation steps may need extra efforts to turn this void * type of functions into generic functions. * Identify these macros and tag them as generics * Pros: later transpliation steps can easily pick up this tag to minimize information loss and produce higher quality generic functions * Cons: requires extra IR above C to support this tag, requires extra preprocessor and C parser support * Expand them as-is and try to merge them later * Pros: need no effort on the preprocessor and C parser end * Cons: loses a lot of information, identifying sections that were the same macro before may be a harder task on the transpiler end (maybe leaving a heuristic-styled attribute tag can simplify this?) * Wrote a working set size statistics script for Jenga * Rohan proposed replacing "% of memory as fast tier" with "% of working set size as fast tier" for our re-submission * The script takes a trace, chops it down to equal-length intervals, counts how many pages have been accessed at least once (or pages that account for the most accesses) in this interval, and takes the max/average of all intervals as the working set size * The script takes several (may be controversial) parameters that may need further discussion * Granularity: should we use 30k or 2M as an interval length? * A shorter interval indicates that we assume the machine migrates pages fast enough to follow up the change of working set * Should we use max or average? * Should we cover all pages that have been accessed in an interval, or for avoiding the case where a lot of pages are only accessed several times, cover only the pages that get most accessed, ranking them down until it reaches e.g. 90% of all accesses in the interval?

1 0