2023-10-01 16:41:22 +02:00
|
|
|
|
# lib-laddertypes
|
|
|
|
|
|
|
|
|
|
Rust Implementation of Ladder-Types (parsing, unification, rewriting, etc)
|
|
|
|
|
<hr/>
|
|
|
|
|
|
|
|
|
|
## Ladder Types
|
|
|
|
|
|
2024-05-01 17:07:47 +02:00
|
|
|
|
### Motivation
|
|
|
|
|
|
2023-10-03 01:39:50 +02:00
|
|
|
|
In order to implement complex datastructures and algorithms, usually
|
2023-10-03 04:34:47 +02:00
|
|
|
|
many layers of abstraction are built ontop of each other.
|
|
|
|
|
Consequently higher-level data types are encoded into lower-level data
|
2023-10-03 01:39:50 +02:00
|
|
|
|
types, forming a chain of embeddings from concept to `rock bottom' of
|
2023-10-03 04:34:47 +02:00
|
|
|
|
byte streams. While a high-level type makes claims about the
|
|
|
|
|
semantics of objects of that type, high-level types are ambiguous in
|
|
|
|
|
regard to their concrete syntactical representation or memory
|
|
|
|
|
layout. However for compositions to be type-safe, compatibility of
|
|
|
|
|
concrete represenations must be ensured.
|
|
|
|
|
|
|
|
|
|
For example in the unix shell, many different tools & utilities
|
|
|
|
|
coexist. Depending on the application domain, each of them will
|
|
|
|
|
potentially make use of different representational forms for the same
|
|
|
|
|
abstract concepts. E.g. for the concept 'natural number', many
|
|
|
|
|
representations do exist, e.g. with variation over radices,
|
|
|
|
|
endianness, digit encoding etc.
|
2023-10-03 01:39:50 +02:00
|
|
|
|
|
|
|
|
|
Intuitively, *ladder types* provide a way to distinguish between
|
|
|
|
|
multiple *concrete representations* of the same *abstract / conceptual
|
|
|
|
|
type*, by capturing the *represented-as* of layered data formats in
|
|
|
|
|
the structure of type-terms. Formally, we introduce a new type
|
|
|
|
|
constructor, called the *ladder type*, written `T1 ~ T2`, where `T1`
|
|
|
|
|
and `T2` are types. The type-term `T1 ~ T2` then expresses the
|
|
|
|
|
abstract type of `T1` being represented in terms of the concrete type
|
|
|
|
|
`T2`, which can be read by "`T1` represented as `T2`".
|
2023-10-01 16:41:22 +02:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
#### Example
|
|
|
|
|
The following type describes a colon-separated sequence of timepoints,
|
|
|
|
|
each represented as unix-timestamp written as decimal number in
|
|
|
|
|
big-endian, encoded as UTF-8 string.
|
|
|
|
|
|
|
|
|
|
```
|
2023-10-03 04:34:47 +02:00
|
|
|
|
<Seq TimePoint
|
2023-10-01 16:41:22 +02:00
|
|
|
|
~<TimeSince UnixEpoch>
|
|
|
|
|
~<Duration Seconds>
|
|
|
|
|
~ℕ
|
|
|
|
|
~<PosInt 10 BigEndian>
|
|
|
|
|
~<Seq <Digit 10>~Char>>
|
|
|
|
|
~<SepSeq Char ':'>
|
|
|
|
|
~<Seq Char>
|
|
|
|
|
~UTF-8
|
|
|
|
|
~<Seq Byte>
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
An object that fits the format described by this type could look like
|
|
|
|
|
this:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
1696093021:1696093039:1528324679:1539892301:1638141920:1688010253
|
|
|
|
|
```
|
|
|
|
|
|
2024-05-01 17:07:47 +02:00
|
|
|
|
### Syntax
|
|
|
|
|
|
|
|
|
|
In their core form, type-terms can be one of the following:
|
|
|
|
|
- (**Atomic Type**) | `SomeTypeName`
|
|
|
|
|
- (**Literal Integer**) | `0` | `1` | `2` | ...
|
|
|
|
|
- (**Literal Character**) | `'a'` | `'b'` | `'c'` | ...
|
|
|
|
|
- (**Literal String**) | `"abc"`
|
|
|
|
|
- (**Parameter Application**) | `<T1 T2>` given `T1` and `T2` are type-terms
|
|
|
|
|
- (**Ladder**) | `T1 ~ T2` given `T1` and `T2` are type-terms
|
|
|
|
|
|
|
|
|
|
Ontop of that, the following syntax-sugar is defined:
|
|
|
|
|
|
|
|
|
|
#### Complex Types
|
|
|
|
|
- `[ T ]` <===> `<Seq T>`
|
|
|
|
|
- `{ a:A b:B }` <===> `<Struct <"a" A> <"b" B>>`
|
|
|
|
|
- `a:A | b:B` <===> `<Enum <"a" A> <"b" B>>`
|
|
|
|
|
|
|
|
|
|
#### Function Types
|
|
|
|
|
- `A -> B` <===> `<Fn A B>`
|
|
|
|
|
|
|
|
|
|
#### Reference Types
|
|
|
|
|
- `*A` <===> `<Ptr A>`
|
|
|
|
|
- `&A` <===> `<ConstRef A>`
|
|
|
|
|
- `&!A` <===> `<MutRef A>`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Equivalences
|
|
|
|
|
|
|
|
|
|
#### Currying
|
|
|
|
|
`<<A B> C>` <===> `<A B C>`
|
|
|
|
|
|
|
|
|
|
#### Ladder-Normal-Form
|
|
|
|
|
exhaustively apply `<A B~C>` ===> `<A B>~<A C>`
|
|
|
|
|
|
|
|
|
|
e.g. `[<Digit 10>]~[Char]~[Ascii]` is in **LNF**
|
|
|
|
|
|
|
|
|
|
#### Parameter-Normal-Form
|
|
|
|
|
exhaustively apply `<A B>~<A C>` ===> `<A B~C>`
|
|
|
|
|
|
|
|
|
|
e.g. `[<Digit 10>~Char~Ascii]` is in **PNF**
|
|
|
|
|
|
|
|
|
|
|
2023-10-02 15:09:59 +02:00
|
|
|
|
## How to use this crate
|
|
|
|
|
|
|
|
|
|
```rust
|
|
|
|
|
use laddertypes::*;
|
|
|
|
|
|
2023-10-03 01:39:50 +02:00
|
|
|
|
fn main() {
|
|
|
|
|
let mut dict = TypeDict::new();
|
2023-10-02 15:09:59 +02:00
|
|
|
|
|
2023-10-03 01:39:50 +02:00
|
|
|
|
let t1 = dict.parse("<A B~X C>").expect("couldnt parse typeterm");
|
|
|
|
|
let t2 = dict.parse("<<A B~X> C>").expect("couldnt parse typeterm");
|
2023-10-02 15:09:59 +02:00
|
|
|
|
|
2023-10-03 01:39:50 +02:00
|
|
|
|
assert_eq!( t1.clone().curry(), t2 );
|
|
|
|
|
assert_eq!( t1, t2.clone().decurry() );
|
|
|
|
|
}
|
2023-10-02 15:09:59 +02:00
|
|
|
|
```
|
2023-10-01 16:41:22 +02:00
|
|
|
|
|
2024-05-01 17:07:47 +02:00
|
|
|
|
## Roadmap
|
|
|
|
|
|
|
|
|
|
- [x] (Un-)Parsing
|
|
|
|
|
- [x] (De-)Currying
|
|
|
|
|
- [x] Unification
|
|
|
|
|
- [x] Ladder-Normal-Form
|
|
|
|
|
- [x] Parameter-Normal-Form
|
|
|
|
|
- [ ] (De)-Sugaring
|
|
|
|
|
- [ ] Seq
|
|
|
|
|
- [ ] Enum
|
|
|
|
|
- [ ] Struct
|
|
|
|
|
- [ ] References
|
|
|
|
|
- [ ] Function
|
2023-10-03 01:39:50 +02:00
|
|
|
|
|
|
|
|
|
## License
|
2023-10-01 16:41:22 +02:00
|
|
|
|
[GPLv3](COPYING)
|
|
|
|
|
|