work on abstract & intro

This commit is contained in:
Michael Sippel 2024-09-17 18:36:38 +02:00
parent 127debf945
commit 633843d9c7
Signed by: senvas
GPG key ID: F96CF119C34B64A6

View file

@ -53,11 +53,35 @@
\begin{abstract}
This paper presents a minimal core calculus extending the \(\lambda\)-calculus by a polymorphic type-system similar to SystemF, but in addition it introduces a new type-constructor called the \emph{ladder-type}.
Using ladder-types, multi-layered embeddings of higher-level data-types into lower-level data-types can be described by a type-level structure.
By facilitating automatic transformations between semantically compatible datatypes, ladder-typing opens up a new paradigm of abstraction.
We formally define the syntax \& semantics of this calculus and prove its \emph{type soundness}.
Further we show how the Boehm-Berarducci encoding can be used to implement algebraic datatypes on the basis of the introduced core calculus.
This work introduces \emph{representational polymorphism} as a form of
coercive subtyping, which differentiates between various concrete
representations w.r.t. an abstract concept using an explicit type
structure. This structure captures the 'represented-as' relation of
multi-level data embeddings. Motivated by the need to manage data
transformations —whether for hardware interfacing, IPC/RPC
communication, or performance optimizations— our type system aims to
reduce bugs and enhance code clarity by enabling implicit coercions
between semantically equivalent representations, while retaining
coherence and soundness.
We extend the polymorphic λ-calculus (System F) with a new type
constructor, termed \emph{ladder-type} to encode a subtype relation based on
syntactical embedding, which inversely also helps to disambiguate
multiple interpretations of the same presentation type. To facilitate
implicit coercions between semantically equivalent yet presentationally
different types, we introduce morphisms —functions that transform one
representation to another of the same abstract type. Morphisms can be
inductively combined to form complex morphism-paths allowing
transitivity, list-mappings and lifting of presentation subtypes
without loss of coherence. This enables programmers to treat various
representational forms interchangeably as abstract objects, with
required transformations managed by the compiler during a separate
translation step based on expression typing, while still being able to
locally control specific representations.
We formalize our extended calculus in Coq and demonstrate that our
translation function preserves typing, with progress and preservation
theorems remaining valid.
\end{abstract}
\maketitle
@ -65,6 +89,62 @@ Further we show how the Boehm-Berarducci encoding can be used to implement algeb
%\newpage
\section{Introduction}
The issue of transformation between varying data encodings is all-pervasive in software development.
Altough programming languages in principle try provide a 'realm of consistent abstraction',
many parts of the code that interface with the outside world are still riddled with data transformation routines.
While certain representational forms might be fixed already at the boundaries of an application,
internally, some other representations might be desired for reasons of simplicity, efficiency or
for the sake of interfacing with hardware, other libraries / processes, the user.
Further, differing complexity-profiles of certain representations might even have the potential to complement
each other and coexist in a single application.
Often however, such specialized implementations become heavily dependent on concrete data formats
and require technical knowledge of the low-level data structures.
Making use of multiple such representations, additionally requires careful transformation of data.
\paragraph{Interfacing}
\todo{serilization / marshalling}
\paragraph{How to get fast code ?}
-> algorithmic optimization
-> see which methods are used most frequently
-> switch structures to reduce complexity there
-> identify special cases
-> remove indirection
-> optimize implementation detail:
-- bottleneck? memory!
---- plenty of Main memory,
but FAST memory is scarce
--> improve cache efficiency: make compact representations
* Balance De-/Encode Overhead
vs. Load-Overhead
* Synchronize locality in access order and memory layout.
-> elements accessed in succession shall be close in memory.
(e.g. AoS vs SoA)
..while..
- accounting for machine specific cache-sizes
- accounting for SIMD instructions
- interfacing with kernels running on accelerator devices
\paragraph{Elegant Abstractions}
\todo{difference to traditional coercions (static cast)}
\todo{relation with inheritance based subtyping: bottom-up vs top-down inheritance vs ladder-types}
\todo{relation to traits/type-classes}
\todo{relation to coercive subtyping}
\paragraph{Related Work}
\todo{type specific languages}
In order to facilitate programming at "high-level", we introduce a type-system that is able to disambiguate
this multiplicity of representations and facilitate implicit coercions between them.
We claim this to aid in (1) forgetting details about representational details during program composition
and (2) keeping the system flexible enough to introduce representational optimizations at a later stage without
compromising semantic correctness.
\section{Core Calculus}
\subsection{Syntax}