From 633843d9c7faf9014150a10f55174fe136fd1abf Mon Sep 17 00:00:00 2001 From: Michael Sippel Date: Tue, 17 Sep 2024 18:36:38 +0200 Subject: [PATCH] work on abstract & intro --- paper/main.tex | 90 +++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 85 insertions(+), 5 deletions(-) diff --git a/paper/main.tex b/paper/main.tex index 62d84a0..d2850f3 100644 --- a/paper/main.tex +++ b/paper/main.tex @@ -53,11 +53,35 @@ \begin{abstract} -This paper presents a minimal core calculus extending the \(\lambda\)-calculus by a polymorphic type-system similar to SystemF, but in addition it introduces a new type-constructor called the \emph{ladder-type}. -Using ladder-types, multi-layered embeddings of higher-level data-types into lower-level data-types can be described by a type-level structure. -By facilitating automatic transformations between semantically compatible datatypes, ladder-typing opens up a new paradigm of abstraction. -We formally define the syntax \& semantics of this calculus and prove its \emph{type soundness}. -Further we show how the Boehm-Berarducci encoding can be used to implement algebraic datatypes on the basis of the introduced core calculus. + This work introduces \emph{representational polymorphism} as a form of + coercive subtyping, which differentiates between various concrete + representations w.r.t. an abstract concept using an explicit type + structure. This structure captures the 'represented-as' relation of + multi-level data embeddings. Motivated by the need to manage data + transformations —whether for hardware interfacing, IPC/RPC + communication, or performance optimizations— our type system aims to + reduce bugs and enhance code clarity by enabling implicit coercions + between semantically equivalent representations, while retaining + coherence and soundness. + + We extend the polymorphic λ-calculus (System F) with a new type + constructor, termed \emph{ladder-type} to encode a subtype relation based on + syntactical embedding, which inversely also helps to disambiguate + multiple interpretations of the same presentation type. To facilitate + implicit coercions between semantically equivalent yet presentationally + different types, we introduce morphisms —functions that transform one + representation to another of the same abstract type. Morphisms can be + inductively combined to form complex morphism-paths allowing + transitivity, list-mappings and lifting of presentation subtypes + without loss of coherence. This enables programmers to treat various + representational forms interchangeably as abstract objects, with + required transformations managed by the compiler during a separate + translation step based on expression typing, while still being able to + locally control specific representations. + + We formalize our extended calculus in Coq and demonstrate that our + translation function preserves typing, with progress and preservation + theorems remaining valid. \end{abstract} \maketitle @@ -65,6 +89,62 @@ Further we show how the Boehm-Berarducci encoding can be used to implement algeb %\newpage +\section{Introduction} +The issue of transformation between varying data encodings is all-pervasive in software development. +Altough programming languages in principle try provide a 'realm of consistent abstraction', +many parts of the code that interface with the outside world are still riddled with data transformation routines. + +While certain representational forms might be fixed already at the boundaries of an application, +internally, some other representations might be desired for reasons of simplicity, efficiency or +for the sake of interfacing with hardware, other libraries / processes, the user. +Further, differing complexity-profiles of certain representations might even have the potential to complement +each other and coexist in a single application. +Often however, such specialized implementations become heavily dependent on concrete data formats +and require technical knowledge of the low-level data structures. +Making use of multiple such representations, additionally requires careful transformation of data. + +\paragraph{Interfacing} +\todo{serilization / marshalling} + +\paragraph{How to get fast code ?} + -> algorithmic optimization + -> see which methods are used most frequently + -> switch structures to reduce complexity there + -> identify special cases + -> remove indirection + + -> optimize implementation detail: + -- bottleneck? memory! + ---- plenty of Main memory, + but FAST memory is scarce + --> improve cache efficiency: make compact representations + * Balance De-/Encode Overhead + vs. Load-Overhead + + * Synchronize locality in access order and memory layout. + -> elements accessed in succession shall be close in memory. + (e.g. AoS vs SoA) + + ..while.. + - accounting for machine specific cache-sizes + - accounting for SIMD instructions + - interfacing with kernels running on accelerator devices + +\paragraph{Elegant Abstractions} +\todo{difference to traditional coercions (static cast)} +\todo{relation with inheritance based subtyping: bottom-up vs top-down inheritance vs ladder-types} +\todo{relation to traits/type-classes} +\todo{relation to coercive subtyping} + +\paragraph{Related Work} +\todo{type specific languages} + +In order to facilitate programming at "high-level", we introduce a type-system that is able to disambiguate +this multiplicity of representations and facilitate implicit coercions between them. +We claim this to aid in (1) forgetting details about representational details during program composition +and (2) keeping the system flexible enough to introduce representational optimizations at a later stage without +compromising semantic correctness. + \section{Core Calculus} \subsection{Syntax}