Computoid
| About | APPerlPerl is Actually Portable
by Gavin HayesAfter stumbling upon Gautham's APE Python port and seeing how far along the Cosmopolitan Libc has come along, I was inspired to see what it would take to port my scripting language of choice, Perl, to the Cosmopolitan Libc and turn it into a self-contained binary. My motivation came from wanting to prove that if Python can do it, Perl can do it too, wanting a more robust Windows Perl port for running my personal media server (MHFS), and the cool factor of hacking on Perl and the Cosmopolitan Libc.
Table of Contents
- Introduction
- 1.0 Building Perl with the Cosmopolitan Libc
- 2.0 Building Actually Portable Perl (APPerl)
- 3.0 The APPerl Perl distribution
- 4.0 Wrap Up
Introduction
The Cosmopolitan Libc allows building Actually Portable Executables: statically linked binaries that will run on six operating systems + BIOS. Perl is written in C. What could we do with an Actually Portable Executable version of Perl? A sysadmin managing a fleet of Red Hat and Windows boxes could start using the latest (and identical) version of Perl across all machines without compiling several different builds of Perl. It could be packaged into a Software Development Kit for easily getting Perl and accompanying scripts onto developer machines. A Perl application developer could increase reach and reduce support burden by also offering a binary release.
Making this vision a reality required some tinkering and development. There were a few gaps in the Libc functionality as the Cosmopolitan Libc is still a relatively new C library, having been in development for a little over two years. To build Actually Portable Executables, the Cosmopolitan Libc breaks some of the rules and conventions held by other libcs. Unlike building with most libcs, the Cosmopolitan Libc is not bound to a compiler. I will discuss addressing such issues below.
Building Perl with the Cosmopolitan Libc
After cloning the Perl source, the next step was generating a proper build config with Configure
. Configure
is not generated by GNU Autotools, but by Perl's own configuration generator generator, metaconfig
, originally written by Larry Wall himself. Modelling after an early version of the Python port, I made a superConfigure
script to invoke Configure
with the flags needed to build with Cosmopolitan. Unfortunately, just because flags were passed in, didn't mean Configure
had to use them. In particular, libs
were applied to the build config, but weren't used in most places for Configure
's own testing. Building for the Cosmopolitan Libc requires passing a large incantation of flags to gcc
as it isn't bound to the C compiler. Edits aren't supposed to be made directly to Configure
, but a patch file or metaconfig
changes could be done later. Modifying Configure
to always pass libs
wasn't difficult, it just had to be done in a few places:$cc -o try $ccflags $ldflags try.c $libs
Unsurprisingly, after the first Configure
, the build failed. The compiler errored out from missing headers. Initially this was fixed in two ways, preventing Configure
from finding non-Cosmopolitan headers and adding dummy (empty) header files to it and gcc
's search path. All the needed libc symbols were already being passed to gcc
as the ccflags
directly -include
d the cosmopolitan.h
amalgamation header. However, Configure
and the Perl build depend on the libc header files being available. Using the amalgamation header ultimately proved to be problematic, but the dummy headers seemed like an easy fix at this time. Configure
checked for availability of certain libc symbols by creating test programs that declared them and then testing if the programs compiled. To accommodate the symbols already being included by the amalgamation, the symbol test was patched to not declare the symbol, just use it:
$extern_C void *$1$tdc; void *(*(p()))$tdc { return &$1; }
void *(*(p()))$tdc { return (void*)&$1; }
Perl itself also sometimes declared libc symbols which lead to conflicting types errors. Temporarily, it was fine to comment out the declarations as cosmopolitan.h
already contained them.
//Uid_t getuid (void);
//Uid_t geteuid (void);
//Gid_t getgid (void);
//Gid_t getegid (void);
After the last change, miniperl built! Perl programmers often prefer to write Perl to sh so Perl is built with Perl! To not have a chicken-and-egg situation making it difficult to port Perl to new platforms, the first thing built is miniperl
, Perl without C or XS extension modules that can be used to assist with building the rest of Perl.
To reduce dependence on the superConfigure
script and make building Perl for Cosmopolitan less different from building Perl for other platforms, a hints
file was created, hints/cosmo.sh
. Once a platform is detected or set in Configure
, it loads in the corresponding settings to ease building Perl for a platform.
Passing make minitest
Basic functionality appeared to work under miniperl, but make
had failed and it suggested running make minitest
, a subset of tests that can be ran on miniperl
. For a first run make minitest
wasn't too bad: Failed 95 tests out of 339, 71.98% okay.
However, it was alarming that a crash in miniperl that kept popping up
libc/zipos/find.c:34: assert(ZIP_CFILE_MAGIC(zipos->map + c) == kZipCfileHdrMagic) failed
Fortunately, it was an easy fix; the Makefile
in addition to building ELF executables like usual, needed to use objcopy
to create APEs or Actually Portable Executables. zipos
, Cosmopolitan's internal ZIP filesystem fails to load on non-APE binaries. objcopy -S -O binary _miniperl miniperl.com
. make minitest
ran a lot better after switching to the APE and a couple more Configure
tweaks: Failed 13 tests out of 357, 96.36% okay.
Removing the stack protector from the Perl build resolved the additional crashes. gcc
's stack protector isn't compatible with the Cosmopolitan Libc. Adding -fno-stack-protector
to the ccflags
stopped Configure
from enabling it.
Failed 12 tests out of 359, 96.66% okay.
Failure to load the Errno
module caused several of the minitest
failures. The Errno
module is a pure Perl module, however it is generated from parsing C headers and output from the C preprocessor. Due to Cosmopolitan wanting to be as thin of a wrapper as possible, the "constants" depend on the currently running operating system and thus must be determined at runtime. The constants for the unix-like platforms are in consts.sh
. Therefore, the "constants" couldn't be simply hard-coded to integer values in Errno.pm
. The solution was to make a C extension (ErrnoRuntime
) Errno
can use to load the current values; however, that didn't solve the problem of Errno
not working for miniperl because miniperl doesn't include the C extensions as it is used to build the extensions. To work around this in order to test the rest of the modules and build the rest of Perl, I setup Errno.pm
to have the current (build-time) constants for use under miniperl only.
In the midst of the Errno
patches, I switched away from using the amalgamation header, cosmopolitan.h
to use Cosmopolitan's libc/isystem
directory as that removed the need for dummy headers and stopped everything from being included all the time, fixing the incompatible declarations errors.
Improving and adding to the Cosmopolitan Libc fixed the remaining make minitest
failures. Perl's massive test suite did a pretty good job of alerting to issues before encountering them in normal use. Working on something as base as a libc is exciting as you get to see how the sausage gets made and the opportunity to make some high-impact contributions to open source as your code may be used in thousands of programs.
- utimesat: stop zipos case from interfering with futimens operation
- Stop sys_lseek from truncating return value to 32 bits
- Add reallocating environ if it's set NULL / Fix NULL pointer dereference when PutEnvImpl is called after clearenv
- Fix stdio fmt of "%.0e" and "%.0g"
- setlocale: Add fake support for locale=""
A variety of headers in libc/isystem
were also improved to include the required Cosmopolitan headers to match Perl and Linux's expectations for the headers.
Building Full Perl
After make minitest
passed, Perl could be linked together, but without several core standard extensions. As seen with the issues with Errno
, Cosmopolitan's "constants" did not work in all the places the constants were used due to the "constants" not being compile-time integer values.
Several extensions relied on ExtUtils::Constant
to generate the glue code necessary to retrieve the constants to make them available to Perl. ExtUtils::Constant
generated static const
arrays containing the necessary constants. To instead generate code compatible with Cosmopolitan's "constants", const
was dropped from the array type and the "constants" were removed from the array: removed items in red:
static const struct iv_s values_for_iv[] =
{
#ifdef AF_INET
{ "AF_INET", 7, AF_INET },
#endif
#ifdef AF_INET6
{ "AF_INET6", 8, AF_INET6 }
#endif
};
Additional code was generated to fill-in the array at runtime:
unsigned i = 0;
#ifdef AF_INET
values_for_iv[i++].value = AF_INET;
#endif
#ifdef AF_INET6
values_for_iv[i++].value = AF_INET6;
#endif
The newly generated code is fragile as the constant setup is now split between multiple places and the order of the constant assignment code must be maintained to match with the order as initialized in the array. The generated code should never be edited manually, so it should be good enough for now.
Fnctl
and IO
were fixed by the ExtUtils::Constant
changes.
POSIX
, in addition to the ExtUtils::Constant
changes, required temporarily disabling some Cosmopolitan unimplemented functions and casting a function to the right signature. Later, the incompatibility was fixed in Cosmopolitan and these hacks were removed.
Socket
in addition to the ExtUtils::Constant
changes, required switching away from using "constants" in places that required the values to be known at compile time. inet_ntop
's output buffer was changed to use a size large enough for all platforms instead of a platform-dependent constant. switch
statements that included case
s with the "constants" had to be switched to if
statements.
Now, all the extensions built and the same process for make minitest
: test, fix, retest, was done for make test
. Adding to and improving Cosmopolitan fixed many of the test failures. Much of this was random POSIX functionality such as:
inet_pton
IPv6 address parsingsigpending
, across the various syscall interfaces and Windows polyfillgetgroups
andsetgroups
Building Actually Portable Perl (APPerl)
With a fairly functional Perl, it could now be turned into APPerl. Perl's cross-compiling facilities and Cosmopolitan's ZIP filesystem made this process not too tricky.
Files stored in the internal ZIP file can be accessed just like normal files via /zip
, a magic path that maps to the ZIP compressed files. Much of the Libc has special handling for it, though it's read-only and cannot be directly exec
ed from.
Similar to prefix
with autoconf
, Configure
's prefix
sets the install directory. Setting prefix
to /zip
configured Perl to read modules from the ZIP filesystem. Actually installing to /zip
on disk, however, would be inconvenient. Running make install
with the DESTDIR
variable alleviated that concern as instead files would be installed to DESTDIR/prefix
, but Perl would still be configured to prefix
.
After the make install
, APPerl was made just by adding the contents of DESTDIR/zip
to the APE Perl binary as it is also a ZIP file.
zip -r "perl.com" "lib" "bin"
. After removing the unneeded perl
executable in bin
, bin
only contained perl scripts, not binaries. For example, perldoc
embedded in the binary can be launched with ./perl.com /zip/bin/perldoc
Getting perldoc
working on Windows
One of Perl's strengths is that it has a large amount of high-quality official and unofficial documentation available. To not starve APPerl users of it, it was critical to make sure it works right at the terminal like usual.
Looking at Perldoc.pm
it was discovered that perldoc
works by generating a pod(Plain Old Documentation) document, writing it to disk, and using shell-invoking system
to launch a pager and open the document.
Fortunately, one of the pagers perldoc
attempts to use, more
, actually exists on Windows too, so an alternative viewing method was not needed. However, shell-invoking system
was broken on Windows as unix builds of Perl use sh
for shell-invoking system
usage. Perl was switched to use cmd.exe
when running on Windows. IsWindows
is a runtime check on the currently running operating system from the Libc. Cosmopolitan even has some automatic conversion from unix to Windows filepaths in the argv
to command line translation.
#ifdef __COSMOPOLITAN__
# include "libc/dce.h"
#endif
#ifdef __COSMOPOLITAN__ if(IsWindows()) { PerlProc_execl("/C/Windows/System32/cmd.exe", "/C/Windows/System32/cmd.exe", "/c", cmd, (char *)NULL); } else #endif {
PerlProc_execl(PL_sh_path, "sh", "-c", cmd, (char *)NULL);
}
A better solution for cross-platform system
is available in the Cosmopolitan Libc: using an embedded command interpreter function (cocmd
) instead of exec
ing a seperate binary for the shell. The main advantage of using cocmd
instead of the current platform's shell is portability. Traditionally, using system
portably across operating systems, especially between Windows and unix was near impossible due to varying command interpreters with incompatible syntax. Using the same command interpreter everywhere gets rid of that compatibility headache. Migrating APPerl's system
to use cocmd
is on the near to-do list. Since the full build of Perl actually depends on shell-invoking usage of its system
, the migration cannot occur until a few more sh
differences are resolved.
argv[0]
script execution
Most programs ignore argv[0]
as their program name isn't relevant to execution. However, it can be used to dispatch just like any of the other arguments. Trying to come up with an easier way to launch perldoc
, I remembered a technique used by BusyBox and PAR::Packer
, use argv[0]
to determine what program to run. APPerl tests the basename of argv[0]
(with extension stripped) against the scripts in /zip/bin
. If it exists, it runs said script. This feature allows accessing perldoc
with a symlink: ln -s perl.com perldoc
and can be used for other scripts, enabling making single binary Perl applications. Patch to perl.c
to enable argv[0]
script execution.
The APPerl Perl distribution
Rather than releasing a lone binary as a demo and calling it done, it seemed apt to turn APPerl into a Perl distribution. By Perl distribution, I am referring to a binary distribution of Perl such as Strawberry Perl, ActiveState Perl, or how Perl is released via various package managers.
With some work, the cosmo
platform may be able to get merged into the official Perl source, but the modifications and additions needed to turn Perl into APPerl are ill-suited for the perl5
source. Following the model of Strawberry Perl, I created a package for building APPerl, Perl::Dist::APPerl
.
Perl::Dist::APPerl
is made up of apperlm
(APPerl Manager) and accompanying documentation. apperlm
is used for building APPerl and includes creating and switching build configs.
The full build of APPerl (all standard modules, extensions, and docs) came out to under 24 MiB. To provide a lighter-weight version, the small
config was developed and came out to under 5 MiB. Debian's list of files in perl-base
was helpful for understanding what would be a decent subset for the small
config. Unfortunately, during the process of creating APPerl, Cosmopolitan Libc dropped support for Windows Vista and 7 as it was too difficult to support them as they do not have a true 64-bit address space. vista
configs were created to still create APPerl for those OSes. The vista
configs use a community fork of Cosmopolitan to still provide compatibility.
The full potential of APPerl lies on how it's used. APPerl building into a single portable cross-platform binary could make it ideal to include in an development SDK. apperlm
is tailored to building custom builds of APPerl to package apps. Thus, it can allow you to create single binary releases of your Perl application. Perl is no longer a second-class citizen to languages that build to binaries, so it's more viable to use for users unable or unwilling to install Perl. Building Perl from source is not a requirement of an APPerl app, nobuild
configs are supported, allowing you to base your app off of an existing version of APPerl such as one of the official releases. APPerl can also ease the transition to newer Perl versions as deploying or switching out one binary is a lot easier than installing a whole tree of files.
Dogfooding
To verify APPerl was ready for external use, it is now used for binary releases of two of my projects: psx_mc_cli
and MHFS
.
psx_mc_cli
is a set of PlayStation MemoryCard file utilities. As expected it wasn't a very tricky port with it only having one non-core dependency, however attempting to make the smallest possible version with its perldoc documentation required some trial and error. To ease creating minimal applications, the "-" prefix was implemented in apperlm
to remove from the config set instead of replacing, a counter-part to the "+" prefix for appending to a config set. An install script was created to create symlinks or fallback to renamed copies of the binary to ease access to the embedded tools. apperl-project.json
MHFS
tested out a larger subset of APPerl as it is an HTTP media server. The APPerl port mainly required figuring out how to include the dependencies in APPerl. Perl itself was able to build and install the HTML-Template
, URI
, and Class-Inspector
distributions once they were copied to the cpan
directory. File::ShareDir::Install
was not included as MHFS's static files could just be added to zip_extra_files
. The App-MHFS
distribution and File::ShareDir
module were installed via copying from zip_extra_files
as their Makefile.PL
's had extension module dependencies (IO
and Time::HiRes
), making them unbuildable by miniperl
. After building, MHFS initially hung when handling some HTTP requests, however it wasn't from APPerl, but from a regression introduced attempting to clean up the socket code. Upon reverting the offending change, it worked as expected on Linux. Unfortunately, Windows compatibility isn't possible yet as Cosmopolitan doesn't support fcntl
F_SETFL
on Windows (needed to set O_NONBLOCK
on the sockets), but I am confident it will be soon be added as Cosmopolitan's Windows polyfills (for fork
, etc) have been spectacular. The recommended dependency MHFS::XS
was not added yet as libFLAC
must be ported to Cosmopolitan first. apperl-project.json
Wrap Up
APPerl binaries, usage information, and development links can be found on the APPerl webpage. APPerl and all things Cosmopolitan Libc are discussed in the redbean discord server.
Thank you to the Cosmopolitan Libc contributors for making it possible and encouraging me along the process, especially Justine Tunney and Gautham Venkatasubramanian.