Computoid

| About | APPerl

Perl is Actually Portable

by Gavin Hayes

After stumbling upon Gautham's APE Python port and seeing how far along the Cosmopolitan Libc has come along, I was inspired to see what it would take to port my scripting language of choice, Perl, to the Cosmopolitan Libc and turn it into a self-contained binary. My motivation came from wanting to prove that if Python can do it, Perl can do it too, wanting a more robust Windows Perl port for running my personal media server (MHFS), and the cool factor of hacking on Perl and the Cosmopolitan Libc.

Table of Contents

  1. Introduction
  2. 1.0 Building Perl with the Cosmopolitan Libc
    1. 1.1 Passing make minitest
    2. 1.2 Building Full Perl
  3. 2.0 Building Actually Portable Perl (APPerl)
    1. 2.1 Getting perldoc working on Windows
    2. 2.2 argv[0] script execution
  4. 3.0 The APPerl Perl distribution
    1. 3.1 Dogfooding
  5. 4.0 Wrap Up

Introduction

The Cosmopolitan Libc allows building Actually Portable Executables: statically linked binaries that will run on six operating systems + BIOS. Perl is written in C. What could we do with an Actually Portable Executable version of Perl? A sysadmin managing a fleet of Red Hat and Windows boxes could start using the latest (and identical) version of Perl across all machines without compiling several different builds of Perl. It could be packaged into a Software Development Kit for easily getting Perl and accompanying scripts onto developer machines. A Perl application developer could increase reach and reduce support burden by also offering a binary release.

Making this vision a reality required some tinkering and development. There were a few gaps in the Libc functionality as the Cosmopolitan Libc is still a relatively new C library, having been in development for a little over two years. To build Actually Portable Executables, the Cosmopolitan Libc breaks some of the rules and conventions held by other libcs. Unlike building with most libcs, the Cosmopolitan Libc is not bound to a compiler. I will discuss addressing such issues below.

Building Perl with the Cosmopolitan Libc

After cloning the Perl source, the next step was generating a proper build config with Configure. Configure is not generated by GNU Autotools, but by Perl's own configuration generator generator, metaconfig, originally written by Larry Wall himself. Modelling after an early version of the Python port, I made a superConfigure script to invoke Configure with the flags needed to build with Cosmopolitan. Unfortunately, just because flags were passed in, didn't mean Configure had to use them. In particular, libs were applied to the build config, but weren't used in most places for Configure's own testing. Building for the Cosmopolitan Libc requires passing a large incantation of flags to gcc as it isn't bound to the C compiler. Edits aren't supposed to be made directly to Configure, but a patch file or metaconfig changes could be done later. Modifying Configure to always pass libs wasn't difficult, it just had to be done in a few places:
$cc -o try $ccflags $ldflags try.c $libs

Unsurprisingly, after the first Configure, the build failed. The compiler errored out from missing headers. Initially this was fixed in two ways, preventing Configure from finding non-Cosmopolitan headers and adding dummy (empty) header files to it and gcc's search path. All the needed libc symbols were already being passed to gcc as the ccflags directly -included the cosmopolitan.h amalgamation header. However, Configure and the Perl build depend on the libc header files being available. Using the amalgamation header ultimately proved to be problematic, but the dummy headers seemed like an easy fix at this time. Configure checked for availability of certain libc symbols by creating test programs that declared them and then testing if the programs compiled. To accommodate the symbols already being included by the amalgamation, the symbol test was patched to not declare the symbol, just use it:

$extern_C void *$1$tdc; void *(*(p()))$tdc { return &$1; }
void *(*(p()))$tdc { return (void*)&$1; }

Perl itself also sometimes declared libc symbols which lead to conflicting types errors. Temporarily, it was fine to comment out the declarations as cosmopolitan.h already contained them.

//Uid_t getuid (void);
//Uid_t geteuid (void);
//Gid_t getgid (void);
//Gid_t getegid (void);

After the last change, miniperl built! Perl programmers often prefer to write Perl to sh so Perl is built with Perl! To not have a chicken-and-egg situation making it difficult to port Perl to new platforms, the first thing built is miniperl, Perl without C or XS extension modules that can be used to assist with building the rest of Perl.

To reduce dependence on the superConfigure script and make building Perl for Cosmopolitan less different from building Perl for other platforms, a hints file was created, hints/cosmo.sh. Once a platform is detected or set in Configure, it loads in the corresponding settings to ease building Perl for a platform.

Passing make minitest

Basic functionality appeared to work under miniperl, but make had failed and it suggested running make minitest, a subset of tests that can be ran on miniperl. For a first run make minitest wasn't too bad: Failed 95 tests out of 339, 71.98% okay. However, it was alarming that a crash in miniperl that kept popping up

libc/zipos/find.c:34: assert(ZIP_CFILE_MAGIC(zipos->map + c) == kZipCfileHdrMagic) failed

Fortunately, it was an easy fix; the Makefile in addition to building ELF executables like usual, needed to use objcopy to create APEs or Actually Portable Executables. zipos, Cosmopolitan's internal ZIP filesystem fails to load on non-APE binaries. objcopy -S -O binary _miniperl miniperl.com. make minitest ran a lot better after switching to the APE and a couple more Configure tweaks: Failed 13 tests out of 357, 96.36% okay.

Removing the stack protector from the Perl build resolved the additional crashes. gcc's stack protector isn't compatible with the Cosmopolitan Libc. Adding -fno-stack-protector to the ccflags stopped Configure from enabling it. Failed 12 tests out of 359, 96.66% okay.

Failure to load the Errno module caused several of the minitest failures. The Errno module is a pure Perl module, however it is generated from parsing C headers and output from the C preprocessor. Due to Cosmopolitan wanting to be as thin of a wrapper as possible, the "constants" depend on the currently running operating system and thus must be determined at runtime. The constants for the unix-like platforms are in consts.sh. Therefore, the "constants" couldn't be simply hard-coded to integer values in Errno.pm. The solution was to make a C extension (ErrnoRuntime) Errno can use to load the current values; however, that didn't solve the problem of Errno not working for miniperl because miniperl doesn't include the C extensions as it is used to build the extensions. To work around this in order to test the rest of the modules and build the rest of Perl, I setup Errno.pm to have the current (build-time) constants for use under miniperl only.

In the midst of the Errno patches, I switched away from using the amalgamation header, cosmopolitan.h to use Cosmopolitan's libc/isystem directory as that removed the need for dummy headers and stopped everything from being included all the time, fixing the incompatible declarations errors.

Improving and adding to the Cosmopolitan Libc fixed the remaining make minitest failures. Perl's massive test suite did a pretty good job of alerting to issues before encountering them in normal use. Working on something as base as a libc is exciting as you get to see how the sausage gets made and the opportunity to make some high-impact contributions to open source as your code may be used in thousands of programs.

A variety of headers in libc/isystem were also improved to include the required Cosmopolitan headers to match Perl and Linux's expectations for the headers.

Building Full Perl

After make minitest passed, Perl could be linked together, but without several core standard extensions. As seen with the issues with Errno, Cosmopolitan's "constants" did not work in all the places the constants were used due to the "constants" not being compile-time integer values.

Several extensions relied on ExtUtils::Constant to generate the glue code necessary to retrieve the constants to make them available to Perl. ExtUtils::Constant generated static const arrays containing the necessary constants. To instead generate code compatible with Cosmopolitan's "constants", const was dropped from the array type and the "constants" were removed from the array: removed items in red:

static const struct iv_s values_for_iv[] =
      {
#ifdef AF_INET
        { "AF_INET", 7, AF_INET },
#endif
#ifdef AF_INET6
        { "AF_INET6", 8, AF_INET6 }
#endif
      };

Additional code was generated to fill-in the array at runtime:

unsigned i = 0;
#ifdef AF_INET
values_for_iv[i++].value = AF_INET;
#endif
#ifdef AF_INET6
values_for_iv[i++].value = AF_INET6;
#endif

The newly generated code is fragile as the constant setup is now split between multiple places and the order of the constant assignment code must be maintained to match with the order as initialized in the array. The generated code should never be edited manually, so it should be good enough for now.

Fnctl and IO were fixed by the ExtUtils::Constant changes.

POSIX, in addition to the ExtUtils::Constant changes, required temporarily disabling some Cosmopolitan unimplemented functions and casting a function to the right signature. Later, the incompatibility was fixed in Cosmopolitan and these hacks were removed.

Socket in addition to the ExtUtils::Constant changes, required switching away from using "constants" in places that required the values to be known at compile time. inet_ntop's output buffer was changed to use a size large enough for all platforms instead of a platform-dependent constant. switch statements that included cases with the "constants" had to be switched to if statements.

Now, all the extensions built and the same process for make minitest: test, fix, retest, was done for make test . Adding to and improving Cosmopolitan fixed many of the test failures. Much of this was random POSIX functionality such as:

Building Actually Portable Perl (APPerl)

With a fairly functional Perl, it could now be turned into APPerl. Perl's cross-compiling facilities and Cosmopolitan's ZIP filesystem made this process not too tricky.

Files stored in the internal ZIP file can be accessed just like normal files via /zip, a magic path that maps to the ZIP compressed files. Much of the Libc has special handling for it, though it's read-only and cannot be directly execed from.

Similar to prefix with autoconf, Configure's prefix sets the install directory. Setting prefix to /zip configured Perl to read modules from the ZIP filesystem. Actually installing to /zip on disk, however, would be inconvenient. Running make install with the DESTDIR variable alleviated that concern as instead files would be installed to DESTDIR/prefix, but Perl would still be configured to prefix.

After the make install, APPerl was made just by adding the contents of DESTDIR/zip to the APE Perl binary as it is also a ZIP file. zip -r "perl.com" "lib" "bin". After removing the unneeded perl executable in bin, bin only contained perl scripts, not binaries. For example, perldoc embedded in the binary can be launched with ./perl.com /zip/bin/perldoc

Getting perldoc working on Windows

One of Perl's strengths is that it has a large amount of high-quality official and unofficial documentation available. To not starve APPerl users of it, it was critical to make sure it works right at the terminal like usual.

Looking at Perldoc.pm it was discovered that perldoc works by generating a pod(Plain Old Documentation) document, writing it to disk, and using shell-invoking system to launch a pager and open the document.

Fortunately, one of the pagers perldoc attempts to use, more, actually exists on Windows too, so an alternative viewing method was not needed. However, shell-invoking system was broken on Windows as unix builds of Perl use sh for shell-invoking system usage. Perl was switched to use cmd.exe when running on Windows. IsWindows is a runtime check on the currently running operating system from the Libc. Cosmopolitan even has some automatic conversion from unix to Windows filepaths in the argv to command line translation.

#ifdef __COSMOPOLITAN__
#    include "libc/dce.h"
#endif
#ifdef __COSMOPOLITAN__
if(IsWindows())
{
    PerlProc_execl("/C/Windows/System32/cmd.exe", "/C/Windows/System32/cmd.exe", "/c", cmd, (char *)NULL);
}
else
#endif
{
    PerlProc_execl(PL_sh_path, "sh", "-c", cmd, (char *)NULL);
}

A better solution for cross-platform system is available in the Cosmopolitan Libc: using an embedded command interpreter function (cocmd) instead of execing a seperate binary for the shell. The main advantage of using cocmd instead of the current platform's shell is portability. Traditionally, using system portably across operating systems, especially between Windows and unix was near impossible due to varying command interpreters with incompatible syntax. Using the same command interpreter everywhere gets rid of that compatibility headache. Migrating APPerl's system to use cocmd is on the near to-do list. Since the full build of Perl actually depends on shell-invoking usage of its system, the migration cannot occur until a few more sh differences are resolved.

argv[0] script execution

Most programs ignore argv[0] as their program name isn't relevant to execution. However, it can be used to dispatch just like any of the other arguments. Trying to come up with an easier way to launch perldoc, I remembered a technique used by BusyBox and PAR::Packer, use argv[0] to determine what program to run. APPerl tests the basename of argv[0](with extension stripped) against the scripts in /zip/bin. If it exists, it runs said script. This feature allows accessing perldoc with a symlink: ln -s perl.com perldoc and can be used for other scripts, enabling making single binary Perl applications. Patch to perl.c to enable argv[0] script execution.

The APPerl Perl distribution

Rather than releasing a lone binary as a demo and calling it done, it seemed apt to turn APPerl into a Perl distribution. By Perl distribution, I am referring to a binary distribution of Perl such as Strawberry Perl, ActiveState Perl, or how Perl is released via various package managers.

With some work, the cosmo platform may be able to get merged into the official Perl source, but the modifications and additions needed to turn Perl into APPerl are ill-suited for the perl5 source. Following the model of Strawberry Perl, I created a package for building APPerl, Perl::Dist::APPerl.

Perl::Dist::APPerl is made up of apperlm (APPerl Manager) and accompanying documentation. apperlm is used for building APPerl and includes creating and switching build configs.

The full build of APPerl (all standard modules, extensions, and docs) came out to under 24 MiB. To provide a lighter-weight version, the small config was developed and came out to under 5 MiB. Debian's list of files in perl-base was helpful for understanding what would be a decent subset for the small config. Unfortunately, during the process of creating APPerl, Cosmopolitan Libc dropped support for Windows Vista and 7 as it was too difficult to support them as they do not have a true 64-bit address space. vista configs were created to still create APPerl for those OSes. The vista configs use a community fork of Cosmopolitan to still provide compatibility.

The full potential of APPerl lies on how it's used. APPerl building into a single portable cross-platform binary could make it ideal to include in an development SDK. apperlm is tailored to building custom builds of APPerl to package apps. Thus, it can allow you to create single binary releases of your Perl application. Perl is no longer a second-class citizen to languages that build to binaries, so it's more viable to use for users unable or unwilling to install Perl. Building Perl from source is not a requirement of an APPerl app, nobuild configs are supported, allowing you to base your app off of an existing version of APPerl such as one of the official releases. APPerl can also ease the transition to newer Perl versions as deploying or switching out one binary is a lot easier than installing a whole tree of files.

Dogfooding

To verify APPerl was ready for external use, it is now used for binary releases of two of my projects: psx_mc_cli and MHFS.

psx_mc_cli is a set of PlayStation MemoryCard file utilities. As expected it wasn't a very tricky port with it only having one non-core dependency, however attempting to make the smallest possible version with its perldoc documentation required some trial and error. To ease creating minimal applications, the "-" prefix was implemented in apperlm to remove from the config set instead of replacing, a counter-part to the "+" prefix for appending to a config set. An install script was created to create symlinks or fallback to renamed copies of the binary to ease access to the embedded tools. apperl-project.json

MHFS tested out a larger subset of APPerl as it is an HTTP media server. The APPerl port mainly required figuring out how to include the dependencies in APPerl. Perl itself was able to build and install the HTML-Template, URI, and Class-Inspector distributions once they were copied to the cpan directory. File::ShareDir::Install was not included as MHFS's static files could just be added to zip_extra_files. The App-MHFS distribution and File::ShareDir module were installed via copying from zip_extra_files as their Makefile.PL's had extension module dependencies (IO and Time::HiRes), making them unbuildable by miniperl. After building, MHFS initially hung when handling some HTTP requests, however it wasn't from APPerl, but from a regression introduced attempting to clean up the socket code. Upon reverting the offending change, it worked as expected on Linux. Unfortunately, Windows compatibility isn't possible yet as Cosmopolitan doesn't support fcntl F_SETFL on Windows (needed to set O_NONBLOCK on the sockets), but I am confident it will be soon be added as Cosmopolitan's Windows polyfills (for fork, etc) have been spectacular. The recommended dependency MHFS::XS was not added yet as libFLAC must be ported to Cosmopolitan first. apperl-project.json

Wrap Up

APPerl binaries, usage information, and development links can be found on the APPerl webpage. APPerl and all things Cosmopolitan Libc are discussed in the redbean discord server.

Thank you to the Cosmopolitan Libc contributors for making it possible and encouraging me along the process, especially Justine Tunney and Gautham Venkatasubramanian.