What is pestle_import?
Here's the basic problem pestle needs to solve. When a user calls the pestle_import
function
namespace My\Programs\FullNamespace;
pestle_import('FullNamespace\To\someFunction');
pestle needs to ensure that calls to someFunction
in the local namespace will behave as though someFunction
were called via its full name.
Put another way, this program
namespace My\Programs\FullNamespace;
pestle_import('FullNamespace\To\someFunction');
someFunction();
needs to behave like this program
namespace My\Programs\FullNamespace;
require_once 'modules/fullnamespace/to/module.php'
FullNamespace\To\someFunction();
Put a third way, pestle_import
needs to
- Define a local function named
someFunction
- Ensure any un-namespaced symbols in
someFunction
behave as though they're being called from the original namespace ofsomeFunction
It's this second part that's the tricky one. If someFunction
looked like this
function someFunction() {
echo "Hello Function";
}
we could just define a new function named someFunction
in the My\Programs\FullNamespace
namespace and be done. However, if the function looks like this
function someFunction() {
return someOtherFunction();
}
we can't just include it into My\Programs\FullNamespace
, because then PHP would attempt to call My\Programs\FullNamespace\someOtherFunction
, which doesn't exist. We need to make sure someOtherFunction
still calls the function from its original namespace.
In other words, we're dealing with a class of problem that's traditionally solved by a linker.
Incremental Strategies
This is a non-trivial problem to solve robustly, especially from PHP userland. So far pestle has taken non-optimal, incremental approaches towards the problem. This allows us to build up a body of code that runs via the pestle_import
pattern. Building up that body of code means we can experiment with syntax and features before committing to a robust solution.
Right now, pestle currently uses a reflection based strategy to implement local importing of functions. Specifically, (at a high level), Pestle keeps a registry of functions that are imported in each namespace, indexed by their local/short name. For each of these functions pestle will generate a local PHP include file that defines an un-namespaced function. This un-namespaced function will reference the registry and invoke the full PHP function via reflection.
Pestle Import Lifecycle: 10,000 foot view
From a very high level, the pestle_import
function has three distinct stages.
First, it must load the original function under its original namespace.
Second, it must register the short function name as being called from a particular namespace and pointing at the original function.
Third, it must generate, cache, and include an "executor function" which will be loaded into the current namespace and do the work of calling the original function.
That's pretty high level though -- let's take a look at what all that actually means.
Pestle Import
We're going to follow a pestle_import
call that looks like this
namespace Foo\Baz\Bar;
pestle_import('Pulsestorm\Pestle\Library\inputOrIndex');
With this call, the end-user-programmer is saying
Hey pestle, I want to use the
inputOrIndex
function from thePulsestorm\Pestle\Library
module/namespace in my program.
Here's the entry point of a pestle_import
call
#File: astorm/pestle/modules/pulsestorm/pestle/importer/module.php
function pestle_import($thing_to_import, $as=false)
{
/**
* @var string $ns_called_from ex. \Namespace\Called\From
*/
$ns_called_from = getNamespaceCalledFrom();
$thing_to_import = trim($thing_to_import, '\\');
includeModule($thing_to_import);
includeCode($thing_to_import, $ns_called_from);
return true;
}
The first line,
$ns_called_from = getNamespaceCalledFrom();
fetches the namespace that pestle_import
was called from. In the above example this will be Foo\Baz\Bar
. The getNamespaceCalledFrom
function uses information from PHP's built-in debug_backtrace
function to determine where pestle_import
was called from.
Next, the includeModule
code will require
in Pulsestorm\Pestle\Library\inputOrIndex
's source module.
#File: astorm/pestle/modules/pulsestorm/pestle/importer/module.php
function includeModule($function_name)
{
$function_name = strToLower($function_name);
$parts = explode('\\', $function_name);
$short_name = array_pop($parts);
$namespace = implode('/',$parts);
$file = $namespace . '/module.php';
return require_once(getPathFromFunctionName($function_name));
}
This is where the we can find the code that converts a PHP namespace into a pestle module file path. The getPathFromFunctionName
function will search through pestle's default modules, as well as any configured modules, until it finds a module.php
file.
Once includeModule
is done, pestle will call the includeCode
function, which is a wrapper function for the current pestle import strategy.
function includeCode($thing_to_import, $ns_called_from)
{
includeCodeReflectionStrategy($thing_to_import, $ns_called_from);
// ... older strategies are often left, commented out ...
}
Reflection Strategy: Registering the Function
The includeCodeReflectionStrategy
has two jobs -- the first is to register the function/symbol we want to import, and the second is to ensure an executor function is loaded.
#File: modules/pulsestorm/pestle/importer/module.php
function includeCodeReflectionStrategy($thing_to_import, $ns_called_from)
{
$parts = explode('\\', $thing_to_import);
$short_name = array_pop($parts);
functionRegister($short_name, $ns_called_from, $thing_to_import);
generateOrIncludeExecutorFunction($short_name, $thing_to_import);
}
Registering a function means adding the function we want to import to a global registry that keeps track of the following two things
- Which namespace the function's being imported from
- The actual, fully namespaced, PHP function that this imported function points at
The functionRegister
function uses a classless getter/setter registry pattern.
function functionRegister($short_name,$ns_called_from=false,$namespaced_function=false)
{
static $functions=[];
if(!$namespaced_function) {
return functionRegisterGet($functions, $short_name,$ns_called_from);
// return $functions[$short_name][$ns_called_from];
}
return functionRegisterSet($functions,$short_name,$ns_called_from,$namespaced_function);
}
The function registry uses the static
variable $functions
to store its values. If users call functionRegister
with the $namespaced_function
argument, functionRegister
will set a value. If this argument is not used, functionRegister
will get a value from its registry.
The functionRegister
call in the includeCodeReflectionStrategy
function
functionRegister($short_name, $ns_called_from, $thing_to_import);
is setting a value.
function functionRegisterSet(&$functions, $short_name, $ns_called_from, $namespaced_function)
{
$functions[$short_name][$ns_called_from] = $namespaced_function;
}
Again, this registry needs to keep track of all the namespaces a short function name is imported from, and what actual PHP function name it points at. So something like this --
namespace Foo\Baz\Bar;
pestle_import('Pulsestorm\Pestle\Library\inputOrIndex');
Will result in the following registered values
$functions['inputOrIndex']['Foo\Baz\Bar'] = 'Pulsestorm\Pestle\Library\inputOrIndex';
Reflection Strategy: Generating the Executor
After registering the function, we need to include an executor function in the local namespace.
generateOrIncludeExecutorFunction($short_name, $thing_to_import);
"Executor Function" is our own term. When an end-user-programmer says something like this
namespace Foo\Baz\Bar;
pestle_import('Pulsestorm\Pestle\Library\inputOrIndex');
We need to ensure there's a function named inputOrIndex
in the local namespace. That is, the user needs to be able to call that function
namespace Foo\Baz\Bar;
pestle_import('Pulsestorm\Pestle\Library\inputOrIndex');
//...
inputOrIndex();
Since pestle_import
is running in the context of this local namespace (Foo\Baz\Bar
), if we include
a file with a function definition that has no namespace, PHP will define a function in the the same namespace that we called pestle_import
from. So part of generateOrIncludeExecutorFunction
's job is to generate an include file (stored in a local, file based cache), and then include it.
The other part of generateOrIncludeExecutorFunction
's job is making sure that the code it generates will result in the original PHP function being called. You can see all the executor functions pestle's generated by looking in your /tmp/pestle_cache
folder -- here's one example.
#File: /private/tmp/pestle_cache/[cache-key]/reflection-strategy/[2nd-cache-key].php
<?php
use function Pulsestorm\Pestle\Importer\functionRegister;
use function Pulsestorm\Pestle\Importer\getNamespaceCalledFromForGenerated;
function inputOrIndex(){
$function = functionRegister(__FUNCTION__, getNamespaceCalledFromForGenerated());
$args = func_get_args();
return (new \ReflectionFunction($function))->invokeArgs($args);
}
##exported for Pulsestorm\Pestle\Library\inputOrIndex
This code was generated when an end user programmer said
pestle_import('Pulsestorm\Pestle\Library\inputOrIndex');
The executor function (function replaceTypeHintsWithNewTypeHints
) will
-
Fetch the fully namespaced PHP function from the function registry for the function
inputOrIndex
, for the namespace that the executor function was called from (fetched withgetNamespaceCalledFromForGenerated
) -
Using PHP's Reflection API, call that function.
When a pestle user calls a function like this
pestle_import('Pulsestorm\Pestle\Library\inputOrIndex');
//...
inputOrIndex(...)
The function they're actually calling is the executor function.
One More Time
That's a lot to take in. Let's do it one more time, but jumping back to a 10,000 foot view with a few more details filled in.
Consider a pestle program that calls pestle_import
from the Foo\Baz\Bar
namespace.
namespace Foo\Baz\Bar;
pestle_import('Pulsestorm\Pestle\Library\inputOrIndex');
The first thing that happens? In includeModule
, we make sure the PHP file with inputOrIndex
is require
d.
require_once('modules/pulsestorm/pestle/library/module.php');
Next, in includeCode
, we'll register the function, ensuring that that static $functions
array in getFunction
has the following keys set
$functions['inputOrIndex']['Foo\Baz\Bar'] = 'Pulsestorm\Pestle\Library\inputOrIndex';
Finally, includeCode
will also (the first time this function is pestle_import
ed) generate and cache "an executor function" that looks something like this
#File: /tmp/pestle_cache/[pestle-key]/reflection-strategy/[function-key].php
<?php
use function Pulsestorm\Pestle\Importer\functionRegister;
use function Pulsestorm\Pestle\Importer\getNamespaceCalledFromForGenerated;
function inputOrIndex(){
$function = functionRegister(__FUNCTION__, getNamespaceCalledFromForGenerated());
$args = func_get_args();
return (new \ReflectionFunction($function))->invokeArgs($args);
}
##exported for Pulsestorm\Cli\Token_Parse\replaceTypeHintsWithNewTypeHints
and then load this executor.
require_once('/tmp/pestle_cache/[pestle-key]/reflection-strategy/[function-key].php')
Future calls will skip the generation and load this file directly from the cache.
The end result? Back up here
namespace Foo\Baz\Bar;
pestle_import('Pulsestorm\Pestle\Library\inputOrIndex');
after pestle_import
finishes, there will be a newly defined and available Foo\Baz\Bar\inputOrIndex
function which, when called, will actually call the requested function, Pulsestorm\Pestle\Library\inputOrIndex
Downsides and Future Plans
The main downside of the current strategy is it still requires us to require in all of the original namespace in order for the executor function to do its work. We're just hiding this detail from the end-user-programmer.
However, with this system in place we should be able to move on to implementing features like
- Importing other symbols (Classes, constants, and ???)
- Importing multiple symbols at one
and building up a library of code that uses these features. This library of code will/could serve as the ultimate test case when/if we attempt to change pestle_import
so it's only loading the code each symbol needs to run and the cached executor files don't use reflection.