You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Out of curiosity, and I've never heard or seen anyone do it before. I decided to dump perl -e"print 'hello world'" 's shared HEK cache. AKA HV* PL_strtab or my_perl->Istrtab . I was curious strings or HV key names, are permanently burned into libperl/perl_construct(), and can't be stopped/prevent-initialization-allocation, by any user. The list shocked me. Some things like UNIVERSAL:: and the core XSUBs were obvious. All of my %ENV, was a surprise. Isn't that supposed to be a getter Magic HV*? Not plain-old-data HV* !
Adjust for your OS before running, XS paths, shell quotes, etc. All ".pm"'s and PP code was removed to make this demo as close as possible to a C breakpoint on a bare interp startup.
perl -e"DynaLoader::boot_DynaLoader('DynaLoader');&{DynaLoader::dl_install_xsub('Hash::Util::bootstrap',DynaLoader::dl_find_symbol(DynaLoader::dl_load_file('C:\pb64\lib\auto\Hash\Util\Util.dll',0),'boot_Hash__Util'),'C:\pb64\lib\auto\Hash\Util\Util.dll')}(); my %keykill; foreach(keys %{*Hash::Util::}) {$keykill{$_} = !!1}; my $a = Hash::Util::bucket_array(undef); my @b; foreach(@{$a}){my @tarr; if(ref $_){@tarr = grep{!$keykill{$_}} @{$_}; push @b, @tarr;}} print join(\"\n\",sort @b);" > t.txt
Expected behavior
While the semi-recent SV_CONST() / PL_sv_consts[] API is a GREAT IDEA. The strings/method names/typeglob names that were picked in the past, are very narrow minded, for a very tiny fraction of perl users and perl process startups. Those all UC method names are just on another planet, since they don't even cover, a bare, empty, no PP code yet, interpreter process. I'll copy paste the list here for convince.
Expected behavior is, any HV keys, unconditionally created inside libperl at process startup, must be global, Read-Write "IMMORTAL" shared HEKs, stored/backed, by C global storage, which means, backed by the libperl.so/libperl.dll/perl.exe/perl.elf binary. Not backed by malloc() memory which is purely duplication of data that already exists in libperl.so.
Fix ideas
SV_CONST() / PL_sv_consts[] API was and is a great idea, but totally missed the most important key names. It should be refactored and expanded IMO. The current contiguous shared_HE_HEK_PVBUF struct, needs some small tweaks, to be "IMMORTAL" class type data, backed by C global var storage, not per-interp malloc. Note, because of PERL_HASH_SEED, it is impossible to make a RO/C-static/C-global shared_HE_HEK_PVBUF struct. If someone has ideas or knows a secret how to implement RO shared_HE_HEK_PVBUF structs, come forward.
One idea I had was precalculate in miniperl the hash numbers for RO HEKs, they are constants per interp-binary, and probably the RO disk hash was mixed with the current timestamp at CC compile time whenever the interp bin was compiled, There is low randomness here, but all Linux package managers families, would have different U32s in their libperl.so, even if 5.X.Y is the same.
Other idea, 1 CPU XOR ^ op against the dynamic hash seed at runtime, against all shared HEKs combined, RO .so backed, and RW malloc backed. HV* PL_strtab is per-proc/per-interp/per-my_perl struct anyways.
Easiest choice is just have RW immortal global HEKs. If the perl port knows how to do it, those 3.5KB-7KBs of strings can be made OS hardware VM RO right after PERL_HASH_SEED is read from the shell on perl proc startup.
On MSVC2022 X64, SBOX32 almost completely inlines if MSVC knows the C literal during LTO Compile time. Therefore it can be done partially in miniperl also at interp CC time.
#define _SBOX32_CASE(len,hash,state,key) \
/* FALLTHROUGH */ \
case len: hash ^= state[ 1 + ( 256 * ( len - 1 ) ) + key[ len - 1 ] ];
Note, the interp currentlu FORBIDS ever re-reading $ENV{PERL_HASH_SEED}, after perl_construct() or perl_init_sys3() runs. All ithreads, all embedders, will use the same seed for the rest of the proc lifetime.
Or the alternative instead of immortal heks is MUCH MORE sub AUTOLOAD{} from XS usage, so the HE/HEK/GV_H/GV_B/GP/CV_H/CV_B structs are never allocated until the interp runs (yy_lex/op_null/gv_fetchFOOpvn()/etc) into the first user's explicit PP/XS method/sub call to these special identifiers. I'm not very eager about this idea since its possible, just not my favorite.
There really is no alternative to SV_CONST() / PL_sv_consts[] API, and discussing "memory bloat" and "memory usage" of immortal HEKs, is not applicable in this case, since the memory is currently already "wasted" before the 1st ASCII char of PP code is ever parsed.
I doubt there would be consensus, to removed from core (to a core .pm) these 4 packages, version::*, Win32CORE::*, Tie::Hash::NamedCapture::* , and builtins::*. That would leave %ENV as the last HV HEK memory hog.
Other questionable packages.
STATIC void
S_init_predump_symbols(pTHX)
{
.............................
/* Historically, PVIOs were blessed into IO::Handle, unless
FileHandle was loaded, in which case they were blessed into
that. Action at a distance.
However, if we simply bless into IO::Handle, we break code
that assumes that PVIOs will have (among others) a seek
method. IO::File inherits from IO::Handle and IO::Seekable,
and provides the needed methods. But if we simply bless into
it, then we break code that assumed that by loading
IO::Handle, *it* would work.
So a compromise is to set up the correct @IO::File::ISA,
so that code that does C<use IO::Handle>; will still work.
*/
Perl_populate_isa(aTHX_ STR_WITH_LEN("IO::File::ISA"),
STR_WITH_LEN("IO::Handle::"),
STR_WITH_LEN("IO::Seekable::"),
STR_WITH_LEN("Exporter::"),
NULL);
SV_CONST() literally has no reason to even bother lazy loading/runtime optionally allocing SV heads for its sub TIE*() methods. The TIE*() HEKs are unconditional already. The current PL_sv_consts[] SV* array should just be merged into PL_sv_immortals[] SV head array.
Win32 Perl 5.41.7. Edit the demo code above, and run it on your system. Note because of Win32CORE:: WinPerl is probably higher than PosixPerl for mandatory created HEKs. The problem still remains the awesome idea SV_CONST() API doesn't implement, the fundamentals.
The text was updated successfully, but these errors were encountered:
Description
Out of curiosity, and I've never heard or seen anyone do it before. I decided to dump
perl -e"print 'hello world'"
's shared HEK cache. AKAHV* PL_strtab
ormy_perl->Istrtab
. I was curious strings or HV key names, are permanently burned into libperl/perl_construct()
, and can't be stopped/prevent-initialization-allocation, by any user. The list shocked me. Some things like UNIVERSAL:: and the core XSUBs were obvious. All of my %ENV, was a surprise. Isn't that supposed to be a getter Magic HV*? Not plain-old-data HV* !Click to expand the HEK dump, its VERY LONG
Steps to Reproduce
Adjust for your OS before running, XS paths, shell quotes, etc. All ".pm"'s and PP code was removed to make this demo as close as possible to a C breakpoint on a bare interp startup.
perl -e"DynaLoader::boot_DynaLoader('DynaLoader');&{DynaLoader::dl_install_xsub('Hash::Util::bootstrap',DynaLoader::dl_find_symbol(DynaLoader::dl_load_file('C:\pb64\lib\auto\Hash\Util\Util.dll',0),'boot_Hash__Util'),'C:\pb64\lib\auto\Hash\Util\Util.dll')}(); my %keykill; foreach(keys %{*Hash::Util::}) {$keykill{$_} = !!1}; my $a = Hash::Util::bucket_array(undef); my @b; foreach(@{$a}){my @tarr; if(ref $_){@tarr = grep{!$keykill{$_}} @{$_}; push @b, @tarr;}} print join(\"\n\",sort @b);" > t.txt
Expected behavior
While the semi-recent SV_CONST() / PL_sv_consts[] API is a GREAT IDEA. The strings/method names/typeglob names that were picked in the past, are very narrow minded, for a very tiny fraction of perl users and perl process startups. Those all UC method names are just on another planet, since they don't even cover, a bare, empty, no PP code yet, interpreter process. I'll copy paste the list here for convince.
Expected behavior is, any HV keys, unconditionally created inside libperl at process startup, must be global, Read-Write "IMMORTAL" shared HEKs, stored/backed, by C global storage, which means, backed by the libperl.so/libperl.dll/perl.exe/perl.elf binary. Not backed by
malloc()
memory which is purely duplication of data that already exists in libperl.so.Fix ideas
SV_CONST() / PL_sv_consts[] API was and is a great idea, but totally missed the most important key names. It should be refactored and expanded IMO. The current contiguous shared_HE_HEK_PVBUF struct, needs some small tweaks, to be "IMMORTAL" class type data, backed by C global var storage, not per-interp malloc. Note, because of PERL_HASH_SEED, it is impossible to make a RO/C-static/C-global shared_HE_HEK_PVBUF struct. If someone has ideas or knows a secret how to implement RO shared_HE_HEK_PVBUF structs, come forward.
One idea I had was precalculate in miniperl the hash numbers for RO HEKs, they are constants per interp-binary, and probably the RO disk hash was mixed with the current timestamp at CC compile time whenever the interp bin was compiled, There is low randomness here, but all Linux package managers families, would have different U32s in their libperl.so, even if 5.X.Y is the same.
Other idea, 1 CPU XOR
^
op against the dynamic hash seed at runtime, against all shared HEKs combined, RO .so backed, and RW malloc backed. HV* PL_strtab is per-proc/per-interp/per-my_perl struct anyways.Easiest choice is just have RW immortal global HEKs. If the perl port knows how to do it, those 3.5KB-7KBs of strings can be made OS hardware VM RO right after PERL_HASH_SEED is read from the shell on perl proc startup.
On MSVC2022 X64, SBOX32 almost completely inlines if MSVC knows the C literal during LTO Compile time. Therefore it can be done partially in miniperl also at interp CC time.
Note, the interp currentlu FORBIDS ever re-reading $ENV{PERL_HASH_SEED}, after perl_construct() or perl_init_sys3() runs. All ithreads, all embedders, will use the same seed for the rest of the proc lifetime.
Or the alternative instead of immortal heks is MUCH MORE sub AUTOLOAD{} from XS usage, so the HE/HEK/GV_H/GV_B/GP/CV_H/CV_B structs are never allocated until the interp runs (yy_lex/op_null/gv_fetchFOOpvn()/etc) into the first user's explicit PP/XS method/sub call to these special identifiers. I'm not very eager about this idea since its possible, just not my favorite.
There really is no alternative to SV_CONST() / PL_sv_consts[] API, and discussing "memory bloat" and "memory usage" of immortal HEKs, is not applicable in this case, since the memory is currently already "wasted" before the 1st ASCII char of PP code is ever parsed.
I doubt there would be consensus, to removed from core (to a core .pm) these 4 packages,
version::*
,Win32CORE::*
,Tie::Hash::NamedCapture::*
, andbuiltins::*
. That would leave%ENV
as the last HV HEK memory hog.Other questionable packages.
SV_CONST() literally has no reason to even bother lazy loading/runtime optionally allocing SV heads for its sub TIE*() methods. The TIE*() HEKs are unconditional already. The current PL_sv_consts[] SV* array should just be merged into PL_sv_immortals[] SV head array.
Perl configuration
Win32 Perl 5.41.7. Edit the demo code above, and run it on your system. Note because of
Win32CORE::
WinPerl is probably higher than PosixPerl for mandatory created HEKs. The problem still remains the awesome idea SV_CONST() API doesn't implement, the fundamentals.The text was updated successfully, but these errors were encountered: