Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO::File ignores PERL_UNICODE and PerlIO layer defaults [rt.cpan.org #68366] #17458

Open
toddr opened this issue Apr 19, 2018 · 8 comments
Open
Labels
dist-IO issues in the dual-life blead-first IO distribution type-Unicode Unicode and System Calls Bad interactions of syscalls and UTF-8

Comments

@toddr
Copy link
Member

toddr commented Apr 19, 2018

Migrated from rt.cpan.org#68366 (status was 'new')

Requestors:

From [email protected] on 2011-05-22 02:59:48:

Hi,
  Setting PerlIO defaults with either -C, PERL_UNICODE or the open
pragma, as shown in ${^OPEN}, are complete ignored by IO::File. This
seriously breaks anyones UTF-8 work that uses IO::File, leading to much
frustration.


Lyle
@toddr toddr transferred this issue from Dual-Life/IO Jan 20, 2020
@toddr toddr added Needs Triage dist-IO issues in the dual-life blead-first IO distribution labels Jan 20, 2020
@Leont
Copy link
Contributor

Leont commented Jan 22, 2020

It ignoring the open pragma is expected, as pragmas are lexical.

It ignoring -C and PERL_UNICODE is unexpected and probably a bug.

@toddr
Copy link
Member Author

toddr commented Jan 22, 2020

@cosmicnet see above.

@Leont
Copy link
Contributor

Leont commented Jan 22, 2020

It ignoring -C and PERL_UNICODE is unexpected and probably a bug.

It would appear that both of these only affect the topmost scope, it sets the correct encodings for that lexical scope.

So that would mean this isn't a bug in the IO dist, but in core.

@cosmicnet
Copy link

This was 9 years ago, so hard for me to remember exactly. But I do recall messing around with a lot of different stuff with IO layers which just didn't work, when I dug deeper IO::File was ignoring a lot/all of those settings. When I changed to regular open() everything worked as expected.

Unfortunately I don't have scope to put together a test case for this at the moment.

@tonycoz
Copy link
Contributor

tonycoz commented Jan 22, 2020

It would appear that both of these only affect the topmost scope, it sets the correct encodings for that lexical scope.

So that would mean this isn't a bug in the IO dist, but in core.

It does say:

The C options mean that any subsequent open() (or similar I/O
operations) in the current file scope

but that's pretty vague for a command-line option.

Changing its behaviour to be global would probably break existing code.

@tonycoz
Copy link
Contributor

tonycoz commented Jan 22, 2020

One way it would be made to work (ignoring the -C issue) is to check the calling scope's $^H{"open<"} and $^H{"open<"} via (caller)[10].

These members of %^H don't seem to be documented, but autodie already uses them.

@Leont
Copy link
Contributor

Leont commented Jan 23, 2020

but that's pretty vague for a command-line option.

Yeah, especially because that's not at all what I would expect it to do.

Changing its behaviour to be global would probably break existing code.

True. TBH the whole concept of -CD is broken anyway if you ask me.

@cosmicnet
Copy link

cosmicnet commented Feb 17, 2020

I just came across this in some code I was working on at the time:

my $outfile = new IO::File;
$outfile->open(">$filepath") || die( "Cannot open file '" . decode( 'utf8', $filepath ) . "' for output" );
#$PerlIO::encoding::fallback = Encode::FB_DEFAULT; # This didn't work for some reason
binmode( $outfile, ':encoding(utf8)' );

tonycoz added a commit to tonycoz/perl5 that referenced this issue Mar 4, 2020
Further clarified it has no effect on modules.

Clarifies a documentation nit discussed in Perl#17458
tonycoz added a commit to tonycoz/perl5 that referenced this issue Mar 9, 2020
Further clarified it has no effect on modules.

"main program scope" is even clearer, credit to github comments.

Clarifies a documentation nit discussed in Perl#17458
atoomic pushed a commit that referenced this issue Mar 12, 2020
Further clarified it has no effect on modules.

"main program scope" is even clearer, credit to github comments.

Clarifies a documentation nit discussed in #17458
@khwilliamson khwilliamson added type-Unicode Unicode and System Calls Bad interactions of syscalls and UTF-8 labels Apr 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dist-IO issues in the dual-life blead-first IO distribution type-Unicode Unicode and System Calls Bad interactions of syscalls and UTF-8
Projects
None yet
Development

No branches or pull requests

5 participants