Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle PEP 263 Source Encoding Lines #18

Open
corranwebster opened this issue Nov 9, 2020 · 2 comments
Open

Handle PEP 263 Source Encoding Lines #18

corranwebster opened this issue Nov 9, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@corranwebster
Copy link

PEP 263 defines a way for Python source code files to declare their encoding by having a special comment which must occur on the first or second line of the file.

For example

# -*- coding: latin-1 -*-

# (C) Copyright 2005-2020 Enthought, Inc., Austin, TX
# All rights reserved.
#
# This software is provided without warranty under the terms of the BSD
# license included in LICENSE.txt and may be redistributed only under
# the conditions described in the aforementioned license. The license
# is also available online at http://www.enthought.com/licenses/BSD.txt
#
# Thanks for using Enthought open source!

gives an error "H103 Wrong copyright header found".

A quick search of the codebase shows that we do have these files. Many of them are autogenerated Sphinx conf.py files, but there are some legitimate matches. All of them are requiring UTF-8, and so can be removed now that that is the default encoding for Python 3 and the codebase is Python 3 only.

From PEP 263, the regex to match the line is: ^[ \t\f]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)

@corranwebster corranwebster added the enhancement New feature or request label Nov 9, 2020
@corranwebster
Copy link
Author

I'm adding this issue mainly so we can reject it: I think the correct solution is that we always use UTF-8 and until such time as we need to have a source file with a different encoding, I think we can ignore this possibility.

@mdickinson
Copy link
Member

I think the correct solution is that we always use UTF-8 and until such time as we need to have a source file with a different encoding, I think we can ignore this possibility.

Agreed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants