-
Notifications
You must be signed in to change notification settings - Fork 12
Updating debian copyright file with cme
In my opinion, creating and maintaining Debian copyright file is the most boring task required to create a Debian package. Unfortunately, this file is also one of the most important of a package: it specifies some legal aspect regarding the use of the software.
Debian copyright file is scrutinised by ftp masters gatekeepers when accepting a new package in Debian project: this file must accurately describe the copyright and licenses of all files of a source package, preferably using a specific syntax. (Kudos to the ftp-masters team: reading copyright files must be even more boring than writing them).
The content of the copyright file must reflect accurately the license of all files. This license is often specified in the comments of a source files. The licensecheck command is able to scan sources files and reports the copyright and licenses declared in there. But it does not summarise this information: a copyright line is generated for each file of a package.
Hence a lot of work is still required to get a proper
debian/copyright
file.
The command cme update dpkg-copyright
aims to make this task much
easier. When run, the debian/copyright
file is created or updated:
- copyright years are coalesced when possible (i.e. 2001,2002,2003-2005 is changed to 2001-2005)
- file entries same copyright (same owner and years) and same license are grouped, group of files may be represented with a wild card (
*
) - license text is filled with actual text for the most popular licenses
The command cme update dpkg-copyright
relies on licensecheck
to
mine source files to extract copyright and license
information. It often does a good job, but sometimes, the result needs to be improved:
- some file types are unexpected
- some files do not contain information
- some files are not parsed correctly and the legal information contain garbage,
Let's see how to improve the results.
For what it's worth, the examples shown below are coming from moarvm
package.
First install packages cme
and libconfig-model-dpkg-perl at least at version 2.074.
Then, run scan-copyrights
in your source package file. The command
will probably issue a lot of messages like:
skipped file ./debian/README.source
skipped file ./lib/MAST/Ops.nqp
skipped file ./build/config.h.in
skipped file ./3rdparty/libatomic_ops/config.guess
skipped file ./3rdparty/dyncall/dyncall/dyncall_call_mips_n64_gas.s
skipped file ./3rdparty/dyncall/dyncall/dyncall_call_x64_generic_masm.asm
skipped file ./3rdparty/dyncall/test/call_suite/mk-cases.lua
These warnings are shown by licensecheck
when a file type is not
parsed. This may usual expected for files like config.guess
or
README.source
, but is definitely a problem for lua
or assembly
files (s
or asm
suffixes). In the first case, we want to suppress
the warning. In the latter case, we want to force licensecheck to
parse the files.
This can be done with debian/copyright-scan-patterns.yml
file. This
files contains a list of suffixes (or patterns) to scan or to
skip. This list of patterns is added to licensecheck
default
list. For instance moarvm
package contains something like:
---
check:
suffixes:
- asm
- lua
- nqp
- s
- template
ignore:
pattern:
- /debian/
- Makefile
- MANIFEST
- /config(.guess|ure|.h.in)
suffixes:
- generic
- rst
- jpg
- txt
- install
- M
- m4
This file forces licensecheck
to parse asm
, lua
, nqp
and
others files. The files matching a pattern in the ignore section are
silently skipped. You should edit a similar file until the list of
skipped files shown by scan-copyrights
is reduced to a reasonable
size (or empty, depending of your definition of "reasonable").
You can edit this YAML file with your favourite editor or wth the GUI
provided by cme edit dpkg
:
For more information, please see Dpkg::Copyright::Scanner man page, section "Selecting or ignoring files to scan".
Once the skipped files are sorted out, you can re-run
scan-copyrights
command. The output may show a list of problematic
files:
The following paths are missing information:
- 3rdparty/README.md: missing copyright and license
- 3rdparty/libuv/checksparse.sh: missing copyright
- 3rdparty/libuv/docs/src/conf.py: missing copyright and license
- 3rdparty/libuv/gyp_uv.py: missing copyright and license
- 3rdparty/libuv/src/unix/spinlock.h: missing copyright
- Configure.pl: missing copyright and license
- README.markdown: missing copyright and license
- docs/6model-parametric-extensions.markdown: missing copyright and license
- docs/README.md: missing copyright and license
- lib/MAST/Nodes.nqp: missing copyright and license
- lib/README.md: missing copyright and license
- ports/macports/README.md: missing copyright and license
- tools/ucd2c.pl: missing copyright and license
- tools/update_ops.p6: missing copyright and license
You may want to add a line in debian/fill.copyright.blanks.yml
Information may be missing because the source file does not contain
information or because licensecheck
failed to parse the file.
For each file, you'll have to read the file and use your best judgement to either ignore the file or provide missing information.
The missing information can be specified in
debian/fill.copyright.blanks.yml
. Each entry is a pattern, usually a
directory name or a complete path, followed by missing information (or
a special instruction to skip the file). For instance:
---
3rdparty/dynasm/:
license: Expat
3rdparty/dyncall/:
copyright: 2007-2015, Daniel Adler <[email protected]>
license: ISC
3rdparty/libtommath/:
copyright: Tom St Denis, [email protected]
license: dwtfyw-license
3rdparty/libtommath/bn_mp_div.c:
skip: '1'
docs/moar.pod:
license: Artistic-2.0
src/:
comment: Almost no file in src has legal information. This entry provides default
legal info for all files in there
copyright: the MoarVM contributors. See the CREDITS file
license: Artistic-2.0
src/gc/debug.h:
skip: '1'
Note that these entries are handled as default values, they will always be superseded by information found in files (which may happen when the package is upgraded.
As before, you can edit edit this file with your favourite editor or
with cme edit dpkg
.
For more information, please see section "Filling the blanks" of Dpkg::Copyright::Scanner man page.
Once you're satisfied with the information extracted from source
file. it's time to actually merge this information with the content of
the existing debian/copyright
files (if any).
First, make sure that your current copyright
file is archived in
your VCS (be it git, svn or whatever).
Then cme update dpkg-copyright
. Using the content of the source files, this
command:
- updates copyright and license information in existing entries
- removes entries of removed files or directories
- adds license text as needed (for known licenses)
Once this command is run, you must check the result and complete any missing information (e.g. license text of unknown licenses).
Despite the precautions taken above, some entries may still have wrong information. Either:
- missing license text or comment
- wrong copyright or license information
You may correct the first kind of error directly in the resulting
copyright
file. Any re-run of cme update dpkg-copyright
will keep
these information.
Correcting wrong copyright or license information is more problematic:
cme considers that information found in files is more exact than old
data found in debian/copyright
. Thus, running cme update dpkg-copyright
will clobber your manual updates.
You can instruct cme to alter or set specific copyright entries in "debian/fix.scanned.copyright" file. Each line of this file will be handled by Config::Model::Loader to modify copyright information.
For instance, if the extracted copyright contains:
Files: *
Copyright: 2014-2015, Adam Kennedy <[email protected]> "foobar
License: Artistic or GPL-1+
You may add this line in debian/fix.scanned.copyright
file:
! Files:'*' Copyright=~s/\s*".*//
As before, you can edit edit this file with your favourite editor.
cme edit dpkg
will soon support this file as well.
For more information, please see section "Tweak results" of Config::Model::Dpkg::Copyright man page.
In case of issues, please file a bug against libconfig-model-dpkg-perl
This is caused by lines like this in the source file:
function foo(c) { ...
The bug is in licensecheck but is very hard to fix since '(c)' has a legal value and is often used to specify copyright.
You should add an entry in fill.copyright.blanks.yml
to ignore this file
A file that trips the licensecheck
bug described above is contained
in a directory that contains no copyright information. I.e. no files
in there have a copyright statement. You should use licensecheck
to
identify the file tripping the bug and set fill.copyright.blanks.yml
to ignore it.
For instance, with a wrong entry like :
Files: source/AYUGens/AY_libayemu/src/*
Copyright: V_Soft and Lion 17 static int Lion17_YM_table [32] = / V_Soft and Lion 17 static int Lion17_AY_table [16] = / Hacker KAY
License: LGPL-2+
Run:
$ licensecheck -r --copyright source/AYUGens/AY_libayemu/src/
source/AYUGens/AY_libayemu/src/ay8912.c: UNKNOWN
[Copyright: V_Soft and Lion 17 static int Lion17_YM_table [32] = / V_Soft and Lion 17 static int Lion17_AY_table [16] = / Hacker KAY]
And add this in fill.copyright.blanks.yml
:
source/AYUGens/AY_libayemu/src/:
skip: '1'
See: