CPAN Digger
The goal of the CPAN Digger project is to understand and to help to improve the code that is on CPAN.
A few steps:
Weekly report in the Perl Weekly
Before setting out to make the small improvements I can offer, I wanted to measure the current state and I wanted to have a way to measure progress.
In the Perl Weekly newsteller I've started to share weekly statistics from MetaCPAN
What I saw is that the numbers are fluctuating quite a lot (even the percentages). About 20-25% of the distributions have not links to GitHub or other public VCS. About 40-60% of those that have a link don't have a CI system configured.
Link to Public Version Control System
If a distribution does not have a link to a public VCS in their META file then it will be difficult to contribute to that distribution. Some people might expect a patch in e-mail, but these days very few people know how to do that. A lot more know how to send a pull-request. It is also much better as the potential contributor can easily see the changes since the most recent release that are still only in the repository. If it is an open source project having an accessible public version control system seems like a very sensible option. Linking to it in the meta-data of the package makes it easy for other tools, e.g. MetaCPAN to display it.
Also without a link to a VCS it will be outright impossible to see if it has a CI system configured.
So the first step is to locate CPAN distributions that don't include a link to their VCS (Version Control System). Suggest to the author to add one. How to convince Meta CPAN to show a link to the version control system of a distribution? This has to be done by contacting the author personally.
Configure Continuous Integration (CI) for the project
Using a hosted Continuous Integration system helps the author catch many issues before the distribution reaches CPAN. It can catch issues with changing dependencies even while the code itself does not change and it can help noticing if our changes would break any of the downstream dependencies before we release the code to CPAN.
In this step we need to locate distributions that have VCS, but don't have CI configured. Ask the author if they are interested and send a Pull-Request to set up CI.
CPAN Testers are awesome, I wrote about them a number of times. They test most of the modules uploaded to CPAN on various platforms. However they only run on modules already uploaded to CPAN. A CI-system configured to your repo can run every time you push a file. Every time someone sends a pull-requests shortening the the feedback loop to the potential contributors. It can run scheduled, for examples once a day, to see if a change in one of your dependencies did not break your code. Again, to get back feedback as soon as possible.
If the project is hosted on GitHub there are a number of options such as GitHub Actions, Travis-CI, Appveyor, Circle-CI, Azure Pipelines.
If the project is hosted on GitLab, they provide the GitLab pipelines.
If the project is hosted on Bitbucket, they provide their own pipelines.
A few articles:
Enable Travis-CI for Continuous Integration
Using Travis-CI and installing Geo::IP on Linux and OSX
Link the the desired issue-tracking system
By default MetaCPAN will link to the Request Tracker, but you might prefer that your users will submite bug-reports and feature requests via some other issue-tracking system. For example the one that comes with your Version Control system.
Help authors configure the Meta-data that links MetaCPAN to the issue-tracking system they prefer to use.
License field
The license field in the META data of a CPAN packages allows an easy way to automatically check the license of each package.
How to add the license field to the META.yml and META.json files on CPAN?
Tools
You can install the CPAN::Digger module and run
cpan-digger --author SZABGAB --report
replacing my PAUSE ID with yours. This will give you a list of your distributions that do not have a link to a Version Control System.
cpan-digger --author SZABGAB --report --github --sleep 3 --limit 30
This will also check the 30 most recently uploaded distributions for having a CI system. This will clone each repository in a temporary directory so you might want to set the "--sleep" flag to hammer GitHub a bit less frequently.
cpan-digger --recent 30 --report --github --sleep 3
Finally you can ask information about the N most recently uploaded distributions by any author. This can be useful if you would like to help others linking to the VCS or setting up a CI system.
TODO
- Set up a separate site where we collect the Meta information about CPAN distributions.
- Run Perl Critic on the source code of the modules and show statistics which rules are usually followed and which not. See Kritika
- Run Perl Tidy on the source code and see what layout is usually used.
- Check Cyclomatic Complexity of the code - Perl::Metrics::Simple
Log
Emails suggesting to add link to VCS. There always will be people who don't want to share the link to their public version control. That's fine. We should not bother them again.
On the other hand where the email bounced or where there was no response we might try to find another way to contact the author.
2020.11.21 | Jason Carty | Email bounced |
2020.11.18 | Guido Socher | Not using public VCS |
2020.11.18 | Olly Betts | Link added |
2020.11.18 | Mathias Weidner | Links added |
2020.11.15 | Strzelecki Łukasz | |
2020.11.15 | Brian Kelly | |
2020.11.15 | Louis Strous | |
2020.11.15 | Ludovico Stevens | Not using public VCS |
2020.11.15 | Marcus Holland-Moritz | Pull-request sent |
2020.11.15 | John Heidemann | Not using public VCS |
2020.11.14 | Mike Taylor | |
2020.11.14 | Pete Ratzlaff | |
2020.11.14 | S. Falempin | Email bounced |
2020.11.14 | Sano Taku | |
2020.11.14 | Scott T. Hardin | |
2020.11.12 | Bruce Schuck | |
2020.11.12 | Michael R. Davis | Moving once private repos to GitHub |
2020.11.12 | Wang Fan | |
2020.11.12 | Armin Fuerst | Email bounced |
2020.11.09 | Franck Giacomoni | |
2020.11.09 | David Dick | On GitHub now |
2020.11.07 | LE GALL Thierry | |
2020.11.07 | Tomohiro Yamashita | |
2020.11.07 | Philip Gwyn | Not using public VCS |
2020.11.07 | Pete Krawczyk | |
2020.11.07 | Jerrad Pierce | Not using public VCS |
2020.11.05 | Jim Turner | |
2020.11.05 | Karl Gaissmaier | |
2020.11.03 | Christoph Halbartschlager | |
2020.11.03 | John Gravatt | John added the VCS link within a few hours |
2020.11.03 | Oleg Pronin | |
2020.10.31 | Dustin La Ferney | |
2020.08.25 | Vincent van Dam | Pull-request sent |
2020.08.25 | Vitaliy V. Tokarev | Will fix in the next issue |
2020.10.25 | Jeffrey Ratcliffe | Did not sound enthusiastic about the idea. |
2020.10.25 | NLnet Labs | It is in the README and they prefer not to add to the META data. see also |
2020.10.25 | Robert Acock | |
2020.10.25 | Philip R Brenan | positive reply |
2020.08.19 | Roland van Ipenburg | success |
2020.08.19 | Theo van Hoesel | projects were internal on their way to public VCSs. |
2020.08.19 | Szymon Nieznański | projects were internal on their way to public VCSs. |
2020.08.19 | Juan Jose San Martin | no response |
2020.08.19 | Paul "LeoNerd" Evans | no objection, but does not see much value in the links. - result: no links. |
2020.08.10 | Jacques Degues | positive response, partial success |
2020.08.09 | Mike Jones | success |
2020.08.09 | Know Zero | Email bounced |
2020.08.08 | Kang-min Liu | success: it was just an oversight. |
Published on 2020-10-24