The goal of the CPAN Digger project is to understand and to help to improve the code that is on CPAN.

A few steps:

Weekly report in the Perl Weekly

Before setting out to make the small improvements I can offer, I wanted to measure the current state and I wanted to have a way to measure progress.

In the Perl Weekly newsteller I've started to share weekly statistics from MetaCPAN

What I saw is that the numbers are fluctuating quite a lot (even the percentages). About 20-25% of the distributions have not links to GitHub or other public VCS. About 40-60% of those that have a link don't have a CI system configured.

Link to Public Version Control System

If a distribution does not have a link to a public VCS in their META file then it will be difficult to contribute to that distribution. Some people might expect a patch in e-mail, but these days very few people know how to do that. A lot more know how to send a pull-request. It is also much better as the potential contributor can easily see the changes since the most recent release that are still only in the repository. If it is an open source project having an accessible public version control system seems like a very sensible option. Linking to it in the meta-data of the package makes it easy for other tools, e.g. MetaCPAN to display it.

Also without a link to a VCS it will be outright impossible to see if it has a CI system configured.

So the first step is to locate CPAN distributions that don't include a link to their VCS (Version Control System). Suggest to the author to add one. How to convince Meta CPAN to show a link to the version control system of a distribution? This has to be done by contacting the author personally.

Configure Continuous Integration (CI) for the project

Using a hosted Continuous Integration system helps the author catch many issues before the distribution reaches CPAN. It can catch issues with changing dependencies even while the code itself does not change and it can help noticing if our changes would break any of the downstream dependencies before we release the code to CPAN.

In this step we need to locate distributions that have VCS, but don't have CI configured. Ask the author if they are interested and send a Pull-Request to set up CI.

CPAN Testers are awesome, I wrote about them a number of times. They test most of the modules uploaded to CPAN on various platforms. However they only run on modules already uploaded to CPAN. A CI-system configured to your repo can run every time you push a file. Every time someone sends a pull-requests shortening the the feedback loop to the potential contributors. It can run scheduled, for examples once a day, to see if a change in one of your dependencies did not break your code. Again, to get back feedback as soon as possible.

If the project is hosted on GitHub there are a number of options such as GitHub Actions, Travis-CI, Appveyor, Circle-CI, Azure Pipelines.

If the project is hosted on GitLab, they provide the GitLab pipelines.

If the project is hosted on Bitbucket, they provide their own pipelines.

A few articles:

Enable Travis-CI for Continuous Integration

Using Travis-CI and installing Geo::IP on Linux and OSX

Link the the desired issue-tracking system

By default MetaCPAN will link to the Request Tracker, but you might prefer that your users will submite bug-reports and feature requests via some other issue-tracking system. For example the one that comes with your Version Control system.

Help authors configure the Meta-data that links MetaCPAN to the issue-tracking system they prefer to use.

License field

The license field in the META data of a CPAN packages allows an easy way to automatically check the license of each package.

How to add the license field to the META.yml and META.json files on CPAN?

Tools

You can install the CPAN::Digger module and run

cpan-digger --author SZABGAB --report

replacing my PAUSE ID with yours. This will give you a list of your distributions that do not have a link to a Version Control System.

cpan-digger --author SZABGAB --report --github --sleep 3 --limit 30

This will also check the 30 most recently uploaded distributions for having a CI system. This will clone each repository in a temporary directory so you might want to set the "--sleep" flag to hammer GitHub a bit less frequently.

cpan-digger --recent 30 --report --github --sleep 3

Finally you can ask information about the N most recently uploaded distributions by any author. This can be useful if you would like to help others linking to the VCS or setting up a CI system.

TODO

  • Set up a separate site where we collect the Meta information about CPAN distributions.
  • Run Perl Critic on the source code of the modules and show statistics which rules are usually followed and which not. See Kritika
  • Run Perl Tidy on the source code and see what layout is usually used.
  • Check Cyclomatic Complexity of the code - Perl::Metrics::Simple

Log

Emails suggesting to add link to VCS. There always will be people who don't want to share the link to their public version control. That's fine. We should not bother them again.

On the other hand where the email bounced or where there was no response we might try to find another way to contact the author.

2020.11.21Jason Carty Email bounced
2020.11.18Guido Socher Not using public VCS
2020.11.18Olly Betts Link added
2020.11.18Mathias Weidner Links added
2020.11.15Strzelecki Łukasz
2020.11.15Brian Kelly
2020.11.15Louis Strous
2020.11.15Ludovico Stevens Not using public VCS
2020.11.15Marcus Holland-Moritz Pull-request sent
2020.11.15John Heidemann Not using public VCS
2020.11.14Mike Taylor
2020.11.14Pete Ratzlaff
2020.11.14S. Falempin Email bounced
2020.11.14Sano Taku
2020.11.14Scott T. Hardin
2020.11.12Bruce Schuck
2020.11.12Michael R. Davis Moving once private repos to GitHub
2020.11.12Wang Fan
2020.11.12Armin Fuerst Email bounced
2020.11.09Franck Giacomoni
2020.11.09David Dick On GitHub now
2020.11.07LE GALL Thierry
2020.11.07Tomohiro Yamashita
2020.11.07Philip Gwyn Not using public VCS
2020.11.07Pete Krawczyk
2020.11.07Jerrad Pierce Not using public VCS
2020.11.05Jim Turner
2020.11.05Karl Gaissmaier
2020.11.03Christoph Halbartschlager
2020.11.03John Gravatt John added the VCS link within a few hours
2020.11.03Oleg Pronin
2020.10.31Dustin La Ferney
2020.08.25Vincent van Dam Pull-request sent
2020.08.25Vitaliy V. Tokarev Will fix in the next issue
2020.10.25Jeffrey Ratcliffe Did not sound enthusiastic about the idea.
2020.10.25NLnet Labs It is in the README and they prefer not to add to the META data. see also
2020.10.25Robert Acock
2020.10.25Philip R Brenan positive reply
2020.08.19Roland van Ipenburg success
2020.08.19Theo van Hoesel projects were internal on their way to public VCSs.
2020.08.19Szymon Nieznański projects were internal on their way to public VCSs.
2020.08.19Juan Jose San Martin no response
2020.08.19Paul "LeoNerd" Evans no objection, but does not see much value in the links. - result: no links.
2020.08.10Jacques Degues positive response, partial success
2020.08.09Mike Jones success
2020.08.09Know Zero Email bounced
2020.08.08Kang-min Liu success: it was just an oversight.