CPAN Testers is only made possible with the support of our sponsors.
For more information on sponsoring, please visit the I CPAN Testers website.

Upgrade Notice

The CPAN Testers Blog site has been upgraded since you last accessed the site. Please press the F5 key or CTRL-R to refresh your browser cache to use the latest javascript and CSS files.

News & Views

Posted by Barbie
on 25th November 2010

If you've ever had a look at the Status page on the CPAN Testers Reports site, you will likely have noticed that typically the graphs show 4-5 lines on any given day. This has been pretty much the case since I added this monitoring feature, and supported the fact that it could take up to 5 days for a less common page to be rebuilt.

However, over the last 2 weeks that has been changing, to the point that from about 4pm CET yesterday (24/11/2010) the builder only had requests less than 24 hours old. It appears there are three reasons for this.

The first is that we have seen a reduction in report submissions over the past few weeks. Having said that, the submissions during October was rather substantial, topping over 500,000 submissions, so it's not too surprising to see a reduction. And to be fair looking at the Monthly Stats, we have already had over 300,000 report submissions this month, so it's not been a quiet month either.

Secondly the Reports site has had some alterations to it to reduce the hits from robots. With around 15-20 crawlers starting to hit the site at once, processing was occasionally affecting other areas of the build process, as well as the backup processes. As such 'rel="nofollow"' has been added as an attribute to links for RSS, YAML and JSON files. This had a dramtic effect, as can been seen in the CPU graph, to the point that the server load dropped to under 2.0 for long periods for the first time since I set up the server! The change essentially means crawlers now only reference just under 10 million pages, rather than over 20 million, and don't pull several gigabytes of storage data off the server each day.

Thirdly the bzip2 process to archive the backup databases now only happens once a day. With the reduction in server hits, this now takes far less time to process. and no longer has a prolonged effect on the build process. Previously it could take over 2 hours to compress the archive 6 times a day, and now takes about 25 minutes once a day.

In addition there have been some minor tweaks to the build process, but the major changes are still waiting in the wings, as the current data stores need to fully update to allow me to implement them.

Every so often we get asked why a particular report hasn't appeared on the Reports site. Depending on the sync process it can be anything between a few minutes and an hour. However, as most watch for reports via the Distro or Author pages, previously it could take up to 5 days to appear. Currently that's now down to less than a day. However that still isn't quite quick enough, which is why some further improvements will hopefully be implemented over the next week.

Expect some more updates soon on the next set of changes and some of the proposed changes for the future.

Posted by Barbie
on 24th November 2010

"...kidney bingos organ fun"

Congratulations to BinGOs, aka Chris Williams, on reaching 3 million test reports submitted. Chris alone now accounts for slightly under one third of all the test reports submitted to CPAN Testers!

Since joining the CPAN Testers community, Chris has been a valuable asset, both in terms of the diversity of the testing platforms, and also for his ability to push the testing infrastructure beyond the limits we anticipated.

With the stagnation of the CPAN-YACSmoke distribution, Chris eagerly stripped it down and rebuilt it into CPANPLUS-YACSmoke, providing a stable basis for testing with CPANPLUS once again. Since then Chris has expanded his knowledge of distributed testing, and developed more applications and modules to support various styles of smoke testing, from his POE plugins to smokebrew.

Well done Chris, and here's to the next 3 million!

 

Back in January 2008 we were celebrating the one millionth post submitted to CPAN Testers. Although that article proclaimed it to be the one millionth report, many initial posts to the mailing list also included discussions and announcements of uploads. It wasn't until I created the Interesting Stats page that we started to see the true picture. However, we only had to wait until March 2008 for the real one millionth report to be posted. Now some 2 years and 7 months later we've had the nine millionth report submitted. It took 9 years to produce 1 million reports, but only a further 2½ years to produce another 8 million reports. The rate at which CPAN Testers has been able to get people involved in the project has been phenomenal. We are now submitting over 500,000 reports a month, so I have no doubt we will pass the 10 millionth mark before the end of the year .. probably just before Christmas :)

In the comments to my nine millionth post on Perl Blogs, John Napiorkowski asked of the comparisons for testing packages in other languages, particularly Python and Ruby. Chris Williams provided some links to the testing setups for those languages, and the sites prove rather interesting. From the perspective of trying to find information about test results CPAN Testers wipes the floor with both, as I found both the Cheese Cake and Firebrigade sites awkward to follow. For the Cheesecake site it seems they are aiming more for a site like CPANTS, which is probably a good first step to encourage a testing culture. While the Firebrigade site seems to have tried to take on the idea of CPAN Testers, but in trying to also be different they've actually made things hard for themselves. I also refute the Ruby claim of "Firebrigade tests every gem ever made on every platform under the sun". On the front page it lists that it only has tests on 45 platforms. CPAN Testers would never make such a bold or false claim, but with over 100 platforms, and 74 alone during October 2010, I think Perl's sun must be much bigger than Ruby's, and CPAN Testers are still only scratching the surface. It will be a long time before any other language can compete with CPAN Testers, and with only 20405 reports in nearly 4 years, the Ruby team have a long way to catch up. CPAN Testers should be immensely proud of the work they have put into the project, whether as a developer, tester or even those with just the odd suggestion to help improve the eco-system. Every contribution has helped to make CPAN Testers worthwhile and valued by the Perl community, as well as respected, imitated and/or envied by other language communities. And we're still improving.

Talking of improvements, there have been several performance improvements to the applications which produce the web pages for several sites. The CPAN Testers Statistics site was suffering from the vast amount of number crunching it performed, and was previously using up as much as 3GB of RAM, and often taking over an hour to produce its results. With a rework to save a snapshot of the data, and restart each time from where we left off, the application now uses less than 1GB RAM and processing takes about 40 minutes. The CPAN Testers Reports page builder has also seen some tweaks, and again the pages have seen a dramatic improvement in build times. Some author pages were taking over an hour to build, but now even RJBS and ADAMK only take 5-15 minutes. There is still room for improvement, and better use of on disk storage is planned.

Still the biggest drain on resources is bzip2. It is a memory and CPU hog at the best of times, and with it holding IO on occasions, it often has a significant impact on other applications. As such I am taking time to review how the bzip2 files are produced. Part of that is to review how often they need to be generated. Tellingly the frequency of any 1 IP to grab the two most popular archive files (uploads.db.bz and cpanstats.db.bz2) are just once a day. Currently the gzip archive of cpanstats.db.gz has a similar popularity. As such over the next few days expect the archive creations to happen in the early hours of the morning (CET), gathering up the previous days stats. Initially I was planning to move the archiving to another server, but with the archives not being in high demand, I will now look at running one complete archive process a day and see how that effects the server performance. If the change in timestamp is likely to cause problems, please me know and I'll see what I can do to help.

In the next few weeks David Golden and I are planning to chat about the future of CPAN Testers. Now that CT2.0 is live, where do we go from here? There are some obvious improvements we can now start to look at, such as expanding the metadata we record, but we have other plans to make CPAN Testers even more reliable and current. Once we've had a chance to discuss the ideas and point them in the right direction, we'll let you know more.

In other news, it is likely that the Preferences site's SSL certificate will fail very soon. For the past 2 years we've been able to qualify for GoDaddy's OpenSource scheme which donates a 1 year certificate for any verified Open Source project. Sadly, despite them considering CPAN Testers an Open Source project for the last 2 years, we have now been rejected for not being an Open Source project! Yes, the response surprised me too, but despite attempts to understand why we no longer qualify, they've now closed the request ticket and have effectively ended the discussion. As such, I'll be looking to purchase a new SSL certificate from another vendor shortly, who hopefully have a better support policy.

I was intrigued to see Yanick Champoux's recent blog post: Generating RT bugs out of CPAN Testers' Reports. Yanick was looking for an effective way submit a test report into his RT queue. Unfortunately I can't add a button to the site as suggested, as at the current time the site doesn't verify that you are the author of a distribution, and opening it up to all would be a nightmare waiting to happen. I did wonder whether this was something that could be added to the Preferences site, but with potentially hundreds of reports coming in, trying to decide whether they are applicable for RT or not could also turn into a nightmare. As such, if you're interested in doing this yourself for your own RT queue, read Yanick's post and see how you get on.

Moving away from CPAN Testers and looking at CPAN, we passed another milestone last month. On 8th October 2010 ETHER became the 5,000th PAUSE user to upload a distribution to CPAN. Although we currently have 8482 PAUSE users (as of 02/11/2010), it is surprising how many have used their ID for other CPAN related activities. After holding the top spot for some considerable time, Adam Kennedy has now been overtaken by Ricardo Signes for the most current distributions attributed to a single PAUSE user. Some years ago it was considered quite a feat to reach 100 distributions, but with 230 active distributions currently to his name, I'm not surprised Ricardo created Dist::Zilla to help him manage them all :)

Finally, we have some more mappings, with 40 new address mappings, of which 22 are for new testers. Until next time, happy testing :)

<< December 2010 (3) October 2010 (3) >>