Splitting off from #1274766: Collect stats on enabled sub-modules, not just projects
See also #1036780: Drupal.org should collect stats on enabled sub-modules and core modules

Once updates.d.o is actually collecting, storing, and aggregating data about the sub-components included in a given project (e.g. the enabled modules inside a given Drupal project) we'll need some way to display this data on d.o so people can see it. I think it's going to overwhelm the existing UI at e.g. https://drupal.org/project/usage/3060 so we're going to have to spend some effort figuring out how to display this in a useful way.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

sreynen’s picture

In addition to individual projects, https://drupal.org/project/usage will need an update to show the new data. That UI could probably stay pretty much the same, just with more rows.

Bojhan’s picture

Priority: Normal » Critical

This is really important as it could influence what we keep in core for D8.

Bojhan’s picture

@dww Can you explain how someone else can make this happen?

dww’s picture

Version: 6.x-1.x-dev » 7.x-2.x-dev

A) We need a decision as to whether d.o is going to continue using Project* or not. Stay tuned for a blog post from the DA about that. :/

B) Assuming we're going to keep using Project*, we need to decide if this is happening in D6 or waiting for D7. Generally, we're supposed to be in a D6 code freeze for d.o, and only working on the D7 port. Therefore, I'm bumping this to D7 for now (although we could move it back if necessary/desired).

C) We need a UI design/plan/mockup(s). Anyone can work on that, although I think this is harder than it might appear at first. Some projects have dozens of sub-components (e.g. core). I don't think comment #1 makes any sense at all, unfortunately:

That UI could probably stay pretty much the same, just with more rows.

I don't know how that UI could stay the same with another order of magnitude more rows. The site-wide overview is already close to unusable. IMHO, we should leave the site-wide project comparison page alone, and only try to deal with displaying this new data when you've narrowed down to a specific project. I don't really see the value in trying to compare sub-components across projects, anyway.

D) We need data. That's what #1274766: Collect stats on enabled sub-modules, not just projects is for. bdragon is the single point of failure on that. I don't know anything about the details of the web cache log parsing code, the mongo DB that processes the data, and how the summaries are injected into the d.o database for display. I don't even know where any of that code lives, much less how it works. Sorry.

E) We need code to implement the plan in C.

Cheers,
-Derek

Bojhan’s picture

I am happy to work on C), but I dont feel much for working hard on the UI without any progress on the code parts. Unless bdragon can work on moving that forward?

dww’s picture

See #1274766: Collect stats on enabled sub-modules, not just projects. Looked like he was having success. You'd have to ping him directly.

Personally, I don't feel much for working hard on this without any progress on A. That's why I listed that first...

Cheers,
-Derek

sreynen’s picture

I'm probably underestimating how many sub-modules we have. Core has a lot, but my impression is most modules have 0 or 1 submodule. Might be good to do D) before C) so the UI is informed by real data.

Alan D.’s picture

imho, having the sub-module info here is not overly useful to end users at https://drupal.org/project/usage. I'd limit it to the individual project usage pages.

On the individual project pages (ie: https://drupal.org/project/usage/views) I can see two obvious options.

1) Enhance the first view by adding more rows (and reducing the number weeks shown)

Week 5.x 6.x 7.x 8.x Total
July 14, 2013 3,270 183,525 458,526 0 645,321
- Views 3,270 183,525 458,526 0 645,321
- Views UI 2,270 123,525 258,526 0 245,321
July 7, 2013 3,273 184,206 454,714 0 642,193
- Views 3,270 183,525 458,526 0 645,321
- Views UI 2,270 123,525 258,526 0 245,321

Or adding an additional table as per https://drupal.org/project/usage with all sub-modules listed.

Project Jul 14, 2013sort icon Jul 7, 2013 Jun 30, 2013 Jun 23, 2013
Drupal core 910,654 906,840 906,941 904,338
  Aggregator 23,000 23,000 23,000 23,000
  Block 453,000 453,000 453,000 453,000
  Blog 203,000 203,000 203,000 203,000
  Book 123,000 123,000 123,000 123,000
  Color 12,000 12,000 12,000 12,000

Hopefully this does get ported, I personally have 3 sub-modules that I want to drop but waiting on 8.x port as I have currently no way of knowing these stats

Great work everyone, much appreciated!!!!

klonos’s picture

Besides the main project graph, I would like to be able to dig deeper into submodules by visiting https://drupal.org/project/usage/drupal/aggregator, https://drupal.org/project/usage/drupal/blog and so forth and be able to see their respective spark line graphs over time.

Ideally, the main project lines would also be displayed on the same graph but grayed out. So for example if you see the main graph of a project go up over time while the individual submodule is either going down or is flat-lined, you'd get the point that the specific feature is not used by people. Having only the submodule's graph would not be enough to make useful comparisons without the overall usage there as well. Hope that makes sense.

MustangGB’s picture

Priority: Critical » Major
Issue summary: View changes
kattekrab’s picture

kattekrab’s picture

Ok. Apparently we have 12 months worth of data now on loghost! That's pretty cool.

What are the next steps to move this forward so we can start to see it?

Does it need to be always visible on project pages? Should it be a report that gets generated periodically somewhere?

TR’s picture

Now that we have the data, it would be great to immediately push out some version of this, even if it isn't ideal. The best way to make it useful is to make it available and solicit feedback.

For the first iteration, I agree 100% with Alan D. from #8. Specifically:
1) "I'd limit it to the individual project usage pages."
Yes, this is project-specific data, that's where it belongs.

2) Present it as "an additional table as per https://drupal.org/project/usage with all sub-modules listed." For example,

Project Jul 14, 2013sort icon Jul 7, 2013 Jun 30, 2013 Jun 23, 2013 Jun 23, 2013 Jun 9, 2013
Drupal core 910,654 906,840 906,941 904,338 901,623 899,127
Aggregator 23,000 23,000 23,000 23,000 23,000 23,000
Block 453,000 453,000 453,000 453,000 453,000 453,000
Blog 203,000 203,000 203,000 203,000 203,000 203,000
Book 123,000 123,000 123,000 123,000 123,000 123,000
Color 12,000 12,000 12,000 12,000 12,000 12,000

3) While this table format only goes back 6 weeks, that's because of the large column width caused by the long-format date. The current minimum column width is enough to show 1 billion users. That's excessive. If the dates were written just a "Month, day" instead of "Month, day, year" that would give us and extra two or three columns easily.

4) If a longer history is desired, instead of one column per week we could show one column per month. That would take use back 8-9 months with the column width fixed as in 3). The most important thing for me is to see the relative usages of the submodules, the week-to-week variation is unimportant, but seeing the long-term change month-to-month might have some use. So for example:

Project Jul 2013sort icon Jun 2013 May 2013 Apr 2013 Mar 2013 Feb 2013 Jan 2013 Dec 2012 Nov 2012
Drupal core 910,654 906,840 906,941 904,338 901,623 899,127 895,511 891,943 888,887
Aggregator 23,000 23,000 23,000 23,000 23,000 23,000 23,000 23,000 23,000
Block 453,000 453,000 453,000 453,000 453,000 453,000 453,000 453,000 453,000
Blog 203,000 203,000 203,000 203,000 203,000 203,000 203,000 203,000 203,000
Book 123,000 123,000 123,000 123,000 123,000 123,000 123,000 123,000 123,000
Color 12,000 12,000 12,000 12,000 12,000 12,000 12,000 12,000 12,000
catch’s picture

What about an additional tab like /project/usage/drupal/components?

Then we could drop the overall usage stats for the project and just show the components at the same level.

One issue with http://drupal.org/project/usage is there's no way to show usage by major core version, although it's possible to filter https://www.drupal.org/download/?f[0]=drupal_core%3A7234 by it.

The same would be useful for the components as well - i.e. I don't think we need usage by tag displayed, just usage by date then a major core version filter would be pretty good.

Wim Leers’s picture

This would be very valuable to help decide what to do with experimental modules: https://www.drupal.org/core/experimental.

TR’s picture

Yes. Let me reiterate what I said in #13:

"Now that we have the data, it would be great to immediately push out some version of this, even if it isn't ideal. The best way to make it useful is to make it available and solicit feedback."

kattekrab’s picture

I like @catch's suggestion of putting it in a separate tab.

But yes, it would be great to start accessing this data.

Alan D.’s picture

Even a simple CSV export would be very very useful :)

Wim Leers’s picture

Bump.

Is there some way we can help push this forward?

kattekrab’s picture

We need to either get it on the DA team roadmap, or supported as community initiative.

A working prototype might also help?

Alan D.’s picture

Pinging the thread to bump it up again. While I initially was mostly interested from the developer perspective, this actually actually highly valuable info from a site builders perspective as well :)

ie. For checking out core sub-modules to see how much usage they have, and thus, how much real world usage these have had already. For workflow, all we have is:

Workbench Moderation: 6,140 users
Drupal > Content Moderation: ?? users

So currently we can't figure out if Content Moderation is some totally unused module that is potentially unstable and littered with numerous undiscovered issues or seemingly stable if there are thousands of users with only 30 or so known bugs.

effulgentsia’s picture

Anyone up for writing a patch that implements #14?

fgm’s picture

Is this data, even if not really up-to-date, available somewhere to do UI experiments on it ?

chr.fritsch’s picture

Is there a documentation how to setup a D7 site with project module and dummy data to get a local dev environment? Or how does developing for d.o. normally work?

Mixologic’s picture

https://www.drupal.org/drupalorg/docs/build/development-environments/dev... is how you go about getting a drupal.org dev site.

Since this would require changes to the project_usage module, we'd probably want to add in an additional drush command to process submodule stats, because I looked at the submodule usage stats and they show that project_usage itself is used by about 38 sites.

Attached is the file format for what we have on our loghost of the most recent weeks worth of data, inside are two files 1518307200.submodule_project_counts which shows raw counts per submodule name, and 1518307200.submodule_release_counts, which is broken apart by individual release.

fgm’s picture

Very interesting data, thanks. I notice quite a number of modules are not listed: I suppose this is only available above a certain threshold ?

Mixologic’s picture

Any modules that have at least 1 user requesting update data is listed.

@fgm which modules are missing? There are 92370 modules in that file, which is pretty close to all of them.

Mixologic’s picture

oh, also, the 'submodule counts' only contain modules who's module name does not match the project name, as those are tracked in the regular stats.

It's yet another mismatch between Drupal.org's data model for a "project" and core's lack of knowledge of what a 'project' is.

kattekrab’s picture

Hello!!

So... nearly a year ago since an update on this. Any chance we could get some movement here?

Sam152’s picture

@Mixologic, is there any chance of getting another dump of the raw data from #25?

Sam152’s picture

Poke! Still interested in getting an updated snapshot of those stats if they're available.

2dareis2do’s picture

+1