YTEP-0029: Extension Packages

Abstract

Created: January 25, 2016

Author: Matthew Turk

The yt project is not the same as the yt codebase. However, the bundling of analysis modules and maintaining a monolithic repository tends to blur the distinction between the two. This YTEP is an attempt to identify how we can make the codebase more friendly to extension modules, and to encourage an ecosystem of independently developed packages that utilize yt.

Status

Completed

Detailed Description

This YTEP proposes a few courses of action designed to promote the idea of yt as a dependency, rather than a destination, for analysis modules and external projects.

As it stands, many (the author included) have viewed yt as the “place” to put things, whether or not they contribute to its core mission. There are several good reasons for this:

  • Distribution: yt is largely a monolithic codebase, and if you have downloaded and installed yt, you have access to all of the analysis modules. This makes discoverability and distribution much easier.
  • Infrastructure: yt has testing infrastructure which gets run on all pull requests and new revisions. As such, if the analysis module is in the primary codebase, it too will be tested. If functionality in yt changes, the analysis module will track this. Additionally, we provide the ability to run tests on managed infrastructure.
  • Hosting: We will allow these projects to host data on hub.yt as well as have their websites either as subrepositories in the main yt-project/website repository or as subdirectories. This will allow them to share our hosting infrastructure without developing their own, if they so wish.
  • Prestige: (maybe?) Having something be a part of yt adds to the sense that it’s part of the bigger project, and that it’s somehow graduated to a stable component. (I am not sure this is real, but it’s been mentioned to me.)
  • Social infrastructure: There’s a mailing list for yt, so folks know where to go.

But, there are several – sometimes quite large – downsides to being part of the mainline yt codebase.

  • Project Standards: The methods right now for developing in yt require several iterations of code review as well as conforming to standards of development practices. For instance, this could include style or where the code is developed (i.e., bitbucket/github/kallithea/etc).
  • Credit: While the situation is certainly improving, and there is nothing to stop a contributor from writing a paper describing a new analysis module (in fact this is encouraged) it is still a part of a “project” codebase rather than a standalone entity. This can diminish the perception of individual contributions, particularly by people not directly affiliated with the mainline yt project.
  • Timeline: The acceptance of a change into yt can often be on a weeks or longer timescale, depending largely on the time being issuing a pull request and the next PR triage. For modules that are developed with much higher frequency, this can be very cumbersome and costly in developer efforts.
  • Dependencies: yt attempts to keep the number of dependencies at a minimum; if a package wants to use them, it often has to either make them on-demand or vendor them with the source. We want to encourage packages to utilize new technologies and experiment – but at present, we can’t allow that into mainline.

A solution to this problem would be to encourage an ecosystem of discoverable packages that build on yt as a dependency. Three such packages exist at time of writing – powderday, trident, and astroblend.

What this YTEP proposes is to restructure our discussion of yt as a project to emphasize these types of packages and projects as citizens in the community, and provide to them a mechanism for people to find them.

This will take a few forms:

  • Re-designing the website to emphasize the burgeoning ecosystem of projects, perhaps similar to the way astropy.org is set up.
  • Offering, but not mandating, projects that they should direct questions to the yt-users mailing list.
  • Splitting out some particularly isolated projects from analysis modules. This may include photon simulator and the sz generator.
  • Provide entry points in the code for people to find out about available analysis modules. (Discussed below.)
  • Utilize intersphinx documentation links for analysis modules, or offer to host documentation on the main yt homepage.
  • Determine mechanisms for recognizing projects as extensions, independent, related, etc.
  • Adding testing support can be difficult and labor intensive; however, we should explore adding testing support for extensions under some criteria. (Such as, does this depend on public APIs, do test failures constitute upstream breakages, etc.)
  • Accept either subrepos or subdirectories in yt-project/website.

At present, people can find analysis modules by doing something like::

>>> dir(yt.amods)

This returns a list of analysis modules. This could be extended to also parse a small list of known external projects which could be installed. Importing one that was not installed would report that it wasn’t installed and describe a method for installing it.

Depending on the type of project, we may want to distinguish between yt extensions and projects that share the yt ecosystem. For the former, we could evaluate using yt.extension as the namespace, if yt.analysis_module is not appropriate. This would then provide (similar to how Flask manages extension modules) a naming scheme (say, ytext_projectname) that would be imported into yt.extension. This is not necessary, but is a possibility within the scope of this YTEP.

Backwards Compatibility

If we split out existing modules, they would need to be API compatible. We would also want to make sure that they are well-supported by our infrastructure.

Alternatives

One possible alternative would be to make the yt.analysis_modules namespace a free-for-all of modifications, with much shorter timescales and essentially autonomous operation by developers.