Package development process
A software package development process is a system for developing software packages. Packages are used to reuse and share code, e.g., via a software repository, a formal system for package checking that usually expose bugs, thereby potentially making it easier to produce trustworthy software (Chambers' prime directive).[1]
Discussion
In this context, a package is a collection of functions written for use in a single language such as Python or R, bundled with documentation. For many programming languages, there are software repositories where people share such packages.
For example, a Python package combines documentation, code and initial set up and possibly examples that could be used as unit tests in a single file with a "py" extension.
By contrast, an R package has documentation with examples in files separate from the code, possibly bundled with other material such as sample data sets and introductory vignettes. The source code for an R package is contained in a directory with a master "description" file and separate subdirectories for documentation, code, optional data sets suits for unit or regression testing, and perhaps others.[2] A formal package compilation process [3][4] checks for errors of various types. This includes checking for syntax errors on both the documentation markup language and the code as well as comparing the arguments between documentation and code. Examples in the documentation are tested and produce error messages if they fail. This can be used as a primitive form of unit testing; more formal unit tests and regression testing can be included. This can improve software development productivity by making it easier to find bugs as the code is being developed. In addition, the documentation makes it easier to share code with others. It also makes it easier for a developer to use code written months or even years earlier. Routine checks are made of packages contributed to the Comprehensive R Archive Network (CRAN) and under development in the companion open-source collaborative development web site, R-Forge. These checks compile the packages repeatedly on different platforms under different versions of the core R language. The results are made available to package maintainers. In this way, package contributors become aware of problems they might otherwise never encounter themselves, because they otherwise would not have easy access to those alternative test results.
An interesting research question would be to compare the quality of contributions to different software repositories and try to relate that to features of the language and accompanying package development process. This could include trying to compare the rate of growth of contributed software to the degree of formality and enforcement of standards for documentation, testing, and coding.
See also
- Package management system for combining software packages in different languages into an operating system.
- Software repository for collections of packages to share.
- Software development process or Software development methodology for a more general discussion of software development.
References
- Chambers, John M. (2008). Software for Data Analysis: Programming with R. Springer. ISBN 978-0-387-75935-7.
- Writing R Extensions.
- Leisch, Friedrich. "Creating R Packages: A Tutorial" (PDF).
- Graves, Spencer B.; Dorai-Raj, Sundar. "Creating R Packages, Using CRAN, R-Forge, And Local R Archive Networks And Subversion (SVN) Repositories" (PDF).