LXR Cross Referencer
LXR Cross Referencer, usually known as LXR, is a general-purpose source code indexer and cross-referencer for code comprehension that provides web-based browsing of source code, with links to the definition and usage of any identifier.
Initial release | c. 1994[1] |
---|---|
Stable release | 2.3.5
/ March 20, 2019 |
Repository | |
Written in | Perl |
Type | Indexer and cross-referencer |
License | GNU GPL 2 |
Website | lxr |
History
LXR was born from a need for a tool to keep a synthetic eye on the Linux kernel during its development (whence its original name: LXR stood for "Linux Cross-Referencer"). Such a tool is all the more necessary as documentation is scarce and contributor number is high.
Two Norwegian students, Arne Georg Gleditsch and Per Kristian Gjermshus, curious about Linux architecture, began writing a small program displaying its files through a web-browser and showing variables usages after a click on the name. Aware of general interest, they posted it rapidly on SourceForge (as early as 1994?[1]).
Time passing, fans joined the development team to give code more maturity; however their number never exceeded ten.[2] With these characteristics, LXR is a typical SourceForge-hosted project but exhibits an exceptional life duration among small projects.
One of the initial creators explored new technologies giving the LXRng spin-off. This experimental development does not contain all features present in the traditional version and departs notably from LXR founding principles.
Though no communication was really ever done around the tool, LXR made its way through some paper columns, e.g. Linux Journal.[3] However, when collecting references to LXR on the Internet, there is ambiguity between the tool itself and instances of LXR displaying indexed source code (since many sites use "LXR" in its original sense of "Linux Cross-Referencer").
After adopting LXR to index the source code for the Mozilla Application Suite, Mozilla forked LXR to into MXR (the Mozilla Cross Reference). MXR was forked in order to meet the needs of Mozilla development, namely code navigation of a mixed C++ and JavaScript codebase. After years of MXR use, Mozilla began work on a new tool with a focus on better static analysis and a dynamic Ajax UI. The result is DXR (the Dehydra Cross Reference[4]). After DXR reached maturity, the MXR instance at mxr.mozilla.org was decommissioned.
Technology
LXR is minimalist and adheres to the least-effort principle.
The deliberate bias towards minimalism avoids using too many different technologies. Thus, it limits the dependencies and the software can be supported by many configurations without special adaptation.
- The design choices include interpreted languages (such as Java or JavaScript) barring or strict HTML 4.01 conformance.
Least-effort principle forbids tool programming if it already exists (at least as open source).
- This results in web browser usage for display (HTML and CSS allow for fancy page lay-out), definitions and references stored in an available relational data base and file parsing with Exuberant ctags tool.
LXR is written in Perl, handy choice for CGI scripts, but not really fit for lexical or syntactic parsing.[5]
LXR tries to impose as few constraints as possible:
- several database choices: MySQL, PostgreSQL, SQLite or Oracle,
- choices for full text search between Glimpse and SWISH-E,
- free choice for HTTP server provided it can execute CGI scripts (instructions are given for Apache, Cherokee, lighttpd, Nginx and thttpd),
- source-file stored in real directory or in version management system repository (choice[6] between CVS, Git,[7] Mercurial and Subversion).
Usage
After software installation, which is not a trivial task but does not require expertise, source code must be pre-processed and LXR configured to display it.
- The different source code versions are implemented as sub-directories.
- An alternative stores source code in a version management system.
Code is indexed during a second phase: identifiers are gathered and their locations entered in a data base. Reindexing is only necessary when source code is modified or a new version added.
Afterwards, all is needed is to launch a web browser with an URL corresponding to the source code and navigate across files through the hyperlinks associated to identifiers.
Capabilities and limitations
Source code can be written in any language that Exuberant ctags can handle, but parsers are not equally fine-grained.
Two versions of the same file can be compared side by side with differences visually enhanced (through diff command launched by LXR).
Besides hyperlinks under variables, a form allows searching for an identifier typed by the user.
To work around the indexing phase limitations, any character sequence may be (full text) searched at the cost of an extensive source files traversal.
LXR limitations are those of the support tools, mainly Exuberant ctags. But the primary cause of difficulties comes essentially from incorrect access permissions to files.
Another limitation comes from the design choice to only do static code analysis, in contrast to other solutions which do semantic analysis as a compile step,
An advanced user may change LXR layout and rendering through customizing page templates (written in HTML) and cascading style sheet (CSS).
LXR collections
- LXR itself
- Linux kernel browsing
- LXR (formerly "the Linux Cross Referencer") (running the experimental LXRng fork provided by lxr.linux.no)
- Linux kernels browsing (ran a very old LXR version until 2017)
- Glibc 2.3.2
- (archives only shows directory structure - March 2016)
- Mozilla Cross Reference, for several projects from Mozilla.org Archived 2018-07-05 at the Wayback Machine
- LXR for Apache HTTPD
- (archive only shows directory structure - March 2016)
- (archive not available - March 2016)
See also
References
- According to dates in SourceForge's CVS repository
- "LXR Cross Referencer Open Source Project on Open Hub: Contributors".
- Kamran Soomro (June 1, 2007). "Read source code the HTML way".
- "Dehydra". MDN Web Docs. Retrieved 2020-11-13.
- A finite state automaton usually scans text (or source code) from left to right without backtracking. Using regular expressions in Perl incurs chances of multiple scanning of text with spurious replacement on already processed fragments.
- It was initially possible to use BitKeeper, but support stopped (around 2005) when license became proprietary.
- Git support has been fixed in release 1.0.