GitHub, a code-hosting website built atop the Git version-control system, hosts hundreds of millions of repositories of code uploaded by more than 65 million developers. The Semantic Code team at GitHub builds and operates a suite of technologies that power symbolic code navigation on github.com. Symbolic code navigation lets developers click on a named identifier in source code to navigate to the definition of that entity, as well as the reverse: given an identifier, they can list all the uses of that identifier within the project.
This system is backed by a cloud object-storage service, having migrated from a multi-terabyte sharded relational database, and serves more than 40,000 requests per minute, across both read and write operations. The static analysis stage itself is built on an open source parsing toolkit called Tree-sitter, implements some well-known computer science research, and integrates with the github.com infrastructure in order to extract name-binding information from source code.
No entries found