Difference between revisions of "Dashboard UI for the Code Scanner (GSoC task)"

From Software Heritage Wiki
Jump to: navigation, search
(create gsoc task)
 
(add pointer to swh-scanner open tasks)
Line 18: Line 18:
 
The following improvements are suggested, although more can be proposed (and even more could be discovered during the project work):
 
The following improvements are suggested, although more can be proposed (and even more could be discovered during the project work):
  
* technology: generating a local HTML file is not necessarily the best way to render results, alternative solutions should be explored, including a self-hosted web app, rendering results with state-of-the-art frontend web frameworks (css/html/javascript)
+
* Technology: generating a local HTML file is not necessarily the best way to render results, alternative solutions should be explored, including a self-hosted web app, rendering results with state-of-the-art frontend web frameworks (css/html/javascript)
* scalability: currently rendering doesn't work when scanning large code bases such as the Linux kernel, rendering should be made lazy, by only loading data to show when needed
+
* Scalability: currently rendering doesn't work when scanning large code bases such as the Linux kernel, rendering should be made lazy, by only loading data to show when needed
* functionality: dashboard rendering should be integrated with the possibility of opening the local source code files that have been scanned, e.g., users will want to be able to open in-browser files that have been detected as known/unknown, in order to figure why
+
* Functionality: dashboard rendering should be integrated with the possibility of opening the local source code files that have been scanned, e.g., users will want to be able to open in-browser files that have been detected as known/unknown, in order to figure why
* functionality: in the future additional information will be added to scanning results, including license and provenance information. While not yet available right now due to backend limitations, the proposed UI should plan ahead about how/where to display such information
+
* Functionality: in the future additional information will be added to scanning results, including license and provenance information. While not yet available right now due to backend limitations, the proposed UI should plan ahead about how/where to display such information
 +
* Paper cuts: [https://forge.softwareheritage.org/tag/code_scanner/ various issues] affect the usability of swh-scanner, improving them would be welcome as part of this project
  
 
== Desirable skills ==
 
== Desirable skills ==
Line 28: Line 29:
 
* Basic understanding of the Software Heritage [https://docs.softwareheritage.org/devel/swh-model/data-model.html data model] and of [https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html SWHID identifiers]
 
* Basic understanding of the Software Heritage [https://docs.softwareheritage.org/devel/swh-model/data-model.html data model] and of [https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html SWHID identifiers]
 
* HTML/CSS/JavaScript and web development in general
 
* HTML/CSS/JavaScript and web development in general
* working knowledge of UI/UX design principles
+
* Working knowledge of UI/UX design principles
  
 
== Potential mentors ==
 
== Potential mentors ==

Revision as of 16:35, 13 February 2022

Introduction

The Software Heritage archive is the most comprehensive open data knowledge base about source code that has been published openly.

As such, it can be used to scan local source code bases to detect which parts of it come from public code, including Free and Open Source Software.

The Software Heritage scanner (swh-scanner) (documentation, code) is a command line tool that enables doing that.

Task description

swh-scanner is currently an experimental tool, which works well in practice, but need a real dashboard user interface to be useful. Several output options are currently available when invoking the swh scanner scan command, in particular batch output in textual and JSON format, and an interactive dashboard (with the -i/--interactive) option.

The interactive view currently works by producing a local HTML file and opening it using the local browser. The goal of this project is to improve the interactive view, making it a serious dashboard-style UI to peruse scanning results.

The following improvements are suggested, although more can be proposed (and even more could be discovered during the project work):

  • Technology: generating a local HTML file is not necessarily the best way to render results, alternative solutions should be explored, including a self-hosted web app, rendering results with state-of-the-art frontend web frameworks (css/html/javascript)
  • Scalability: currently rendering doesn't work when scanning large code bases such as the Linux kernel, rendering should be made lazy, by only loading data to show when needed
  • Functionality: dashboard rendering should be integrated with the possibility of opening the local source code files that have been scanned, e.g., users will want to be able to open in-browser files that have been detected as known/unknown, in order to figure why
  • Functionality: in the future additional information will be added to scanning results, including license and provenance information. While not yet available right now due to backend limitations, the proposed UI should plan ahead about how/where to display such information
  • Paper cuts: various issues affect the usability of swh-scanner, improving them would be welcome as part of this project

Desirable skills

  • Python 3 and Git are a must to work on any Software Heritage project
  • Basic understanding of the Software Heritage data model and of SWHID identifiers
  • HTML/CSS/JavaScript and web development in general
  • Working knowledge of UI/UX design principles

Potential mentors

  • Stefano Zacchiroli <zack@upsilon.cc> (zack on IRC)