Skip to content

machine-data-hub/Roadmap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 

Repository files navigation

Roadmap

Near term

  • machinedatahub.ai site live
  • Metadata fully populated
    • Make sure JSON schema is same from dataset to dataset
    • Change "Dataset 1, 2" etc to "File 1/2" etc
  • rebrand github group/repos to match
  • (In Progress) Netlify Open Source plan application submitted
    • License
    • Code of Conduct at the top level directory of the project repository or prominently in the documentation (with a link in the navigation, footer, or homepage)
    • Must feature a link to Netlify service
    • (In Progress) Review that all conditions are met, fill out the form and submit
  • Nested dataset schema
    • each dataset can contain multiple files
    • break out per file metrics vs. dataset metrics
  • Submit a Dataset fully functioning
    • Front end form
    • Back end saves suggestion to Github API (preferred) or Postgres
  • machine-data-hub published to PyPI
    • unit testing runs on every push
    • sphinx documentation pushes to readthedocs on tag
    • library builds and pushes to PyPI on tag
    • release notes section added to sphinx documentation
  • Blog functionality added to web app
    • blog content can be added to repo in markdown format
  • (In Progress) create a getting started page
    • Add general step by step process
    • Add why people should use it
    • Add python package section
  • Three documented examples of ML model built from a dataset
    • Get working ML model in notebook
    • Write blog post tutorial with example
    • Get feedback from LM mentors
    • Implement feedback from LM mentors and update on website
  • (In Progress) UW ML Course students use machine-data-hub as data source for class project
    • Talk to UW 416 Course Instructor and send out email about response
    • Follow up with Wes for Feedback
  • Web app receives a 90+ rating from [lighthouse] (https://developers.google.com/web/tools/lighthouse) for performance
    • (In Progress) Fix slow image loading

Longer term

  • machine-data-hub CLI does local ETL on at least three of the datasets
  • Web App automated end to end testing
  • Auth-N (Authentication) implemented
  • Up Voting datasets
    • mitigation plan for duplicate votes (i.e. require Auth-N to cast a vote)
    • Dataset content pre-rendered, only user interaction elements (upvote controls and counts) load after hydration
  • machinedatahub analytics (page views, dataset download counts) with Postgres
  • User trial with survey and reward to get feedback from potential users (possibly use to incentivize students above)
  • External user submits a new dataset
  • First pull request merged from non-original team member
  • Academic Paper Published

Maybe

  • Auth-Z (Authorization) - allow private datasets

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published