Let's Build the Ultimate Aspirin for the 7 Pains of Dataset Management [Open-source]

Hacktoberfest Online Meetup

Wednesday October 28, 2020
5:00pm to 7:00pm PST
Event is hosted online

Event Description

Managing datasets can be a hassle - especially at a large scale. Especially for unstructured data, storing, accessing the data, and version-controlling it is hard. Davit Buniatyan, Activeloop CEO, has battled with managing petabyte-scale data during his time at the Princeton Neuroscience lab. In this meetup, he will cover the critical stack needed to resolve the biggest pains of dataset management. Moreover, he will present the open-source Hub package - that he is building to be the SQL for images.

After a brief brainstorming session, we will get hacking - improving our free dataset visualization tool (https://app.activeloop.ai/datasets/explore) and the open-source Hub package (the fastest way to access and manage datasets for PyTorch and TensorFlow).

Can't make it to this online meetup? Hacktoberfest is virtual and open to participants from around the globe. Sign up to participate today.

Rules and Rewards

First, sign up on the Hacktoberfest site at https://hacktoberfest.digitalocean.com.

1+ merged PR: stickers
3+ merged PRs: T-shirt plus a choice of stickers or face mask
5+ merged PRs: T-shirt, stickers, and a face mask
Best Contributor Award: SONY WH-CH710 noise-canceling headphones
All contributors get a contributor badge!

To qualify for the official limited edition Hacktoberfest shirt from Hacktoberfest team, you must register and make four pull requests between October 1-31.

Event code of conduct

Hacktoberfest meetups are welcoming, open, and inclusive. Please read our Events Code of Conduct before attending. Happy hacking!

Who can attend?

Locations have different requirements for who can attend. This location is open to the following:


Welcome (5:00 - 5:05 PDT) - Intro to the Activeloop team and Hacktoberfest
Network (5:05 - 5:10 icebreaking)
Dataset Management pains intro (5:10 - 5:17) - explain what issues we are solving
Intro to HUB - the fastest way to access and stream datasets for Tensorflow and PyTorch (5:17 - 5:25)
Demo visualization tool - (5:25 - 5:30)
Brainstorming (5:30 - 5:45)
Get hacking (assign people to the teams based on brainstorming) - (5:45 - 6:45)
Show and tell (6:45 - 7:00)

Links & Resources


Chat platform where all the activities and fun happens.

Open link →

Hosted by

Mikayel Harutyunyan and Davit Buniatyan

Activeloop (www.activeloop.ai), is a company backed by Y Combinator and a member of NVIDIA’s prestigious Inception program. Activeloop is a dataset management system that streamlines data scientists’ data aggregation, preparation, as well as optimizes the training of machine learning models. The company’s open-source Hub package is the fastest way to access and manage datasets for PyTorch and TensorFlow. Thanks to the package, you can build scalable data pipelines in no time.