Oct 14

3 min read

FiftyOne Computer Vision Tips and Tricks — Oct 14, 2022

Tips and tricks for using open source FiftyOne promo card

Welcome to our weekly FiftyOne tips and tricks blog where we recap interesting questions and answers that have recently popped up on Slack, GitHub, Stack Overflow, and Reddit.

Wait, what’s FiftyOne?

FiftyOne is an open source machine learning toolset that enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster.

Ok, let’s dive into this week’s tips and tricks!

Sorting samples in a collection by fields or expressions

Community Slack member Sybil Lyu asked,

“Can I use SortBy with an expression in FiftyOne via the add stage button in the FiftyOne App?”

Today, the best way to sort by an expression is via Python. If you run some of the examples from the Docs, you’ll see the equivalent JSON that you’d need to type into the SortBy stage in the view bar of the App to create the view:

Connecting the FiftyOne client to MongoDB

Community Slack member Naman Gupta asked,

“Is it possible to connect to an already running local FiftyOne instance from a Jupyter notebook without having to spin up another FiftyOne server? I want to connect to the locally running instance and either load, filter or export datasets on the same machine.”

You can run MongoDB in a separate container and then configure your FiftyOne client in the Jupyter container to connect to it. Other than MongoDB, there’s no FiftyOne “server” in the open source package. FiftyOne Teams, on the other hand, provides a centralized MongoDB database and a FiftyOne App server allowing everyone on your team to easily load the same datasets in Python and the App.

Learn more about working with data and notebooks in the FiftyOne Docs.

Mapping ground truth labels in a bounding box problem

Community Slack member Raghav Mecheri asked,

“I’m trying to map a set of ground truth labels to a broader set of categories for a bounding box problem. For example turning bounding boxes that have labels for “audi”, “bmw”, “mercedes” all into “car”. I could iterate through each image as I load it, but I feel that there’s probably a “right” way to do this in FiftyOne — any good starting points?”

You can use map_labels()for this!

view = dataset.map_labels(...)

This will give you a view that dynamically renames the labels when you iterate over/visualize it in the App. If you want to save the changes to the actual dataset, just add:

view.save()

Merging samples and updating labels

Community Slack member Jason Barbee asked,

“I cloned a dataset, changed the ground_truth labels on samples, but running main_dataset.merge(working_dataset) doesn’t seem to overwrite my existing labels. Is there a replace_sample type API?”

By default, when using merge_samples(), the merge_lists attribute is True, meaning that for lists of labels like detections, the two lists will be merged based on label ID rather than the working dataset overwriting all main dataset labels. If you set merge_lists=False, then it will discard all existing labels and keep only the labels from the dataset being merged in.

Learn more about merge_samples in the FiftyOne Docs.

Using FiftyOne datasets with the Pytorch dataloader

Community Slack member Sidney Guaro asked,

“Is it possible to use a FiftyOne dataset in Pytorch dataloader?”

We do have some integrations with Pytorch Lightning Flash, as well as a Detectron2 tutorial. But you can also always integrate FiftyOne datasets right into Pytorch dataloaders (check out this blog). Here is an example from the blog that sets up a torch dataset from FiftyOne:

What’s next?

If you like what you see on GitHub, give the project a star
Get started! We’ve made it easy to get up and running in a few minutes
Join the FiftyOne Slack community, we’re always happy to help

FiftyOne Computer Vision Tips and Tricks — Oct 14, 2022

Wait, what’s FiftyOne?

Sorting samples in a collection by fields or expressions

Connecting the FiftyOne client to MongoDB

Mapping ground truth labels in a bounding box problem

Merging samples and updating labels

Using FiftyOne datasets with the Pytorch dataloader

What’s next?

More from Voxel51

Recommended from Medium

Fontastic — Part I

What happen if you mix Twitter, the president of México and natural language processing?

How Text Classification Can Help Businesses

Extract Web Pages Content Using These Data Extractors

Data pre-processing with Orange

My Internship Experience

Early results: This is what happens when you machine-learn JIRA tickets

How Is Induction Different From Deduction?

Get the Medium app

Jimmy Guerrero