Exercises

👤 Individual

👤 Individual

📨 Deliverable 1 - Bags of Visual Words [25 pts]

Purpose of this Question: (i) Develop understanding and intuition for DBoW. (ii) Read a seminal work in the field.

Read the DBoW paper paper. Then provide answers to the following questions. When complete, your writeup should be 1+ page long – aim for each response to be 200-250 words. Be sure to give speific examples in your response and include pictures or diagrams as needed. Your goal is to convince the grader you deeply understand this topic.

Explain which components provide robustness against illumination and 3D viewpoint changes in a basic BoW-based place recognition system. Why?
Explain the purpose of Inverse Document Frequency (IDF) term (i.e., tf-idf). What would happen without this term and why?
- Hint: Additional resources for building intuition include the citation provided in the DBoW paper or this tutorial.
How does the vocabulary size in BoW-based systems affect the performance of the system? Consider performance metrics such as: (i) computational cost, (ii) precision, (iii) recall.
- Hint: How would adding words to the vocabulary change the ability to identify 2 documents/images are similar? Are more words always better?

📨 Deliverable 2 - Setting Up Docker [5 pts]

Purpose of this Question: (i) Make sure your computer is setup for next week. (ii) Ensure you are familiar with the core concepts of Docker. In an academic setting, the impact of releasing research code is more important than ever, and technologies like Docker are important creating reproducable results, especially in robotics.

On the next assignment, we will be running some code in Docker. Docker is such an important tool in modern software engineering it is worth being at least being familiar with it.

Follow the installation instructions listed here. Specifically, you should follow the instructions listed in the “Install Using the Repository” section. Attach a screenshot of your hello-world Docker image running on your computer.
Read through the following tutorial – you do not need to run the code (unless you want to)! In about 200-250 words total, answer the following questions:
- What is a Dockerfile?
- In the context of docker, what is an image?
- In the context of docker, what is a container?
- How might you use Docker as part of a research publication? Why might this be better than other methods?
- When might you not want to use Docker?

Yes, we know you could probably just get ChatGPT to generate reasonable answers to these questions, but you should really take the 20min to answer these questions for yourself.

📨 Deliverable 3 - Running YOLO w/ Ultralytics [15 pts]

Purpose of this Question: YOLO is a seminal work in our domain. Especially given how easy it is to install/run these days, everyone working in this space should know how to do this.

Since the introduction of YOLO (You Only Look Once) in 2015 (yes, a seminal work is named after a meme), YOLO and its evolutions (aptly named YOLOv2, YOLOv3, etc.) have become a staple in object detection, image sementation, etc.

These days, you can easily install YOLOv11 through the Ultralytics package through pip, docker, etc. Once installed, you can easily perform detection, image sementation, etc. through easy code wrappers (e.g., Python) or via command line.

Explain the following tasks in your own words:
- Detection
- Segmentation
- Classification
- Pose Detection
- Oriented Bounding Box Detection
- Hint: Checkout the official YOLOv11 documentation
Install YOLOv11 through the Ultralytics package. We recommend using pip since it is the most straight forward.
- Hint: If you install via pip to the user and your command line doesn’t find yolo, you might need to include ~/.local/bin on your terminal’s $PATH variable. You can do this by adding PATH="$PATH:~/.local/bin" to your ~/.bashrc file and resourcing with source ~/.bashrc.
- Hint: If yolo is not detecting your file, try giving an absolute or relative path. It appears the source= command line parameter does not expand ~ to $HOME.
Select some pictures of your choice (taking your own pictures is highly encouraged!) – don’t just use the example picture in the Ultralytics documentation (i.e., https://ultralytics.com/images/bus.jpg). Use YOLOv11 (we recommend the CLI since it is the most straight forward) to perform the 5 tasks outlined above. Make sure to select images that make it clear the system is working (e.g., an image without a person in it won’t make sense for pose detection). Include your processed images in your PDF – be sure to state which image corresponds with which task.
(Optional) Extra Credit (5 pts): Install and run YOLOv11 through the official Docker image. How did you “get your image” into the Docker image? How did you “get the processed image” out of the Docker image? Provide a short explaination and a screenshot(s) of it running on your computer.

[Optional] Provide feedback (Extra credit: 3 pts)

This assignment involved a lot of code rewriting, so it is possible something is confusing or unclear. We want your feedback so we can improve this assignment (and the class overall) for future offerings. If you provide actionable feedback to the Google Form link, you can get up to 3 pts of extra credit. Feedback can be positive, negative, or something in between – the important thing is your feedback is actionable (i.e., “I did’t like X” is vague, hard to address, and thus not actionable).

The Google Form is pinned in the course Piazza (we do not want to post the link publically, which is why it is not provided here).

Note that we cannot give extra points for anonymous feedback while maintaining your anonymity. That being said, you are still encouraged to provide anonymous feedback if you’d like!