Deliverables#
General remarks on the project reports#
Project reports are not solely focused on the final results, but also on the process and decisions made along the way. We expect to hear the reasons for your final decisions, for instance the reason why you choose X, over alternative options like Y.
Clarify the objectives and goal of your project. What do you want to do it, and why are your questions important to us?
Provide a detailed description about the data you will use. Where the data are collected from, how they are compiled and preprocessed for your analysis. What are the data type of your focal features, and what features do you think are relevant for your analysis?
Determine the appropriate methods. Additionally, consider discussing the methods used in previous studies. Considering the data types and the information you aim to present, what methods could potentially be suitable? It would also be beneficial to explore what approaches others have taken when working with similar datasets.
Clarify the limitation and advantage of your approach. The limitation and advantage stems from data and methodologies, and must be discussed in light of existing works. For instance, you want to develop a link prediction algorithm for a social network based on the common neighbor approach. What are the fundamental assumption underlying the link prediction algorithms? When does the algorithm fail? Can you think of the advantage of your algorithm over other alternatives such as graph neural networks?
Embrace failures. As Thomas Edison famously said, “I have not failed. I’ve just found 10,000 ways that won’t work.” In many cases, works and analyses may appear to follow a single pathway, but it is important to recognize that this is just one of many paths that people have taken, many of which have turned out to be unsuccessful. It is crucial to try out multiple candidates, and more importantly, to document your failures and understand why they did not work. Consider using fake data, small subsets, mock-ups, and sketches. These methods can help you iterate and refine your approach, ultimately leading to more successful outcomes.
Proposal#
A document should include the following sections:
Project Title
Team Members (1-4 people; keep in mind that a larger team is expected to accomplish more than a smaller one)
Abstract: A concise summary of your project.
Introduction: Provide motivation, background, and objectives for your project. Explain why it is important or interesting and why others should care. Review and discuss relevant existing works, particularly those that have inspired your project. Critique these works substantively. Remember, there is always a wealth of relevant work available.
Questions or Objectives: Specify the methods you plan to create and what you hope to discover from the data.
Datasets and Methods: Identify the dataset you will be using. If you haven’t done so already, I strongly encourage you to reconsider your project. Obtaining and cleaning datasets can be time-consuming. Describe the dataset, including its structure and data types if it is tabular. Explain the methods you plan to apply and why you have chosen them. Finally, provide detailed information about the dataset to convincingly argue that it is suitable for your project and proposed methods.
References
Final presentation#
Please create a 10-minute video (please adhere to the time limit) and upload it to YouTube. You have the option to either publish it or make it unlisted. The video can be in any format you prefer. Make sure to include a thorough analysis while also making it interesting and enjoyable! The video will be evaluated based on three criteria: (i) the strength of the case you present, (ii) the quality of your analysis, and (iii) the production and delivery of your presentation.
Once you have completed your video, feel free to share it on Slack and receive feedback from your fellow students and instructors. It’s always beneficial to see what others have accomplished, so I highly encourage you to share your work!
Final report#
You will need to submit your code and a report on your work. Ideally, your code will be in well-documented Jupyter notebooks (e.g. see Peter Norvig’s notebooks or good Kaggle exploratory data visualization kernels).
The report has no minimum or maximum length, but you need to make sure all the topics are thoroughly addressed in clear writing. The format and ingredients for the final report will depend on the types of projects that you do.
If the project is more about creating a software package or a website, then the report may focus more on the technical aspects of the project.
Idea sketch template#
The followings are the list of questions I personally use before starting a project. Every idea is nebulous when it comes to a mind. We can materialize it by writing down the ideas. It’s surprisingly hard to write it down first, and you will realize a lot of things. In sum, writing is thinking. It serves as a scaffolding to think through a research project. These list of questions are a living document, and you will constantly update as the project progresses.
Answer each question in 2~3 sentences. I usually set a timer for 15 mins for each. If one of the questions takes more than 15 mins, it’s the weakness of the idea of the current form.
Project Overview: What is the core focus of your project? Are you developing something new or testing existing ideas?
Project Value: What makes this work meaningful and worth pursuing?
Research Gaps: What key questions or problems remain unsolved in this area?
Novel Approach: What makes your proposed solution unique and different from existing methods?
Necessity: Why develop a new solution if existing methods exist? What advantages does your approach offer?
Success Metrics: How will you define and measure success for this project?
Validation Strategy: What specific criteria or tests will demonstrate that your solution works?
Broader Impact: How could this work benefit fields beyond your immediate research area?
Implementation Plan: Break down each project goal into ~3 concrete, actionable tasks.