Ochirsaikhan's Research Notebook #16
Replies: 12 comments
-
9-18-2023Completed
Prompt structure 1
ChatGPT Answer 1
Prompt structure 2
ChatGPT Answer 2
Prompt structure 3
ChatGPT Answer 3
Each prompt generated completely different answers by ChatGPT, and I noticed that small tweak in the prompt structure would cause completely different answers. To Do
|
Beta Was this translation helpful? Give feedback.
-
Week 4 & 5During Week 4 and 5, I've met multiple times with my first comp reader, Prof. Kapfhammer to further discuss the feasibility of my previous senior thesis idea, polish the idea, and brainstorm new ideas. After meeting with Prof. Kapfhammer, I noticed that my previous idea was lacking clear direction. However, after reading papers suggested by Prof. Kapfhammer, I narrowed down my comp idea further: Now I'll focus on fine-tuning LLM models such as OpenAI's ChatGPT, Anthropic's Claude AI, and Google's Bard, etc and do experimental analysis on which customized/fine-tuned model produces the best result in explaining a piece of code to novice programmers. Before, I was just focusing on fine-tuning the model; now, I'm more interested in doing experiments with real students to find out if the fine-tuned LLM models can help them understand a piece of code better than a general LLM or just through Googling. I'm thinking that my experiment will involve students using three tools to compare their efficiencies: my fine-tuned model, general LLM model, and plain old Googling. The research papers I read includes the following:
After reading two papers written by Philip Guo, who is a leading researcher in areas spanning human-computer interaction, data science, programming tools, and online learning, I learned that ChatGPT is really good "doer" and not a good "explainer". Therefore, I decided to improve the explaining power of ChatGPT and other LLMs by fine-tuning/customizing it with additional training set to conduct research whether that model will be more effective in explaining code. Now, I'll focus more on reading papers and doing more research on fine-tuning a Large Language Model, which will involve reading OpenAI's documentation and finding out other scholarly articles about customizing a LLM. Moreover, during Week 5, my mentor notified me that he will connect me with a leading researcher on LLM, and I'm hoping that our meeting will be fruitful in terms of discussing the feasibility and quality of my senior thesis idea. |
Beta Was this translation helpful? Give feedback.
-
Week 6 & 7During Week 6 & 7, my progress was somewhat hindered due to my flu, but I still managed to make significant strides in my project. I continued my research on fine-tuning ChatGPT-3.5 Turbo, delved into OpenAI's Fine Tuning documentation, and started experimenting with the model using a $10 credit. Additionally, I successfully connected to the OpenAI API through my API key and demonstrated a working prototype in my project repository. Research and Learning
3.** Practical Experimentation**: To facilitate practical experimentation with ChatGPT-3.5 Turbo, I purchased a $10 credit for access to the model. This will enable me to test and fine-tune the model for my specific project requirements.
Next Steps
With these recent developments and my continued commitment to research and experimentation, I am making steady progress towards my senior thesis goal of enhancing the explaining power of ChatGPT-3.5 Turbo and similar language models. I am excited about the potential impact of my project and look forward to the challenges and opportunities that lie ahead. |
Beta Was this translation helpful? Give feedback.
-
On November 6th, 2023, OpenAI announced the introduction of GPTs, which is more tailored version of ChatGPT. This solves the problem of sharing structured prompts. Now, with the help of GPTs, any GPT will work based on its internal prompt, meaning that we don't have to copy and paste long prompts any time we want ChatGPT to do specific tasks for us. Also, OpenAI announced the GPT Store where people can share their custom GPTs and potentially make money from it if a lot of people are using it. So, there is a chance for me to incorporate customized GPTs into my senior thesis project. According to OpenAI, "GPTs will continue to get more useful and smarter, and you'll eventually be able to let them take on real tasks in the real world." Also, developers can connect GPTs to the real world through APIs, which I can explore to make my senior thesis project more comprehensive. Moreover, ChatGPT Plus users (GPT-4) now includes fresh information up to April, 2023 instead of January 2022 (GPT-3.5 Turbo). Research ArticleI wanted to get an expert's opinion on GPTs, so I read this article that was written by Ethan Mollick who is a professor at the Wharton School of the University of Pennsylvania, studying what our new "AI-haunted" era means for work and education. From the article, I understood that GPTs currently are a powerful tool for sharing a "GPT" that was built by someone with their own prompt. Essentially, it is more convenient way to share structured prompts, which are programs written in plain English that can get AI to do useful things. Even though it can learn from a proprietary dataset, GPTs can still hallucinate and make information up. This is just the start of an era, and in the future more GPTs and agents will have access to more systems, meaning they can act more on their own. Since this is a new technology, I'm curious about how I can use this in my senior thesis project productively. Custom GPTs vs Fine-Tuning. What's the difference?According to a user named "Paul.lene" on OpenAI's community form:
The one misleading difference between fine-tuning and GPTs or other agents/bots/AI (they can be called in a multitude of ways) is that in both case you give it new data. It is really in the way the data is used by the AI that makes the big difference. In the first case the AI is modified in its core, while in the second case it is really about providing instructions to guide the existing AI (without modifying the core). Why do you want to use one or the other ?** Context & cost. Both methods aim to specialize an AI but the outcomes are different. Fine-tuning is more complex and expensive, you need new quality data that will lead to consolidation of knowledge for a tiny part the AI systems - you literally improve the system in a very incremental way (if done correctly). While the custom GPTs approach is much less expensive and more accessible (no-code/low-code - everybody can do it). You do not improve the system but you activate the proper part of the AI brain to get the best out of it. |
Beta Was this translation helpful? Give feedback.
-
Hello @Ochirsaikhan, your research notebook does not meet several of the baseline requirements for research notebooks in CMPSC 600. Please meet with me during the start of the Spring 2024 semester so that we can develop a plan for improving your work in this area. I encourage you to refer to the syllabus for more details about the requirements for a research notebook in a computer science senior thesis. |
Beta Was this translation helpful? Give feedback.
-
February 20, 2024Last week, I met with Prof. Kapfhammer and my academic advisor, Prof. Luman, to discuss the shift in my senior thesis project. After much contemplation and consideration, I decided to pivot my senior thesis idea from Code Explanation with Large Language Models to Comprehensive Stock Trading System with Large Language Models. I am fully aware that switching ideas this late into the semester is risky and might complicate the timeline of my senior thesis project. However, I firmly believe the potential benefits outweigh the challenges based on the following reasons:
After talking to my comp readers and my academic advisors about the aforementioned reasons, they approved of my new idea. Moreover, I discovered that my true passion lies in the stock market, programming, and AI rather than code explanation and pedagogy, because I noticed that I started to enjoy working on my senior comp. With the newfound passion, I hope to catch up with the current course schedule by March 11, 2024 (after Spring Break), and I'm planning to fully commit to catching up to my comp during Spring Break. Completed
TODO
New Goal 🚀: By March 11, revise Chapter 1 & 2, finish writing Chapter 3 & 4 and catch up with the course. |
Beta Was this translation helpful? Give feedback.
-
February 22, 2024Meeting with JJ 🖥️I had a productive meeting with my second comp reader today. I introduced my new focus on my comp idea, and JJ provided clear guidance on my next actionable steps so that I can catch up to the course by March 11. She suggested that I focus on building the artifact and directed me on how I should conduct my experiment and evaluation. Specifically, I would focus on building a comprehensive trading system, backtesting it on historical data to test the long-term trading prospects, and using it in real life to test the short-term prospects. JJ also provided me guidance on how I should test it specifically. Overall, the meeting was really productive, and I cleared up my confusion on a lot of things. Completed ✅
TODO 🔨
|
Beta Was this translation helpful? Give feedback.
-
February 27, 2024Meeting with Prof. Kapfhammer 🖥️I met with my first comp reader, Prof. Kapfhammer, today, and we talked about the implementation details of my senior thesis project. Specifically, we discussed how I should store all of my relevant data, such as company news, company fundamentals, and macroeconomic news, retrieve it, aggregate all the data, and give it to the LLM model. He recommended I use vector databases and embeddings in the last step to ensure the LLM model has the most relevant information. We also talked about the potential token limitation to give the LLM context, which is around 8,192 tokens in GPT-4. So, I have to limit and implement a logic that, with the user input and the context, doesn't overflow the token limit of the given LLM model. Lastly, we discussed what I can incorporate in my future works section in my comp and how I would experimentally evaluate my model through historical data and in real life. Completed ✅
TODO 🔨
|
Beta Was this translation helpful? Give feedback.
-
March 05, 2024Progress Note 👨🏻💻I've been working on my artifact and making progress. Completed ✅
TODO 🔨
|
Beta Was this translation helpful? Give feedback.
-
March 06, 2024Progress Note 👨🏻💻Today, I focused on doing more research on the best Stock Market News & Fundamental Market news other than EODHD to get exposed to other API providers. I've found a few other compelling Stock API providers such as AlphaVantage and Polygon, which will be the data source for my system to make decisions, so the quality of the API is almost directly correlated with the quality of the system. Therefore, I need to be careful and thoughtful in choosing the best API provider. Moreover, I've dug deeper into the LangChain and OpenAI documentations. Specifically, I learned more about RAG (Retrieval Augmented Generation), and building complex Agents with Chains. In the upcoming few days, I need to focus more on building the pipeline for the system. Completed ✅
TODO 🔨
|
Beta Was this translation helpful? Give feedback.
-
March 07, 2024Progress Note 👨🏻💻I have explored deeply what each Stock & Financial Market API provider offers. I now have API keys for Polygon, Alpha Vantage, Finnhub Stock API, and EODHD. I compared each provider's strengths and weaknesses and identified which area the specific provider is good at. For example, I've found that only EODHD provides the stock news article in its full length, which is crucial for the LLM to summarize on its own rather than using summaries that were already provided by the API provider. So, I've purchased EODHD's premium All-in-One Package to get access to their endpoints and make progress towards my artifact. Moreover, the quality of the API response is also different depending on what endpoint I'm using. For instance, I've found that when quoted with "MSFT" as a ticker, Alpha Vantage's news API responded with an [article](Alpha Vantage) that only mentioned Microsoft once (it only mentioned Microsoft Teams which is not even the company but its product), meaning that their API is not returning relevant and valid news for my system to use. That's why I've chosen EODHD for the news summarizer section of my system. Completed ✅
TODO 🔨
|
Beta Was this translation helpful? Give feedback.
-
Hello @Ochirsaikhan please note that the minimum number of research notebook entries for the Spring 2024 semester was 8 and you had 6 in total. |
Beta Was this translation helpful? Give feedback.
-
This is the start of my research notebook for my senior comp.
Beta Was this translation helpful? Give feedback.
All reactions