Skip to main content

Posts

Week of 4/24

Week of 4/24      Throughout the weeks, we worked to refine our knowledge on Q-learning and the whole jazz. Mr. Lee asked Elaine to help us create an aesthetically pleasing and functional dots and boxes interface. We saw the initial version, and to say that's its impressive would be an understatement. It looks really nice and well-made, and it works very well, so honestly, I have no room to complain. There were a few aesthetic improvements we wanted to touch up on, and I'm honestly so thankful to Elaine for her help. The game is going great and I'm excited to see what we can work on in the future when we integrate the interface with the actual game itself. It's gonna be epic.  Here's what it looks like right now. Exciting, right?      Since I kind of have to show y'all the code, here's an excerpt:  Whoever told us that Q-learning only took 1-2 lines of code freaking lied to us. There's variables to initialize, moves to define, ...

Week of 4/10

Week of 4/10      As per my last blog post, no one showed up to the last Caltech meeting last time so we really did not have that much instruction as to what we should be doing next. We had a lot of questions on implementing Q-learning and understanding the whole reward function aspect of Q-learning, and these questions are critical to the application of our knowledge into code for the actual project. Therefore, we had to find a way to communicate with either the professor or the grad student soon in order to proceed with our project. In an effort to get the questions answered as soon as possible, I attempted to email Dr. Hassibi our questions and ask about whether or not he could come in anytime and help answer some of our questions. While we waited for him to respond to the emails, we tried to still delve a little deeper into what we already had. Whoever told us that Q-learning only took 1-2 lines of code freaking lied to us. There's variables to initialize,...

Week of 4/2

Week of 4/2      Hi pals! Welcome to another installation of Keeping it Up With the Beavers. Just kidding. But I come bearing some good news! After months and months of waiting, we finally got our Caltech emails and logins. Even though there's only like two months before school ends, I think that there's a lot of things I can do with my email before they expire it. For instance, I already obtained free Mathematica and Wolfram Alpha accounts. I also got a Caltech VPN! This is honestly pretty exciting-- and I'm planning to file for an ID card so I can brag about it to my friends. Highly exciting!      On Thursday, we had Club Rush. The campus was closed, and the bus didn't initially come. I had to help with my club, National Arts Honors Society (NAHS) while we were waiting for the bus, and my fingers were smothered from the chocolate dust on the tiramisu tortes. Mr. Lee had originally arranged for a 12:50pm rendezvous time for the bus, so I planned to ...

Week of 3/27

Week of 3/11      Since no one showed up to the meeting last Thursday, we didn't really get necessarily the guidance we wanted to go further with our assignment. But no worry! We decided to continue to work on our projects. I'm currently trying to figure out the code we have and I want to expand the game from a 4x4 grid to a 5x5 grid. This requires that I redo the Q-learning table, which is surprisingly difficult to do. Everyone told me that it'd just be two lines of code, but actually it's like 500 lines of code that is configured to the specific game/function that the algorithm has to use to generate the Q-values. This is really complicated and I have no idea what I'm doing. I'm trying to dissect the code I have right now and see how I can modify it to create new Q-values, but there's a new error every time I try to run it. Sigh. This is kind of rough.      Due to the fact that these few weeks have mostly just been us trying to understand the code, t...

Week of 3/11-- Technical Journal

Week of 3/11      Our original goal for this week (these two weeks?) was to find a dataset we could train a supervised learning algorithm with to play tic-tac-toe with. We planned to create the game and then have Mr. Lee's 4th period AP Computer Science class play the game to generate the game states and the data for the algorithm to learn from. This, however, turned out to be more trouble than it was worth. For instance, we went online to find some datasets for tic-tac-toe game states. This would make the supervised learning process much easier. However, the data that we found either was formatted really weird or only represented endgame states. This created some problems. First, just having the endgame data sets won't teach the algorithm anything, and it wouldn't feed the algorithm with any information about optimal game states or plays/actions that it can make. Additionally, the strange formatting on some of the data make it unusable for us. We had to format the dat...

Week of 3/7

Week of 3/7     Woah, hello there pals! This week (or these two weeks, I suppose) have been quite interesting. I've settled into my new home (after staying for what felt like weeks at my aunt's apartment), and after throwing some stuff out, I finally found space in my room (and in the house). But no matter on that. Let's talk about the progress I've made so far with my super epic research. While I was expanding my knowledge base of Q-learning, I discovered DQN. DQN is also known as Deep Q Learning. It's kind of like Q-learning but it uses neural networks, hence the name "deep". I think that it's infinitely better than regular Q-learning, since Q-learning takes up way too much memory in its quest for game states and specific game state/action pairs. Robert says that he's looking into a deep Q-learning specific application for our dots and boxes game, but that's the extent of our progress on it so far. Deep Q- Learning. Regular Q-le...

Week of 2/20

Week of 2/20      Dearest welcome, humble reader. Today, I will be discussing my progress in regards to the dots and boxes game.      Our group split into two research groups-- one with me, Robert, and Aston, and the other with Robert, Edmond, and Puja. The other group focused on building a working game using Python while our group focused on researching stuff about reinforcement learning. Initially, we didn't really have that much of a direction in what to research, but after a while, some hope was found. From my research, I learned that reinforcement learning is pretty applicable to the implementation of a dots and boxes game made with certain algorithms. However, the biggest question is what algorithm to use and how to implement it. I did some preliminary research and I stumbled on q-learning. Q-learning is a pretty interesting algorithm-- and although it's not incredibly practical for us to use it for our game, I think that the algorithm ...