Latest Posts
Using GPT for Qualitative Analysis
A new topic we are investigating with my company is Topic Generation and Classification. This is an extensive experimentation process on Topic Modelling and GPT-3.5/4 for qualitative analysis. I first start with topic modelling, how well can we actually describe topics using a simple BERTopic model, which currently is the state of the art model. We then investigate how strong this model actually is compared to our human experts. After discussing the weaknesses and strengths of this approach, we go ahead and investigate how GPT can help us improve this performence....
Tackling Unconventional Data: A Guide to Optimizing Sentiment Analysis Models for Atypical Text
I recently had the opportunity to develop a sentiment analysis tool for my company. Although I had some prior experience in this area, I quickly realized that I had more to learn. After extensive research and experimentation, we achieved the desired results. In this post, I’ll share my journey, thought process, and the techniques I employed to meet our objectives. Identifying the Issue & Setting the Stage Our startup specializes in delivering top-notch qualitative coding services to businesses, presenting the results on a user-friendly dashboard for our clients....
Object Detection ABCs - Setting Up Metrics
These are my notes on refreshing my object detection knowledge. We will start with bounding boxes for localization and cover everything we need before jumping in to implement YOLO algorithms. This tutorial includes answers to the following questions: What is localization? What are a bounding box and sliding window? How to measure the success of a predicted bounding box: Intersection over the union. How to get rid of extra bounding boxes: Non-max suppression....
NLP Series: ABC of Sentiment Analysis
Github Repo | Full-code notebook In this post, I will work my way into basic Sentiment Analysis methods and experiment with some techniques. I will use the data from the IMDB review dataset acquired from Kaggle. We will be examining/going over the following: Data preprocessing for sentiment analysis 2 different feature representations: Sparse vector representation Word frequency counts Comparison using: Logistic regression Naive Bayes Feature Representation Your model will be, at most, as good as your data, and your data will be only as good as you understand them to be, hence the features....
Lecture Notes - Makemore Part 3: Activations & Gradients, BatchNorm
This post includes my notes from the lecture “Makemore Part 3: Activations & Gradients, BatchNorm” by Andrej Karpathy. Link of the video: https://www.youtube.com/watch?v=P6sfmUTpUmc Initialization Fixing the initial Loss: Initial loss must be arranged (the value depends on the question), in our case its a uniform probability. When initializing make sure the numbers do not take extreme values (* .01) Do not initialize to 0 Having 0 and 1 in softmax (a lot of them) is really bad, since the gradient will be 0 (vanishing gradient)....